0% found this document useful (0 votes)

18 views22 pages

A Memory-Resident Malware Detection Framework Based On Memory Forensics and Deep Neural Network

The document presents MRm-DLDet, a novel memory-resident malware detection framework that utilizes memory forensics and deep learning to effectively identify malware that operates solely in memory. By converting memory dumps into ultra-high resolution RGB images and employing a neural network for feature extraction, MRm-DLDet achieves a detection accuracy of 98.34%, outperforming existing methods. The framework addresses challenges in current detection techniques, such as reliance on expert knowledge and limitations in handling large image sizes.

Uploaded by

Nazir Gohar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views22 pages

A Memory-Resident Malware Detection Framework Based On Memory Forensics and Deep Neural Network

Uploaded by

Nazir Gohar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

Liu et al.

Cybersecurity (2023) 6:21 Cybersecurity

https://doi.org/10.1186/s42400-023-00157-w

RESEARCH Open Access

MRm‑DLDet: a memory‑resident malware

detection framework based on memory
forensics and deep neural network
Jiaxi Liu1,2, Yun Feng1, Xinyu Liu1,2, Jianjun Zhao1,2 and Qixu Liu1,2*

Abstract
Cyber attackers have constantly updated their attack techniques to evade antivirus software detection in recent years.
One popular evasion method is to execute malicious code and perform malicious actions only in memory. Mali-
cious programs that use this attack method are called memory-resident malware, with excellent evasion capability,
and have posed huge threats to cyber security. Traditional static and dynamic methods are not effective in detect-
ing memory-resident malware. In addition, existing memory forensics detection solutions perform unsatisfactorily
in detection rate and depend on massive expert knowledge in memory analysis. This paper proposes MRm-DLDet,
a state-of-the-art memory-resident malware detection framework, to overcome these drawbacks. MRm-DLDet first
builds a virtual machine environment and captures memory dumps, then creatively processes the memory dumps
into RGB images using a pre-processing technique that combines deduplication and ultra-high resolution image
cropping, followed by our neural network MRmNet in MRm-DLDet to fully extract high-dimensional features from
memory dump files and detect them. MRmNet receives the labeled sub-images of the cropped high-resolution RGB
images as input of ResNet-18, which extracts the features of the sub-images. Then trains a network of gated recurrent
units with an attention mechanism. Finally, it determines whether a program is memory-resident malware based on
the detection results of each sub-image through a specially designed voting layer. We created a high-quality dataset
consisting of 2,060 benign and memory-resident programs. In other words, the dataset contains 1,287,500 labeled
sub-images cut from the MRm-DLDet transformed ultra-high resolution RGB images. We implement MRm-DLDet for
Windows 10, and it performs better than the latest methods, with a detection accuracy of up to 98.34%. Moreover, we
measured the effects of mimicry and adversarial attacks on MRm-DLDet, and the experimental results demonstrated
the robustness of MRm-DLDet.
Keywords Memory-resident malware, Memory forensics, Malware detection, Deep learning, Ultra-high resolution
image

Introduction
Over the years, artificial intelligence (AI) techniques have
significantly promoted the efficiency and ability of file-
based malware detection engines. Yet, at the same time,
*Correspondence: cyber-attackers also keep exploring advanced methods
Qixu Liu to evade or compromise antivirus software. One such
[email protected]
1
Institute of Information Engineering, Chinese Academy of Sciences, method is In-memory Code Execution (ICE) (Fewer
Beijing 100085, China 2008; Team 2021; Paschen 2020; Malik 2019; odzhan
2
School of Cyber Security, University of Chinese Academy of Sciences, 2019; Microsoft 2018; Kumar et al. 2020). ICE attacks
Beijing 100049, China
only execute malicious operations in memory and leave

© The Author(s) 2023. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which
permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the
original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or
other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line
to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory
regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this
licence, visit http://creativecommons.org/licenses/by/4.0/.
Liu et al. Cybersecurity (2023) 6:21 Page 2 of 22

almost no evidence on the disk, making it challenging for features (Nataraj et al. 2011; Ni et al. 2018; Bozkir et al.
traditional static and dynamic analysis methods to detect 2021; O’Shaughnessy and Sheridan 2022). These stud-
(Arefi et al. 2018; Alrawi et al. 2021). For instance, in ies obtained great detection results. However, exist-
cyberattacks against the National Bank of Malawi (CERT ing vision-based malware detection techniques usually
2018), attackers rewrote and recompiled the multiple analyze PE files directly. They still face the drawbacks of
open-source codes embedding the encrypted DarkComet existing static and dynamic malware detection methods,
(Lesueur 2020) remote access trojan (RAT) into relevant i.e., they cannot effectively detect malware only running
codes. In the actual operational process, the encrypted in memory. Moreover, a memory dump can be reshaped
data will be loaded, decrypted, and expanded into a into an RGB image, but the size of a memory dump file is
complete DarkComet RAT portable executable (PE) file the same as the virtual machine’s memory, which is 2GB
in memory. In this way, the payload implements an ICE or above. Therefore, the generated memory dump images
attack to avoid killing and bypassing anti-virus solutions. are of ultra-high resolution, and their minimum size is
Advanced Persistent Threats (APT) groups and malware about 6000 × 6000 pixels after our processing method
families such as Lazarus Group (MITRE 2021), Poweliks (more details can be found in the “MRm-DLDet” sec-
(O’Murchu and Gutierrez 2015), Zeus (Binsalleeh et al. tion). However, existing vision-based malware detection
2010) all employed ICE attacks. methods only allow the input of regular-size pictures.
Kumar et al. (2020) briefly presents memory-resident For example, Ni et al. (2018) used images of max 32 × 32
malware. In this work, we define memory-resident mal- pixels for their model; the detection model of Bozkir
ware as malware that executes shellcodes and PE files in et al. (2021) extracts features from images converted
memory using ICE technology. from malware with 256 × 256 pixels; O’Shaughnessy and
With the spread of memory-resident malware, mem- Sheridan (2022) proposed a hybrid malware classification
ory forensics has become more critical. Memory foren- model that extracts visual features from the images with
sics (MF), is a technique that captures volatile memory a max size of 512 × 512 pixels, none of which can handle
data from computers’ memory dumps and analyzes our ultra-high resolution images. Thus, it can be inferred
them. Memory dumps contain processes, network con- that existing vision-based detection methods can not effi-
nections, open files, and registry modifications created ciently handle ultra-high resolution images.
during the malware’s runtime, significant traces for iden-
tifying memory-resident malware. There have been many Motivation
memory forensics studies incorporating machine learn- Therefore, we proposed a novel approach by combining
ing (Barabosch et al. 2017; Bozkir et al. 2021; Wang et al. the information of the malware’s whole memory dumps,
2020; Sihwail et al. 2021). These works have significantly such as memory pages, processes, and other related data
improved the memory-resident malware detection accu- with deep neural network for detection to solve the diffi-
racy and efficiency. Unfortunately, variants of ICE attacks culties that traditional static and dynamic analysis meth-
may inject different processes and use different methods ods to detect memory-resident malware. And solve the
to load shellcodes or PE files. Moreover, attackers always two challenges in memory-resident malware detection.
look for never-detected vulnerable processes or meth- Our work can better use the malware-specific execution
ods to construct advanced attacks. Thus, manual feature data to detect memory-resident malware by converting
engineering requires analysts to be familiar with and pos- memory dumps to pictures without extensive and com-
sess extensive domain knowledge to distinguish high and plex expert knowledge. A memory dump file can be con-
low differentiation features. verted into an RGB image, every pixel of a memory dump
image is associated with memory data, and the difference
Challenges between images can help separate benign from malicious
Based on the above discussions, existing memory-res- memory dumps. Moreover, this paper designs a memory
ident malware detection methods face two challenges: dump file preprocessing method to relieve the storage
(1) The accuracy of detection frameworks relies on vari- space pressure caused by the size of memory dump files
ous hand-crafted features of memory-resident malware, and solve the problem that existing vision-based malware
which requires massive expert knowledge in the field of detection methods cannot handle ultra-high resolution
memory analysis and it is somewhat subjective and not images.
generalizable. (2) Existing detection tools do not take full In order to further discuss whether visualization can
advantage of memory data information. help detect memory-resident malware, we analyzed
In the malware detection field, many studies have the memory dump of a Lazarus Group’s sample. An
used computer vision to convert malware into images, MS-DOS header is found at address 0xB14D000 of this
then classify malware programs by specific image memory dump, which is the start of a PE file. Lazarus
Liu et al. Cybersecurity (2023) 6:21 Page 3 of 22

implements the ICE attack by decrypting the payload overcomes the two challenges faced by existing memory-
and loading it into its own memory space, so the PE file resident malware detection methods and solves the prob-
found at 0xB14D000 is the payload of the malware from lem that current image detection methods cannot handle
Lazarus Group. We analyzed a benign sample for com- ultra-high resolution images. Experiments show that
parison, Fig. 1 shows a motivating example. The two MRm-DLDet has a high detection accuracy (98.34%).
images on top are data from the same location in the To summarize, in this paper, we make the following
two memory dump files. We converted the two memory contributions:
dumps to RGB images by our framework. The bottom
of Fig. 1 shows a part of the two RGB images that cor- • We comb through the latest ICE methods from mal-
respond to the code fragments of two dumps at addresses ware families and APT groups and first define mal-
from 0xB14D000 to 0xB1589DF. These images have sig- ware that uses ICE methods to execute shellcode or
nificant differences between color, texture, and structure. malicious PE files in memory as memory-resident
That leads us to a driving thesis of our work: The seman- malware.
tic and structural differences between malicious • We propose the first memory-resident malware
memory-resident actives and benign memory dumps detection framework that combines memory foren-
can be effectively identified by visually comparing sics and deep learning named MRm-DLDet, which
memory dump RGB images. Detailed description of the focuses on capturing and analyzing memory dumps.
memory dump files and its visualization can be found in MRm-DLDet has a virtual machine environment for
the “MRm-DLDet” section. capturing memory dumps, a novel memory dump
preprocessing method combining data deduplication
Our work and ultra-high resolution image cropping, and a neu-
We propose a state-of-the-art memory forensic frame- ral network named MRmNet.
work based on deep learning called MRm-DLDet • Because of the lack of publicly available open source
(Memory-Resident malware Deep Learning Detector). datasets for in-memory-resident malware detection,
MRm-DLDet first captures memory dumps, and their we collected a dataset with 2,060 benign and mali-
size will be reduced by memory duplicate page deletion. cious programs. The memory dumps of the programs
Then MRm-DLDet converts memory dumps into ultra- that are converted into ultra-high resolution images
high resolution RGB images and uses a non-overlap- will be cropped into 1,287,500 sub-images. Now our
ping sliding window to crop the images into sub-images dataset is ready to be used publicly for non-commer-
served as inputs to MRm-DLDet’s MRmNet neural cial reasons.
network. The MRmNet combined with ResNet-18 (He • We studied the influence of different image sizes
et al. 2016), gated recurrent units (Cho et al. 2014), and on the performance of MRm-DLDet and compared
attention mechanism (Zhou et al. 2016). Our framework MRm-DLDet with the most advanced methods.
Compared to the latest methods, our framework is
better in all experimental evaluation metrics. Spe-
cifically, MRm-DLDet has a detection rate of up to
98.34%.

Related work
Memory forensics‑based memory‑resident malware
detection
Since Malik (2019) first performed how to load and run
portable executables entirely from memory manually,
in-memory malware execution has gradually become
a prevalent attack method for cyber attackers. In this
paper, we divide MF-based malicious code detection
methods into two types based on different technical
bases: method based on the characteristics of operating
system and memory pages (OS Characteristics Based),
and detection method combining artificial intelligence
Fig. 1 A motivating example of memory dump visualization
Liu et al. Cybersecurity (2023) 6:21 Page 4 of 22

(AI Based). Figure 2 succinctly shows the latest MF-based A very recent effort by Alrawi et al. (2021) presented a
malware detection techniques. post-detection technique named FORECAST to predict
capabilities that malware has staged for execution auto-
OS characteristics‑based matically. FORECAST guides a symbolic analysis of the
Volatility (Foundation 2020) is the most widely used and malware’s code by leveraging the execution context of the
authoritative open-source MF-based framework with ongoing attack from the malware’s memory image.
a plugin called malfind. It determines whether the pro-
cess is suspicious by checking the memory pages’ vir- AI‑based
tual address descriptor (VAD), a process’ VAD tree that Wang et al. (2020) proposed PROVDETECTOR that
describes the layout of memory segments. In the VAD detects malware with steganography, which is a prove-
tree, there is also some information about the type and nance-based approach. PROVDETECTOR first uses a
level of protection (read, write, execute) of the memory novel selection algorithm to identify potentially malicious
page (Ligh et al. 2014), in addition to information related parts of the process’ OS-level provenance data. Then
to the mapped object and several other flags. For exam- it applies neural embedding and machine learning. In
ple, if the protection field of a memory page is set to another study, Quincy (Barabosch et al. 2017) extracted
“PAGE_EXECUTE_READWRITE”, then malfind will ini- 38 features from the volatile memory and used Random
tially determine that it is a malicious process. However, Forests and Extremely Randomized Trees to classify the
malfind can be easily bypassed by malware developers. memory storage area, which achieved an AUC score of
For instance, the “Bypass Malfind” method proposed by 93.8% on Windows XP, but only 84.4% on Windows 10.
Block and Dewald (2019) will assign a memory of one Bozkir et al. (2021) represented the suspicious pro-
memory-resident malware with READONLY protec- cesses’ memory dumps into RGB images and reported
tion and then change the protection state of all contained 96.39% prediction accuracy by combining the RBF ker-
pages to EXECUTE_READWRITE by VirtualProtectEx. nel-based SMO algorithm with GIST+HOG for feature
What’s more, malfind needs an expert in the memory vectors. Still, This study only investigated malware pro-
forensics field, as malfind does not provide a post-processes and did not consider malware that hides in benign
cessing algorithm to distinguish benign software from processes to execute, such as UUID Shellcode (Team
malicious ones, which means it requires extensive expert 2021) and Earlybird (spotheplanet 2020). We assume
knowledge of memory forensics to analyze Volatility’s that this limits the accuracy of the study by Bozkir et al.
output and determine if a program is a malicious one. (2021) to some extent.
Arefi et al. (2018) have reported a reverse engineering Sihwail et al. (2021) applied memory forensics to
tool named FAROS to detect in-memory-only malware extract memory-based features from malware memory
injection attacks. FAROS only focused on three in-mem- images to expose the actual behavior of malware. They
ory code injection attack techniques. And only imple- used feature engineering and the SVM algorithm, con-
mented on Windows 7 VM, without considering new verted the features into binary vectors, and obtained
attack methods on Windows 10 systems, which are now a classification accuracy of 98.5% in Windows 7 OS.
more widely used.

Fig. 2 Classification of related work on memory forensics-based malware detection

Liu et al. Cybersecurity (2023) 6:21 Page 5 of 22

However, this task is time-consuming for manual feature to RGB images, and then uses deep neural network to
extraction and performs poorly on Windows 10. detect memory-resident malware. In this paper, memory-
resident malware is primarily used as attack payloads to
Deep learning‑based malware detection vision methods launch attacks on the target system to directly execute
In recent years, computer vision has been applied to mal- the malware in the victim computer’s memory, instead of
ware detection with good results. The idea of converting writing malware to the hard drive to evade the progres-
files into images before detection inspired our research. sively increasing malware detection process and remain
Nataraj et al. (2011) first visualized Malware binaries invisible in the target device.
as grayscale images based on the observation that the We assume that the memory-resident malware exe-
images belonging to the same malware family appear cutes the attack when we capture memory dumps.
very similar in layout and texture. In their solution, Boz- Recent research (Wang et al. 2020) can help alleviate this
kir et al. (2019) employed several various convolutional assumption. Moreover, it is assumed that all ICE attacks
neural networks to classify persistent malware files. They leave traces of in-memory data.
converted PE files’ binary bytes into images, and they
reported 97.48% detection accuracy in experiments. Our framework
Pinhero et al. (2021) used three malware visualiza- MRm-DLDet is a memory-resident malware detection
tion methods: grayscale maps, RGB maps, and Markov tool that integrates computer vision and deep learning
images, and then extracted features of the three types of techniques with memory forensics to model ICE attacks.
images using Gabor filters. Twelve different neural net- Figure 3 gives an overview of the MRm-DLDet architec-
works were trained and the F-measure up to 99.97%. ture, which consists of three main parts. We outline the
Tekerek and Yapici (2022) proposed a new method approach here, and the following two sections provide
based on CNN by converting byte files to gray and RGB complete information.
image formats respectively for malicious code classifica- First, MRm-DLDet captures memory dump files and
tion. O’Shaughnessy and Sheridan (2022) proposed a removes duplicate memory pages (A in Fig. 3). Next, we
hybrid framework for malware classification by setting an convert deduplicated memory dumps to RGB images.
entropy threshold to quickly determine whether a sample The RGB images are ultra-high resolution since the
is packed or not, then analyzing the samples using static memory dumps still contain too much data after remov-
and dynamic methods respectively. Static PE files or ing duplicates. Inspired by ultra-high resolution image
memory dump files of processes are mapped into images processing methods in remote sensing image recogni-
by space-filling curves, then the model extracts visual tion, we propose a vision-based enormous image pro-
features from the images, reporting an accuracy of 97.6%. cessing solution. To avoid the important information
However, most of the existing vision-based malware loss caused by traditional multiple downsampling layers
detection methods directly analyze the binary files. image scaling methods, we cropped the enormous ones
O’Shaughnessy and Sheridan (2022) converted memory into sub-images (B in Fig. 3). After that, sub-images are
dumps to images when the malicious program is run- fed into the MRmNet. MRmNet extracts the feature
ning, but they do not analyze the complete memory data. vector of each sub-image by a pre-trained ResNet-18
All existing vision-based malware detection efforts can- network. The feature vectors of sub-images formed the
not deal with ultra-high resolution images. Our work feature of the whole memory dump file, which are fed
solves this problem by deduplicating complete memory into the gated recurrent units (GRU) model later. Then,
dumps and using non-overlapping sliding windows to cut we add an attention layer to retain important details and
images into multiple sub-images. prevent information loss. Finally, we design a voting layer
to output the memory-resident malware detection results
Framework overview (C in Fig. 3).
We first discuss the threat model, then introduce the
overall framework of MRm-DLDet. Finally, we describe Background on ultra‑high resolution image classification
the background of ultra-high resolution image classifica- Our MRm-DLDet framework first visualizes binary
tion (one of the essential techniques used in this study) in files into RGB images. Then we want to get the features
detail. of these images (i.e., features of the memory dump
files) by the ResNet-18 network. However, the dumps
Threat model converted after deduplication still have a minimum
MRm-DLDet is a framework for memory-resident mal- size of 6000 × 6000 . Limited by the storage of GPUs in
ware detection in Windows 10. It takes PE files as input, general devices at this stage, it is not possible to handle
converts the memory dumps of PE programs runtime the computation of ultra-high resolution images, and
Liu et al. Cybersecurity (2023) 6:21 Page 6 of 22

Fig. 3 The overview of MRm-DLDet framework

processing such large images directly by CNN networks MRm‑DLDet

will lead to memory overflow. Thus, it is considered To solve the challenges presented in the “Introduction”
necessary to preprocess ultra-high resolution images section, we created a memory-resident malware detec-
to reduce the image size and retain complete memory tion framework. This section shows the MRm-DLDet
information contained in the image. The direct resize framework and its three component modules in detail.
method is the easiest and fastest, but this will cause a
substantial loss of features, resulting in poor detection Memory data collection and preprocessing module
accuracy. This module executes malicious samples and benign pro-
In this work, we find inspiration from some methods grams in the processed virtual machines and generates
in remote sensing images field. Van Etten (2018) pro- a memory snapshot file for each program. This module
posed YOLT to detect small objects from large swaths also does memory dump deduplication as the first mem-
of imagery. YOLT first uses a sliding window in which ory dump preprocessing step. Specifically, the following
sizes and overlaps (15% by default) are defined by users three modules are included.
to partition ultra-high resolution images into cutouts,
then puts them into a network architecture to train and Modify virtual machine
test the model. The F1-score of YOLT is higher than We use virtualization-based software VMware Work-
0.8. In the study proposed by Wang et al. (2019), they station (VMware 2022), to generate memory dumps for
cropped remote sensing images into several small sub- memory-resident malware and benign samples. Before
images by large-scale cropping, utilizing the non-over- getting the memory dump file, we first made some
lapping sliding window method, to ensure that the big changes to the virtual machine settings since malware
pictures are not scaled during training and testing. is likely to be sensitive to its operating environment.
In the MRm-DLDet framework, we use a non-over- For example, according to cybersecurity experts from
lapping sliding window technique to cut memory dump G DATA (Ebach 2017), malware of Zeus family verifies
images, which can preserve as much information as if it is being launched on a VMware system by check-
possible in the memory dumps compared to resizing ing whether \\.\HGFS file, \\.\vmci file, or registry key:
images directly to the target size. HKLM \SOFTWARE\VMware Inc. \VMware Tools
Liu et al. Cybersecurity (2023) 6:21 Page 7 of 22

exists. If any of them is present, Zeus aborts execution off, and create a new snapshot. Finally, we use vmrun’s
and removes itself. Screen resolution is a commonly ‘createSnap’ operation to capture a snapshot that dumps
used anti-virtual machine detection indicator by mal- the VM’s memory state to a file. We created a snapshot of
ware as well. For example, the banking Trojan TrickBot the Windows 10 VM before any operations and named it
(Abrams 2020) checks the target device’s screen resolu- ‘Initial State’. Every time before running one sample, the
tion to detect virtual machines. If the screen resolution VM machine will roll back to the ‘Initial State’. Küchler
is 800 × 600 or 1024 × 768, the machine will be consid- et al. (2021) suggests that most malicious behavior can be
ered a virtual machine. In addition, malware will search observed within the first two minutes that it is executed.
for user activity on the device by whether the mouse is Each malicious sample was given two minutes to initial-
moved or clicked, or the keyboard is typed, etc., to deter- ize and execute.
mine if it is being analyzed within a virtual machine To further analyze the memory dump file, we present
(Miramirkhani et al. 2017; Yokoyama et al. 2016; Bulazel the structure of the captured memory dump files. Win-
and Yener 2017). According to Malware Behavior Catalog dows memory management can be summarized into
(2022), samples from the DarkComet family will check three mechanisms: (1) virtual address space manage-
if the mouse is moving. The Darkhotel and Ursnif mal- ment, (2) physical page management, and (3) address
ware (MITRE 2021; Ionut Arghire 2017), check whether translation and page swapping (Yu et al. 2015). MRm-
the mouse cursor position has changed to determine DLDet analyzes the entire data of the memory dump,
whether it is running on a real device. Therefore, we miti- including the data of the physical and virtual memory
gate these anti-VM detections by performing actual user space, where each process runs in its own virtual address
actions while the malicious sample is running, including space. In Fig. 4, we briefly show the layout of a typi-
moving and clicking the mouse and typing characters on cal process (Yosifovich et al. 2017), with each part of it
the keyboard. described as follows.
Therefore, we modified the configuration of our Win-
dows 10 virtual machine by the following steps to prevent • Kernel address space: Users do not have access to
it from being checked by memory-resident malware. this part of the memory, which is managed by the
operating system and used for paging pools, system
• Uninstall VMware Tools. cache, device drivers, etc.
• Do some modifications to the.vmx file, such as mak- • User address space: Programs running in user mode
ing the virtual machine use the same BIOS serial have no access to the kernel address space but are
number as the physical machine, etc.
• Modify the MAC address to a random one except
default VMware MAC address (e.g., 00:0c:29,
00:50:56, 00:05:69).
• Modify screen resolution to any value except
800 × 600 and 1024 × 768. In our virtual machine,
we set it to 1152 × 864.
• Mimic normal user behavior by clicking or mov-
ing the mouse and tapping random characters on
the keyboard when running samples in the virtual
machine.

Generating memory dumps

To generate memory dumps, firstly, we do data cleaning
to remove samples that use outdated in-memory code
execution methods and those that don’t run in our Win-
dows 10 VM. Then we automate the memory dump gen-
eration process by controlling the vmrun utility through
python scripts to improve generation efficiency. Vmrun
utility is a command-line utility that controls virtual Fig. 4 Windows Memory Layout overview of one typical process
machines to perform various tasks, such as power on/
Liu et al. Cybersecurity (2023) 6:21 Page 8 of 22

allowed to enter the user address space to which they

are assigned.
• Thread stack: It is used to display the memory alloca-
tion for the stack used by each thread in this process
and orderly allocate short-term storage for local vari-
ables.
• Process heap: It is the dynamically allocated memory
portion, shows the memory allocation for this process
heap, and is used by programs to store global variables.
• DLLs: Contains the DLL files that the sampling process
needs to call.
• Program image: Placement of the executable files.

Figure 4 illustrates a high-level layout since current Win-

dows operating systems use techniques such as address
space layout randomization (ASLR) to defend against
attacks such as buffer overflow, which means that sev-
eral parts of Fig. 4 may not be contiguous in a complete
memory dump file. Further exploration of these tech-
niques is out of scope of this paper. Furthermore, the
“Experiments” section shows that MRm-DLDet obtains
excellent detection results without turning off these
techniques.
Ultra‑high resolution image preprocessing module
Memory dump deduplication
In this module, deduplicated memory dump files are
The generated memory dumps are saved as snapshot
converted into RGB images. We cut one ultra-high reso-
files. As the fact that the memory size of the virtual
lution RGB image into sub-images and labeled each sub-
machine and one memory dump is the same, which is
image. It is the second memory dump preprocessing step.
above 2GB. To reduce the consumption of storage space
and improve MRm-DLDet’s efficiency when analyzing
Visualization memory dumps
memory dumps, we are inspired by Brengel and Ros-
In order to overcome the challenges faced by current
sow (2018) to design and implement a memory duplicate
memory-resident malware detection method that relies
page delete process, which is the first step of memory
on massive expert knowledge and does not fully exploit
dump preprocessing. We base on two observations: (1)
memory information. We are inspired by the motivating
The memory-resident malware only injects payload in
example in Fig. 1. Moreover, from “Generating Memory
small areas of memory compared to the whole memory.
Dumps” section, we find that the data between different
(2) Rolling back to the ’Initial State’ to start running guar-
parts of a process memory also correspond to different
antees that the memory is always the same each time we
uses and structures. We infer that the injected memory
run a sample. Therefore, we only retained the memory
regions will be distinguished from the benign memory
data related to ICE attacks to improve the specificity of
dump structure. Therefore, we represented the memory
memory dumps. We define memory dump deduplica-
dump files as RGB images. The difference between the
tion as: Analyzes the target memory dump and deletes
different data contents after visualization is used to dis-
its pages that are the same as the ’Initial State’ memory
tinguish benign programs from memory-resident ones.
dump.
The deduplicated memory dump files from the pre-
After memory dump deduplication, the memory dump
vious module can essentially be represented as binary
files from the previous module only retain the memory
strings consisting of zeros and ones. We convert every
data about changes after running samples, such as new
8 bits (1 byte) of a memory dump file to a pixel value
processes and threads, added registry configurations,
(0x00 → 0, 0xFF → 255). First, read three-pixel values
injected shellcodes, etc. Algorithm 1 shows the deduplica-
from one memory dump file at a time, and fill them into
tion process briefly.
Liu et al. Cybersecurity (2023) 6:21 Page 9 of 22

Fig. 5 Convert a memory dump file to an RGB image

a 3D array. Hence, the three dimensions are respectively

loaded into the R, G, B channels to generate an RGB
image. What’s more, to increase the variability in visual
features between various memory dump images and to
improve detection accuracy, we use the CLAHE (Reza
2004) technology on all memory dump images. Figure 5
depicts a process of developing a memory dump image.
The images we generate are all square, and the final size
of the image is determined by the number of pixels in
the image, which is square root of the number that rep-
resents (dumpfilesize)/3. At the same time, we choose a
lossless PNG format for the images to minimize the loss
of features. The resolution of the images ranged from
6000 × 6000 pixels to 10000 × 10000 pixels.
Figure 6 shows six memory dump images of benign and
memory-resident samples, with three images of benign Fig. 6 Several RGB memory dump images belonging to benign
samples and memory-resident malware
memory dumps on the left and three images of memory-
resident samples on the right. An empirical observa-
tion that can be made is that the benign and malicious
memory dump images are visually distinct. For color, the window methods for ultra-high resolution images, which
benign image is lighter, while the memory-resident image are mainly used for target detection tasks of remote sens-
is darker. For texture, benign and malicious images have ing images and do not aim at image classification tasks.
different textures. This observation is also consistent with Therefore, we first use the non-overlapping sliding
our inference above. Therefore, it can be found that the window method in vision-based malware detection.
RGB images generated by memory dumps of memory- Our approach overcomes the drawback of traditional
resident malware are significantly different from those image scaling methods, which use multiple downsam-
generated by benign programs, i.e., the method of using pling layers and lead to information loss.
deep neural networks to classify images is efficacious for We set the size of each sub-image to 224 × 224 . How-
memory-resident malware detection. ever, the size of memory dumps varies, and so does the
number of sub-images. To solve this problem, MRm-
DLDet resizes all RGB images to the same size using
Ultra‑high resolution image processing bicubic interpolation. In this resizing method, the tex-
The RGB images converted from memory dump files tural features are still visible (Vasan et al. 2020a). In
are ultra-high resolution images, and CNN networks addition, to choose an appropriate size, we investigate
commonly used for image classification usually do not the influence of three image sizes on model detection
support such large-scale inputs. As mentioned in “Back- results and choose 5600 × 5600 as the adjusted size of
ground on Ultra-High Resolution Image Classifica- RGB images. The memory dump images are then cut
tion” subsection, in previous studies (Wang et al. 2019; into multiple sub-images, and each memory dump pro-
Van Etten 2018), researchers have proposed sliding duces 625 sub-images after cropping. “Experiments”
Liu et al. Cybersecurity (2023) 6:21 Page 10 of 22

section describes more details on the influences of Structure of MRmNet

image size on model detection accuracy. We designed and implemented MRmNet consisting of
ResNet-18, an attention-based GRU, and a self-made vot-
Vision‑based features extraction and detection module ing layer in the Vision-Based Detection Module of the
This module mainly includes the MRmNet, the neural MRm-DLDet framework. It extracts high-dimensional
network in MRm-DLDet. “Structure of MRmNet” sec- features from memory dump images to improve the
tion shows more details of this module. accuracy of our detection framework.
More recently, methods combining convolutional neu-
Images features extraction ral networks and recurrent neural networks have been
We use a pre-trained ResNet-18 network to extract a vec- widely used, the CNN models extract spatial features in
tor with a length of 512 for each sub-image that was cut depth and retain valid information, followed by train-
from one memory dump file. The features of each mem- ing and prediction with RNN, which allows the model to
ory dump file are represented as a matrix of [625, 512]. better express temporal and spatial features. This paper
applies this approach to the malware detection problem,
Memory‑resident malware detection using more advanced ResNet-18 and GRU networks to
We train the attention-based gated recurrent units (GRU) improve the CNN-RNN combination. The MRmNet
network with the features of the sub-images and divide structure in MRm-DLDet framework is shown in Fig. 7.
the training set, verification set, and test set. Consider
performance evaluation metrics to adjust hyperparam- Input layer
eters, including accuracy, precision, recall and F1-score. In MRm-DLDet Framework, malicious and benign pro-
In the end, the model adds a voting layer, which we first grams to be detected are first extracted from the runtime
generated. After the neural network outputs the predic- memory dumps. Then the memory dumps are converted
tion results of each sub-picture, the voting layer first cal- into RGB images and cropped into multiple sub-images.
culates the sum of each 625 outputs, i.e., each sub-picture The sub-images are then imported into the ResNet-18
votes for the final classification of memory dump, with a layer.
value of 1 or 0. The voting layer then calculates the aver- In this study, D i (i = 1, 2, 3...n) denotes the memory
age value of sub-images voting. If the average value is dump files generated by memory-resident samples and
above the threshold, the sample that the memory dump benign samples, i denotes that the sample is the i-th of
represents is detected as a memory-resident malware. our dataset. While di,j (i = 1, 2, ..., n, j = 1, 2, ..., 625)

Fig. 7 The MRmNet structure in MRm-DLDet framework

Liu et al. Cybersecurity (2023) 6:21 Page 11 of 22

denotes each sub-image generated after sliding window Trainc = V1,1 , V1,2 , ...V2,1 , V2,2 , ..., Vn,625 (2)
partitioning a D i image, j denotes the serial number of
the sub-picture, 625 in total. To further describe the GRU principle, set Vt to represent
the input at the current moment, zt as the update gate, rt
as the reset gate, ht as the hidden state that passes to the
ResNet‑18 layer next moment, while ht−1 is the old state, h̃t is the candi-
The second layer is the network to extract the di,j fea- date hidden state. The specific implementation of a single
tures. In this study, we use the ResNet-18 network (He gated recurrent unit is as follows:
et al. 2016). To get over the difficulty that deep networks
zt = σ (Wz · [ht−1 , Vt ])
are not easily optimized, the ResNet-18 network uses a
residual structure. Each residual block is a multilayer rt = σ (Wr · [ht−1 , Vt ])
neural network consisting of a convolutional layer, a (3)
ht = (1 − zt ) ∗ ht−1 + zt ∗ h̃t
batch normalization layer, and an activation layer. The
h̃t = tanh(W · [rt ∗ ht−1 , Vt ]j )
new technique introduced by the ResNet model provides
shortcut connections between non-contiguous convolu- In addition, we applied dropout technique on our GRU
tional layers. This technique allows the model to skip lay- layer to reduce the risk of overfitting. Overall, the output
ers to process vanishing gradients to achieve lower losses by GRU layer at moment t is represented as:
and better results.
To obtain the image features, we extract the output of

Gt = GRU Vi,j (4)
the ResNet-18 model’s avgpool layer as the result of one
sub-image feature extraction, which is a vector with a
length of 512. In detail, the cropped sub-images, repre- Attention layer
sented as di,j that obtained from the input layer, enter the The attention mechanism was proposed by Zhou et al.
ResNet-18 layer as the input. After going through the pre- (2016), which could assign weights to data and weight
trained ResNet network, it extracts a [1, 512] vector Vi,j summation, and is highly interpretable. The attention
(i = 1, 2, ..., n, j = 1, 2, ...625), which is generated by avgpool mechanism effectively retains important details and pre-
layer. i denotes that the sample is the i-th program of our vents critical information from being lost. For that rea-
dataset, and j denotes the serial number of the sub-image. son, we added an attention layer after the GRU layer and
The output of the ResNet-18 layer is formalized as: let it assign weights to the output vectors so that it could
Vi,j = ResNet − 18(di,j ) (1) further improve the detection accuracy of the MRm-
DLDet framework. In our neural network, the attention
mechanism estimates the association level between fea-
tures. Gt is the vector output by the GRU layer at moment
GRU layer t. W is set as the result of the GRU layer weighted sum-
The third layer of MRmNet is the GRU layer. It is well mation of the output vectors. Let at represent the weight
known that long short-term memory (LSTM) (Hochre- of the hidden layer of the attention module, b represent
iter and Schmidhuber 1997) solves RNN’s problem of the bias.
lacking long-term dependence on learning by adding a
gated mechanism and memory cell. However, the LSTM
W = at Gt + b (5)
network has many parameters and converges slowly. Set Li,j as the final output label.
GRU (Cho et al. 2014) is an improved version of standard
LSTM. GRU makes simplifications and improvements Li,j = softmax(at W ) (6)
on LSTM networks. GRU only has update gate and reset
In the end, the output Li,j is the classification result of
gate, while LSTM has three gates (forget gate, input gate,
sub-images. The output of MRmNet has two cases:
and output gate). GRU has fewer training parameters, so
on the one hand, the source binary of the sub-image is
it saves much time when the training data is enormous.
a memory-resident malware, i.e., Li,j is set to 1. On the
This layer divides the feature-extracted sub-image
other hand, the program which generated the sub-image
vectors from the previous layer into the training set,
is a benign file, i.e., Li,j is set to 0.
validation set, and test set. Then, for example, the mem-
ory-resident malware in the training set can be obtained
as the vector sequence shown in Eq.(2). Trainc(c = 0, 1) Voting layer
represents the class of samples in the training set, i.e., 0,1. The final layer of MRmNet is voting layer, a layer that we
designed. In previous layers, ResNet-18 will go through
Liu et al. Cybersecurity (2023) 6:21 Page 12 of 22

GRU, and then the attention layer will assign weights and byte files and asm files. EMBER only includes features
give detection results. extracted after parsing the PE file, and SOREL-20M
The detection result Li,j from the attention layer is offers malicious samples with the PE header set to 0.
only the classification result of one sub-image. In MRm- Since memory-based detection methods require memory
DLDet, a memory dump image will be cropped into 625 data when a program runs, these samples are not avail-
sub-images. Therefore, a combined result of the 625 sub- able for memory forensics. On the other hand, these
images will report the whole memory dump image’s clas- datasets provide few benign executables. Therefore, the
sification result, which is the tested sample’s detection existing datasets do not apply to our study. To construct a
result. suitable dataset, we constructed a dataset that meets the
In order to effectively detect memory-resident mal- following requirements:
ware, it becomes a challenge as how to most effectively
organize the classification results of the 625 sub-images • Malicious samples from memory-resident malware
from each memory dump image. We designed a vot- family.
ing layer, the attention layer’s output Li,j as input. Every • Both malicious samples and benign programs are
625 sub-images represent a memory dump file, that complete PE files and can be run on Windows 10.
can be considered as a group, for example, when i = 1, • All the samples were built recently.
Group1 = L(1,1) , L(1,2) , ...L(1,625) . Calculate the arithme-

tic mean of each sub-images group, the result is shown

as Mi. Malware samples
We collected memory-resident malware that use up-to-
625
n
date evasion methods from VirusShare, which is a mal-
Mi = Li,j (7) ware sample repository that provides security researchers
i=0 j=0
and forensic analysts access to real-time malware sam-
That is, each sub-picture is considered to vote for the ples. To cover as many existing ICE attack techniques
classification of the final executable, with a value of 1 as as possible, we selected more than 80 malware families
memory-resident malware, or 0 as benign. The arithme- using ICE attack techniques such as process injection
tic mean is averaged over those 625 sub-pictures, with based on security companies’ publicly available technical
a threshold of 0.6 for classification, “Experiments” sec- analysis and the analysis in the globally-accessible knowl-
tion describes more details on the selection basis of the edge base of adversary tactics and techniques ATT &CK
threshold values. matrix (MITRE).

• Mi < 0.6, detecting as benign program. Benign samples

• Mi ≥ 0.6, detecting as memory-resident malware. The benign binaries consist of system files and some free-
ware with a large number of users. Some benign data is
extracted from the “System32” directory of the Win-
Experiments odws 10 system. In addition, we collected some free pop-
This section introduces our basic experimental setup, ular programs from CNET Download (Ventures 2022).
discusses existing malware datasets, and presents our To ensure that the benign programs are not bundled
dataset. After that, we described each of our experiments with malicious or adware, we verified the binary labels
in detail and showed the experiments’ results. by uploading benign samples to VirusTotal. VirusTo-
tal aggregates a large number of antivirus products and
Dataset online scanning engines to detect malware. We removed
Many research institutions and companies have provided the samples obtained from CNET Download that Virus-
malware datasets that can be used for artificial intelli- Total detection ratio >3%.
gence and big data analysis, such as the 2015 Microsoft We collected 1120 benign and 1648 malicious sam-
Malware Classification Challenge dataset (BIG2015 data- ples. After data cleaning, we eventually obtained 1010
set) (Ronen et al. 2018), EMBER dataset (Anderson and benign and 1050 malicious binaries. Since every memory
Roth 2018), and SOREL-20M dataset (Harang and Rudd dump image is a set of 625 sub-images, there are 631,250
2020). benign sub-images and 656,250 malicious sub-images as
However, these datasets have some drawbacks. On the input for MRmNet. Once all the data are processed,
the one head, in order to prevent the spread of malware, we divide the train, validation, and test set according to
they all provide processed malicious sample data. For the ratio of 6 : 2 : 2. Table 1 describes the detailed division
example, the BIG2015 dataset only provides processed
Liu et al. Cybersecurity (2023) 6:21 Page 13 of 22

Table 1 Data distribution of benign and malicious samples

Class Train set Val set Test set Total

Memory-resident 630 210 210 1,050

Benign 606 202 202 1,010
Total 1,236 412 412 2,060

Table 2 Data distribution of sub-images

Class Train set Val set Test set Total

Memory-resident 393,750 131,250 131,250 656,250

Benign 378,750 126,250 126,250 631,250
Total 772,500 257,500 257,500 1,287,500
Fig. 8 Detection results of different thresholds

Table 3 The hyperparameters during training

Configuration Value Table 4 Comparison of different image processing methods
Method Accuracy (%) Precision Recall F1-score
Epoch 60
Batch Size 2048 Directly resize 90.55 0.8824 0.9002 0.8912
Learning Rate 0.001 Non-overlapping 98.34 0.9896 0.9777 0.9836
ModelCheckpoint monitor=’val_ sliding window
acc’, Bold values indicate the best detection result
mode=’max’

memory-resident malware and benign. Selecting a rea-

of benign and malicious data, and the division of pro- sonable threshold would help our model achieve the best
cessed sub-images for MRmNet’s model training can be detection results. We used the median value 0.5 as guide-
found in Table 2. Moreover, our datasets are now ready to line with a step size of 0.03 and tested the four evalua-
be used for non-profitable purposes (C1air3 2023). tion metrics of the model when the threshold values were
respectively chosen from 0.45 to 0.69, which is presented
Experiment settings in Fig. 8.
We implemented our execution environment on one According to Fig. 8, when the threshold value is
ThinkPad T480 physical machine with Intel Core selected as 0.6, we got the best accuracy, recall, and
i5-8250U, 1.80 GHz processor, and 24 GB of RAM. We F1-score scores. Thereby, it can be seen that 0.6 is an
generated memory dump files in VMware Workstation appropriate threshold for our detection framework.
(VMware 2022), version 16.0, which installed a Windows
10 OS. The main programming language environment is
Python 3.7. Ultra‑high resolution image preprocessing method
After several evaluations and adjustments, the hyper- evaluation
parameter configuration of the attention based GRU The first experiment explored the effect of different
model in the experiment shows in Table 3. In the experi- ultra-high resolution image processing methods on the
ments of this study, we chose Accuracy, Precision, Recall, MRm-DLDet framework’s detection performance. On
and F1-score to evaluate the effectiveness of our MRm- the one hand, the ultra-high resolution images were
DLDet framework and the neural network models used directly reduced to 224 × 224 by the bicubic interpo-
for comparison. These four evaluation metrics have been lation method. On the other hand, a non-overlapping
widely used in previous studies, and they are important sliding window cuts every ultra-high resolution image
basis for model performance evaluation. into 625 sub-images. Then separately fed, the memory
dump images using these two methods into the MRmNet
Threshold for detection (ResNet-18+GRU+Attention) network for training and
In MRmNet, the final voting layer needs to choose testing. The evaluation metrics’ results of the two experi-
a threshold to classify the voting results into ments are in Table 4. Compared with the direct scaling
Liu et al. Cybersecurity (2023) 6:21 Page 14 of 22

Table 5 Description of three CNN models

Model Author Description

VGG16 Simonyan and Zisserman (2014) The VGG16 network has a strong fitting ability. It is often used as a benchmark for malware identifica-
tion, and its core design idea is to use smaller convolutional kernels and build deeper network layers.
Inception V3 Vasan et al. (2020b) Proposed by Google, the highlight is the addition of decomposition techniques to decompose the
convolutional kernel.
ResNet-18 He et al. (2016) ResNet-18 is one of the ResNet network family, its network structure balances training efficiency and
accuracy well, and it has achieved excellent results in visual malware classification.

Table 6 Comparison of the MRm-DLDet framework evaluation memory dumps better than directly resized images to get
using different neural networks higher detection accuracy.
Method Model Accuracy Precision Recall F1-score We analyzed the misreported programs. A malicious
(%) sample of the DarkComet family was misreported as
benign, and we found that this was because when the
Directly VGG16 87.76 0.8842 0.8733 0.8787
resize memory dump of the sample was obtained, the runt-
ResNet-18 91.59 0.9124 0.9062 0.9093
ime sample did not successfully connect to C&C and
Inception 85.79 0.8649 0.8813 0.8730
V3 therefore did not perform the following attack behavior,
MRm- 90.55 0.8824 0.9002 0.8912
resulting in similar characteristics to the benign program.
DLDet Another observation is that memory-resident malware
Non- VGG16 92.36 0.9192 0.9210 0.9201 samples are falsely reported at a higher rate, probably
overlapping ResNet-18 94.20 0.9276 0.9524 0.9398 because hackers are constantly improving ICE attacks to
sliding
window Inception 91.77 0.9193 0.9128 0.9160 make their actions increasingly slight and more similar to
V3 the APIs used by benign programs, for example, to obtain
MRm- 98.34 0.9896 0.9777 0.9836 a higher evade capability.
DLDet
Bold values indicate the best detection result Different memory dump image sizes and neural networks
evaluation
In MRm-DLDet’s Ultra-High Resolution Image Preproc-
method, using a non-overlapping sliding window could essing Module, when using the non-overlapping sliding
significantly improve the memory-resident malware window to crop memory dump RGB images, the num-
detection performance of MRm-DLDet. ber of sub-images varies depending on the resolution of
Furthermore, to compare the effect of different neural the memory dumps. MRm-DLDet solves this problem
networks with different image processing methods on by resizing all RGB images to the same size by bicubic
the performance of visual detection of memory-resident interpolation. We evaluate the effect of three different
malware, we selected three neural network models that image sizes: 3360 × 3360, 4480 × 4480, and 5600 × 5600
are widely used in malware visualization detection meth- on the detection performance of the model. Additionally,
ods as comparison baselines. Then trained and tested to choose the best CNN-RNN combination for MRm-
the models using the processed memory dump data. The Net, we selected three CNN models (VGG16, Inception
models are briefly described in Table 5. V3, ResNet-18), three RNN models (RNN, LSTM, GRU)
Table 6 shows the results of training and testing the and respectively cross-combined them. Each CNN-RNN
four neural network models separately with images pro- combination uses three different sizes of memory dump
cessed by two different dimensionality reduction meth- images as input, the CNN models extract image features,
ods. It can be found that MRm-DLDet using sub-images and the features are then transferred to the RNN models
cropped by non-overlapping sliding windows as input has that are combined with the attention mechanism.
the best detection accuracy of 98.34% and F1 score > 0.98, Totally 27 detection models were generated, 9 of each
and the detection results of the models using images pro- memory dump image size. Each sub-image is 224 × 224 ,
cessed with the non-overlapping sliding window method through the non-overlapping sliding window, one RGB
as input are better than those of the models trained with image of three sizes produces 625, 400 as well as 225
direct resized images. This also proves that it is reason- sub-images respectively. Table 7 shows the training
able to apply non-sliding windows with ultra-high resolu- and detection results of each model. Besides the four
tion memory images, which can preserve the features of
Liu et al. Cybersecurity (2023) 6:21 Page 15 of 22

Table 7 Comparison of different memory dump image sizes and different neural networks
Memory dump image size MRmNet’s deep learning model Accuracy Precision Recall F1-score Feature
extraction time
(minutes)

5600 × 5600(625 Sub-Images) VGG16 + RNN + Attention 96.49% 0.9474 0.9671 0.9572 0.5970
VGG16 + LSTM + Attention 97.51% 0.9554 0.9676 0.9614 0.5970
VGG16+GRU+Attention 97.62% 0.9546 0.9680 0.9613 0.5970
Inception V3 + RNN + Attention 94.27% 0.9328 0.9351 0.9340 3.9532
Inception V3 + LSTM + Attention 94.90% 0.9450 0.9526 0.9488 3.9532
Inception V3 + GRU + Attention 95.22% 0.9309 0.9674 0.9488 3.9532
ResNet-18 + RNN + Attention 97.66% 0.9817 0.9642 0.9729 0.4779
ResNet-18 + LSTM + Attention 98.11% 0.9828 0.9810 0.9819 0.4779
ResNet-18 + GRU + Attention 98.34% 0.9896 0.9777 0.9836 0.4779
4480 × 4480(400 Sub-Images) VGG16 + RNN + Attention 95.67% 0.9529 0.9524 0.9527 0.5837
VGG16 + LSTM + Attention 95.97% 0.9500 0.9575 0.9537 0.5837
VGG16 + GRU + Attention 96.05% 0.9581 0.9534 0.9557 0.5837
Inception V3 + RNN + Attention 93.28% 0.9251 0.9388 0.9319 1.1931
Inception V3 + LSTM + Attention 94.01% 0.9468 0.9418 0.9442 1.1931
Inception V3 + GRU + Attention 93.78% 0.9321 0.9375 0.9348 1.1931
ResNet-18 + RNN + Attention 97.81% 0.9808 0.9787 0.9797 0.4598
ResNet-18 + LSTM + Attention 97.66% 0.9821 0.9733 0.9777 0.4598
ResNet-18 + GRU + Attention 97.41% 0.9808 0.9714 0.9761 0.4598
3360 × 3360(225 Sub-Images) VGG16 + RNN + Attention 95.28% 0.9427 0.9531 0.9479 0.2674
VGG16 + LSTM + Attention 94.81% 0.9345 0.9463 0.9403 0.2674
VGG16 + GRU + Attention 95.59% 0.9464 0.9644 0.9553 0.2674
Inception V3 + RNN + Attention 94.98% 0.9316 0.9536 0.9425 1.4769
Inception V3 + LSTM + Attention 94.61% 0.9439 0.9404 0.9422 1.4769
Inception V3 + GRU + Attention 95.02% 0.9405 0.9562 0.9483 1.4769
ResNet-18 + RNN + Attention 95.58% 0.9807 0.9432 0.9616 0.3964
ResNet-18 + LSTM + Attention 96.00% 0.9794 0.9528 0.9659 0.3964
ResNet-18 + GRU + Attention 96.03% 0.9784 0.9643 0.9713 0.3964
Bold values indicate the best experimental results

evaluation metrics, we also consider the feature extrac- by non-overlapping sliding windows and the model is
tion time, which shows the time taken by different CNN relatively getting better detection results. Therefore, we
models to extract features from a memory dump image choose 5600 × 5600 for MRm-DLDet as the size of the
of different sizes. memory dump image after bicubic interpolation process-
Table 7 shows that in terms of image size, the accu- ing in order to get the best detection results.
racy of the neural networks decreases with the For the CNN models, five evaluation metrics are con-
increased compression of the images. The evalua- sidered: accuracy, precision, recall, F1-score, and feature
tion metrics of the model gradually decrease from extraction time. In MRm-DLDet, the pre-trained CNN
5600 × 5600 to 3360 × 3360. For example, the ResNet- model extracts features for each memory dump file’s sub-
18+GRU+Attention model has 98.34% accuracy, 0.9896 images, which are then transferred to the RNN model to
precision, 0.9777 recall, and F1-score is 0.9836 when the train and test. The feature extraction time is affected by
image size is 5600 × 5600, while the accuracy drops to two factors:
97.41% and 0.9808 when the image size is 4480 × 4480.
When the size is reduced to 3360 × 3360, the detection • The number of sub-images of each memory dump
accuracy is only 96.03%, the precision is only 0.9784. The file is different due to the different sizes of the images.
same finding was found in the other 9 combinations of • The difference between the structure of the neural
neural network models. This may be because using a network results in different feature vector lengths
5600 × 5600 image, more sub-images can be generated extracted by each model.
Liu et al. Cybersecurity (2023) 6:21 Page 16 of 22

In this experiment, the image feature vector extracted

from the ResNet-18 model’s avgpool layer is 512 in
length, the VGG16 model extracts features of 4096 in
length for each sub-image, and the pre-trained Inception
V3 model outputs sub-image features with 1 dimension
and 2048 in length. Therefore, for the 27 detection mod-
els, the combination of different memory dump image
sizes and pre-trained CNN models resulted in 9 different
feature matrices, and the corresponding feature extrac-
tion times are shown in Table 7 as well. Since the feature
extraction time only depends on the CNN model, every
three detection models using the same CNN share the
same feature extraction time.
Table 7 shows that among the nine models using the Fig. 9 Accuray and loss of each epoch
same memory dump image size, ResNet-18 with differ-
ent RNN models combination obtained better detec-
tion results than the other CNN models. Taking a
5600 × 5600 size image as an example, the highest accu- and attention achieved the best detection results out of
racy of the three models using the pre-trained ResNet-18 25 models.
extracted features was 98.34%, and the results of pre- In conclusion, the ResNet-18+GRU+Attention model
cision, recall, and F1-score were also the best. The best with memory dump images at the size of 5600 × 5600
model that combines the VGG16 model with each of the achieved the best detection result, with an accuracy of
three RNN models is VGG16+GRU+Attention, with 98.34%. Figure 9 shows the variation of both accuracy as
a detection accuracy of 97.62%. The detection accuracy well as loss of the model on the train and validation sets.
of the combination of Inception V3 and three RNNs is We noticed that this model attains high accuracy with
lower than that of the model using the other two meth- both train and validation sets, and the training loss and
ods to extract features, with the highest detection accu- validation loss are very close to each other, which means
racy of only 95.22%. The same conclusion can be found in our model works well in memory-resident malicious
other memory dump image sizes. code detection experiments.
Comparing the feature extraction time consump-
tion of the three CNN models in Table 7, Inception V3 Ablation study
takes much more time than the other two, which could In the previous subsection, we evaluated the effect of
be more efficient and suitable for further deployment. using different CNN-RNN combinations and different
The difference between VGG16 and ResNet-18 in fea- image sizes on the final detection results of MRm-DLDet.
ture extraction efficiency is less than 0.1 min, which is We selected the image size as 5600 × 5600 and the model
not significantly different. The shortest feature extrac- combination ResNet-18+GRU+Attention for MRmNet
tion time is found when the image size is 3360 × 3360, that resulted in the best detection results. To further
and the pre-training model is VGG16, which takes only explore the impact of different components in the MRm-
0.2674 min to extract the features of one sample. How- Net of MRm-DLDet on the final detection performance,
ever, ResNet-18 is still selected as the CNN model for we performed an ablation study. Three MRmNet combi-
the vision-based detection module because the detection nations: only ResNet-18, ResNet-18+GRU, and ResNet-
accuracy of ResNet-18 models is much better than that of 18+GRU+Attention, were considered separately to
the VGG16 models, and the required feature extraction evaluate the effectiveness of each component. The results
time is 0.4779 min. There is no significant difference in can be found in Table 8.
detection efficiency compared to VGG16. According to Table 8, when using only the ResNet-18
Once the memory dump image size and CNN model model, MRmNet has the same results as those in Table 6.
are determined, it can be found that the best perform- The lowest detection accuracy is achieved by convert-
ing RNN model in the MRmNet is GRU by analyzing ing the size of ultra-high resolution images to 224 × 224
the detection results of the 27 deep learning models in and inputting it directly into the ResNet-18 model for
Table 7. Comparing the results with different combina- detection only in this model. The detection accuracy of
tions of the three RNN models using the same image size MRmNet was significantly improved to 95.43% when
and CNN models, the combination of CNN with GRU the ultra-high resolution images were cropped into sub-
images and then using ResNet-18 to extract sub-image
Liu et al. Cybersecurity (2023) 6:21 Page 17 of 22

Table 8 Ablation Study for MRm-Net

Model used by MRmNet Accuracy Precision Recall F1-score

ResNet-18 91.59% 0.9124 0.9062 0.9093

ResNet-18 + GRU 95.43% 0.9633 0.9504 0.9568
ResNet-18 + GRU + Atten- 98.34% 0.9896 0.9777 0.9836
tion
Entries in bold font indicate the best results

features before calling the GRU network for detection.

Finally, the best detection results were obtained after
adding the attention mechanism to the GRU network.
Thus, MRmNet using ResNet-18+GRU+Attention as in
MRm-DLDet gets the best results.
Fig. 10 Comparing MRm-DLDet with Memory Forensics-based
detection methods
Comparing existing techniques
To comprehensively evaluate the effectiveness of our
MRm-DLDet framework, as MRm-DLDet focuses on
capturing and analyzing memory dumps, we compared state-of-the-art evade methods. Malfind treats every
the MRm-DLDet with several up-to-date memory foren- non-empty memory area with RWX permissions as mali-
sics-based memory-resident malware detection works. In cious, and once those permissions are well-tuned (i.e.,
addition, since our work detects memory-resident mal- only RX permissions are set), malfind performs poorly.
ware code using a vision-based approach, we also com- The other methods were trained and tested mainly on
pare it with several latest existing vision-based malware Windows 7, and the features they selected may not take
detection methods. The methods used for both compari- full advantage of memory data information, resulting in
sons have been described in “Related work” and are only poor performance in our dataset.
briefly described in this subsection.
Comparing deep learning and vision‑based malware
Comparing memory forensics‑based detection methods detection methods
We compared MRm-DLDet with four memory forensics- To evaluate MRm-DLDet as comprehensively as pos-
based malware detection works. Malfind is a plugin of sible, we compared it with four recent malware detec-
Volatility, which detects malware by checking if the VAD tion methods combining deep learning and vision-based
protection is set to PAGE_EXECUTE_READWRITE. methods. Bozkir et al. (2019) converted raw binary bytes
Quincy (Barabosch et al. 2017) extracted 38 features of PE files into RGB images and used convolutional neu-
from a memory dump and used Random Forests and ral networks to classify persistent malware files. Pinhero
Extremely Randomized Trees to classify the memory et al. (2021) visualized PE files into grayscale, RGB, and
storage area. Sihwail et al. (2021) converted the features Markov images and extracted features of the three types
extracted from malware memory into binary vectors, of images by Gabor filters. VGG3 got the best classifi-
further using the SVM algorithm to detect malware. Boz- cation result in their work. Tekerek and Yapici (2022)
kir et al. (2021) is the most advanced study for detecting converting PE files to gray and RGB images for malware
memory-resident malware using memory forensics. It classification. O’Shaughnessy and Sheridan (2022) used
represents memory dumps as RGB images and uses an a hybrid framework for malware classification, first set-
SMO algorithm with radial basis kernels combined with ting an entropy threshold to determine whether a sample
feature vectors extracted by GIST+HOG. is packed or not, then analyzing the samples using static
We reproduced the detection methods according to the and dynamic methods respectively. Their dynamic meth-
principles described in the article and trained and tested ods converted malicious process memory dumps into
them by our dataset. The experimental results can be space-filling curve images.
found in Fig. 10. Again, we reproduced the methods according to the
The MRm-DLDet framework gets better results than principles in the article, then trained and tested them
other methods in accuracy, precision, recall, and F1-score by our dataset. The results are shown in Fig. 11. Due to
according to Fig. 10. The dataset used in this experiment image layout limitations, Tekerek and Yapici (2022) is
was our memory-resident malware dataset contains represented as “Tekerek (2022)”, and O’Shaughnessy
Liu et al. Cybersecurity (2023) 6:21 Page 18 of 22

the machine and run for about two minutes to initial-

ize and execute. It looks like MRm-DLDet is somewhat
time-consuming. However, from previous experimental
results, we found that the MRm-DLDet framework is far
ahead of the existing memory-resident malware detec-
tion methods regarding accuracy. Therefore, we consider
the time consumption of the memory dumps preprocess-
ing stage to be acceptable.

Robustness of MRm‑DLDet framework

To measure the robustness of MRm-DLDet, we evalu-
ate the impact of evasion attacks on our model. Neural
network-based malware detection methods are easily
attacked by mimicry attacks (Wagner and Soto 2002) and
Fig. 11 Comparing MRm-DLDet with Vision-Based detection adversarial attacks. In this section, we analyze both types
methods of attacks and evaluate the performance of our MRm-
DLDet framework against these evasion attacks.

Mimicry attacks to MRm‑DLDet framework

and Sheridan (2022) is represented as “O’Shaughnessy To evaluate MRm-DLDet’s robustness to mimicry attack,
(2022)”. we described four kinds of up-to-date memory-resident
Analyzing the results in Fig. 11, the methods of Boz- evasion methods in Table 9, they all have good evasion
kir et al. (2019), Pinhero et al. (2021) and Tekerek and capability.
Yapici (2022) perform similarly, but all have significant We compared our MRm-DLDet detection framework
differences with the detection results of the other two with the seven most popular AV engines (PCmag 2022).
methods. This may be because the above three meth- We generate 30 mimicry attack samples for each state-
ods visualize PE files directly, and the packed samples of-the-art memory-resident evasion method in Table 9
affect the accuracy. O’Shaughnessy and Sheridan (2022)’s and detect them separately with the eight AV engines.
method uses static and dynamic analysis methods by All mimicry attack samples do not appear in the train-
distinguishing between packed and unpacked samples. ing dataset of MRm-DLDet. The detection results can be
Its dynamic approach can capture malware activities in found in Table 10.
the memory, thus showing high accuracy in our dataset. Table 10 shows that MRm-DLDet gets better detection
However, its dynamic method only extracts the mini- results than popular AV engines, and the only ICE attack
processes containing process threads and handles infor- which is not detected is module stomping. We define the
mation of malware, which means they do not analyze criterion of robustness as the ability to detect three out
the complete memory data, resulting in lower detection of four state-of-the-art attacks proving that the detec-
results than our MRm-DLDet. tion engines are robust. It can be found that only MRm-
The two comparing experiment results prove that our DLDet and Microsoft’s Windows Defender are robust
framework that converts memory-resident malware’s against the latest mimicry attacks. Overall, this periment
memory dumps to RGB images and uses deep learning confirmed that our MRm-DLDet is robust to the latest
models to detect them is an effective ICE attacks detector mimicry attacks.
with high accuracy and recall.
Adversarial attacks to MRm‑DLDet framework
Time consumption evaluation in realistic environment Over the past few years, many studies of adversarial
We investigated the performance of MRm-DLDet work- attacks against deep learning-based malware detection
ing in a realistic environment and evaluated the time tools have proved effective (Suciu et al. 2019; Hu and Tan
consumption of deploying it in a real environment. We 2017; Anderson et al. 2018; Grosse et al. 2017). In theory,
deployed our framework on a Windows 10 computer and MRm-DLDet is a deep learning-based detection frame-
kept providing suspicious PE files into MRm-DLDet for work that can also be attacked by an adversarial exam-
analysis to detect if they were memory-resident malicious ple carefully constructed by the attackers. We assume
programs. Statistically, the average time taken by MRm- that the attackers will not intercept our model, so the
DLDet from receiving a PE file to the end of detection is attackers can only modify the attack sample based on the
2.39 min. This is expected as a PE file needs to be put into model binary decision, i.e., black box attack.
Liu et al. Cybersecurity (2023) 6:21 Page 19 of 22

Table 9 The latest evasion methods used by memory-resident malware

Evasion Method Description

UUID Shellcode (Team 2021) UUidFromStrinA API takes a string based UUID and converts it to its binary representation.
Providing UUidFromStrinA a pointer to a heap address, it decodes data and writes them to memory without using
common functions such as memcpy or WriteProcessMemory.
The EnumSystemLocales function executs shellcode.
Earlybird (spotheplanet 2020) Earlybird method creates a new legitimate process in a suspended state and allocates memory for shellcode in
the new process’s memory space.
Declare APC routine pointing to the shellcode, then shellcode is written to the previously allocated memory.
Queuing APC to the main thread, resuming the thread and executing the shellcode.
Phantom DLL Hollowing (orr 2021) Phantom DLL Hollowing first open a TxF handle to a Microsoft signed DLL file on disk, and infect its .text section
with shellcode.
Generate a phantom section from this malware-implanted image and map a view of it to the address space of a
process of his choice.
The shellcode is hidden and then executed in the .text section with +RX permissions.
Module Stomping (ired.team Module Stomping first injects some benign Windows DLL into a remote (target) process.
2020) Overwrites DLL’s that loaded in step one, AddressOfEntryPoint point with shellcode.
Starts a new thread in the target process at the benign DLL’s entry point, where the shellcode has been written to.

Table 10 Comparison of the detection results for the latest ICE Discussion
attacks by different antivirus solutions Our experimental results indicate that MRm-DLDet
AV engines Attacks framework is superior to the state-of-the-art memory-
resident malware detection methods. Measurements of
UUID Earlybird PhantomDLL Module
Shellcode hollowing stomping
MRm-DLDet’s runtime have shown that it could also
be used in real-world malware detection. We detect
McAfee U∗ U U M ICE attacks with memory forensics, which takes the
Bitdefender M ∗∗ U U M memory dumps of program runtime as information
Webroot U U U U sources. MRm-DLDet takes full advantage of the feature
Malwarebytes U U U U that memory forensics can directly analyze RAM data
ESET-NOD32 U U U M and detect malware operations in memory for captur-
Sophos U U U U ing memory-resident malware. Some challenges are still
Microsoft M M U M faced in memory-resident malware detection.
MRm-DLDet M M M U
∗
U means Undetcted
∗∗ The risk of overfitting
M means Memory-Resident Malware
Our dataset consists of 1010 benign and 1050 memory-
resident samples, which seems insufficient for a deep
We believe that attacks against deep neural networks learning model, and the model may be at risk of overfit-
are not feasible in our environment. On the one hand, our ting. In this paper, the risk of overfitting is concentrated
method takes PE files as input. It is deployed on the client in the neural network MRmNet of MRm-DLDet, where
system, so the attackers cannot theoretically obtain the ResNet-18 is primarily used for extracting image features.
feature data during detection for data poisoning attacks, The attention-based GRU network is used to complete
etc. On the other hand, existing black-box attack meth- the classification task. The high-resolution images input
ods (Hu and Tan 2017; Anderson et al. 2018) are aimed into the GRU network is split into multiple sub-images
at static PE anti-malware, such as modifying malware by after the ultra-high resolution image preprocessing mod-
adding irrelevant junk characters to bypass detection. ule, samples for training and testing GRU network are
Since most of these methods are intended to alter mal- 631,250 benign sub-images and 656,250 malicious sub-
ware without affecting the program’s regular function, images, so the sample size is enough to avoid overfitting.
malware’s operations in memory will not change. So the We have also added dropout and early stopping tech-
existing adversarial attacks that modify the PE files have niques that effectively generalize the network and reduce
minimal impact on our detection framework. the overfitting of trained data to the GRU network.
Liu et al. Cybersecurity (2023) 6:21 Page 20 of 22

Runtime time overhead MRm-DLDet exhibits well robustness in detecting the

Getting a memory dump when performing malicious latest memory-resident malware, demonstrating that it is
operations usually takes more than two minutes for potentially valuable in practical usage. As a result, MRm-
execution time. It will reduce the efficiency of the detec- DLDet is a powerful detection scheme for memory-resi-
tion system. This situation challenges memory forensics dent malware.
and deep learning-based malware detection techniques,
Acknowledgements
which we identify as future work to be addressed. The authors of the paper sincerely appreciate anonymous reviewers who
reviewed this manuscript and provided constructive comments.
Apply the MRm‑DLDet framework in a production Author Contributions
environment JL participated in all the work, proposed the framework with careful experi-
To run MRm-DLDet in the production environment, it ments, and wrote the manuscript. YF and XL provided suggestions on the
detection model and joined the discussion of this work. JZ did some work on
will be deployed to the client device and detect suspicious data collection. QL reviewed the manuscript and gave suggestions on the
PE files according to the user’s requirements, get the revision of the details of the article. All authors read and approved the final
memory dumps and detect them, and return the detec- manuscript.
tion results to the client in time so that the malicious files Funding
can be handled in time to ensure the safety of the client This work is supported by the Youth Innovation Promotion Association CAS
device. On the one hand, future research will need to (No.2019163), the Strategic Priority Research Program of Chinese Academy
of Sciences (No. XDC02040100), the Key Laboratory of Network Assessment
monitor malicious sample analysis reports published by Technology at Chinese Academy of Sciences and Beijing Key Laboratory of
major security companies and forums to obtain updated Network security and Protection Technology.
samples. On the other hand, there is also a demand to
Availability of data and materials
investigate a way to reduce the training and testing time The full dataset used in this paper and the demo code for the detection
for periodic model updates in the future, with an aim to framework are publicly available and could be accessed at:https://github.
find a solution that doesn’t lead to a lag in model detec- com/C1air3/MRm-DLDet.
tion but also achieves high efficiency.
Declarations
Conclusion Competing interests
This study has applied deep neural networks to detect The authors declare that they have no competing interests.
memory-resident malware in memory forensics. We pro-
posed and evaluated a detection framework for memory- Received: 16 January 2023 Accepted: 6 April 2023
resident malware named MRm-DLDet. It combines the
information of the malware’s whole memory dumps with
neural networks to overcome the bottleneck faced by
existing detection methods that rely on massive expert References
knowledge and do not fully exploit memory informa- Abrams L (2020) TrickBot malware now checks screen resolution to evade
analysis. https://www.bleepingcomputer.com/news/security/trickbot-
tion. We Deduplicate the large memory dump file first. malware-now-checks-screen-resolution-to-evade-analysis/
Then in the process of visualizing memory dumps, we Alrawi O, Ike M, Pruett M, Kasturi RP, Barua S, Hirani T, Hill B, Saltaformaggio
converted memory dumps to RGB images. We used a B (2021) Forecasting malware capabilities from cyber attack memory
images. In: 30th USENIX security symposium (USENIX security 21), pp
non-overlapping sliding window to cut the generated 3523–3540
high-resolution images to maximize the preservation Anderson HS, Roth P (2018) Ember: an open dataset for training static pe
of memory dump features. This aims to improve deep malware machine learning models. arXiv preprint arXiv:1804.04637
Anderson HS, Kharkar A, Filar B, Evans D, Roth P (2018) Learning to evade static
neural networks’ accuracy and determine whether there pe machine learning malware models via reinforcement learning. arXiv
are suspicious processes in the computer’s memory. To preprint arXiv:1801.08917
support these findings, we collected a publicly available Arefi MN, Alexander G, Rokham H, Chen A, Faloutsos M, Wei X, Oliveira DS,
Crandall JR (2018) Faros: illuminating in-memory injection attacks via
dataset consisting of state-of-the-art memory-resident provenance-based whole-system dynamic information flow tracking.
malware and benign samples. We have designed and In: 2018 48th annual IEEE/IFIP international conference on dependable
implemented MRmNet for detection in MRm-DLDet, systems and networks (DSN), pp 231–242. IEEE
Barabosch T, Bergmann N, Dombeck A, Padilla E (2017) Quincy: Detecting
which is composed of ResNet-18, GRU, and attention, host-based code injection attacks in memory dumps. In: international
and it was trained by our dataset. conference on detection of intrusions and malware, and vulnerability
We explored the performance of different detection assessment, pp 209–229. Springer
Binsalleeh H, Ormerod T, Boukhtouta A, Sinha P, Youssef A, Debbabi M, Wang
methods. In our experiments, MRm-DLDet achieved L (2010) On the analysis of the zeus botnet crimeware toolkit. In: 2010
an impressive accuracy rate and detected 98.34% of eighth international conference on privacy, security and trust, pp 31–38.
memory-resident malware. This method offers a reason- IEEE

able time to detect memory-resident malware. Moreover,

Liu et al. Cybersecurity (2023) 6:21 Page 21 of 22

Block F, Dewald A (2019) Windows memory forensics: detecting (un) intention- Nataraj L, Karthikeyan S, Jacob G, Manjunath BS (2011) Malware images: visualiza-
ally hidden injected code by examining page table entries. Digit Investig tion and automatic classification. In: proceedings of the 8th international
29:3–12 symposium on visualization for cyber security, pp 1–7
Bozkir AS, Tahillioglu E, Aydos M, Kara I (2021) Catch them alive: a malware Ni S, Qian Q, Zhang R (2018) Malware identification using visualization images
detection approach through memory forensics, manifold learning and and deep learning. Comput Sec 77:871–885
computer vision. Comput Sec 103:102166 odzhan: Shellcode: in-memory execution of DLL (2019). https://modexp.wordp
Bozkir AS, Cankaya AO, Aydos M (2019) Utilization and comparision of convolu- ress.com/2019/06/24/inmem-exec-dll/
tional neural networks in malware recognition. In: 2019 27th signal process- orr F (2021) Phantom DLL hollowing. https://github.com/forrest-orr/phantom-
ing and communications applications conference (SIU), pp 1–4. IEEE dll-hollower-poc
Brengel M, Rossow C (2018) Memscrimper: Time-and space-efficient storage of O’Murchu L, Gutierrez FP (2015) The evolution of the fileless click-fraud malware
malware sandbox memory dumps. In: international conference on detec- poweliks. Symantec Corp
tion of intrusions and malware, and vulnerability assessment, pp 24–45. O’Shaughnessy S, Sheridan S (2022) Image-based malware classification hybrid
Springer framework based on space-filling curves. Comput Sec 116:102660
Bulazel A, Yener B (2017) a survey on automated dynamic malware analysis eva- Paschen C (2020) Avoiding get-injectedthread for internal thread creatioN.
sion and counter-evasion: Pc, mobile, and web. In: proceedings of the 1st https://www.trustedsec.com/blog/avoiding-get-injectedthread-for-inter
reversing and offensive-oriented trends symposium, pp. 1–21 nal-thread-creation/
C1air3: MRm-DLDet (2023). https://github.com/C1air3/MRm-DLDet PCmag: the best antivirus protection for 2022 (2022). https://www.pcmag.com/
CERT A (2018) Analysis of cyberattacks against the national bank of Malawi. picks/the-best-antivirus-protection
https://www.antiy.com/response/20181127.html Pinhero A, Anupama M, Vinod P, Visaggio CA, Aneesh N, Abhijith S, Anan-
Cho K, Van Merriënboer B, Bahdanau D, Bengio Y (2014) On the properties thaKrishnan S (2021) Malware detection employed by visualization and
of neural machine translation: encoder-decoder approaches. arXiv preprint deep neural network. Comput Sec 105:102247
arXiv:1409.1259 Reza AM (2004) Realization of the contrast limited adaptive histogram equaliza-
Ebach L (2017) Analysis Results of Zeus. Variant Panda G DATA, G DATA tion (clahe) for real-time image enhancement. J VLSI Signal Proc Syst Signal,
Fewer S (2008) Reflective DLL injection Image Video Technol 38(1):35–44
Foundation V (2020) The volatility framework. http://www.volatilityfoundation. Ronen R, Radu M, Feuerstein C, Yom-Tov E, Ahmadi M (2018) Microsoft malware
org classification challenge. arXiv preprint arXiv:1802.10135
Grosse K, Papernot N, Manoharan P, Backes M, McDaniel P (2017) Adversarial Sihwail R, Omar K, Ariffin KAZ (2021) An effective memory analysis for malware
examples for malware detection. In: European symposium on research in detection and classification. CMC-Comput Mater Continua 67(2):2301–2320
computer security, pp 62–79. Springer Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale
Harang R, Rudd EM (2020) Sorel-20m: A large scale benchmark dataset for mali- image recognition. arXiv preprint arXiv:1409.1556
cious pe detection. arXiv preprint arXiv:2012.07634 spotheplanet: Early Bird APC Queue Code Injection (2020). https://www.ired.
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. team/offensive-security/code-injection-process-injection/early-bird-apc-
In: proceedings of the IEEE conference on computer vision and pattern queue-code-injection
recognition, pp. 770–778 Suciu O, Coull SE, Johns J (2019) Exploring adversarial examples in malware
Hu W, Tan Y (2017) Generating adversarial malware examples for black-box detection. In: 2019 IEEE security and privacy workshops (SPW), pp 8–14. IEEE
attacks based on gan. arXiv preprint arXiv:1702.05983 Team R (2021) RIFT: analysing a lazarus shellcode execution method. https://resea
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput rch.nccgroup.com/2021/01/23/rift-analysing-a-lazarus-shellcode-execution-
9(8):1735–1780 method/ Accessed Accessed 23 January 2021
Ionut Arghire: Ursnif banking Trojan gets mouse-based anti-sandboxing (2017). Tekerek A, Yapici MM (2022) A novel malware classification and augmentation
https://www.securityweek.com/ursnif-banking-trojan-gets-mouse-based- model based on convolutional neural network. Comput Sec 112:102515
anti-sandboxing/ Van Etten A (2018) You only look twice: Rapid multi-scale object detection
ired.team: Module stomping for shellcode injection (2020). https://www.ired. in satellite imagery. arXiv preprint arXiv:1805.09512
team/offensive-security/code-injection-process-injection/modulestom Vasan D, Alazab M, Wassan S, Naeem H, Safaei B, Zheng Q (2020) Imcfn: Image-
ping-dll-hollowing-shellcode-injection based malware classification using fine-tuned convolutional neural network
Küchler A, Mantovani A, Han Y, Bilge L, Balzarotti D (2021) Does every second architecture. Comput Netw 171:107138
count? time-based evolution of malware behavior in sandboxes. In: pro- Vasan D, Alazab M, Wassan S, Naeem H, Safaei B, Zheng Q (2020) Imcfn: Image-
ceedings of the network and distributed system security symposium, NDSS. based malware classification using fine-tuned convolutional neural network
The Internet Society architecture. Comput Netw 171:107138
Kumar S et al (2020) An emerging threat fileless malware: a survey and research Ventures R (2022) Download.com. https://download.cnet.com/
challenges. Cybersecurity 3(1):1–12 VirusShare: VirusShare. https://virusshare.com/
Lesueur J-P (2020) Darkcomet: remote administration tool. https://www.darkc VirusTotal: virustotal. https://www.virustotal.com/gui/home/upload
omet-rat.com/ VMware I (2022) VMware. https://www.vmware.com/
Ligh MH, Case A, Levy J, Walters A (2014) The art of memory forensics: detecting Wagner D, Soto P (2002) Mimicry attacks on host-based intrusion detection
malware and threats in windows, Linux, and Mac memory. John Wiley, USA systems. In: proceedings of the 9th ACM conference on computer and
Malik A (2019) In-memory execution of an executable. https://securityxploded. communications security, pp 255–264
com/memory-execution-of-executable.php Wang Q, Hassan WU, Li D, Jee K, Yu X, Zou K, Rhee J, Chen Z, Cheng W, Gunter
Malware Behavior Catalog: Dark Comet (2022). https://github.com/MBCPr CA et al (2020) You are what you do: hunting stealthy malware via data
oject/mbc-markdown/blob/master/xample-malware/dark-comet.md#4 provenance analysis. In: NDSS
Microsoft: Out of sight but not invisible: Defeating fileless malware with behavior Wang L, Tao D, Wang R, Wang R, Li H (2019) Big map r-cnn for object detection
monitoring, AMSI, and next-gen AV - microsoft security (2018). https://www. in large-scale remote sensing images. Mathemat Foundations Comput
microsoft.com/security/blog/2018/09/27/out-of-sight-but-not-invisibled 2(4):299
efeating-fileless-malware-with-behavior-monitoring-amsi-and-next-gen-av Yokoyama A, Ishii K, Tanabe R, Papa Y, Yoshioka K, Matsumoto T, Kasama T, Inoue
Miramirkhani N, Appini MP, Nikiforakis N, Polychronakis M (2017) spotless sand- D, Brengel M, Backes M et al (2016) sandprint: Fingerprinting malware sand-
boxes: evading malware analysis systems using wear-and-tear artifacts. In: boxes to provide intelligence for sandbox evasion. In: research in attacks,
2017 IEEE symposium on security and privacy (SP), pp 1009–1024. IEEE intrusions, and defenses: 19th international symposium, RAID 2016, Paris,
MITRE: virtualization/sandbox evasion: user activity based checks (2021). https:// France, September 19-21, 2016, Proceedings 19, pp 165–187. Springer
attack.mitre.org/techniques/T1497/002/ Yosifovich P, Solomon DA, Ionescu A (2017) Windows internals, part 1: system
Mitre: mitre attck. https://attack.mitre.org/ architecture, processes, threads, memory management, and more. Micro-
Mitre: lazarus group (2021). https://attack.mitre.org/groups/G0032/ soft Press, USA, pp 113–202
Yu Z, Qing-Zhong L, Tao L, Li-Hua W, Chun S (2015) Research and development of
memory forensics. J Software 26(5):1151–1172
Liu et al. Cybersecurity (2023) 6:21 Page 22 of 22

Zhou P, Shi W, Tian J, Qi Z, Li B, Hao H, Xu B (2016) Attention-based bidirectional

long short-term memory networks for relation classification. In: proceedings
of the 54th annual meeting of the association for computational linguistics
(volume 2: Short Papers), pp 207–212

Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in pub-
lished maps and institutional affiliations.