0% found this document useful (0 votes)
17 views

Detecting Obfuscated Malware using Memory Feature Engineering

Uploaded by

noorbasirah05
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Detecting Obfuscated Malware using Memory Feature Engineering

Uploaded by

noorbasirah05
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Detecting Obfuscated Malware using Memory Feature Engineering

Tristan Carrier1 , Princy Victor1 , Ali Tekeoglu2 a


and Arash Habibi Lashkari1 b
1 Canadian Institute for Cybersecurity (CIC), University of New Brunswick (UNB), Fredericton, NB, Canada
2 Johns Hopkins University Applied Physics Laboratory, Critical Infrastructure Protection Group, Maryland, U.S.A.

Keywords: Obfuscated Malware, Memory Analysis, Ensemble Learning, Malware Detection, Stacking, Machine
Learning

Abstract: Memory analysis is critical in detecting malicious processes as it can capture various characteristics and behav-
iors. However, while there is much research in the field, there are also some significant obstacles in malware
detection, such as detection rate and advanced malware obfuscation. As advanced malware uses obfuscation
and other techniques to stay hidden from the detection methods, there is a strong need for an efficient frame-
work that focuses on detecting obfuscation and hidden malware. In this research, the advancement of the
VolMemLyzer, as one of the most updated memory feature extractors for learning systems, has been extended
to focus on hidden and obfuscated malware used with a stacked ensemble machine learning model to create
a framework for efficiently detecting malware. Also, a specific malware memory dataset (MalMemAnalysis-
2022) was created to test and evaluate this framework, focusing on simulating real-world obfuscated malware
as close as possible. The results show that the proposed solution can detect obfuscated and hidden malware
using memory feature engineering extremely fast with an Accuracy and F1-Score of 99.00% and 99.02%,
respectively.

1 INTRODUCTION As the complexity and time consumption of man-


ual detection methods are very high, different learning
Since the advent of Malware in the 1980s, it has be- systems like machine learning and deep learning are
come one of the focal points in the field of cyberse- used, which can produce intelligent insights from the
curity. Malware is any malicious software used by data automatically. The primary importance of these
cybercriminals that harms the system or user by per- learning systems is figuring out which training data to
forming various criminal activities. With the fast ad- feed to the system to make the quickest and most ac-
vancement of technology and internet access, mal- curate assessment. Machine learning systems take in
ware has also evolved regardless of the available se- a set of features that can be looked at with a large sam-
curity measures (Statista, 2021). Morever, their abil- ple size to compare and contrast differences. Further-
ity to evade from the detection methods has made the more, these features can be input in different formats,
process of malware detection complex. With diver- which is a factor in determining the machine learn-
sity in malware categories and families, it is essential ing system that should be used. While some machine
to cover all the bases. learning algorithms are focused on speed, others are
There are several Malware categories such as focused on accuracy and precision. Therefore, select-
Worms, Viruses, Bots, Botnets, Trojan Horses, Ran- ing the different algorithms that correspond to the ob-
somware, Spyware, Rootkits, etc. As malware fami- jective and input type has an enormous impact on the
lies has several functionalities like infiltrating the sys- results of the system. Ensemble machine learning is
tem, gaining access to information, preventing access one such model in machine learning that correlates
for authorized users, or performing other cybercrimes well with these goals of detection and characteriza-
within each category, the best solution to detect them tion.
should focus on both categories and families to pre- There exists several approaches for obfuscated
vent and stop them in the future. malware detection based on memory analysis. How-
ever, in most of the works, the complexity and time
a https://orcid.org/0000-0001-7638-0941 consumption are high, which makes them not suitable
b https://orcid.org/0000-0002-1240-6433

177
Carrier, T., Victor, P., Tekeoglu, A. and Lashkari, A.
Detecting Obfuscated Malware using Memory Feature Engineering.
DOI: 10.5220/0010908200003120
In Proceedings of the 8th International Conference on Information Systems Security and Privacy (ICISSP 2022), pages 177-188
ISBN: 978-989-758-553-1; ISSN: 2184-4356
Copyright c 2022 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved
ICISSP 2022 - 8th International Conference on Information Systems Security and Privacy

for real-world application. This is the motivation to in malware memory analysis since all critical infor-
propose a fast, efficient and easy to develop solution mation is stored in memory. The memory analysis
for obfuscated malware detection by making use of that is not performed live needs snapshots, and ob-
the most effective features captured through memory taining these snapshots is essential to ensure that the
analysis. memory files are not affected. Affected memory files
Main Contributions: The main contributions of could change the results of the memory analysis pro-
this research include: cess and would remove the reliability of the analysis
• Proposing a malware analysis framework that taking place.
uses a two-layer stacked ensemble learning model (Stüttgen and Cohen, 2014) proposed a frame-
to improve the current obfuscated or hidden mal- work that shows how to capture a memory from a
ware detection solution. Linux system with minimal impact by using a relo-
cation hooking that can copy the information safely.
• Proposing 26 new memory-based features for Furthermore, since this technique doesn’t require the
the only available open-source memory analyzer installation of an environment on the system, those
for learning systems, VolMemLyzer, by focusing tasks will not be in the analyzed memory. In addi-
specifically on the obfuscated and hidden malware tion to the memory snapshot capturing difficulties, au-
detection and implementing the new version of tomation and the complexity of the analysis process
the open-source project, VolMemLyzer-V2. are other challenges. As a solution for this, (Socala
• Generating and releasing a comprehensive dataset and Cohen, 2016) explains the method of automatic
by executing more than 2500 malware samples profile generation for live memory analysis, which
on three common obfuscated and hidden cate- can automate the analysis process in a viable manner.
gories, including Spyware, Ransomware, and Tro- Moreover, the work by (Okolica and Peterson,
jan Horse, to test and evaluate the proposed frame- 2010) discusses the importance of having a highly
work. flexible memory analysis process that can work on
The structure of this paper is shown as follows. different platforms and systems as this would signifi-
Section 2 introduces the related works on memory cantly reduce the amount of time needed to match the
analysis models that used machine learning and deep system with the profile. Furthermore, the work also
learning for malware detection. Section 3 proposes an discovered debugging structures on memory analysis
obfuscated and hidden malware detection framework to allow the tools to run on more systems. In another
that tackles the challenges identified in this study. work, (Block and Dewald, 2017) introduced a mem-
Section 4 presents the dataset creation process and the ory analysis plugin that can use to simplify the analy-
malware types, families, and samples, while section 5 sis process. This plugin focuses on the details of the
presents the experimental analysis. Finally, section heap objects in memory, and these heap objects can
6 concludes the paper by discussing the findings and help a memory analysis professional understand what
future works. undergoes in the system memory.
When the memory has been successfully captured,
the next step to consider is how to extract the data
from within it. (Okolica and Peterson, 2011) ex-
2 LITERATURE REVIEW plains the importance of DLLs and Windows drivers,
which are difficult to extract with no entry point to
Since the inception of malware, it has gained enor- gain access, especially with no export functions. To
mous attention in the cybersecurity field due to its get the information from these drivers, a huge work
various delivery methods and categories. Although is needed from a memory forensic professional. The
there exist several detection methods, each carries its authors show the method of reversing the drivers to
challenges. This section highlights the related works gain quicker and more efficient access to the driver
on malware detection through memory analysis and information.
discusses the remaining issues and challenges in this A work by (Dolan-Gavitt, 2008) discusses the im-
research field. portance of gaining access to the full registry in mem-
ory with the use of cell indexes. Similarly, (Zhang
2.1 Malware Memory Analysis et al., 2011) also explains the extraction of registry
information from physical memory for Windows sys-
Memory analysis is a method that provides a strong tems and the importance of understanding the file
understanding of the activities in the system by cap- structure. In the other work by (Zhang et al., 2009),
turing memory snapshots and extracting features from the use of the data structure, Kernel Processor Con-
them. (Shree et al., 2021) discusses the reliability

178
Detecting Obfuscated Malware using Memory Feature Engineering

trol Region, is explained for translating the differ- creasing its accuracy. The work discussed layering
ence from virtual to physical memory in the address the levels on top of each other, which shows the in-
space, thus improving the memory forensics on win- tensity of malware in each section of the malware
dows machines. (Zhang et al., 2010) also did their memory dump. Moreover, this heat map can be com-
study on converting virtual to physical addresses by pared to other malware systems to show a higher ac-
using the paging structure for 2MB pages in a Win- curacy detection and classification rate. As Malware
dows 7 system. classification analysis can be costly in time and accu-
Memory analysis can be used in many differ- racy, (Kang et al., 2019), suggests the use of vector-
ent ways to find out what happened to a victim. ing assembly source code using the Long Short-Term
(Thantilage and Jeyamohan, 2017) discuss the usage Memory-Based (LSTM) method for classifying mal-
of volatile memory analysis to gain information on ware. Using word2vec with the LSTM system, the
social media evidence. The developed application fo- increase in accuracy reached 0.5 percent higher than
cuses specifically on targeting volatile memory anal- other methodologies.
ysis to obtain social media evidence. Malware detection and analysis are difficult for
Updating lots of systems in an industry can be ex- advanced systems; however, malware detection in a
pensive and often goes unnoticed until an attack oc- cloud is even more difficult with more liabilities. It
curs. For this, (Sharafaldin et al., 2017) proposes a can be hard to examine if malicious acts are happen-
new tool called BotViz that uses a hybrid approach ing with constant live processes running, especially
for detecting bots in a network. In addition to that, while taking privacy into account. Using an unbiased
this model uses hooks to strengthen bot detection. training set, the minHash method was able to have a
The work by Martin-Perez (Martin-Perez et al., 2021) nearly perfect detection rate. With increasing cloud
presents an interesting concept of memory dump pre- operations, using the minHash method can increase
processing with two different strategies that can relo- efficiency and reliability, as shown in the numerous
cate file objects to make the analysis process quicker experiments as (Nissima et al., 2019) has shown. In
and easier. The first strategy, called Guided De- this work, the results are drastically different in detec-
Relocation, specifically selects a new space for the tion across the different classifiers, which shows the
information. The second strategy is Linear Sweep impact of the classifiers based on the different types
De-Relocation, which sweeps through the memory to of malware being input into the system. To reduce this
find a storage spot. Memory forensic tools have dras- variance, classifiers can work together to make up for
tically changed how memory analysis is performed; their weaknesses.
however, they can still be refined and improved to The current direction of remote computing leads
be faster, more efficient, and easier to use. (Lewis to more information stored in the cloud; as such, they
et al., 2018) discusses the method by which the de- have been a bigger target for malware with an increase
fects are fixed and improvements are added to pro- in demand for security. With the static and dynamic
fessional tools such as Volatility. Improving mem- approaches not being applicable for cloud comput-
ory forensic algorithms to adapt to current standards ing, the need for new security methods has increased
is also needed as they explain many novel memory for specialized cloud computing security. (Li et al.,
analysis algorithms. 2019) suggests a deep learning approach that collects
a memory snapshot of the system and converts it into
2.2 Malware Detection Using Memory a grayscale image. The convolutional neural network
Analysis then models the system and trains deep learning to
differentiate between malicious and benign memory
snapshots. Results showed that this process reduces
Sometimes, dynamic and static methods of malware
runtime for analysis as well as accurately identifies
classification can be both inaccurate and imprecise.
malware. After obtaining the target virtual machine
According to (Dai et al., 2018), the idea of using
introspection, it is fed to the extracting model, which
an extracted malware memory dump file that is con-
converts it to the grayscale image passed from the tar-
verted into grayscale image results in higher accuracy
get VM to the secure VM for analysis.
and precision than static and dynamic methods. Re-
sults show a 20 percent increase in accuracy when (Sai et al., 2019) developed the concept of manag-
converting to a grayscale image before comparison ing memory with API call mining. This method ana-
with other known malware. lyzes API calls that access the system’s memory and
According to (Yucel and Koltuksuz, 2019), using observes the transitions in the memory to watch the
a three-dimensional heat mapping system can reduce management and ensure that the system does not con-
the time it takes to classify malware along with in- tain any malicious activity. This method can check the

179
ICISSP 2022 - 8th International Conference on Information Systems Security and Privacy

allocated memory during runtime and detect roughly percent false positive rate.
95 percent of all malicious programs from the system To combat the malware obfuscation techniques,
memory behavior. the detection method needs to be designed with ob-
The importance of detecting new malware is ex- fuscation in mind. This can be done using a specif-
tremely high to prevent new attacks from harm- ically designed dataset to test how well a detection
ing systems. Many techniques have high detection system deals with obfuscated malware. (Sadek et al.,
rates on known malware using in-depth training tech- 2019) challenged detection methods by using a large
niques. However, while comparing previous works, it dataset that consists of positive and negative memory
can be identified that the works do not deal with new snapshots, advanced payload systems, and malware
never seen before malware. As a solution for this, (Si- obfuscation. (Bozkir et al., 2021) have come up with
hwail et al., 2019) suggests using memory forensics to a novel approach that uses an RGB image to show
extract artifacts from memory combined with mem- memory dump files in their malware detection sys-
ory feature extraction. Based on past known mal- tem.
ware and the extracted artifacts, the framework can While using the manifold learning technique
determine what future malware will consist of. Re- called UMAP, (Javaheri and Hosseninzadeh, 2017)
sults showed that the model has an extremely high identified the original memory dump file showing ma-
detection rate and accuracy while still keeping a low licious or benign activity. After testing with ten mal-
amount of time needed to run. ware families and benign samples, the results were
As some malware like Objective-C malware, also roughly 96 percent accuracy at the extremely fast
known as userland, puts MacOS X systems at risk, speed of only 3.56 seconds. Moreover, a framework
(Case and Richard, 2016) proposed a plugin for the was also developed to combat the obfuscation of mal-
Volatility framework that focuses on automatically ware. Using the detection presence time of the mal-
analyzing the artifacts of the system that have impor- ware at each level of the operating system down to the
tance. This is done by monitoring the Objective-C kernel, they were able to dump the malware memory
at runtime and outputting a file that can be analyzed. at the precise time and view the malware installation.
Based on this file, it can be examined and determined The framework was focused specifically on obfusca-
how to deal with the current situation. This results tion and packaging in mind to challenge one of the
in a fast analysis time and less work for the analysts, biggest problems in malware detection. After testing
thus allowing more systems to be monitored in the the framework, it obtained roughly 85 percent accu-
same amount of time. As typical Malware detection racy in detecting kernel-level malware. Though there
and unpacking tools can be detected from the mal- are many different methods to detect obfuscated mal-
ware debuggers, malware stays dormant during scans ware, each method has to be looked into for different
and avoids malware detection methods. situations.
However, according to (Kawakoya et al., 2010), Malware and botnets can be difficult to blacklist
while using the stealth debugger, malware is not when they use obfuscation and concealment. Botnet
aware when to stay dormant or when to run to avoid command and control servers can also make a real-
malware detection scans. In addition to that, the time prediction for domain names extremely chal-
stealth debugger takes the virtual machine memory lenging. (S et al., 2019), discusses the use of a frame-
and sends it to the guest operating system. After work to counter obfuscation by using the LSTM net-
which, it runs the analysis to identify the true origins work. This framework operates for both binary and
of the code. Since most malware is advanced enough multi-class data with a high recall rate and precision,
to contain obfuscation methods, this model can detect producing a good F1 score. This F1 score consists of
most packers at an incredibly high accuracy rate, with over 80 percent for binary class data and over 60 per-
some packers getting a perfect detection rate. While cent for multi-class data. Moreover, this framework
static and dynamic approaches are a good start for de- can be used to help identify concealed and obfuscated
tecting malware, they can often be exploited by obfus- malware in botnet systems.
cated malware, leading to malware deactivating the VMShield, a proposed method by (Mishra et al.,
detection methods. Using application-specific detec- 2021), protects virtual domains in the cloud from ob-
tion with machine learning, (Xu et al., 2017) was able fuscated and stealthy malware attacks. This work
to get nearly a perfect malware detection rate. This used a state-of-the-art method that collects runtime
method works on the top layer and works down to behavior from the different processes and analyzes
the kernel level, where many corruption attacks can the results to make obfuscated and stealthy malware
occur. With this approach, corruption attacks were unable to sneak past detection. Passing down to the
stopped 99 percent of the time with less than a five system, VMShield is able to monitor the results of

180
Detecting Obfuscated Malware using Memory Feature Engineering

each layer and trace all of the system calls and extract 3 PROPOSED APPROACH
the features that are the biggest impact on the system.
VMShield can detect more than 97 percent of the at- In most existing works, the complexity and time con-
tacks using these introspection techniques, including sumption are high, making them unsuitable for real-
hidden and obfuscated attacks. VMShield cloud pro- world application. As a solution for this, a fast, ef-
tection process step by step, where it discusses the ficient, and easy to develop solution for obfuscated
tracing of the hypervisor from the virtual machine, malware detection is proposed in this paper by using
feature extraction, selection process, and profile gen- the most effective features captured through memory
eration. Finally, VMshield obtains the result of the analysis.
model and delivers a status report that can be looked
by the admin. 3.1 General Overview
Virtual machine introspection has become a com-
mon tactic with detecting malware and other mali- The overview of this obfuscated malware detection
cious sources as it can miss hidden, dead, or obfus- framework is depicted in Figure 1. The components
cated malware. With the use of a virtual machine of the proposed framework include:
monitor, otherwise known as a hypervisor, (Kumara
and Jaidhar, 2016), discusses an automated internal • Memory Dump File: Memory dumps can be
and external system that can detect hidden, dead, and obtained by using programs such as MAGNET
obfuscated malware inside the virtual machine with RAM, ManTech Memory DD, Forensic Tool Kit
the aid of machine learning. After testing the sys- (FTK), or virtual machine managers with the
tem with an advanced data set using cross-validation, memory capturing feature. This is a snapshot
the authors found that their system has a 99.55 per- showing the activity that took place in memory on
cent accuracy rate while still holding the extremely the system (MAG, 2021)(Man, 2021)(For, 2021).
low false-positive rate of 0.004 percent. • Volatility: is a completely open collection of
There exist works like (Sklavos, 2017) that dis- tools, implemented in Python under the GNU
cusses the security issues in IoT devices by study- General Public License, to extract digital arti-
ing the malware for both system hardware and soft- facts from volatile memory (RAM) samples. (Vol,
ware. In this work, the most widespread malware cat- 2016).
egories, such as logic bombs, rootkits, bots, etc., were • VolMemLyzer-V2: The memory feature extrac-
discussed from a software viewpoint. In addition to tor for learning-based solutions with the 26 new
that, the hardware security in IoT devices was also features implemented as part of the proposed
studied by mentioning the power monitoring attacks, model to target obfuscated and hidden malware.
timing attacks, etc. The work also presented the ex- VolMemLyzer extracts the features using volatil-
isting malware detection approaches and summarized ity plugins and generates a CSV file(Lashkari
expected future directions. et al., 2020).
Overall, it can be identified that several ap-
proaches exist for obfuscated malware detection • CSV Feature File: This is the output from the
based on memory analysis. To the best of our knowl- VolMemLyzer feature extractor, which contains
edge, no literature focused on the detection in the all the features that have been extracted in a com-
memory through feature extraction, as the methods pact comma-separated values file (CSV).
used are very complex and time-intensive. It is also • Ensemble Learning: A machine learning tech-
interesting to notice that the works have focused on nique that focuses on combining classifiers to
detecting malware found in different system layers for cover its weaknesses. As some classifiers are eas-
general and obfuscated malware cases. The VolMem- ily swayed by outliers or have a high bias, ensem-
Lyzer was developed as the first memory-based mal- ble learning allows these weaknesses to have less
ware analysis feature extractor for learning-based so- impact on the overall results (Ens, 2021). The
lutions, but it did not focus on obfuscated malware stacking ensemble technique was used for this
analysis (Lashkari et al., 2020). As a result, this framework which has two layers of classifiers.
work proposes an obfuscated malware memory anal- • Malicious and Benign Classification output: The
ysis framework that focuses on a fast and low-cost binary output for each memory dump file that
solution that will be discussed in the next section. shows whether there is a malicious activity or be-
nign activity.

181
ICISSP 2022 - 8th International Conference on Information Systems Security and Privacy

Figure 1: Malware Memory Analysis Process.

3.2 Proposed Features aggregation with a meta classifier. In this proposed


work, stacking is selected as the method for ensemble
The first step in this model is to obtain the memory learning due to the speed and variance performance.
dump, which is compatible with the current Volatil- As mentioned above, the first layer runs individually,
ity version 2.6 (Vol, 2016) used for the framework allowing them to run in tandem and reduce the time
with the VolMemLyzer (Lashkari et al., 2020). The needed for classification. The second layer deals with
memory dumps are then passed to the VolMemLyzer a slight variance of inputs which allows it to be run
feature extractor, which uses Volatility to extract 58 quickly after the first layer is finished. The differ-
features. Among these 58 features, the newly added ent classifiers can compensate for each other’s weak-
26 features specifically focus on targeting obfuscated nesses, keeping the accuracy high while classifying
and hidden malware, as explained in this section. fast.
There are five different categories that these fea- As there are a plethora of machine learning clas-
tures belong to from Volatility. The first category is sifiers, it is vital to identify the suitable classifier
called Malfind, which detects potential malicious ex- that can be applied to the proposed model. Usually,
ecutables that are usually DLLs associated with Tro- the classifier is selected based on the type of dataset
jan malware. The next category is called Ldrmodule, used for classification. This work’s chosen classifiers
which gives information on potential injected code are SVM, Decision Tree, Linear Perceptron, Naive
into the system, which is often the way which spyware Bayes, Random Forest, and KNN as base learners, ex-
enters the system. The Handle category is the one perimenting with different combinations. However,
that looks at the type of information in memory and only three were selected at a time, as the proposed
its classification. The Process View features show the model developed using Python is built to use three
process list with information that can be used to find base learners and one meta learner. Similar to base
malicious processes. The last feature type is the API- learners, different meta-learners such as SVM, KNN,
hook features which show the total number of API- Naive Bayes, and Logistic Regression were also used
hooks of key types. Table 1 depicts the features ex- for finding the best classifier. All these classifiers
tracted for the memory analysis framework. were chosen with the goal in mind for speed and ac-
curacy. The best combination among these will be se-
3.3 Detection Model lected in the experimentation phase. Finally, the meta
learner outputs the binary results showing whether the
Once these features are extracted, they are ready to be memory snapshot was malicious or benign. Figure 2
fed into the proposed ensemble learning. The ensem- shows the different base learners and meta learners
ble learner has two stages, being the training stage and used in the proposed model.
the validation stage. In the training stage, the ensem-
ble learner runs the base learners, and the prediction
results from these classifiers are used as input for the 4 CREATING A NEW DATASET
second layer classifier. After the ensemble learner is
trained, it is then validated, going through the same As the proposed malware detection framework fo-
process as the training to validate the data set results cuses explicitly on targeting obfuscated malware, a
and ensure that the training was successful. dataset is developed to simulate real-world conditions
There exist several ensemble learning techniques close to the malware found in the real world.
such as stacking, voting, boosting, bagging etc. In
voting, the weights given by the user are used to com-
bine the classifiers, whereas stacking achieves this

182
Detecting Obfuscated Malware using Memory Feature Engineering

Table 1: Extended Feature List.


Feature Type Feature List Feature Discription
commitCharge Total number of Commit Charges
protection Total number of protection
Malfind uniqueInjections Total number of unique injections
avgMissingFromLoad The average amount of modules missing from the load list
avgMissingFromInit The average amount of modules missing from the initilization list
Ldrmodule avgMissingFromMem The average amount of modules missing from memory
port Total number of port handles
file Total number of file handles
event Total number of event handles
desktop Total number of desktop handles
key Total number of key handles
thread Total number of thread handles
directory Total number of directory handles
semaphore Total number of semaphore handles
timer Total number of timer handles
section Total number of section handles
Handles mutant Total number of mutant handles
pslist Average false ratio of the process list
psscan Average false ratio of the process scan
thrdproc Average false ratio of the third process
pspcid Average false ratio of the process id
session Average false ratio of the session
Process View deskthrd Average false ratio of the deskthrd
nhooks Total number of apihooks
nhookInLine Total number of in line apihooks
Apihooks nhooksInUsermode Total number of apihooks in user mode

Figure 2: Stacked Ensemble Learning Classifiers.

4.1 Overview malicious records is created by capturing malicious


as well as benign dumps. For capturing malicious
In this research, a dataset (MalMemAnalysis-2022) memory dumps, 2,916 malware samples collected
of 58,596 records with 29,298 benign and 29,298

183
ICISSP 2022 - 8th International Conference on Information Systems Security and Privacy

from VirusTotal that have different malware cate- agement system. This ensures that the memory
gories including Ransomware, Spyware, and Trojan dump is not contaminated with a process usually
Horse as listed in the Table 2, are executed in a not on the typical system. The memory dump is
VM with 2 GigaBytes of memory. Similarly, for the captured from a Windows10 system rather than a
creation of benign memory dumps, normal user be- windows XP or older system that is not used as
haviour is captured by using various applications in much. This is to ensure that the malware being
the machine. The detailed process will be discussed tested is as close to a real-world simulation as pos-
in the below section. (New dataset will be avail- sible. To expand the dataset, this process was au-
able in “https://www.unb.ca/cic/datasets/MalMem- tomated where 2,916 malware samples from three
2022.html”). malware categories including Trojan Horse, Ran-
somware, and Spyware were executed in the VM.
Table 2: Malware sample count. As it is important to have some benign processes
Malware Category Malware Count executed during the malicious memory dump cre-
Families ation, different applications in Windows VM were
Zeus 195 opened along with executing the malware sam-
Trojan Horse ples. This is done to make sure that the classifier
Emotet 196
Refroso 200 is not able to determine the difference just based
scar 200 on the benign processes alone. For each sample
Reconyc 157 execution, 10 memory dumps, each with a 15 sec-
180Solutions 200 onds gap, were captured to ensure no malware
Spyware behaviour is left out, and extracted 29,298 mali-
Coolwebsearch 200
Gator 200 cious memory dumps. For benign dumps, normal
Transponder 241 user behaviour is captured by using different ap-
TIBS 141 plications in the machine and performed oversam-
Conti 200 pling using SMOTE algorithm to make the dataset
Ransomware balanced. Unlike other oversampling methods,
MAZE 195
Pysa 171 SMOTE does not generate duplicates instead pro-
Ako 200 duces synthetic values that are negligibly distinct
Shade 220 from the actual values.
• The third step consists of transferring the result-
ing memory dump files to a Kali Linux machine to
4.2 Creating Dataset perform the feature extraction using the VolMem-
Lyzer with the 26 new features added to target
Four main steps were considered in this dataset cre- malware obfuscation.
ation: research, memory dump extraction, memory
dump transfer, and feature extraction. • The fourth main step on the initial process was the
feature extraction of the memory dump files and
• First step is the research of the malware category, the creation of the final combined CSV file for all
family, and sample type. It is important to have tested memory dump files, which is to be used in
malware that simulates as close to a real-world ex- the ensemble learning system. After the memory
ample as possible. As such, malware designed to dumps were acquired, the VolMemLyzer feature
specifically target old systems that are no longer extractor ran on all the memory dumped files in
in use and do not work on newer systems would the folder and generated the resulting CSV file to
not accurately detect the malware of current sys- be used in the ensemble learning system.
tems. This is the reason why in-depth research
was done on each family and type of malware. sectionEXPERIMENTS To finalize our proposed
Based on the research, we have collected a mini- model, we have used the newly created dataset. The
mum of 100 and a maximum of 200 malware sam- detailed experimental setup, along with the finalized
ples from five different families in three malware model, is discussed in the below sections.
categories: Trojan Horse, Ransomware, and Spy-
ware. 4.3 Experimental Setup
• The second step is memory dumping. The mem-
A python code and a bash script are used to execute
ory dump can be activated outside the virtual ma-
the malware samples on a 64-bit Windows 10 isolated
chine, where the memory snapshot is captured
virtual machine inside Oracle Virtual Box and cap-
from using the VirtualBox virtual machine man-

184
Detecting Obfuscated Malware using Memory Feature Engineering

tured the local machine’s memory dumps. For the fea- Table 4: Ensemble Model Comparison.
ture extraction, we created the CSV file with features Base Meta
from the captured memory dump using VolMemLyzer Pre. Rec. F1 Acc.
Learner Learner
feature extractor for learning systems, publicly avail- NB, LP, DT SVM 0.96 0.95 0.95 0.95
able on GitHub (Lashkari et al., 2020). In addi- SVM, LP, DT KNN 0.97 0.96 0.96 0.96
tion to that, for developing stacked ensemble learn- NB, LP, RF LR 0.98 0.97 0.97 0.97
ers, python was used with the Sklearn library and de- NB, RF, DT LR 0.99 0.99 0.99 0.99
ployed in the Jupyter Notebook IDE for simplifying
the development of model (skl, 2021).
were Naive Bayes, Random Forest, and Decision Tree
for the base-learners and Logistic regression as the
4.4 Finalizing The Proposed Model
meta-learner in the finalized model. Figure 3 shows a
confusion matrix representing the true positive, false
This section finds the best combination of base learn-
positive, false negative and true negative.
ers and meta learners by performing several experi-
ments. First, each base-learner is evaluated using the
created dataset, and results are analyzed using differ-
ent evaluation metrics, including Accuracy, weighted
average Precision, weighted average Recall, and F1-
score, as shown in Table 3. From the results, it can
be identified that RF, Decision Tree, and KNN exhib-
ited better performance, whereas Linear Perceptron
has the least performance.

Table 3: Individual Classifiers Result.


Classifiers Pre. Rec. F1 Acc.
RF 0.98 0.97 0.97 0.97
NB 0.92 0.92 0.92 0.92
DT 0.97 0.97 0.97 0.97 Figure 3: Confusion matrix results.
KNN 0.95 0.95 0.95 0.95
SVM 0.91 0.90 0.90 0.90 Furthermore, since the memory dump used for
LP 0.61 0.59 0.53 0.60 feature extraction is taken from a Windows10 system,
the size of each memory dump is 2GB which is quite
large. However, the resulting CSV files’ size is around
To select the best stacking model and finalize
2KB, allowing for scalability in the proposed model.
it, different combinations of base-learners and meta-
Moreover, this model could solve the complexity and
learners were considered and the top four highest ac-
time-consuming issues found in the existing works.
curacy results among them are shown in Table 4. Re-
Table 5 shows the approximate time for samples
sults proved that the performance of the model is
to be classified as malicious or benign. From this, it
increased when ensemble methods like stacking are
can be identified that the classification time displayed
used. One of the main goals of the research was to fo-
is linear, which proves the model’s scalability. Over-
cus on overall speed of which classification speed has
all, the proposed stacked ensemble model has a clas-
been optimized. The classifiers that were considered
sification time of around 0.008 milliseconds per sam-
had a fast classification speed while also able to work
ple, which is significantly lower than most of the other
on a large variance of data for the first layer. The sec-
models listed in the related works (the second column
ond layer classifiers didn’t need a large variance so
of the Table 6).
strictly speed and accuracy was looked into. Stacking
was chosen to satisfy the goal of a fast speed in clas-
sification since stacking allows first layer classifiers 4.5 Comparing Proposed Model With
to run in parallel. At a fast speed, it is able to check Relevant Studies
the results from multiple classifiers and determine the
right binary classification. Moreover, it is also iden- The comparison mainly focused on existing works
tified that although some classifiers perform poorly, that targeted obfuscated malware or hidden malware
their performance can be enhanced when combined that belonged to one of the three categories of Ran-
with some other classifiers. somware, Spyware, and Trojan Horse malware. Table
Hence, based on the results, the selected classifiers 6 shows the results of the four various related works

185
ICISSP 2022 - 8th International Conference on Information Systems Security and Privacy

Table 5: Approximate Classification Time Based On The 5 CONCLUSIONS


Number Of Samples.
No. of samples Time (Sec) As networking and the internet evolved, malware au-
50 0.4 thors swiftly adapted their malicious code, and most
100 0.8 of them are used to exploit vulnerabilities in Mi-
150 1.2 crosoft Windows. Although several techniques ex-
200 1.6 ist to detect obfuscated or hidden malware based on
250 2 memory analysis, the time consumption and com-
300 2.4 plexity of the works are high. As a solution for this
problem, obfuscated malware detection model is pro-
in the targeted area. Due to the differences in datasets posed, which extracts features from memory dumps
and approaches, the exact results are difficult to com- using VolMemLyzer, a feature extractor for learning
pare and often biased towards a specific method. systems. For this, a dataset (MalMemAnalysis-2022)
The four criteria for comparison were overall ac- was constructed by executing malware samples from
curacy of the technique, overall speed per sample, three main categories, Spyware, Ransomware, and
memory usage needed to classify a sample, and over- Trojan Horse malware, in an isolated virtual machine.
all model complexity. Based on these criteria, each The model was designed to use memory fea-
work was given a result on a scale from very low to ture engineering with a stacked ensemble learner to
very high. These four related works are then com- achieve the goals of this research. Different combi-
pared to the proposed model, Detecting Obfuscated nations of base-learners and meta-learners were used
Malware using Memory Feature Engineering. using the created dataset, and the final model was se-
The first work selected for comparison is the work lected based on different evaluation metrics. The best
by (Xu et al., 2017) which has very impressive ac- results were exhibited when Naive Bayes, Decision
curacy and medium complexity. However, the mem- Tree, and Random Forest were used as base learners
ory usage is high, and the work did not mention the and Logistic Regression as the meta-learner with an
speed. In the second work, (Bozkir et al., 2021) de- accuracy of 99%. Moreover, this model is compared
tected obfuscated malware with high accuracy and a with related works that focus on obfuscated malware
rate of 3.56 seconds to detect per sample. Although detection in memory. The comparison results showed
the complexity of this model is high, the primary con- that the proposed model has less classification time
cern is the memory usage as the system needs to store and better performance.
the RGB image representation for each sample and
three different images for the creation of the final im- 5.1 Future Work
age. The third work is from (Javaheri and Hossenin-
zadeh, 2017) which has a model that requires a lower Detecting obfuscated malware in memory using fea-
amount of memory with high speed. However, it is ture engineering with the VolMemLyzer shows a
worth noting that this framework does have increased quick and precise way of dealing with malware that
complexity and has less accuracy than the previous is attempting to hide in memory. The next step of
methods. The fourth work is from (Nissima et al., this research is to work with more advanced designer
2019), and the model exhibits a high accuracy with malware that is specifically designed for different sys-
lower complexity compared to the previous two meth- tems, including Mac and Linux-based systems. More-
ods. However, it is slower, and the amount of memory over, this can ensure that older systems are protected
used is not mentioned. with this detection model and can be incorporated
Comparing these related works shows the impor- with upcoming systems. This would protect most sys-
tance of a fast detection method for obfuscated or hid- tems from obfuscated malware attacks by focusing on
den malware. The high amount of obfuscated mal- automated detection. With ransomware running ram-
ware has made speed and scalability a requirement pant in today’s society, ensuring that this malware is
rather than a luxury. From the results, it can be identi- detected and dealt with before it causes harm is ex-
fied that the new features and stacked machine learn- tremely important. The speed of this model can help
ing model improve the overall accuracy for obfus- detect the malware before such harm is caused and
cated and hidden malware detection. thus reduce overall harm.

186
Detecting Obfuscated Malware using Memory Feature Engineering

Table 6: Comparison Table With Other Obfuscated Malware Detection Methods.


Memory
Method Accuracy Speed Complexity
Usage
(Xu et al., 2017) Very High Not Mentioned High Medium
(Bozkir et al., 2021) High Medium Very High High
(Javaheri and Hosseninzadeh, 2017) Medium High Low High
(Nissima et al., 2019) High Medium Not Mentioned Medium
Proposed Model High Very High Low Medium

ACKNOWLEDGEMENTS Dolan-Gavitt, B. (2008). Forensic analysis of the windows


registry in memory. Digital Investigation, 5:pp. 26–
We thank the Mitacs Program for providing the 32.
Global Research Internship (GRI) opportunity to sup- Javaheri, D. and Hosseninzadeh, M. (2017). A framework
port this project. for recognition and confronting of obfuscated mal-
wares based on memory dumping and filter drivers.
Kang, J., Jang, S., Li, S., Jeong, Y.-S., and Sung, Y. (2019).
Long short-term memory-based malware classifica-
REFERENCES tion method for information security. Computers and
Electrical Engineering, 77.
(2016). Volatility framework - volatile mem- Kawakoya, Y., Iwamura, M., and Itoh, M. (2010). Mem-
ory extraction utility framework. https: ory behavior-based automatic malware unpacking in
//github.com/volatilityfoundation/volatility. (Ac- stealth debugging environment. 5th International
cessed on 08/10/2021). Conference on Malicious and Unwanted Software,
(2021). Ensemble methods in machine learn- Nancy, Lorraine, pages pp. 39–46.
ing: What are they and why use them? Kumara, A. and Jaidhar (2016). Leveraging virtual machine
https://towardsdatascience.com/ensemble-methods- introspection with memory forensics to detect and
in-machine-learning-what-are-they-and-why-use- characterize unknown malware using machine learn-
them-68ec3f9fef5f5. (Accessed on 08/10/2021). ing techniques at hypervisor. Digital Investigation,
(2021). Ftk R forensic toolkit: The gold standard in digital 23:pp. 99–123.
forensics for over 15 years. https://www.exterro.com/ Lashkari, A. H., Li, B., Carrier, T. L., and Kaur, G.
forensic-toolkit. (Accessed on 08/9/2021). (2020). Volatility memory analyzer. https://github.
(2021). Magnet ram capture: What does it do? com/ahlashkari/VolMemLyzer.
https://www.magnetforensics.com/resources/ Lewis, N., Case, A., Ali-Gombe, A., and III, G. G. R.
magnet-ram-capture/. (Accessed on 08/11/2021). (2018). Memory forensics and the windows subsys-
(2021). Mantech memory dd version 1.3 for tem for linux. Digital Investigation, 26:pp. 3–11.
forensic analysis of computer memory. Li, H., Zhan, D., Liu, T., and Ye, L. (2019). Using deep-
https://investor.mantech.com/press-releases/press- learning-based memory analysis for malware detec-
release-details/mantech-memory-dd-version-13- tion in cloud. 2019 IEEE 16th International Con-
forensic-analysis-computer-memory. (Accessed on ference on Mobile Ad Hoc and Sensor Systems Work-
08/12/2021). shops (MASSW), Monterey, CA, USA, pages pp. 1–6.
(2021). Understanding logistic regression in python. https: Martin-Perez, M., Rodriguez, R. J., and Balzarotti, D.
//realpython.com/logistic-regression-python/. (Ac- (2021). Pre-processing memory dumps to improve
cessed on 08/10/2021). similarity score of windows modules. Computer &
Block, F. and Dewald, A. (2017). Linux memory forensics: Security, 101.
Dissecting the user space process heap. Digital Inves- Mishra, P., Aggarwal, P., Vidyarthi, A., Singh, P., Khan,
tigation, 22:pp. 66–75. B., Alhelou, H. H., and Siano, P. (2021). Vmshield:
Bozkir, A. S., Tahilioglu, E., Aydos, M., and Kara, I. Memory introspection-based malware detection to se-
(2021). A malware detection approach through mem- cure cloud-based services against stealthy attacks.
ory forensics, manifold learning and computer vision. IEEE Transactions on Industrial Informatics.
Science Direct, 103. Nissima, N., Lahava, O., Cohena, A., and Rokacha, Y. E. L.
Case, A. and Richard, G. G. (2016). Detecting objective-c (2019). Volatile memory analysis using the minhash
malware through memory forensics. Digital Investi- method for efficient and secured detection of malware
gation, 18. in private cloud. Computers & Security, 87.
Dai, Y., Li, H., Qian, Y., and Lu, X. (2018). A malware clas- Okolica, J. and Peterson, G. L. (2010). Windows operating
sification method based on memory dump grayscale systems agnostic memory analysis. Digital Investiga-
image. Digital Investigation, 27:pp. 30–37. tion, 7:pp. 48–56.

187
ICISSP 2022 - 8th International Conference on Information Systems Security and Privacy

Okolica, J. S. and Peterson, G. L. (2011). Windows driver Zhang, S., Wang, L., Zhang, R., and Guo, Q. (2010). Ex-
memory analysis: A reverse engineering methodol- ploratory study on memory analysis of windows 7
ogy. Computers & Security, 30:pp. 770–779. operating system. International Conference on Ad-
S, A., S, S., Poornachandran, P., krishna Menon, V., and P, vanced Computer Theory and Engineering(ICACTE),
S. K. (2019). Deep learning framework for domain 3.
generation algorithms prediction using long short-
term memory. ICACCS.
Sadek, I., Chong, P., Rehman, S. U., Elovici, Y., and Binder,
A. (2019). Memory snapshot dataset of a compro-
mised host with malware using obfuscation evasion
techniques. Data in brief, 26.
Sai, K. V. N., Thanudas, B., Chakraborty, A., and Manoj,
B. S. (2019). A malware detection technique using
memory management api call mining. IEEE.
Sharafaldin, I., Gharib, A., and Lashkari, A. H. (2017).
Botviz: A memory forensic-based botnet detection
and visualization approach. International Carnahan
Conference on Security Technology (ICCST).
Shree, R., Shukla, A. K., Pandey, R. P., Shukla, V., and
Bajpai, D. (2021). Memory forensic: Acquisition and
analysis mechanism for operating systems. Materials
Today: Proceedings.
Sihwail, R., Omar, K., Ariffin, K. A. Z., and Afghani, S. A.
(2019). Malware detection approach based on artifacts
in memory image and dynamic analysis. Applied Sci-
ences.
Sklavos, N. (2017). Malware in iot software and hardware.
In Workshop on Trustworthy Manufacturing and Uti-
lization of Secure Devices (TRUDEVICE’16), pages
8–11.
Socala, A. and Cohen, M. (2016). Automatic profile gener-
ation for live linux memory analysis. Digital Investi-
gation, 16:pp. 11–24.
Statista (2021). Statista: annual number of mal-
ware attacks worldwide from 2015 to 2019.
https://www.statista.com/statistics/873097/
malware-attacks-per-year-worldwide/. (Accessed on
08/10/2021).
Stüttgen, J. and Cohen, M. (2014). Robust linux memory
acquisition with minimal target impact. Digital Inves-
tigation, 11:pp. 112–119.
Thantilage, R. and Jeyamohan, N. (2017). A volatile mem-
ory analysis tool for retrieval of social media evidence
in windows 10 os based workstations. National Infor-
mation Technology Conference (NITC).
Xu, Z., Ray, S., Subramanyan, P., and Malik, S. (2017).
Malware detection using machine learning based anal-
ysis of virtual memory access patterns. Design, Au-
tomation & Test in Europe Conference & Exhibition
(DATE), Lausanne, pages pp. 169–174.
Yucel, C. and Koltuksuz, A. (2019). Imaging and evaluat-
ing the memory access for malware. Forensic Science
International: Digital Investigation, 32.
Zhang, R., Wang, L., and Zhang, S. (2009). Windows mem-
ory analysis based on kpcr. International Conference
on Information Assurance and Security.
Zhang, S., Wang, L., and Zhang, L. (2011). Extracting win-
dows registry information from physical memory. 3rd
International Conference on Computer Research and
Development.

188

You might also like