A Comparative Analysis of Open Source Automated Malware Tools
A Comparative Analysis of Open Source Automated Malware Tools
Malware Tools
Preeti Animesh Kumar Agrawal
National Forensic Sciences University National Forensic Sciences University
Gandhinagar, India Gandhinagar, India
[email protected] [email protected]
978-93-80544-44-1/22/$31.00 2022
c IEEE 226
Authorized licensed use limited to: Universiti Malaysia Perlis. Downloaded on December 19,2024 at 17:38:03 UTC from IEEE Xplore. Restrictions apply.
across a network and target database and file servers, and can virtualization in Cuckoo Sandbox. The malware can detect
thus quickly paralyse an entire organization. It is a growing whether the function’s been hooked to track process
threat, generating billions of dollars in payments to behaviour by CuckooMon, how CuckooMon uses config files
cybercriminals and inflicting significant damage and for initialization steps and how malware can check those
expenses for businesses and governmental organizations [4]. specific directories and if it finds those files, it will not show
its true behaviour. Also, the malware can check
E. Rootkits SYSTEMDRIVE directory to check for dll/, bin/, lib/ etc.
A rootkit is malicious software that allows an folders which are created to copy start-up, fix and disguise
unauthorized user to have privileged access to a computer and scripts before starting the emulation process, if it finds these,
to restricted areas of its software [5]. A rootkit may contain a it will know it is running in a sandboxed environment. The
number of malicious tools such as keyloggers, banking paper also talks about various mitigation steps that can be
credential stealers, password stealers, antivirus disablers, and taken to stop malware from detecting virtualization. A
bots for Distributed Denial of Service (DDoS) attacks. This comparative analysis of dynamic malware analysis tools is
software remains hidden in the computer and allows the presented in [11] which demonstrates that it becomes very
attacker remote access to the computer [5]. difficult for the analyst to decide which one to choose for a
F. Malware Analysis specific task. In [11], the authors have discussed which tools
capture which malware function such as monitoring function
Malware analysis is the study of malware samples to calls, hooking, registry change detection, entropy etc. better
determine their functionality, risk factor, their origin, etc. than others. For example, Cuckoo Sandbox captures API
Malware Analysis is divided into two broad sub categories - (Application Programming Interface) hooking is better than
static analysis and dynamic analysis. others, IDA Pro shouldn’t be used to capture registry changes
1) Static Analysis but it works better in finding traces etc. In 12], state-of-the-art
survey discusses about the definition of malware, its types,
Static analysis is the process of debugging a software or how those specific types work, different evasion techniques
code without actually executing the malware [6] [8]. Tools used by malware and what are the different analysis layouts
and techniques are used which instantly discover whether a that can be used to perform dynamic analysis or automated
file has a malicious intent or not. This analysis is further analysis for that matter. Some of the analysis layouts
divided into basic static analysis and advanced static analysis. discussed include bare metal analysis, virtual machine,
Basic static analysis involves submitting the code to tools like hypervisor, emulation, volatile memory acquisition etc. There
virus total [8] where it can be analysed by various anti-virus are different analysis techniques discussed along with tools to
solutions, checking the string of the code and using tools like use. For example, Function Calls Analysis (tools used are
PEid [9] to determine whether it is malicious or not. For TTAnalyze, CWSandbox etc.), Execution Control (tools used
advanced static analysis, a reverse engineering of the code is are Minesweeper, VAMPiRE, Cobra etc.), Flow Tracking,
done to determine whether it is malicious or not. Tracing, etc. Mapping of Analysis layouts to techniques is
2) Dynamic Malware Analysis also given i.e., which analysis techniques are better suited for
which layouts. Comparisons based on functionalities and
Dynamic malware analysis is the process of gathering Research Evaluation Measurements are presented in the paper
information about the malware after executing it. Various as well.
sysinternal tools, debuggers, and disassemblers are used in
this process. Some of them are Procmon, Process Explorer, Comparative analysis of both static and dynamic malware
IDAPro (Interactive disassembler), Ollydbg, Process Hacker, analysis tools is done in [13]. Various tools for static analysis
etc. (such as PE (Portable executable) studio, PE View, IDAPro,
UPX, etc) and various tools for dynamic analysis (such as
Automated malware analysis is the process of automation Process Explorer, Process Monitor, Wireshark, Ollydbg, Burp
of both static and dynamic malware analysis. It is basically a Suite, etc) are discussed. Offline dynamic malware analysis
framework/sandbox environment which takes malware poses a significant threat to the machine so it uses mostly
sample files as input and executes them and gives the online dynamic malware analysis tools and compares their
information about functionality and tasks attempted by that findings such as detection methods used by the tools, time
malware. It reduces the cost of doing the process manually taken by each tool to perform the analysis. It then presents an
and also increases efficiency in terms of time as more algorithm that may be used for the comparative analysis. A
samples can be analysed this way in less time. system known as “MalGene” is discussed in [14] that does
Automated malware analysis mostly relies on the malware automated extraction of malware evasion signature by going
samples that have been discovered previously, this makes the through process that includes executing the malware first in
process faster. Checking the hash value of the file to be bare metal environment and then in a virtualized
analysed to check it against the database, using the automated environment. Its evasion signature model works in two steps:
techniques to unpack or de-obfuscate the files is a part of first is sequence alignment and second is evasion signature
automated malware analysis. extraction. Then also taking into account the unique call
request that evasive malware makes to the analysis
II. LITERATURE REVIEW environment for fingerprinting helps in extraction of the
In [10], the authors discuss how to increase successful signature. One drawback is that there can be other factors that
emulation rate in the virtual environment. It talks about how can result in the gap between sequence alignment and if the
malware these days are able to detect whether they are MalGene fails to take into account those factors then the
running in a virtual environment or not. It specifically talks signature extracted may not be a correct one. In [15], the
about the techniques which malware uses to detect authors have focused on designing an automated analysis tool
2022 9th International Conference on Computing for Sustainable Global Development (INDIACom) 227
Authorized licensed use limited to: Universiti Malaysia Perlis. Downloaded on December 19,2024 at 17:38:03 UTC from IEEE Xplore. Restrictions apply.
for malware that specifically attacks IoT devices. The study from ColdPress is compared against two other tools which
[15] discusses about various attack surfaces in IoT (Internet are used in industry which are IntelOwl and Hybrid Analysis
of Things) and vulnerabilities used to exploit them. The IoT and data shows the pros and cons of each tool. The tool
malwares are basically of three types: hardware, firmware works only for PE files and also the paper does not talk about
and network, and 10 out of 18 attack surfaces provided by the tools ability to perform unpacking or de-obfuscation.
OWASP (Open Web Application Security Project) included
In the reviewed studies, it is observed that small samples
network vulnerabilities. The automation environment
are used. In the Sisyfos tool, the functionality to include the
designed is basically for Linux based malware as 71.8% of
classification of android files can be added. The analysis of
the IoT devices are Linux based. The model used existing
different kind of files except PE and their unpacking is a good
open-source Linux tools available for static analysis and a
area of further research. In the comparative analysis paper
new dynamic tool integrated with it. The analysis pipeline
discussed, only online tools have been taken into
works in three phases: static analysis, dynamic analysis and
consideration. Based on the gaps found in the various
then result interpretation to classify the malware. In the result
researches done by the different researchers, a methodology
interpretation phase the tool used Yara rules to classify the
is suggested which overcomes the challenges and presents an
malware on the basis of the vulnerabilities it utilizes, attack
automated way of analysing the malware samples.
surface used, behaviour it showed and changes it made. There
was not enough rule data available for the result interpretation III. METHODOLOGY
stage so trial and error method were used for making the
model better. The model only analysed files based on 5 Fig. 1 presents the methodology used in for the present
architecture models, so the improvement could be made to research work.
include another 7 architecture models to the dynamic analysis
environment.
In [16], an automated malware analysis model “Sisyfos”
is developed that can be accessed through web interface or
through command line using a simple curl command. This
model uses open-source static and dynamic analysis tools for
analysing and classifying to decide whether the file is benign
or malicious. The platform is very modular and can easily be
integrated with new tools. For static analysis it uses Cuckoo
Sandbox and LaikaBOSS. For dynamic analysis, the sample
is analysed through Cuckoo Sandbox. The samples for the
training set have been taken from Virus Share. To identify the
best suited algorithm to use, TPOT (Tree-based Pipeline
Optimization Tool) open-source tool was used which
compares results from different algorithms and at last
Random Forest algorithm was used for classification. The
Platform only works for Windows and Linux Files but
Android can be included in it. Also, the size of the training
dataset can be increased to get more accurate results.
In [17], the researchers have introduced MIMOSA, an Fig. 1. Flowchart for Methodology
automated malware analysis tool that uses coverings to find a
small set of configurations that together mitigate the In order to carry out the analysis, the following steps are
techniques used by most stealthy malwares. The approach followed:
used by MIMOSA takes care of two important things - the
number of artifacts that are mitigated and the cost of x Obtain 3 malware samples. Three malware samples
deploying those mitigation techniques. The model was tested were obtained for this comparative analysis which
with a set of stealthy malwares which used one to five belonged to three different groups. Three malware
artifacts. The malware was then categorized on the basis of samples are: (a) WannaCry – Ransomware, (b)
artifacts it checked for. The number of stealthy malwares Stuxnet - Worm + Rootkit, and (c) Resource Editor –
used to test the model is 1300 which is far less if compared to Rootkit.
other malware analysis models. So, the testing of the model x Download/Familiarize with 3 tools. Tools used in this
with a larger number would increase the reliability of analysis are Cuckoo Sandbox, Any. Run and Intezer
MIMOSA. ColdPress, an extensible integrated automated Analyze. Cuckoo Sandbox was downloaded and
malware analysis and threat intelligence system that can run configured for offline use whereas Any. Run and
multiple modules at the same time based on system Intezer Analyze were used on their online platform. In
configurations is discussed in [18]. It also integrates powerful this step, the platforms were familiarized with and it
Software Reverse Engineering Frameworks which is a first. was ascertained how they operate and their various
Tools integrated in ColdPress are presented as modules such functionalities was studied.
as de-compilation i.e., extraction readable source code from
given binary, hashing, malware threat intelligence, tools such x Select the functionalities to check for each tool. There
as MITRE Framework are used to map the TI (Threat are number of functionalities that may be noted while
Intelligence) indicators to actions and tactics. The output doing malware analysis based on the artifacts found.
228 2022 9th International Conference on Computing for Sustainable Global Development (INDIACom)
Authorized licensed use limited to: Universiti Malaysia Perlis. Downloaded on December 19,2024 at 17:38:03 UTC from IEEE Xplore. Restrictions apply.
In this case the functionalities that were chosen are: TABLE III. RESULTS WITH INTEZER ANALYZE
Registry, Network, Files, and Process. Intezer Analyze Analysis
Samples
x Run each malware sample with each tool. Every Registry Network Files Process
malware sample was run in all the three tools, i.e., first WannaCry 9 9 9 9
WannaCry sample was run in Cuckoo, Any. Run and Stuxnet 9 - 9 9
Intezer Analyze followed by the other two samples. Resource Editor - - - -
x Note down results from each sample analysis. While D. Comparative Analysis
running the samples in tools, note down whatever
A comparative analysis is done for three tools to find out
information that can be extracted from the tools with
which tool is better in findings artifacts related to each of the
respect to the functionalities that are selected for
functionalities selected. One extra functionality, i.e., service
analysis.
has also been added to the comparative analysis which has
x Make separate tables for each of the tools. Now make been added due to its apparent importance and presence while
tables for the three tools which would contain the analysing the malware. Based on comparative analysis
name of functionalities as the name of columns and presented in Table IV, following results are obtained.
name of samples as the row names.
x Cuckoo Sandbox was found to be a better tool if files
x Prepare comparative analysis table after combining all dropped and created are to be looked at. Intezer
three tables. Noting the results from the tables made Analyze is somewhat useful for the same.
in the previous step, a comparative analysis table
x Services stated were noted most by Cuckoo Sandbox
would be made which would include a yes mark if the
only.
functionality was found successfully by the tool in two
or more samples. x All three tools were able to see the processes created
and also flagged as to which processes are malicious.
IV. ANALYSIS AND RESULTS
This section reports the experimental findings with three TABLE IV. COMPARATIVE ANALYSIS OF THREE TOOLS
different tools considered in this research work. Tools
Artifacts
A. Cuckoo Sandbox Cuckoo Sandbox Any. Run Intezer Analyze
Based on results presented in Table I, it was observed Files 9 - 9
Cuckoo Sandbox failed to provide results in terms of Network - 9 -
networking functionality, may be due to low number of Registry 9 9 -
samples selected for the analysis. Process 9 9 9
Service 9 - -
TABLE I. RESULTS WITH CUCKOO SANDBOX
2022 9th International Conference on Computing for Sustainable Global Development (INDIACom) 229
Authorized licensed use limited to: Universiti Malaysia Perlis. Downloaded on December 19,2024 at 17:38:03 UTC from IEEE Xplore. Restrictions apply.
As only three tools were analysed during this research, the [18] H. Tan, M. Chandramohan, C. Cifuentes, G. Bai and R. K. L. Ko,
number of tools as well as the number of samples for analysis “ColdPress: An Extensible Malware Analysis Platform for Threat
Intelligence”, arXiv.org, 2021.
can be increased in future work. There were few drawbacks
in each of the tools, so in future one tool can be picked and it
can be improved/worked upon to provide better functionality
such as ability of Any. Run to check files dropped can be
improved.
Other types of malwares can be included in the sample
base while increasing its number. Functionality to download
pdf reports from cuckoo offline can also be added in the
future.
REFERENCES
[1] A. Johansen, “What Is A Trojan? Is it a Virus or is it Malware?”.
https://us.norton.com/internetsecurity-malware-what-is-a-trojan.html.
[Accessed: June 20, 2021].
[2] R. Grimes, “9 Types of Malware and How to Recognize Them”.
https://www.csoonline.com/article/2615925/security-your-quick-guide-
to-malware-types.html. [Accessed: June 21, 2021].
[3] “What is a Computer Worm and How does it Work?”. Us.norton.com.
https://us.norton.com/internet security-malware-what-is-a-computer-
worm.html. [Accessed: June 20, 2021].
[4] “What is Ransomware”. https://www.mcafee.com/enterprise/en-
in/security-awareness/ransomware.html. [Accessed: June 21, 2021].
[5] Comodo. “Endpoint Detection”. https://enterprise.comodo.com/rootkit-
definition/ [Accessed: June 22, 2021].
[6] Ö. Aslan and R. Samet, "Investigation of Possibilities to Detect
Malware Using Existing Tools," in Proc. of the 2017 IEEE/ACS 14th
International Conference on Computer Systems and Applications
(AICCSA), 2017, pp. 1277-1284.
[7] S. Jamalpur, Y. S. Navya, P. Raja, G. Tagore and G. R. K. Rao,
"Dynamic Malware Analysis using Cuckoo Sandbox," in Proc. of the
2018 Second International Conference on Inventive Communication
and Computational Technologies (ICICCT), 2018, pp. 1056-1060.
[8] M. Ijaz, M. H. Durad and M. Ismail, "Static and Dynamic Malware
Analysis Using Machine Learning," in Proc. of the 2019 16th
International Bhurban Conference on Applied Sciences and
Technology (IBCAST), 2019, pp. 687-691.
[9] S. Ninja “Static Malware Analysis” https://resources.infosecinstitute.
com/topic/malware-analysis-basics-static-analysis/. [Accessed: June 23,
2021].
[10] A. Chailytko and S. Skuratovich, "Defeating sandbox evasion: how to
increase the successful emulation rate in your virtual environment." In
Proc. of the Virus Bulletin Conference, 2016.
[11] M. Lebbie, S. R. Prabhu, A. K. Agrawal, “Comparative Analysis of
Dynamic Malware Analysis Tools”, in Proc. of the International
Conference on Paradigms of Communication, Computing and Data
Sciences. Algorithms for Intelligent Systems, 2022, doi: 10.1007/978-
981-16-5747-4_31.
[12] O. Or-Meir, N. Nissim, Y. Elovici, and L. Rokach. “Dynamic Malware
Analysis in the Modern Era—A State of the Art Survey,” ACM
Comput. Surv., vol. 52, 2020.
[13] A. Datta, K. A. Kumar and D. Aju, “An Emerging Malware Analysis
Techniques and Tools: A Comparative Analysis”, International Journal
of Engineering Research and Technology, vol. 10, no. 4, 2021.
[14] D. Kirat and G. Vigna, “MalGene: Automatic Extraction of Malware
Analysis Evasive Signature”, in Proc. of the 22nd ACM SIGSAC
Conference on Computer and Communications Security (CCS-15),
2015.
[15] S. Lee, H. Jeon, G. Park and J. Youn, “Design of Automation
Environment for Analysing various IoT Malware”, Tehnicki Vjesnik /
Technical Gazette., vol. 28, no. 3, pp. 827-835, 2021.
[16] D. Serpanos, P. Michalopoulos, G. Xenos and V. Ieronymakis,
“Sisyfos: A Modular and Extendable Open Malware Analysis
Platform,” Applied Sciences, vol. 11, no. 7, 2021.
[17] M. Ahmadi, K. Leach, R. Dougherty, S. Forrest and W. Weimer,
“MIMOSA: Reducing Malware Analysis Overhead with Coverings,”
arXiv:2101.07328v1, 2021.
230 2022 9th International Conference on Computing for Sustainable Global Development (INDIACom)
Authorized licensed use limited to: Universiti Malaysia Perlis. Downloaded on December 19,2024 at 17:38:03 UTC from IEEE Xplore. Restrictions apply.