0% found this document useful (0 votes)
18 views

A Static Analysis Tool for Malware Detection

Uploaded by

worood.n.89
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

A Static Analysis Tool for Malware Detection

Uploaded by

worood.n.89
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

2021 International Conference on Data Analytics for Business and Industry (ICDABI)

A Static Analysis Tool for Malware Detection


Haitham Ameen Noman Qusay Al-Maatouk Sinan Ameen Noman
King Abdullah II School of Engineering School of Technology Gradute School of Computer Science
2021 International Conference on Data Analytics for Business and Industry (ICDABI) | 978-1-6654-1656-6/21/$31.00 ©2021 IEEE | DOI: 10.1109/ICDABI53623.2021.9655866

Princess Sumaya University for Technology Asia Pacific University (A.P.U.) The University of Alabama
Amman, Jordan Kuala Lumpur, Malaysia Alabama, United States of America
[email protected] [email protected] [email protected]

Abstract—Malware detection refers to the process of detecting string, or hash that shows the program’s structure. This step-
the presence of Malware on a host system or of distinguishing in static analysis will be accurate and fast if the Malware is
whether a specific program is malicious. The different types of detected in the past (known) and might fail if it is unknown.
Malware created new challenges for the researchers to develop
a concrete detection solution that can tackle the Malware effec- Advanced static analysis, on the other hand, advanced is based
tively. Malware analysis can be classified into two methods: The on the use of disassembler and debugger tools. This is to
first is done by analyzing the Malware statically without executing display the analyzed program instructions since these instruc-
it. The second method is conducted by analyzing the Malware tions will give an image for the analyzer on program structure
dynamically, which is conducted by monitoring it during its and functionality. Since advanced static analysis detects some
execution in an isolated, safe environment. This paper developed
a tool that performs static analysis on the Malware to detect its unknown malware, applying this technology requires advanced
behaviour. The tool works by extracting the suspected program’s knowledge, including the concept of operating systems. Static
APIs and checking if those APIs are malicious or not. The analysis tools mainly use signature and pattern matching
tool showed promising results and high accuracy to tell whether techniques [4]. These tools require manual work to analyze
the analyzed program is Keylogger, Ransomware, Backdoor or and detect Malware, while dynamic malware analysis detects
benign. Moreover, some false-positive results appeared during
the tests when trying to identify software like Zoom and Team the behaviour of the Malware during its running. To determine
Viewer. system behaviours. For unknown Malware, dynamic analysis
Index Terms—Malware analysis, Static Analysis, Reverse En- works better than static analysis; however, static analysis is
gineering considered faster for known malware [4]. Different Malware
can be categorized into different groups based on how it works,
I. I NTRODUCTION such as Virus, Worms, Ransomware, Keylogger, Spyware,
Nowadays, technology is being developed rapidly, and a Backdoor. etc. A keylogger can be a software, for example,
considerable number of so-called smart devices are used by API based Keylogger, which uses the operating system’s
people, especially personal computers, laptops and smart- API to make the strokes of the keyboard and clicks of the
phones. The increased usage of these devices would make mouse; the second type is a hardware keylogger which is
them attractive targets for attackers. With each day, Cyber- a physical device that can be installed on the devices to
attacks increase massively; about 269 billion emails are sent log the data and keystrokes similarly to Software Keylogger.
every day worldwide [1]. At least 3.4 billion phishing emails Hardware Keyloggers can be prevented by solid physical
are sent over the planet day by day [2]. The AV-TEST Institute security; otherwise, software keyloggers need to be analyzed in
counts over 350,000 new malicious executables and possi- static or dynamic analysis tools to understand how it works [4].
bly unwanted applications daily [3]. Moreover, sophisticated Ransomware is yet another type of Malware that is generally
Malware is being created with different innovative evasion used to encrypt the victim’s files or to lock the computer and
techniques. Traditional Antivirus Software struggles to detect demand some amount of money in exchange for the master
some of this sophisticated Malware that has developed evasion key that can be used to decrypt the encrypted files [4]. A
techniques. The main problem revolves around the time spent Backdoor is a malicious code or program that can pass and
by specialists to generate signatures specified for the developed break authentication or other security controls to access an
Malware, in addition to this, the expenses to detect cyber- operating system or computer’s data. Backdoors allow the
attacks, according to global studies and reports, the whole attackers to gain access with no authentication and execute
world lost approximately USD 3 trillion in 2015 because commands in the compromised system [5].
of cybercrimes and expected to cost the world more than
II. MALWARE STATIC ANALYSIS
USD 6 trillion by 2021[3]. Malware specialists build tools to
analyze new Malware and add their signatures to anti-viruses’ A. Keylogger
databases. These tools are categorized into two types: static The Keylogger can be grouped under two categories: user-
analysis tools and dynamic analysis tools. The Static analysis mode, which records user input using WIN32 API and kernel-
itself is classified into two types: basic and advanced. The mode keylogger, implemented as a filter driver or device
Static fundamental analysis gives the analyzer a signature, driver. It is possible to record the user’s keystrokes by using

Authorized licensed use limited to: Universiti Malaysia Perlis. Downloaded on December 19,2024 at 17:41:22 UTC from IEEE Xplore. Restrictions apply.
978-1-6654-1656-6/21/$31.00 ©2021 IEEE 661
2021 International Conference on Data Analytics for Business and Industry (ICDABI)

SetWindowsHookEx API or GetAsyncKeyState API where encrypts the different types of data. 15. CryptDestroyKey: This
GetAsyncKeyState, when called, will determine whether the API releases the handle referenced to the hkey parameter.
pressed key is up or down. This API, when used, can
record keystrokes from A-Z and 0-9. On the other hand, C. Backdoor
the API SetWindowsHookEx: when get called, will install Backdoor is mainly used as a way of passing and break-
an application-defined hook procedure into the hook chain. through authentication to access a computer system or sys-
Some of the hook chains are mouse hooks or keyboard hooks tem’s data. Backdoor can be on a system or within an appli-
[5]. Usually, A keylogger can be shown as a process in task cation. A. System Backdoors: This particular type sometimes
manager; for that reason, the malware authors always try to relies on social engineering to be delivered to the victim and
hide it inside another legitimate program. To do that, they get executed. B. Application Backdoors: This type modifies
use the API” CreateRemoteThread”. This is used to create a legitimate software to bypass the security mechanisms; ap-
thread in the virtual address of the targeted program; the thread plication back¬doors may fully compromise the targeted ma-
will record the user’s keystrokes. SetWindowsHookEx: when chine. This type of backdoors mainly targets Web applications,
the global hook is used, this API targets a remote process by network server processes, operating systems and network ap-
injecting a DLL into its address space [5]. pliances [8]. The detection of application backdoors is mainly
conducted by inspecting the binary or inspecting the source
B. Ransomware code statically; keep in mind that backdoor mechanisms can
Ransomware is malicious software that affects the computer be heavily obfuscated. In Microsoft Windows, it is required
and mobile devices in different ways, such as spam emails; to know that the Portable Executable (PE) format comprises
when it is executed on the targeted machine, it will prevent multiple sections. Each section in the PE gives other info,
the user from accessing his/her data or the whole system like the file header and the DLL and API number. However,
unless he/she pays the required ransom from the malware malware authors sometimes modify the PE file header and
author [6]. There is more than one type of Ransomware, and look quite different from the original program. The API
it can be categorized into two types: Locker ransomware: calls will be modified, too [9]. This drastic change in the
this type will lock the user computer until he/she pays the PE will consequently affect the analysis to determine the
ransom, such as CryptoLocker. Crypto ransomware: this type malicious behaviour and the type of Malware. The primary
will encrypt the user files and data until he/she pays the purpose of this research is to extract API calls from Windows
ransom, such as Cerber [6]. Traditional detection for Malware executable program (PE) using static analysis. We created a
in anti-viruses is based on the signature. Some executables tool to automatically extract the APIs from PE and identify
are obfuscated or packed, making it difficult to detect new the malware behaviour accordingly.
emerged Ransomware effectively. Detection of Ransomware
is performed by testing more than one type of Ransomware III. DESIGN
and detect the used API’s embedded within it. The following The design of our tool, as mentioned above, is based on
are some commonly used APIs in Ransomware [7]: 1. Find- static analysis. The tool was fully implemented to disas-
FirstChangeNotificationA: There will be a change notification sem¬ble a portable executable file (PE) from machine code
handle and a change condition in this API. If any change into assembly code; afterwards, it checks if the PE is packed.
happens to match the condition, then the handle will succeed. If that was the case, the tool should unpack the PE then
2. SHEmptyRecycleBinA: this API delete the recycle bin analyze the API calls to check whether the PE is Malware
on the targeted drive. 3. SHFileOperation: This API would (Ransomware, Keylogger or Backdoor) or not. After that, the
move, delete, rename, or copy a file system object when tool will decompile the PE machine code into a high-level
it is called. 4. SHLoadInProc: This API loads a DLL into language and generate a summary report. One limitation that
the Windows Explorer process. 5. SHBrowseForFolder: This can be noticed in this tool is the capability to unpack only
API shows a dialogue box that let the user select a Shell one type of packers, which is ”UPX Packet”. This means, if
folder. 6. SHGetFileInfo: This API returns information about the analyzed Malware were packed with any other type of
an object in the file’s system. 7. SHQueryRecycleBinA: This packers, the tool would fail to extract the required informa-
API returns the Recycle Bin size and items for a targeted tion. The tool supports Malware, Keylogger, Ransomware and
drive. 8. SHPathPrepareForWriteA: This API is used to check Backdoors only [10]. Figure 1 illustrates the mechanism of
if the path exists. 9. EncryptFileA: This API encrypts a file the tool: As mentioned before, Application Program Interfaces
or directory and encrypts all new files created in an encrypted (API) are functions used to execute standard low-level system
directory. 10. DecryptFileA: This API decrypts an encrypted functions. API function calls are stored in Dynamic Link
file or directory. 11. FileEncryptionStatusW: This API retrieves Libraries (DLLs) such as Kernel32.dll, User32.dll, sechost.dll,
the encryption status of a file. 12. CryptGenKey: This API and GDI32.dll; and These DLLs contain functions that do a
generates a random cryptographic session key or a public / lot of mechanisms, such as create files and read files to start
private key pair. 13.BCryptGenerateSymmetricKey: This API and end processes, create and access network communication,
creates a key object for use with a symmetrical key encryption and allocate the memory. Security analysts aim to extend the
algorithm from a supplied key. 14. CryptEncrypt: This API functionality of any application, but usually, the source code

Authorized licensed use limited to: Universiti Malaysia Perlis. Downloaded on December 19,2024 at 17:41:22 UTC from IEEE Xplore. Restrictions apply.
662
2021 International Conference on Data Analytics for Business and Industry (ICDABI)

suspicious program on their online database, the result was


Keylogger, so we could safely say that the tool successfully
analyzed this particular Malware. Figure 3 shows that the tool
has successfully managed to find five malicious APIs related
to Keylogger. The decision of the tool was made according
to these five extracted APIs in the program. After analyzing
various keyloggers, the ordinary behaviour noticed was that
they all use the same API responsible for key strokes and
mouse clicking, file manipulation, and network APIs such as
socket and file transfer. This can be summarized in Figure 4.
The second selection was Ransomware, where the tool was fed
with the suspicious program, and the tool took four seconds
to complete the task. No packer was found, four APIs related
to Ransomware, and the tool decided it was Ransomware
accordingly. Figure 5 depicts that there is no packer found in
this Malware, and some general information of the malware
architecture, time stamp, number of sections, file type, and
the hash is also calculated. It is essential to mention that five
malicious APIs were found in the Malware, in which four
APIs are related to the Ransomware, and one API is related
Fig. 1. The Flow Diagram of the Tool to the Backdoor. Since there were four APIs related to the Ran-
somware, the tool would consider the Malware as Ransomware
after several tests were conducted on the suspicious programs
of an application is not available, so the tool was developed
using our tool. After testing all the Ransomware samples using
to analyze each DLL one by one and extract the used API on
our tool, the conclusion was that all the tested Ransomware
the suspicious applications. After studying the extracted API
used encryption, decryption and acquisition APIs and used
for more than two types of each family: Keylogger: Custom
the network API to spread to the victim’s network. Figure 6
Keylogger, TusharPardhe’s Keylogger and Master (Giacomo-
below summarizes the behaviour of three Ransomware based
Law keylogger). Ransomware: Wannacry, CryptoLocker, and
on their APIs. On average, tested Ransomware used eight
Cerber. The final stage of the analysis involves checking other
networking API and almost eight encrypt/decrypt API and five
information within the analyzed PE, such as the ASCII and
file acquisition APIs and five file manipulation APIs as well.
Unicode strings and hashes. Strings are printable characters
The third and last selection was a backdoor. The tool took
embedded within the application; it enriches the analysis pro-
ten seconds to complete the analysis. No packer was found,
cess by giving a clue to the analyst. Microsoft has a program
three ransomware APIs were found, the result Backdoor was
dubbed ”strings.exe” designed to search inside an executable
detected, the online database result was Backdoor, and the test
for ASCII or Unicode, starting with three characters or more
was successful. Figure 7 shows some of the APIs extracted
significant. By linking the strings.exe tool with our tool, it
from the Malware and lists the malicious APIs found. As
could automatically search for the most suspicious strings in
we can see, there were four malicious APIs found in the
the Malware. Figure 2 assemble the complete structure of our
Malware. Three APIs are related to the Backdoor, and one API
tool by combining the two stages of analysis:
is related to Ransomware. Since there were only three APIs
related to the Backdoor, the tool will identify the Malware as
IV. RESULTS
”a backdoor”. We ran the tool against three different backdoor
For safety measures, we decided to analyze the suspicious samples. We concluded that all the tested backdoors perform
programs in isolated environments. The type of tested data network behaviour to gain access and gather information
in this research will be windows executable programs. We from the victim’s machine. Figure 8 summarises the result.
prepared a suitable environment for this purpose. The environ- We could calculate the accuracy of the tool based on these
ment comprises a fully updated and patched virtual machine tests, and the accuracy will be calculated for each Malware
in which we ran the Windows operating system inside. We separately, as shown below: 1- The Ransomware accuracy was
configured the machine network settings to run on ”Host- calculated based on three samples tested; one of the three
only” to prevent the Malware from communicating with the failed, and the other two passed, so the accuracy is (2/3)
other network devices. We ran the tool against several types of equal to 66.6 2- The Keylogger accuracy was calculated based
suspicious programs. The first malware selection was a key- on eight samples tested; all of them were successful, so the
logger; the tool took six seconds to complete the analysis. No accuracy is (8/8) equal to 100 3- The Backdoor accuracy was
packer found, Five APIs related to Keylogger were found, and calculated based on three samples tested, the three tests were
the tool decision was was clearly ”a Keylogger”. Compared successful, so accuracy is (3/3), which is 100 When using the
with the online database Totalvirus, when we uploaded the tool to analyze non-malware PE, a false positive result might

Authorized licensed use limited to: Universiti Malaysia Perlis. Downloaded on December 19,2024 at 17:41:22 UTC from IEEE Xplore. Restrictions apply.
663
2021 International Conference on Data Analytics for Business and Industry (ICDABI)

Fig. 2. The complete Structure of the Tool

Fig. 4. Keyloggers’ APIs behavior analysis

occur simply because some innocuous programs might need


to use some functionalities like recording the keystrokes or
does some encryption functionalities, not for bad intentions.
Fig. 3. Keylogger Sample Analysis If these programs got analyzed using our static analysis tool,
the tool might identify the benign program as a malicious
one. For example, the tool has been used to analyze Zoom
and Team Viewer Software. While scanning Zoom, the tool

Authorized licensed use limited to: Universiti Malaysia Perlis. Downloaded on December 19,2024 at 17:41:22 UTC from IEEE Xplore. Restrictions apply.
664
2021 International Conference on Data Analytics for Business and Industry (ICDABI)

Fig. 7. Backdoor’s APIs Behavior Analysis

R EFERENCES
[1] “How Many Emails Are Sent Per Day,” Campaign Monitor, May-2019.
[Online]. Available: https://www.campaignmonitor.com/blog/email-
marketing/shocking-truth-about-how-many-emails-sent/. [Accessed:
Fig. 5. Ransomware Sample Analysis 31-Aug-2021].
[2] M. Shuaib, S. i. M. Abdulhamid, O. S. Adebayo, O. Osho, I. Idris,
J. K. Alhassan, and N. Rana, ”Whale optimization algorithm-based
email spam feature selection method using rotation forest algorithm for
classification,” SN Applied Sciences, vol. 1, no. 5, pp. 1-17, 2019.
[3] M. H. Nguyen, D. Le Nguyen, X. M. Nguyen, and T. T. Quan, ”Auto-
detection of sophisticated malware using lazy-binding control flow graph
and deep learning,” Computers Security, vol. 76, pp. 128-155, 2018.
[4] Ö. Aslan, ”Performance comparison of static malware analysis tools
versus antivirus scanners to detect malware,” in International Multidis-
ciplinary Studies Congress (IMSC), 2017.
[5] K. Nasaka, T. Takami, T. Yamamoto, and M. Nishigaki, ”A keystroke
logger detection using keyboard-input-related API monitoring,” in 2011
14th International Conference on Network-Based Information Systems,
2011, pp. 651-656: IEEE.
[6] S. Sheen and A. Yadav, ”Ransomware detection by mining API call
usage,” in 2018 International Conference on Advances in Computing,
Communications and Informatics (ICACCI), 2018, pp. 983-987: IEEE.
[7] P. Bajpai and R. Enbody, ”An Empirical Study of API Calls in
Ransomware,” in 2020 IEEE International Conference on Electro In-
Fig. 6. Ransomware APIs behavior analysis chart formation Technology (EIT), 2020, pp. 443-448: IEEE.
[8] C. Wysopal, C. Eng, and T. Shields, ”Static detection of application
backdoors,” Datenschutz und Datensicherheit-DuD, vol. 34, no. 3, pp.
149-155, 2010.
has identified it as a potential backdoor due to the APIs used [9] M. Alazab, S. Venkataraman, and P. Watters, ”Towards understanding
in its development. On the other hand, Team viewer has been malware behaviour by the extraction of API calls,” in 2010 second
cybercrime and trustworthy computing workshop, 2010, pp. 52-59:
identified as a potential keylogger because of the key-strokes IEEE.
APIs that Team Viewer uses. We conclude that these APIs can [10] H. A. Noman, S. I. S. Al Satary, O. S. W. Al-Quraini, M. A. Yousfi,
also be used by benign Software as well as a malicious ones. And Q. Al-Maatouk, ”Designing And Implementing A Secured Smart
Network That Can Resist Next-Generation State Surveillance,” Journal
Of Theoretical And Applied Information Technology, Vol. 99, No. 02,
V. CONCLUSION 2021.

In this paper, we presented a tool that can identify three


types of Malware by relying on static analysis techniques.
The tool uses two stages of analysis before decompiling the
Malware back to its source code. Although the tool has shown
high accuracy in detecting the Malware, it can sometimes
present false-positive results when analyzing a benign PE. In
order to reduce the false-positive results, the tool could be
enhanced to not rely only on the API during its detection
phase. This can be implemented by adding extra layers of
detection like scanning the strings and hashes to produce more
accurate results. Additionally, the tool can be expanded to
detect more types of Malware like spyware, trojan horses,
viruses and worms.

Authorized licensed use limited to: Universiti Malaysia Perlis. Downloaded on December 19,2024 at 17:41:22 UTC from IEEE Xplore. Restrictions apply.
665

You might also like