A Survey On Malware and Malware Detection Systems
A Survey On Malware and Malware Detection Systems
net/publication/272238656
CITATIONS READS
2 2,589
3 authors:
Ali Abdelrahman
Prince Sattam bin Abdulaziz University
8 PUBLICATIONS 4 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Imtithal Saeed on 05 December 2015.
25
International Journal of Computer Applications (0975 – 8887)
Volume 67– No.16, April 2013
Obfuscated malware include polymorphic and metamorphic access, getting in without authentication, because these
malware, in which the original code transformed into a form programs can be used remotely by enemies to make attacks.
that is functionally the same but is much more difficult to be
understood. The obfuscation techniques that used are dead- Trojan horse is a code that appears to be a useful program, but
code (which is inserting some number of code that accomplish actually it steals information or corrupts data [11, 32].
nothing), code transportation (by inserting jumps in the Sniffers are computer programs that can intercept and record
program while its control flow remains the same), register traffic over a network. They capture each packet to decode
renaming (by the mean of replacing the use of register in an and gain raw data, showing the values of various fields in the
instruction with another unused one) or toolkit paradigm packet and analyzing its contents. Sniffer code can be used as
where a set of variant malware, that generates one type of initial steps toward intrusion attack.
malware in each time of infection. Spam also known as junk email, is a software package that
Remote execution of malware is done by hackers to achieve broadcast identical messages to numerous targeted recipients
their intention remotely using the infrastructure of the Internet by email. Spam can delay system as numerous mails come,
and benefiting from the existing methods of remote execution. further it can lead to consuming bandwidth. In some cases it is
used instead of adware. However in United States, spam was
4. MALWARE CLASSES declared to be legal by the CAN-SPAM Act of 2003, provided
Several malware classifications have been issued so far, the message restricted to certain specifications [33].
depending on some of their characteristics. The purpose of Botnet is a collection of infected computers (contains bot
such classifications is to facilitate the tracking of authorship, software embedded in it) that have been taken over by hacker
correlating information, identifying new variants [30]. and used to perform malicious functions, without the hackers
However, in this paper, a kind of classification, depending on having to log into the client's computer. Botnet can make DoS
the use of networks and Internet, is made. Using of networks attack as many clients’ bots, under control of hacker bot,
and the Internet is that they represent the execution having a role of attack [11, 31].
environments of malware or as means of propagation. The
idea behind such classification is because the use of networks 4.2 Ordinary Malware
and the Internet necessitate dealing with this type of software Virus is any software code that has the ability to replicate
in special ways. That is like intrusion detection systems (IDS), itself, during infection, into any other application software or
prevention of SQL attacks, detecting worm spreading on a document. Viruses can do harmful functions on a user
LAN, real-time classification of malware etc. The machine; it can make destruction to the whole system. Virus
classification made, is to categorize the major common code is attached in an application program using one or more
malware types into groups depending on the network and web of three methods (pre-pending, embedding and post-pending).
usage. This type of malwares use local file system to locate
malicious code from infected device to uninfected one [11,
4.1 Network-based Malware 31, 32].
Spyware is a kind of malware that is installed secretly on a
user computer for the purpose of collecting information about Example: Autorun.inf file which resides in a removable
users without their knowledge [31]. Even reputable vendors of storage media for the purpose of playing the disk
software like Microsoft and Google, intentionally, collect automatically. This file is targeted by malware developers, to
information of their users using spywares[10]. put their malcode in, instead of the original code. When the
removable disk enters, the operating system starts searching
Adware is a short-cut of advertising-supported software. They for “autorun.inf” and run it. This thing ensures infection
are software packages that automatically play advertisements inevitably no way. Generally, basic type of viruses can
to user computer without desire. The objective of adware is to successfully be detectable by signature-based scanners, if
gain financial profit for their author. Adware are not harmful signatures are provided.
by nature, but they can be in the form of a pop-up window
which can interrupt users thinking. Some adware may come Worm is any software code that has the ability of self-
with integrated spyware such as key loggers and other replicating on victim computer. Worms are independent; they
privacy-invasive software [32]. don’t need for a host program to start lifecycle. Worm can
consume network bandwidth, preventing legitimate users
Cookies are some information stored on user's computer by from using it. Worm has the property of creating new copies
their web browsers. The main purpose of cookie is to of themselves to increase the spread rate in a system. AV
authenticate users depending on the information stored in, scanners can make use of this characteristic to detect a
storing site preferences and server-based session. Cookies are malware, i.e. when there are several files of the same
sent as a field in the header of the HTTP response by a web attributes, this might be a sign of malware infection [11, 31,
server to a client, and then sent back unchanged by the 32].
browser to server in each time when requests sent to the
server. Cookies are not executable, because they are text Logic bomb is a software program which remains quiet until a
format file only, but may be permanent or not expired on specific condition is met. The most common activator for a
specific date/time. Thus they are not harmful by themselves, logic bomb is a date and time. The logic bomb checks the
system date and time, regularly, to see whether it must be
but they can be used by other spywares.
activated. If so, the logic bomb activates and executes its code
Backdoors, also called trap doors, are malcode written into an [31]. From the previous discussion, it is obvious that malware
applications or operating systems with the intention of those depend on networks dominate the current state of
granting programmers access to the system without requiring malware infection. Also they can have big impact
them to go through ordinary methods of authentications. representing in the disclosure of confidential data, preventing
They're written by experts or specialized developers for online services and sabotage files. Table 1 summarizes
friendly usage. The security problem with trapdoors is the full malware classification and their properties.
26
International Journal of Computer Applications (0975 – 8887)
Volume 67– No.16, April 2013
Spyware
Adware
Cookies
Trapdoor
horse
Trojan
Sniffers
Spam
Botnet
Logic bomb
Worm
Virus
Malware family
Factors of comparison
Pattern ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔
Creation Obfuscated ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔
techniques
Polymorphic ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔
Toolkit ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔
Network ✔ ✔ ✔ ✔ ✖ ✔ ✔ ✔ ✔ ✔ ✖
Execution
Remote execution through web ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✖ ✖ ✖
environment
PC ✖ ✖ ✖ ✖ ✖ ✖ ✖ ✖ ✔ ✔ ✔
Network ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔
Propagation
Removable disks ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔
media
Internet downloads ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔
Breaching confidentiality ✔ ✖ ✔ ✖ ✔ ✔ ✖ ✖ ✖ ✖ ✖
Negative
Inconveniencing users ✖ ✔ ✖ ✖ ✖ ✖ ✔ ✖ ✖ ✖ ✖
impacts
Denying services ✖ ✖ ✖ ✔ ✖ ✖ ✔ ✔ ✔ ✔ ✔
Data corruption ✖ ✖ ✖ ✔ ✖ ✖ ✔ ✔ ✔ ✖ ✔
Alamos National Laboratory [36]. W&S was rule-based on
statistical analysis used with anomaly detection. In 1990, the
MALWARE DETECTION Time-based Inductive Machine (TIM) used as anomaly
detection with inductive learning of sequential user patterns in
A malware detection program D is the computational function
Common Lisp on a VAX 3500 computer [37]. The Network
that works in a domain which contains a collection of
Security Monitor (NSM) used masking on access matrices for
application programs ‘P’, and a collection of malicious and
anomaly detection on a Sun-3/50 workstation [38]. The
benign programs. The detector program ‘D’ analyzes the
Information Security Officer's Assistant (ISOA) was a
programs ‘p’ which belongs to the set of application programs
prototype that deployed variety of strategies including
‘P’ to find whether it is a benign (normal program), or a
malware (malicious program) [11]. Formally, we can write the statistics, a profile checker, and an expert system [39].
above definition as below: 4.4 Mechanism of Malware Detection
Software companies develop detection systems products at
laboratories and keep track of new programs, analyzing them,
The previous function represents the main function of a putting the valid software in white list and malicious software
malware detection program. The detection program in black list. For the undecidable software, so-called gray list,
determines the identity of a program by analysis or by the scanners operate them in a controlled environment for
identification. But sometimes this function may result in, false more classification [3].
positive, false negative or undecidable objects depending on
When analysis of a program in the gray list results in new
efficiency of the function ‘D’. So the function could be
malware, company releases online updates for new malicious
rewritten as follows:
software. Then users can update their product databases by
using remote access through Internet connection.
27
International Journal of Computer Applications (0975 – 8887)
Volume 67– No.16, April 2013
effectiveness. But the main disadvantage with signature-based of conformity with the original system, efficiency and full
techniques is that they cannot defend against unknown control of system resources. Virtual machine-based malware
malware [3] . detection systems are constructed on the basis of the
mentioned concept. There exist three classes of VM used by
Anomaly-based systems detect any kind of misusing malware detection systems; Sandbox is the first one where
computer that fall out of the ordinary activity of a computer computer resources have to be reached through specific API
system, while signature-based systems detect malwares that provided by the VM where system receives information of a
have a fingerprint in their databases [37, 40]. Anomaly-based suspicious executable program from a user, analyzes its
detect computer malicious software by monitoring system behavior by performing it in a controlled environment
activities and classifying it as either normal or anomalous (sandbox) and sends analysis reports back to the user who has
.The pivotal difference between signature-based and anomaly- issued the information. Secondly emulation where simulating
based is using classification to detect a malware, instead of the entire computer system for running the guest operating
using patterns [41]. system and the VMM provides an execution environment for
4.5.2 Heuristic based techniques programs that are identical to the original machine with
exception of differences caused by the availability of system
Artificial intelligence (AI) was used with signature-based and
resources or by timing dependencies while efficiency is the
anomaly-based techniques to enhance their efficiency. Neural core characteristic of emulators. An emulator is a piece of
networks(NNs) have been adopted for their adaptability to software that acts as a hardware (i.e. CPU emulator simulates
environmental changes and their ability of prediction [42]. CPU functionality using software). The emulator does not
Fuzzy logic is an artificial intelligence approach derived from directly execute a code; instead instructions are intercepted by
fuzzy theory, which use approximation for logic rather than the emulator, translated into corresponding sequence of
precise classical logic. Genetic algorithm is another machine instructions compatible with the targeted platform. System
learning-based technique used in malware detection process emulators are hidden to detection code so it is regarded as a
for deriving classification rules and selecting appropriate suitable environment for malware analysis. Thirdly, in native
features or optimal parameters for optimum solution. It system virtual machines, a virtual machine monitor (VMM) is
applies principles of evolutionary biology such as inheritance, a smaller piece of privileged code that privileging VM on the
mutation, selection and combination. The main advantage of host computer. This characteristic makes it native VM with
this technique is the derivation of solutions from multiple good performance, but liable to errors and tamper resistance
directions with no need for prior knowledge about system [14].
behavior [43].
Agent–based intrusion and malware detection systems depend
Statistical and mathematical techniques are used in malware on characteristics of agent technology such as autonomy,
detection by applying statistical and mathematical models on decentralization, platform independency, scalability and
the information of system activities such as network mobility. It is benefitting from the notion of no central station
connections, bandwidth, memory usage, system call used by causes no central point of failure [21-23, 25, 46-49].
objects etc. [7, 42].
While the design of host-based IDS and distributed IDS suffer
4.6 Malware Detection Technologies from a number of drawbacks that host-based IDS cannot
detect outsider attacks but it is effective internally, the
Host-based intrusion detection systems monitor dynamic
distributed IDS does not take care of internal attack but it is
behavior and state of specific computer system to see if there
effective externally and agent-based system invented to
are any internal or external activities defraud the system
combine characteristics of both host-based and distributed
policy. This kind of malware detection systems idiomatically
IDS [22].
named (“in-the-box”) because they reside in the same host
that they are monitoring [13]. Web-based scanning provided by vendors those maintain
websites with detection capabilities for scanning the entire
Network-based intrusion detection systems (IDS) are used to
local computer systems, critical areas only, local disks, folders
sniff all the packets on network nodes for analysis. In this type
or files. Online scanning is good idea for those who don't want
a single sniffer module placed in each network segment to
to run antivirus applications on their computers. Sometimes
monitor traffic in that segment. In contrast distributed-
malicious software firstly attacks and disables any existing
network-based intrusion detection system has multiple
antivirus software then starts attacking. Turning to an online
modules placed in each node to monitor traffic in those nodes
resource that isn't already installed on the infected computer
[44]. Network-based malware detection systems idiomatically
could be reasonable solution [50].
named (“out-of-the-box”) because they reside outside the host
that they are monitoring [19]. Application protocol-based intrusion detection system
(APIDS) is an intrusion detection system that focuses its
There are hybrid intrusion detection systems; used with
monitoring and analysis on a specific application protocol.
mixture of host-based and network-based capabilities. This
The system monitors the dynamic behavior and state of the
type of IDS consists of multiple subsystems locating on
application protocol. The system consists of a service or an
separate nodes in the network for monitoring and gathering
agent that sits between group of servers, monitoring and
data from these nodes. The data collected by these subsystems
analyzing the application protocol between them. A typical
is sent to the main system for analysis and classification [45].
place for an APIDS would be between a web server and the
Regarding effectiveness issue both host-based and network-
database management system, monitoring the SQL protocol
based detection systems have their drawbacks, while host-
specific to the middleware/business logic when it interacts
based protects effectively internal system but it is susceptible
with the database . Anti-Spam systems they are used to
to external attack, network-based can prevent external attack
prevent e-mail spam. Both end users and administrators have
but it can’t protect inside host [45].
roles in treating spam, rather than embedded techniques used
According to [19] a virtual machine (VM) is defined as an automatically by email server systems. Anti-spam techniques
efficient isolated duplicate of real machine with characteristics can be classified into four categories: those that require
28
International Journal of Computer Applications (0975 – 8887)
Volume 67– No.16, April 2013
actions by end-users, those that can be managed by e-mail From the previous discussion, it is clear that the malware
administrators, those that can be automated by e-mail senders detection systems have evolved widely in the past few
and those deployed by researchers and law enforcement decades. They have evolved from static programs that work
officials [33]. on static data analysis and regular algorithms to complex
algorithms that work on sophisticated techniques based on
Multi-agent P2P intrusion detection is an agent-based service- statistical and mathematical models and artificial intelligence.
oriented system which puts in use of distributed security Furthermore, the addition of technological solutions such as
policy and distributed intrusion detection, on architecture that the use of cloud computing, virtual machines, network-based
provides interactive environment to make a decision [21]. application and agent-based technology. Table 2 illustrates
Special tools for virus removal are available to help remove some of the malware detection systems.
stubborn infections or certain types of infection. Examples do
include Trend Micro's Rootkit Buster and rkhunter tool to
scan for rootkits on an Ubuntu Linux computer.
29
International Journal of Computer Applications (0975 – 8887)
Volume 67– No.16, April 2013
[1] Mcafee and Lab, 2013 Threats Predictions. 2013. [18] Ahmed, M., et al. NIDS: A Network Based Approach to
Intrusion Detection and Prevention. in Computer Science
[2] Berkenkopf, R.B.S., G-Data Malware Report. 2010. and Information Technology - Spring Conference, 2009.
IACSITSC '09. International Association of. 2009.
[3] Ye, Y., et al., Intelligent file scoring system for malware
detection from the gray list, in Proceedings of the 15th [19] Garfinkel, T. and M. Rosenblum, A virtual machine
ACM SIGKDD international conference on Knowledge introspection based architecture for intrusion detection.
discovery and data mining. 2009, ACM: Paris, France. p. 2003: p. 191--206.
1385-1394.
[20] Lagar-Cavilla, H.A., Flexible Computing with Virtual
[4] Rieck, K., Malheur A novel tool for malware analysis Machines. 2009.
2012.
[21] Gorodetsky, V., et al., Multi-agent Peer-to-Peer Intrusion
[5] Pinz, C.I., et al., Improving the security level of the Detection
FUSION@ multi-agent architecture. Expert Syst. Appl.,
2012. 39(8): p. 7536-7545. Computer Network Security, V. Gorodetsky, I. Kotenko,
and V.A. Skormin, Editors. 2007, Springer Berlin
[6] Ammar Ahmed E. Elhadi, M.A. Maarof, and A.H. Heidelberg. p. 260-271.
Osman, Malware Detection Based on Hybrid Signature
[22] Ye, D., An Agent-Based Framework for Distributed
Behaviour Application Programming Interface Call Intrusion Detections. 2009.
Graph. American Journal of Applied Sciences, 2012. 9
(3): p. 283-288. [23] Ou, C.-M. and C.R. Ou, Agent-Based immunity for
computer virus: abstraction from dendritic cell algorithm
[7] Kevadia Kaushal, P.S., Nilesh Prajapati, Metamorphic with danger theory, in Proceedings of the 5th international
Malware Detection Using Statistical Analysis. conference on Advances in Grid and Pervasive
International Journal of Soft Computing and Engineering Computing. 2010, Springer-Verlag: Hualien, Taiwan. p.
(IJSCE), 2012. 2(3). 670-678.
[8] Yanfang Ye, T.L., Shenghuo Zhu,Weiwei [24] Bijani, S. and D. Robertson, Intrusion detection in open
Zhuang,Egemen Tas,Umesh Gupta,Melih Abdulhayoglu, peer-to-peer multi-agent systems, in Proceedings of the
Combining file content and file relations for cloud based 5th international conference on Autonomous
malware detection, in Proceedings of the 17th ACM infrastructure, management, and security: managing the
SIGKDD international conference on Knowledge dynamics of networks and services. 2011, Springer-
discovery and data mining. 2011, ACM: San Diego, Verlag: Nancy, France. p. 177-180.
California, USA. p. 222-230.
[25] Dong, H., et al. Research on adaptive distributed
[9] Christodorescu, M., et al., Semantics-Aware Malware intrusion detection system model based on Multi-Agent.
Detection, in Proceedings of the 2005 IEEE Symposium in Computer Science and Automation Engineering
on Security and Privacy. 2005, IEEE Computer Society. (CSAE), 2011 IEEE International Conference on. 2011.
p. 32-46.
[26] Ou, C.M., Multiagent-based computer virus detection
[10]Yin, H., et al., Panorama: capturing system-wide systems: abstraction from dendritic cell algorithm with
information flow for malware detection and analysis, in danger theory. Springerlink, 2011.
Proceedings of the 14th ACM conference on Computer
and communications security. 2007, ACM: Alexandria, [27] Paritosh Das, R.N., A Temporal Logic Based Approach
Virginia, USA. p. 116-127. to Multi-Agent Intrusion Detection and Prevention. 2012.
[11] Vinod, P., et al., Survey on Malware Detection Methods. 28. McGraw, G. and G. Morrisett, Attacking Malicious Code:
2009. A Report to the Infosec Research Council. IEEE Softw.,
2000. 17(5): p. 33-41.
[12] Zeltser, L., what is cloud Anti-Virus and how it does
work. [29] Xufang, L., P.K.K. Loh, and F. Tan. Mechanisms of
Polymorphic and Metamorphic Viruses. in Intelligence
[13]Jiang, X., X. Wang, and D. Xu, Stealthy malware and Security Informatics Conference (EISIC), 2011
detection through vmm-based "out-of-the-box" semantic European. 2011.
view reconstruction, in Proceedings of the 14th ACM
conference on Computer and communications security. [30] EroCarrera. and P. Silberman, STATE OF MALWARE:
2007, ACM: Alexandria, Virginia, USA. p. 128-138. FAMILY TIES. 2010.
30
International Journal of Computer Applications (0975 – 8887)
Volume 67– No.16, April 2013
[31] Egele, M., et al., A survey on automated dynamic [44] Kozushko, H., Intrusion Detection: Host-Based and
malware-analysis techniques and tools. ACM Comput. Network-Based Intrusion Detection Systems. 2003.
Surv., 2008. 44(2): p. 1-42.
[45] Basicevic, F., M. Popovic, and V. Kovacevic. The use of
[32] Idika, N. and A.P. Mathur., A Survey of Malware distributed network-based IDS systems in detection of
Detection Techniques. 2007. evasion attacks. in Telecommunications, 2005. advanced
industrial conference on telecommunications/service
[33] Goldman, E., Dissecting Spam's Purported Harms. 2003. assurance with partial and intermittent resources
[34] Webster, M., Algebraic specification of computer viruses conference/e-learning on telecommunications workshop.
and their environments. Selected Papers from the First aict/sapir/elete 2005. proceedings. 2005.
Conference on Algebra and Coalgebra in Computer [46] Gou, X., W. Jin, and D. Zhao, Multi-agent system for
Science Young Researchers Workshop (CALCO-jnr Worm Detection and Containment in Metropolitan Area
2005), 2005. Networks. Journal of Electronics (China), 2006. 23(2): p.
[35] Grimes, R.A., Malicious Mobile Code: Virus Protection 259-265.
for Windows. O'Reilly Media, 2001. [47] Mechtri, L., F.D. Tolba, and S. Ghanemi. MASID: Multi-
[36] Vaccaro, H.S. and G.E. Liepins. Detection of anomalous Agent System for Intrusion Detection in MANET. in
computer session activity. in Security and Privacy, 1989. Information Technology: New Generations (ITNG), 2012
Proceedings., 1989 IEEE Symposium on. 1989. Ninth International Conference on. 2012.
[37] Teng, H.S., K. Chen, and S.C. Lu. Adaptive real-time [48] Silva, M., D. Lopes, and Z. Abdelouahab, A Remote IDS
anomaly detection using inductively generated sequential Based on Multi-Agent Systems, Web Services and MDA,
patterns. in Research in Security and Privacy, 1990. in Proceedings of the International Conference on
Proceedings., 1990 IEEE Computer Society Symposium Software Engineering Advances. 2006, IEEE Computer
on. 1990. Society. p. 64.
[38] Heberlein, L.T., et al. A network security monitor. in [49] Pinz, C.I., et al., Real-time CBR-agent with a mixture of
Research in Security and Privacy, 1990. Proceedings., experts in the reuse stage to classify and detect DoS
1990 IEEE Computer Society Symposium on. 1990. attacks. Appl. Soft Comput., 2011. 11(7): p. 4384-4398.
[39] Winkler, J.R., A Unix Prototype for Intrusion and [50] Steroids, S.o., Malware online scanners. accessed
Anomaly Detection in Secure Networks (1990). 12/4/2013.
Proceeding. 13 th National Computer Security [51] Chaugule, A., Z. Xu, and S. Zhu, A specification based
Conference, 1990. intrusion detection framework for mobile phones, in
[40] P. García-Teodoro, J.D.-V., G. Maciá-Fernández, E. Proceedings of the 9th international conference on
Vázquez, Anomaly-based network intrusion detection: Applied cryptography and network security. 2011,
Techniques, systems and challenges. Computers & Springer-Verlag: Nerja, Spain. p. 19-37.
Security, 2009 28. [52] Blount, J.J., D.R. Tauritz, and S.A. Mulder. Adaptive
[41] Bolzoni, D. and S. Etalle, APHRODITE: an Anomaly- Rule-Based Malware Detection Employing Learning
based Architecture for False Positive Reduction. 2006, Classifier Systems: A Proof of Concept. in Computer
Centre for Telematics and Information Technology, Software and Applications Conference Workshops
University of Twente: Enschede. (COMPSACW), 2011 IEEE 35th Annual. 2011.
[42] Chandola, V., A. Banerjee, and V. Kumar, Anomaly [53] Asmaa S. Ashoor, S.G., Intrusion Detection System
detection: A survey. ACM Comput. Surv., 2009. 41(3): p. (IDS): case study. 2011 International Conference on
1-58. Advanced Materials Engineering IPCSIT, 2011.
31