Unit Ii Ais
Unit Ii Ais
Malware Detection
Introduction:
Malware has become a challenge to the security of the computer system. The rapid
Many approaches have been proposed to cope with the situations, which are mainly classified
into three categories: static techniques,dynamic techniques, and heuristics. Artificial immune
systems (AIS), because of the natural similarities between the biological immune system and
computer security system, have been developed into a new field for anti-malware research,
attracting many researchers.
The immune mechanisms provide opportunities to construct mal-ware detection models that are
robust and adaptive with the ability to detect unseen malware. In this chapter, the classic
malware detection approaches and immune-based malware detection approaches are briefly
introduced after the background knowledge of malware is presented.
With the rapid development of computer technology and the Internet, the computer
has been a part of daily life. Meanwhile, computer security garners more and more
attention. Malwares, the new variations and unknown malwares in particular, have
Nowadays malwares are becoming more complex with faster breed speeds and stronger abilities
for latency, destruc-tion, and infection. A malware is able to spread over the globe in several
minutes and may result in huge economic losses.
How to protect computers from various kinds of malwares has become one of the most urgent
missions.The malware detection approaches based on immune principles have paved a new way
for anti-malware research.
Many companies have released anti-malware software, most of which is based on signatures.
The software detects known malwares very quickly with lower false positive rates and
overheads. Unfortunately, the software fails to detect new varia-tions and unknown malwares.
Based on metamorphic and polymorphous tech-niques, even a layman can develop new
variations of known malwares easily using virus automatons. For example, Agobot has been
observed to have more than 580 variations since its initial release, using polymorphism to evade
detection and disassembly Thus, traditional malware detection approaches based on signatures
are no longer fit for the new environments; as well, dynamic techniques and heuris-tics have
started to emerge.
a program with the help of application programming interface (API) call sequences
generated at runtime.
However, because of the huge overheads of monitoring API calls, it is very hard to deploy the
dynamic techniques on personal computers.
Data mining approaches, one of the most popular heuristics, try to mine fre-quent patterns or
association rules to detect malwares using classic classifiers.
These have led to some success. However, data mining loses the semantic information of
sitism, breed, and infection. In nature, the biological immune system (BIS) protects the
body from antigens, resolving the problem of unknown antigens [2], so applying
immune mechanisms to anti-malware has developed into a new field for the past few
Forrest applied the immune theory to computer anomaly detection for the first time in 1994.
Since then, many researchers have proposed various kinds of malware detection models and
achieved some success; most of them are mainly derived from ARTificial Immune System
(ARTIS).
Over time, more and more immune mechanisms have become clear. Immune-based malware
detection approaches make use of more immune theories and the study deepens continuously.
The simulations to BIS keep going ahead. Now the malware detection objects have included raw
bit strings, process calls, and process call arguments.
3. Dynamic malware analysis: Dynamic malware analysis executes suspected malicious code
in a safe environment called a sandbox. This closed system enables security professionals to
watch and study the malware in action without the risk of letting it infect their system or escape
into the enterprise network.
4. Dynamic monitoring of mass file operations: Observing mass file operations such as rename
or delete commands to identify signs of tampering or corruption. Dynamic monitoring often uses
a file integrity monitoring tool to track and analyze the integrity of file systems through both
reactive forensic auditing and proactive rules-based monitoring.
5. File extensions blocklist/blocklisting:File extensions are letters occurring after a period in a
file name, indicating the format of the file. This classification can be used by criminals to
package malware for delivery. As a result, a common security method is to list known malicious
file extension types in a “blocklist” to prevent unsuspecting users from downloading or using the
dangerous file.
6. Application allowlist/allowlisting: The opposite of a blocklist/blocklisting, where an
organization authorizes a system to use applications on an approved list. Allowlisting can be very
effective in preventing nefarious applications through rigid parameters. However, it can be
difficult to manage and reduce an organization’s operational speed and flexibility.
MALWARE
Malware can infect networks and devices and is designed to harm those devices, networks and/or
their users in some way.
Conversely, malware detection is a set of defensive techniques and technologies required to
identify, block and prevent the harmful effects of malware. This protective practice consists of a
wide body of tactics, amplified by various tools based on the type of malware that infected the
device.
Cybercriminals use and develop malware (malicious software) to infiltrate target computer
systems and achieve their objectives. Malware is offensive in nature and can cause destruction,
disruption and numerous other effects to computer systems to achieve criminal goals.
Depending on the type of malware and its goal, this harm may present itself differently to the
user or endpoint. In some cases, the effect malware has is relatively mild and benign, and in
others, it can be disastrous.
No matter the method, all types of malware are designed to exploit devices at the expense of the
user and to the benefit of the hacker -- the person who has designed and/or deployed the malware
.Different types of malware have unique traits and characteristics. Types of malware include the
following:
● A virus is the most common type of malware that can execute itself and spread by
infecting other programs or files.
● A worm can self-replicate without a host program and typically spreads without any
interaction from the malware authors.
● A Trojan horse is designed to appear as a legitimate software program to gain access
to a system. Once activated following installation, Trojans can execute their
malicious functions.
● Spyware collects information and data on the device and user, as well as observes the
user's activity without their knowledge.
● Ransomware infects a user's system and encrypts its data. Cybercriminals then
demand a ransom payment from the victim in exchange for decrypting the system's
data.
● A rootkit obtains administrator-level access to the victim's system. Once installed,
the program gives threat actors root or privileged access to the system.
● A backdoor virus or remote access Trojan (RAT) secretly creates a backdoor into an
infected computer system that enables threat actors to remotely access it without
alerting the user or the system's security programs.
● Adware tracks a user's browser and download history with the intent to display
pop-up or banner advertisements that lure the user into making a purchase. For
example, an advertiser might use cookies to track the webpages a user visits to better
target advertising.
● Keyloggers, also called system monitors, track nearly everything a user does on their
computer. This includes emails, opened webpages, programs and keystrokes.
• Concealment: Malwares often attach themselves to benign programs and start up with
the host programs. They perform harmful operations in the back-ground hiding from
users.
• Latency: After intruding in a computer system, malwares hide themselves from users
instead of attacking the system immediately. This feature makes malwares have longer
lives. They spread themselves and infect other pro-grams in this period.
• Trigger: Most malwares have one or more trigger conditions. When these conditions
are satisfied, the malwares begin to destroy the system.Other features of the malwares
include illegality, expressiveness, and unpredictability.
Malwares are evolved with computer technology all the time. The development of
malwares generally goes through several phases including:
• DOS boot phase: Figs. 2.1 and 2.2 illustrate the boot procedures of DOS without and
with boot sector virus, respectively. Before the system obtains right of control, the
malware starts up, modifies interrupt vector, and copies itself to infect the disk. These are
the original infection procedures of mal-wares. Similar infection procedures can be found
in malwares now
.• DOS executable phase: In this phase, the malwares exist in a computer system in the
term of executable files. They control the system when users run appli-cations infected by
the malwares. Most malwares now are executable files.
• Macro malware phase: Before the emerging of macro malwares, all the mal-wares
merely infected executable files because this almost is the only way for the malwares to
obtain the right of execution. When users run a host of a malware, the malware starts up
and controls the system. Infecting data files cannot help the malware to run itself. The
emerging of macro malwares changed this situation and their punching bags are data
files, mainly Micro-soft Office files.
Malware analysis is the inspection of a malware’s core components and source code to
understand its behavior, origin, and intended actions, with the aim of mitigating its potential
threats.
Malware refers to any intrusive software designed to infiltrate a user’s computer or network
without their consent. Such intrusive files include spyware, scareware, rootkits, worms, viruses,
and Trojan horses.
Malicious programs can be programmed to steal users' data, spy on their online activities, or
even harm their system files. For example, early in January 2023, Pepsi Bottling Ventures
suffered a data breach when data-stealing malware infiltrated its network, stealing personal
information.
Similarly, the City of Oakland sustained a ransomware attack that caused a network outage.
Inspired by the human immune system, we explore the development of a new Multiple-Detector
Set Artificial Immune System (mAIS) for the detection of mobile malware based on the
information flows in Android apps. mAISs differ from conventional AISs in that
multiple-detector sets are evolved concurrently via negative selection. Typically, the first detector
set is composed of detectors that match information flows associated with malicious apps while
the second detector set is composed of detectors that match the information flows associated with
benign apps. The mAIS presented in this paper incorporates feature selection along with a
negative selection technique known as the split detector method (SDM). This new mAIS has
been compared with a variety of conventional AISs and mAISs using a dataset of information
flows captured from malicious and benign Android applications. This approach achieved a
93.33% accuracy with a true positive rate of 86.67% and a false positive rate of 0.00%.
● Over-granting permissions
● Insecure transmission
The outputs from both the non-self detector set and self detector set are then used for
classification. For Android malware detection, we were able to achieve 93.33% accuracy
with a true positive rate of 86.67% and a false positive rate of 0.00%.
Malware has become a major threat to the security of the computer and the Internet.
Using artificial immune system techniques for malware detection has two major benefits. First,
increasing the ability to come over some of the traditional detector's drawbacks, like dealing with
the new and polymorphic malware and the increased number of false alarms caused by wrong
decision.
Second take advantages of the capabilities to learn, adapt, self-tolerance and memories actions,
which make it a good example that we can take for solving some major problems in many fields,
including the problem of malware detection in computer security which suffering from the rapid
increasing in the malware and the problem of false positive alarms. In this paper, we try to
highlight the recent techniques applied in malware detection using the artificial immune system
from two points of view: self-nonself theory, danger theory.
Malware Detection Techniques Using
two major benefits. First, increasing the ability to come over some of the tradi-
tional detector’s drawbacks, like dealing with the new and polymorphic malware
and the increased number of false alarms caused by wrong decision. Second take
which suffering from the rapid increasing in the malware and the problem of false
positive alarms. In this paper, we try to highlight the recent techniques applied in
malware detection using the artificial immune system from two points of view:
In static malware analysis, security experts analyze a malware program without executing its
code. The aim is to identify malware families, how the malware operates, and its capabilities.
Since there’s no code execution, static malware analysis doesn’t require a live environment.
However, this can result in analysts missing critical information about the malware that can only
be discovered by watching it in operation.
Since they don’t need to execute the code, analysts can quickly identify the malware's
functionality and capabilities. It can also be automated using tools like disassemblers,
decompilers, and debuggers to quickly analyze large numbers of malware samples.
2. It’s Signature-Based
Static malware analysis uses a signature-based detection approach, which compares the sample
code's digital footprint against a database of known malicious signatures. Every malware has a
unique digital fingerprint that uniquely identifies it. This could be a cryptographic hash, a binary
pattern, or a data string.
Anti-virus programs work the same way. They scan for malware by reviewing the digital
footprints of known malware signatures and flag the file as malware if a scan finds matching
footprints.
While the signature-based malware analysis approach is good at detecting known malware
signatures, it’s unreliable when dealing with new or modified malware.
The method might also fail to detect malware samples programmed to activate only under certain
conditions, such as those triggered by a user’s log, date, time, or network traffic.
3. Techniques Used
Static malware analysis uses different techniques to understand the nature of a threat. One
approach is comparing the digital fingerprint of the malware binary with available databases of
malicious signatures.
A technician can also use a disassembler or debugger to reverse engineer the binary to examine
its code. Alternatively, some analysts perform static malware analysis by extracting a sample’s
string metadata. Doing so reveals details like commands, filenames, messages, API calls, registry
keys, URLs, and other IOCs.
This approach provides a more in-depth, accurate report, but the process can take longer. It also
requires specialized tools, and there’s the risk of infecting the analysis environment with the
malware.
1. It Requires a Sandbox
To safely run the malware and observe its activities, security analysts need a closed
testing environment (malware sandbox) where the malware can execute without
infecting the entire system or network.
By watching the suspicious file execute each of its commands, analysts can gain deep visibility
into the malware’s logic, functionality, and indicators of compromise. In other words, it shows
things that are harder to tell from a static analysis, such as what the malware was programmed to
do, how it communicates, and its evasion mechanism.
3. It’s Behavior-Based
While static analysis uses signature-based detection, dynamic analysis uses a behavior-based
detection approach. Quickly evolving malware or new types of malware can be hard to detect
using the signature-based approach. Some forms of malware can also obscure their signature,
making static analysis ineffective.
Since dynamic analysis uses the behavior-based detection approach, it ensures it is possible for
security analysts to identify and understand new and unknown threats.
With the AI market set to grow by over 38% annually between 2022 and 2029, we can expect the
number of new malware via AI-based platforms like ChatGPT as discussed before to increase.
Dynamic malware analysis will play a crucial role in helping security analysts understand these
newly-emerging threats.
4. Techniques Used
Some of the techniques used during dynamic malware analysis include:
● Activity monitoring: This technique involves monitoring the system calls made by the
malware during execution, such as creating or modifying files, opening network
connections, and making changes to the registry.
● Network traffic analysis: Malware often contacts remote servers to receive commands or
exfiltrate data. Network traffic analysis involves monitoring the malware’s traffic during
execution to understand the servers it communicates with, the types of commands it
receives, and the data it exfiltrates.
● Dynamic code analysis: This technique involves tracing the execution flow of the
malware to understand how it operates.
● Memory analysis: Malware often attempts to hide its activities in memory, such as by
encrypting data or using process hollowing techniques. Analysts use memory analysis to
examine the contents of the system memory during and after malware execution to
identify any hidden activities.
As the threat of malware continues to grow, it’s important to understand the differences between
static and dynamic malware analysis to build effective defense strategies against malware threats.
Both techniques have their strengths and weaknesses, and the right one for you will depend on
the specific circumstances of your analysis. Static analysis provides quick and efficient results by
examining the malware's code and structure. In contrast, dynamic analysis gives you in-depth
insights by observing the malware running in a controlled environment to observe its behavior.
By combining these techniques, security teams can better understand malware threats and
develop more effective defense strategies to detect and mitigate potential attacks.
Heuristics:
Heuristic analysis detects and removes a heuristic virus by first checking files in your computer,
as well as code that may be behaving in a suspicious manner.
Heuristic evaluation or analysis works by looking for commands and instructions not normally
present in a benevolent application. For example, it may detect commands to deliver payloads
often disguised within a Trojan horse virus or those used to distribute a worm virus throughout
your network.
Heuristic analysis can pinpoint a virus through the way it replicates as it spreads. It is also at the
heart of user and entity behavior analytics (UEBA), which uses algorithms to study the behavior
of users, routers, endpoints, and servers.
Heuristic analysis is done using a couple of different techniques:
Static heuristic analysis involves examining the source code of a program and comparing it to the
source code of known viruses that have already been logged in a database. If enough of it
matches what is in the database, the code gets flagged as a potential threat.
2. Dynamic Heuristic Analysis
Dynamic heuristic analysis uses a virtual machine, which acts as a sandbox. A sandbox is a safe,
isolated environment in which a program can execute without affecting the rest of your system or
network. With dynamic heuristic analysis, the sandbox environment allows the file to run, so you
can see what it would do if it runs in a sensitive environment.
For example, during a dynamic heuristic analysis, the program under observation may
self-replicate, try to stay within resident memory after executing, overwrite files, or do other
things that viruses are often programmed to do.
There are a few advantages and disadvantages to heuristic analysis, but despite the drawbacks, it
is still a very powerful tool.
Advantages:
Heuristic analysis can detect more than just modified forms of current malicious programs. It can
also detect previously unknown malicious programs. This is because it analyses the behavior of a
potential threat instead of its file name.
This method of analysis also reduces the number of false positives because some behaviors are
very specific to malware, and heuristic analysis can identify them, pinpointing the threat. For
example, if a program tries to delete files that are needed by the operating system, it is most
likely malicious. Heuristic analysis can detect this kind of behavior and flag the threat so it can
be removed.
On the other hand, by merely examining the signature of a program and comparing it to those of
known threats, the threat may slip away unnoticed, simply because it does not match a known
threat. This is often the case when dealing with a zero-day or previously unknown threat.
Heuristic analysis can flag the threat based on what it does, regardless of whether it has already
been logged in a threat management system.
Disadvantages
Heuristic analysis is designed to detect known threat behavior. If the threat does not perform any
action the threat detection technology has been programmed to recognize, it can slip under the
radar.
To illustrate, suppose your antivirus software has been engineered to flag a program that tries to
delete files your operating system needs but not files that decrypt themselves. In this case, if it
comes across a self-decrypting file, it may not notice that it is a threat—even though this action
is typical of threats.
There is also a chance that the antivirus/anti-malware software uses heuristic scanning based on a
range of behavior that is too broad. In this heuristic analysis example, the process can result in
mislabeling innocent files as threats. However, this is more common in older heuristic analysis
programs, so if you have a newer one and it has been recently updated, chances are it uses
modern techniques, which limit the number of false positives.
It is easy to confuse the terms “heuristic analysis” and “heuristic virus.” However, in some ways,
they can not be more different. A heuristic virus can be detected using heuristic analysis. For
example, the malware known as Heur.Invader is designed to make changes to your system’s
settings. Therefore, it can be detected using heuristic analysis.
Heuristic analysis, on the other hand, identifies programs or applications that behave
suspiciously. In other words, heuristic analysis is a methodology used to identify a heuristic
virus.
How Does Heuristic Analysis Help to Detect and Remove a Heuristic Virus?
Heuristic analysis detects and removes a heuristic virus by first checking files in your computer,
as well as code that may be behaving in a suspicious manner. Once a potential threat has been
identified, it gets flagged.
At this point, the threat can be removed from your system. The antivirus system can also
quarantine the threat, which can give IT teams the opportunity to study it and gain a better
understanding of what it is and how it works.
Heuristic analysis is a method of threat detection that works by looking for commands and
instructions that would not normally be present in a benevolent application.
Heuristic analysis evaluates the actions of programs using a couple of different techniques. For
example, you can use static heuristic analysis, which involves examining the source code and
comparing it to the source code of known viruses. You can also use dynamic heuristic analysis,
which uses a virtual machine that acts as a sandbox. With dynamic heuristic analysis, the
sandbox environment allows the file to run, so you can see what it would do in a sensitive
environment.
the heuristic method detect viruses:
Heuristic analysis detects and removes a heuristic virus by first checking files in your computer,
as well as code that behaves in a suspicious manner. Once a potential threat has been identified,
it gets flagged.
APPROACHES
When the AIS-based detection method is presented with a new file, it can use the immune cells
to search for these patterns and determine whether the file is likely to be malware. If the file is
deemed to be malicious, the AIS-based method can take appropriate actions, such as
quarantining the file or alerting the user.
Abstract—Malicious apps use various methods to spread viruses, take control of computers
and/or IoT devices, and steal sensitive data such as credit card numbers or other personal
information. Despite the numerous existing means of intrusion detection, malware code is not
easily detectable. The primary issue with current malware detection approaches is their inability
to identify novel attacks and obfuscated malware, as they rely on static bases of malware
examples, making them susceptible to new unseen malware behaviors. To address this, we
propose a new method for malware recognition, which consists of two processes: the first
process creates new instances of malware using a memetic algorithm, and the second process
detects these new instances of attacks through solid detectors produced by an artificial immune
system-based algorithm. Our new malware recognition method has proven its merits through
thorough experiments on widely used datasets and evaluation metrics, and has been compared to
prominent state-of-the-art methods. Index Terms—Malware, Memetic algorithms, Artificial
immune system
Malware is malicious software created to access a computer or any other device and cause
damages to it. Malicious programs try to find their way to the targeted systems, which are
usually connected to Internet most of the time, either for work or for personal use, with the
widespread of IoT technology. We can easily notice that IoT technology is vulnerable to malware
attacks especially due to the fact that IoT devices lack robust security measures [1]. In this
concern, different methods were proposed which try to detect malware programs relying on
specific features within the apps. Those features can be classified with regards to their type either
as static or dynamic, which depends on the nature of the used detection method [2] (i.e.,
signature-based, behavioralbased or heuristic-based method). Various methods and techniques
were proposed in literature to enhance the array of computer security means where the use of
(deep) neural networks [3] and evolutionary algorithms (EAs) [4], [5], was particularly present in
recent works. Those methods showed interesting results in detecting malware when assessed
using static stored malicious samples which is no longer the case when tested against new
unknown variants of malware. This can be explained by the lack of diversity of the malware
samples. In another perspective, many works relied on machine learning classifiers [6] to set new
detection rules but those rules led to high percentages of false positives. In this paper, we
propose a two-step detection approach, named IMMU-Det, which is distinguished by the
combination of a Memetic Algorithm (MA) and an Artificial Immune system (AIS) based
algorithm relying on a clonal selection process to generate a diverse population of immune cells
(detectors in our case). The first step is the one that will generate a new set of "memes", those are
the malicious variants (vectors of Application Programming Interface (API) calls) that will serve,
in the following step, as input (antigens) to the AIS based algorithm which, in turn, will produce
detectors. Those detectors will help reveal the true nature of unknown application
Let us mention that an API is a collection of protocols, procedures, and functions that enables
data exchange between numerous programs and gadgets. An analysis of the API calls invoked
within an app’s running process regarding their number or their nature (i.e., sensitive or not) will
greatly help anti-malware producers to examine the app’s behavior and categorize it afterwards
as a safe or a dangerous app. The key contributions of our work are as follows:
1) The suggestion of a new malware detection process based on AIS combined with MA, where
the AISbased clonal selection algorithm generates a set of effective and reliable detectors capable
of detecting new malicious codes generated by a MA.
2) The MA-based schema serves as an excellent example of the advantages of genetic algorithm
extended by a local search module which optimizes the search space of potential generated
malicious codes (i.e., memes). It helps selecting the most challenging memes which will have
impact on the detection quality of the detectors, output of the AIS-based clonal selection
algorithm.
3) The advantages of coupling a MA and an AIS-based clonal selection algorithm are highlited
in the experimental results showing a more accurate prediction of the nature of new apps and a
consequent decrease in erroneous decisions.
4) Our IMMU-Det approach has outperformed a number of malware detection techniques and
engines in terms of accuracy maximization and false alarm decrease. The rest of this paper is
organized as follows: Section III-A presents the essential descriptions tied to MA and AIS-based
clonal selection algorithm used in this work. Section II presents related previous works. Section
III describes our proposed approach. The experimental setup and the results of the performance
analysis are given in Section IV, and the conclusion is given in Section V.
a) Memetic Algorithms:
Memetic algorithms are a type of optimization algorithms that combine elements of genetic
algorithms and local search techniques. They are named "memetic" because they are inspired by
the concept of memes, which are self-replicating ideas or behaviors that can spread and evolve
within a population. In memetic algorithms, a population of solutions to a problem is evolved
through a process of selection, mutation, and recombination, similar to genetic algorithms.
However, in addition to these genetic operators, memetic algorithms also incorporate local
search procedures that can fine-tune individual solutions and improve their quality.
The general steps for implementing a memetic algorithm are as follows: 1) Define the
optimization problem and determine the parameters of the memetic algorithm, such as the
population size, the number of generations, and the crossover and mutation rates.
4) Apply genetic operators (e.g., crossover and mutation) to the population to generate new
solutions.
5) Apply local search to each solution in the population to improve its quality.
6) Repeat steps 3-5 for the specified number of generations or until a satisfactory solution is
found.
7) Return the best solution found by MA. It is important to note that the specific step
which is referred to as the second brain. AIS as a dynamic, adaptive, robust,distributed learning
system have the ability of fault tolerant and noise resistant, and
AIS have been applied to many complex problem domains, such as optimiza-
tion, pattern recognition, fault and anomaly diagnosis, network intrusion detection,
The steps of the general artificial immune algorithm are shown in Algorithm 5.
1. Input antigens.
There are three typical algorithms in AIS: negative selection algorithm (NSA),
tected body from antigens from the beginning of life, resolving the problem of
unknown antigens [2]. The computer system is designed from the prototype of
human beings and the computer security system has the similar functions with BIS.
Furthermore, the futures of AIS, such as dynamic, adaptive, robust, are needed in
the computer anti-malware system (CAMS). To sum up, applying immune mecha-
nisms to computer security systems, especially the CAMS, is reasonable and has
developed into a new field, attracting many researchers. The relationship of BIS and
ognize new variations and unknown malwares, using existing knowledge. The
CAMS with immune mechanisms would be more robust to make up the fault
BIS CAMS
Antigens Malwares
Pattern matching of the malwares and detectors Binding of an antigen and an antibody
In the bottom layer, a non-stochastic but guided candidate virus gene library is generated by
statistical information of viral key codes. Then a detecting virus gene library is upgraded from
the candidate virus gene library using negative selection. In the middle layer, a novel storage
method is used to keep a potential relevance between different signatures on the individual level,
by which the mutual cooperative information of each instruction in a virus program can be
collected.
In the top layer, an overall matching process can reduce the information loss considerably.
Experimental results indicate that the proposed model can recognize obfuscated viruses
efficiently with an averaged recognition rate of 94%, including new variants of viruses and
unknown viruses.