0% found this document useful (0 votes)
7 views3 pages

A Survey of Anomaly Detection Methods in Networks: Weiyu Zhang, Qingbo Yang, Yushui Geng

Uploaded by

kifirid776
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views3 pages

A Survey of Anomaly Detection Methods in Networks: Weiyu Zhang, Qingbo Yang, Yushui Geng

Uploaded by

kifirid776
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

A Survey of Anomaly Detection Methods in Networks

Weiyu Zhang, Qingbo Yang, Yushui Geng


Modern Educational Technology Center
Shandong Institute of Light Industry
Jinan, China
[email protected]

Abstract--Despite the advances reached along the last 20 years,


anomaly detection in networks is still an immature technology, II. PROBLEM STATEMENT
Nevertheless, the benefits which could be obtained from a better To pose the problem of anomaly detection in any system
understanding of the problem itself as well as the improvement of implies the existence of a subjacent concept of normality. The
these methods. Therefore, in this paper we present a survey on notion of ‘normal’ is usually provided by a formal model that
anomaly detection in networks. In order to distinguish between
expresses relations between the fundamental variables involved
the different approaches used for anomaly detection in networks
in the system dynamics. Consequently, an event is catalogued
in a structured way, we have classified those methods into four
categories: statistical anomaly detection, classifier based anomaly as anomalous because its degree of deviation in relation to the
detection, anomaly detection using machine learning and finite profile of characteristic behavior of the system, specified by the
state machine anomaly detection. We describe each method in model of normality, is high enough.
details and give examples for its applications in networks. Formally, an anomaly detection system S can be defined as
a pair S = (M,D), where M is the model of normal behavior of
Keywords-Anomaly detection; Machine learning; Intrusion
the system and D is a similarity measure that allows obtaining,
detection; Network security
given an activity record, the degree of deviation that such
activities have with regard to the model M. Therefore, the core
I. INTRODUCTION of the system is constituted by two main modules: the
Communication networks make physical distances modeling subsystem and the detection subsystem. The first of
meaningless. While we are enjoying the ease of being them works during a training stage, and performs an event
connected, it is also recognized that an intrusion of malicious processing in order to obtain the model M of the normal
users from one place can cause severe damages to wide areas. behavior of the system. The obtained model is subsequently
Computer network’s security becomes a critical issue and it is used by the detection engine to evaluate new events. As was
important to develop mechanisms to defense against the stated before, this evaluation is a measurement of the degree of
intrusions. deviation that such events present in relation to the model of
the system. These two modes of operation are usually carried
The existing intrusion detection methods fall in two major out separately. However, it is important to note that systems
categories: signature recognition and anomaly detection [1]. evolve and, therefore, the model should be reconstructed
For signature recognition techniques, signatures of known periodically in order to provide a way of adaptation to the new
attacks are stored and monitored events are matched against the environment.
signatures. The techniques signal an intrusion when there is a
match. An obvious limitation of these techniques is that they
cannot detect new attacks whose signatures are unknown. III. ANOMALY DETECTION METHODS
Anomaly detection, on the other hand, builds models of normal
data and detects any deviation from the normal model in the A. Anomaly detection using statistics
observed data. Given a set of normal data to train from, and In statistical methods for anomaly detection, the system
given a new piece of test data, the goal is to determine whether observes the activity of subjects and generates profiles to
the test data belong to “normal” or to an anomalous behavior. represent their behavior. Typically, two profiles are maintained
The anomaly detection techniques have the advantage that they for each subject: the current profile and the stored profile. As
can detect new types of intrusions as deviations from normal the network events are processed, the system updates the
usage [2]. However, their weakness is the high false alarm rate. current profile and periodically calculates an anomaly score by
comparing the current profile with the stored profile using a
The remainder of this article is organized as follows. In function of abnormality of all measures within the profile. If
Section 2, we introduce the problem through the presentation the anomaly score is higher than a certain threshold, the system
of theoretical considerations. In Section 3, we provide detailed generates an alert.
discussions on the various techniques used in anomaly
detection. Finally, in section 4 we conclude this paper.

978-1-4244-5273-6/09/$26.00 ©2009 IEEE


Authorized licensed use limited to: UNIVERSIDAD VERACRUZANA. Downloaded on July 01,2021 at 00:20:10 UTC from IEEE Xplore. Restrictions apply.
Statistical anomaly detection has a number of advantages. decision tree and then extract a set of classification rules from
Firstly, these systems do not require prior knowledge of the decision tree. Other algorithms directly induce rules from
security flaws and/or the attacks themselves. In addition, the data by employing a divide-and-conquer approach.
statistical approaches can provide accurate notification of
malicious activities that typically occur over extended periods Fuzzy logic techniques have been in use in the area of
of time. However, statistical anomaly detection schemes also network security since the late 1990's [5]. Dickerson et al. [6]
have drawbacks. Firstly, it can be difficult to determine developed the Fuzzy Intrusion Recognition Engine (FIRE)
thresholds that balance the likelihood of false positives with the using fuzzy sets and fuzzy rules. FIRE uses simple data mining
likelihood of false negatives. In addition, statistical methods techniques to process the network input data and generate
need accurate statistical distributions, but, not all behaviors can fuzzy sets for every observed feature. The fuzzy sets are then
be modeled using purely statistical methods. used to define fuzzy rules to detect individual attacks. FIRE
does not establish any sort of model representing the current
Haystack [3] is one of the earliest examples of a statistical state of the system, but instead relies on attack specific rules
anomaly-based intrusion detection system. It used both user for detection.
and group-based anomaly detection strategies, and modeled
system parameters as independent, Gaussian random variables. Genetic algorithms, a search technique used to find
approximate solutions to optimization and search problems,
Haystack defined a range of values that were considered
normal for each feature. If during a session, a feature fell have also been extensively employed in the domain of
outside the normal range, the score for the subject was raised. It intrusion detection to differentiate normal network traffic from
was designed to detect six types of intrusions. But, one anomalous connections. The major advantage of genetic
drawback of Haystack was that it was designed to work offline. algorithms is their flexibility and robustness as a global search
method. The earliest attempt to apply genetic algorithms to the
Statistical Packet Anomaly Detection Engine (SPADE) [4] problem of intrusion detection was done by Crosbie and
is a statistical anomaly detection system. SPADE was one of Spafford [7] in 1995, when they applied multiple agent
the first papers that proposed using the concept of an anomaly technology to detect network based anomalies.
score to detect port scans, instead of using the traditional
approach of looking at p attempts over q seconds. In [4], the C. Anomaly detection using machine learning
authors used a simple frequency based approach, to calculate Machine learning aims to answer many of the same
the 'anomaly score' of a packet. The fewer times a given packet questions as statistics. However, unlike statistical approaches
was seen, the higher was its anomaly score. Once the anomaly which tend to focus on understanding the process that
score crossed a threshold, the packets were forwarded to a generated the data, machine learning techniques focus on
correlation engine that was designed to detect port scans. building a system that improves its performance based on
However, the one major drawback for SPADE is that it has a previous results. In other words systems that are based on the
very high false alarm rate. This is due to the fact that SPADE machine learning paradigm have the ability to change their
classifies all unseen packets as attacks regardless of whether execution strategy on the basis of newly acquired information.
they are actually intrusions or not.
A Bayesian network is a graphical model that encodes
B. Anomaly detection using a classifier probabilistic relationships among variables of interest. When
In this section we focus on the anomaly detection using a used in conjunction with statistical techniques, Bayesian
classifier. Anomaly detection depends on the idea that normal networks have several advantages for data analysis [8]. Several
characteristics behavior can be distinguished from abnormal researchers have adapted ideas from Bayesian statistics to
behavior. A classifier can be used to predict the normal create models for anomaly detection. Valdes et al. [9]
incoming event given the current event. If during the developed an anomaly detection system that employed naive
monitoring phase the next event is not the one predicted by the Bayesian networks to perform intrusion detection on traffic
classifier, it is considered as an anomaly. The classification bursts.
process typically involves the following steps: 1. Identify class Bayesian techniques also have been frequently used in
attributes and classes from training data. 2. Identify attributes classification and suppression of false alarms areas. Kruegel et
for classification. 3. Learn a model using the training data. 4. al. [10] proposed a multi-senor fusion approach where the
Use the learned model to classify the unknown data samples. A outputs of different IDS sensors were aggregated to produce a
variety of classification techniques have been proposed in the single alarm. This approach is based on the assumption that
literature. These include inductive rule generation techniques, any anomaly detection technique cannot classify a set of events
fuzzy logic and genetic algorithms-based techniques. as an intrusion with sufficient confidence. Although using
Inductive rule generation algorithms typically involve the Bayesian networks for intrusion detection can be effective in
application of a set of association rules and frequent episode certain applications, their limitations should be considered in
the actual implementation. Since the accuracy of this method
patterns to classify the audit data. The advantage of using rules
is that they tend to be simple and intuitive, unstructured and is dependent on certain assumptions that are typically based on
less rigid. As the drawbacks they are difficult to maintain, and the behavioral model of the target system, therefore, selecting
in some cases, are inadequate to represent many types of an accurate model is the most important things towards solving
information. A number of inductive rule generation algorithms the problem. Unfortunately selecting an accurate behavioral
have been proposed in literature. Some of them first construct a model is a difficult task as typical networks are complex.

Authorized licensed use limited to: UNIVERSIDAD VERACRUZANA. Downloaded on July 01,2021 at 00:20:10 UTC from IEEE Xplore. Restrictions apply.
Typical datasets for intrusion detection are very large and IV. CONCLUSIONS
multidimensional. To tackle the problem of high dimensional Networks are becoming increasingly complex at the same
datasets, researchers have developed a dimensionality time that security concerns do not cease to grow and require
reduction technique known as principal component analysis more and more attention. Hence, there is a strong need for
(PCA). PCA is a technique where n correlated random anomaly detection as a frontline security research area for
variables are transformed into d<n uncorrelated variables. The network security. In order to give a clear vision about the use
uncorrelated variables are linear combinations of the original of this technique, we present in this paper a classified survey of
variables and can be used to express the data in a reduced form. the methods that are used for anomaly detection in networks.
Shyu et al. [11] proposed an anomaly detection scheme, where We believe that a deeper knowledge is required until this
PCA was used as an outlier detection scheme and was applied technology achieves a solid maturity.
to reduce the dimensionality of the audit data and arrive at a
classifier that is a function of the principal components.
REFERENCES
Mahoney et al. [12–14] presented several methods that [1] H. S. Javitz and A. Valdes, The SRI Statistical Anomaly Detector,
address the problem of detecting anomalies in the usage of Proceedings of the 1991 IEEE Symposium on Research in Security and
network protocols by inspecting packet headers. The common Privacy, May 1991.
denominator of all of them is the systematic application of [2] D. E. Denning, An Intrusion Detection Model, IEEE Transactions on
learning techniques to automatically obtain profiles of normal Software Engineering, SE-13, pp. 222-232, 19517.
behavior for protocols at different layers. Packet Header [3] S.E. Smaha, Haystack: An intrusion detection system, in: Proceedings of
Anomaly Detector (PHAD) [12], LEarning Rules for Anomaly the IEEE Fourth Aerospace Computer Security Applications
Detection (LERAD) [13] and Application Layer Anomaly Conference, Orlando, FL, 1988, pp. 37–44.
Detector (ALAD) [14] use time-based models in which the [4] S. Staniford, J.A. Hoagland, J.M. McAlerney, Practica automated
detection of stealthy portscans, Journal of Computer Security 10, 2002,
probability of an event depends on the time. For each attribute, pp. 105–136.
they collect a set of allowed values and flag novel values as [5] H.H. Hosmer, Security is fuzzy!: applying the fuzzy logic paradigm to
anomalous. PHAD, ALAD, and LERAD differ in the attributes the multipolicy paradigm, in: Proceedings of the 1992-1993 Workshop
that they monitor. PHAD monitors 33 attributes from the on New Security Paradigms Little Compton, RI, United States, 1993.
Ethernet, IP and transport layer packet headers. ALAD models [6] J.E. Dickerson, J.A. Dickerson, Fuzzy network profiling for intrusion
incoming server TCP requests: source and destination IP detection, in: Proceedings of the 19th International Conference of the
addresses and ports, opening and closing TCP flags, and the list North American Fuzzy Information Processing Society (NAFIPS),
Atlanta, GA, 2000,pp. 301–306.
of commands in the application payload. Depending on the
attribute, it builds separate models for each target host, port [7] M. Crosbie, G. Spafford, Applying genetic programming to intrusion
detection, in: Working Notes for the AAAI Symposium on Genetic
number (service), or host/port combination. LERAD also Programming, Cambridge, MA, 1995, pp. 1–8.
models TCP connections. The authors break down the [8] D. Heckerman, A Tutorial on Learning With Bayesian Networks,
multivariate problem into a set of univariate problems and sum Microsoft Research, Technical Report MSRTR-95-06, March 1995.
the weighted results from range matching along each [9] A. Valdes, K. Skinner, Adaptive model-based monitoring for cyber
dimension. The advantage of this approach is that it makes the attack detection, in: Recent Advances in Intrusion Detection Toulouse,
technique more computationally efficient and effective at France, 2000, pp. 80–92.
detecting network intrusions. [10] C. Kruegel, D. Mutz, W. Robertson, F. Valeur, Bayesian event
classification for intrusion detection, in: Proceedings of the 19th Annual
Computer Security Applications Conference, Las Vegas, NV, 2003.
D. Anomaly detection using finite state machines
[11] M.-L. Shyu, S.-C. Chen, K. Sarinnapakorn, L. Chang, A novel anomaly
A finite state machine (FSM) is a model of behavior detection scheme based on principal component classifier, in:
composed of states, transitions and actions. In this model, a Proceedings of the IEEE Foundations and New Directions of Data
state stores information about the past, a transition indicates a Mining Workshop, Melbourne, FL, USA, 2003, pp. 172–179.
state change and is described by a condition that would need to [12] M.V. Mahoney, P.K. Chan, PHAD: Packet Header Anomaly Detection
for Identifying Hostile Network Traffic Department of Computer
be fulfilled to enable the transition. An action is a description Sciences, Florida Institute of Technology, Melbourne, FL, USA,
of an activity that is to be performed at a given moment. Technical Report CS-2001-4, April 2001.
The finite state machine has been used to detect attacks on [13] M.V. Mahoney, P.K. Chan, Learning Models of Network Traffic for
Detecting Novel Attacks Computer Science Department, Florida
the DSR protocol in [15]. First, an algorithm for monitor Institute of Technology CS-2002-8, August 2002.
selection for distributed monitoring all nodes in networks was [14] M.V. Mahoney, P.K. Chan, Learning nonstationary models of normal
proposed and then the correct behaviors of the nodes according network traffic for detecting novel attacks, in:Proceedings of the Eighth
to DSR were manually abstracted. Using this method has the ACM SIGKDD International Conference on Knowledge Discovery and
advantage of detecting intrusions without the need of trained Data Mining, Edmonton, Canada, 2002, pp. 376–385.
data or signatures, also unknown intrusions can be detected [15] P. Yi, Y. Jiang, Y. Zhong, and S. Zhang, Distributed Intrusion Detection
with few false alarms. As a result, a distributed network for Mobile Ad hoc Networks, Proceedings of the 2005 Symposium on
Applications and the Internet Workshops (SAINTW'05),pp. 94-97.
monitor architecture which traces data flow on each node by
[16] R. Sekar, A. Gupta, J. Frullo, T. Shanbhag, A. Tiwari, H. Yang, S. Zhou,
means of finite state machine was proposed. In Ref. [16], Sekar Specification-based anomaly detection: a new approach for detecting
et al. present a specification-based model as well as a prototype network intrusions, Proceedings of the Ninth ACM Conference on
with excellent detection performance. The model proposed by Computer and Communications Security, Washington, DC, USA,
authors consists of developing protocol specifications by using November 18–22, 2002, pp. 265–274.
Extended Finite State Automata (EFSA).

Authorized licensed use limited to: UNIVERSIDAD VERACRUZANA. Downloaded on July 01,2021 at 00:20:10 UTC from IEEE Xplore. Restrictions apply.

You might also like