0% found this document useful (0 votes)
15 views5 pages

Intrusion Detection System Using Hierarchical GMM and Dimensionality Reduction

The document discusses an effective intrusion detection system (IDS) for web servers that combines signature-based and anomaly-based detection techniques using Gaussian Mixture Models (GMM) and dimensionality reduction. It highlights the importance of reducing the number of attributes from 41 to 14 and 7 for improved classification accuracy and efficiency, utilizing data from the KDD Cup 99 dataset. The proposed approach aims to enhance the detection of intrusions while managing large amounts of data and rules effectively.

Uploaded by

icmlfc23
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views5 pages

Intrusion Detection System Using Hierarchical GMM and Dimensionality Reduction

The document discusses an effective intrusion detection system (IDS) for web servers that combines signature-based and anomaly-based detection techniques using Gaussian Mixture Models (GMM) and dimensionality reduction. It highlights the importance of reducing the number of attributes from 41 to 14 and 7 for improved classification accuracy and efficiency, utilizing data from the KDD Cup 99 dataset. The proposed approach aims to enhance the detection of intrusions while managing large amounts of data and rules effectively.

Uploaded by

icmlfc23
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

www.ijcait.

com International Journal of Computer Applications & Information Technology


Vol. 1, No.1, July 2012

Intrusion Detection System Using Hierarchical GMM and


Dimensionality Reduction

L. Maria Michael J. Indra Mercy N.R. Rejin Paul


Assistant Professor Student, M.E. Assistant Professor
Velammal Institute of Saveetha Engineering College, Velammal Institute of
Technology, Chennai Chennai Technology, Chennai

ABSTRACT attack of firewall is powerful, but not invulnerable. Intrusion


detection techniques need to be applied to protect the web
The focus of this chapter is to provide the effective intrusion server because merely relying on the firewall is not enough.
detection technique to protect Web server. The IDS protects
an server from malicious attacks from the Internet if someone Host intrusion detection refers to the class of intrusion
tries to break in through the firewall and tries to have access detection systems that reside on and monitor an individual
on any system in the trusted side and alerts the system host machine. A network intrusion detection system monitors
administrator in case there is a breach in security. Gaussian the packets that traverse a given network link. Network data
Mixture Models (GMMs) are among the most statistically has a variety of characteristics that are available for a NIDS to
mature methods for clustering the data. Intrusion detection monitor: most operate by examining the IP and transport layer
can be divided into anomaly detection and misuse detection. headers of individual packets, the content of these packets, or
Misuse detection model is to collect behavioral features of some combination thereof.
non-normal operation and establish related feature library. In In this paper data mining classification algorithm is being
the existing system of anomaly based Intrusion Detection used with the concept of Dimension Reduction. Dimension
System, the work is based on the number of attacks on the Reduction is applied using Best First Search which reduces
network and using decision tree analysis for rule matching the feature selection from 41 attributes to 14 and 7 potential
and grading. We are proposing an IDS approach that will use attributes for classification. The proposed approach focuses on
signature based and anomaly based identification scheme. using information obtained KDD Cup 99 data set for the
And we are also proposing the rule pruning scheme with selection of attributes to identify the type of attack and then
GMM(Gaussian Mixture Model). It does facilitate efficient compares the performance of the ID3 with J48 by a randomly
way of handling large amount of rules. And we are planned to selected initial dataset with the reduced dimensionality.
compare the performance of the IDS on different models. The Furthermore, the results indicate that our approach provides
Dimension Reduction focuses on using information obtained more accurate results compared to the purely random one in a
KDD Cup 99 data set for the selection of attributes to identify reasonable amount of time.
the type of attacks. The dimensionality reduction is performed
on 41 attributes to 14 and 7 attributes based on Best First 2. ANALYSIS TECHNIQUE
Search method and then apply the two classifying Algorithms Misuse Detection: The essence of misuse detection centers
ID3 and J48 Keywords-Intrusion detection, reliable networks, around using an expert system to identify intrusions based on
malicious routers, internet dependability, tolerance. a predetermined knowledge base. As a result, misuse systems
are capable of attaining high levels of accuracy in identifying
Keywords even very subtle intrusions that are represented in their expert
ID3, KDD, IDS, Dimensionality Reduction, NIDS. knowledge base. These techniques are able to automatically
retrain intrusion detection models on different input data that
1. INTRODUCTION include new types of attacks; as long as they have been
Labeled appropriately. Their disadvantage is that they cannot
IDS is concerned with the detection of hostile detect unknown intrusions and they rely on signatures
actions. This network security tool uses either of two main extracted by human experts. This method uses specifically
techniques. The first one, anomaly detection, explores issues known patterns of unauthorized behavior to predict and detect
in intrusion detection associated with deviations from normal subsequent similar attempts. These specific patterns are called
system or user behavior. The second employs signature signatures.
detection to discriminate between anomaly or attack patterns
(signatures) and known intrusion detection signatures. Both The essence of misuse detection centers around using an
methods have their distinct advantages and disadvantages as expert system to identify intrusions based on a predetermined
well as suitable application areas of intrusion detection. Data knowledge base. As a result, misuse systems are capable of
are grouped using the rule pruning scheme with GMM attaining high levels of accuracy in identifying even very
(Gaussian Mixture Model). It does facilitate efficient way of subtle intrusions that are represented in their expert
handling large amount of rules.The issue of the web servers knowledge base. These techniques are able to automatically
safety consists of two parts: One is the transmission security, retrain intrusion detection models on different input data that
including data on antieaves dropping and data integrity; the include new types of attacks; as long as they have been
other is the web server side and client-side in itself. The labeled appropriately. Their disadvantage is that they cannot
former can be enhanced by a variety of security protocols. detect unknown intrusions and they rely on signatures
However, the latter need taking precautions by firewall and extracted by human experts. This method uses specifically
intrusion detection techniques. Generally speaking, the anti- known patterns of unauthorized behavior to predict and detect

P a g e | 29
www.ijcait.com International Journal of Computer Applications & Information Technology
Vol. 1, No.1, July 2012
subsequent similar attempts. These specific patterns are called
signatures.
Anomaly Detection
Anomaly detection is concerned with identifying
events that appear to be anomalous with respect to normal
system behavior. Its Designed to uncover abnormal patterns
of behavior, the IDS establishes a baseline of normal usage
patterns, and anything that widely deviates from it gets
flagged as a possible intrusion. Thus these techniques identify
new types of intrusion as deviations from normal usage. It is
an extremely powerful and novel tool but a potential
drawback is the high false alarm rate, that is. previously
unseen (yet legitimate) system behaviours may also be
recognized as anomalies, and hence flagged as potential
intrusions. If a user in the graphics department suddenly starts
accessing accounting programs or compiling code, the system
can properly alert its administrators.
In a network-based system, or NIDS, the every
individual packet flowing through a network is analyzed. The
NIDS can detect malicious packets that are designed to be
overlooked by a firewall simplistic filtering rules. In a host-
based system, the IDS examines at the activity on each
individual computer or host.
A wide variety of techniques including neural
networks, decision tree approach and hidden Markov models
have been explored as different ways to cluster the data for
rule creation. Each and every techniques has got its own pros Fig.1. 1 Overall system Architecture
and cons, Hidden markov model is slow, full search on a
Reducing The Data Features For Intrusion Detection Systems
database of 400,000 sequences can take 15 hours. Decision
Using Gmm
tree approach is unstable to handle large volume of data,
The current intrusion detections aiming at the web server
Data Collection Issues: For accurate intrusion detection,
attack all adopt the rule-based method, like the famous
we must have reliable and complete data about the target
intrusion detecting system Snort2.0, which detection rules are
system’s activities. Reliable data collection is a complex issue
written after the features are refined from every intrusion
in itself. Most operating systems offer some form of auditing
behavior. Thus a rules library is formed. Then the captured
that provides an operations log for different users. These logs
data packets are matched the rules library respectively. If the
might be limited to the security-relevant events (such as failed
match succeeds, the behavior is regarded as intrusion.
login attempts) or they might offer a complete report on every
system call invoked by every process. Similarly, routers and Since the amount of audit data that an IDS needs to verify
firewalls provide event logs for network activity. These logs is very huge even for a small network, rule matching is
might contain simple information, such as network connection difficult even with computer assistance because extraneous
openings and closings, or a complete record of every packet features can make it harder to detect suspicious behavior
that appeared on the wire. patterns. Complex relationships exist between the features,
which are difficult for humans to discover. IDS must group
The amount of system activity information a system
the amount of data to be processed. This is very important if
collects is a trade-off between overhead and effectiveness. A
real-time detection is desired. Reduction can occur in one of
system that records every action in detail could have
several ways. Data that is not considered useful can be
substantially degraded performance and require enormous
filtered, leaving only the potentially interesting data. Data can
disk storage. For example, collecting a complete log of a 100-
be grouped or clustered to reveal hidden patterns; by storing
Mbit Ethernet link’s network packets could require hundreds
the characteristics of the clusters instead of the data, overhead
of Gbytes per day.
can be reduced. Finally, some data sources can be eliminated
using feature selection
3. OVERVIEW OF GAUSSIAN
MIXTURE MODEL 4. IMPROVING THE RULE
Mixture models are a type of density model that comprise MATCHING SPEED BY GMM
of a number of component functions, usually Gaussian. The
Gaussian Mixture Models (GMMs) are among the most
distribution of feature vectors was extracted from packets in
statistically mature methods for clustering. It deals with
the network. A Gaussian Mixture Model GMM is used to
clustering problems: a model-based approach, which consists
construct a Bayesian classification procedure on the
in using certain models for clusters and attempting to optimize
observations and leads to the system behavior model.
the fit between the data. In practice, each cluster can be
Parameters of mixture model are used by the Expectation
mathematically represented by a parametric distribution, like a
Maximization (EM) algorithm.
Gaussian. The entire data set is therefore modeled by a
mixture of these distributions. An individual distribution used

P a g e | 30
www.ijcait.com International Journal of Computer Applications & Information Technology
Vol. 1, No.1, July 2012
to model a specific cluster is often referred to as a component Maximization
distribution.
A mixture model with high likelihood tends to have the
component distributions have high “peaks” and the mixture
model “covers” the data well. Main advantages of model-
based clustering:
well-studied statistical inference techniques
flexibility in choosing the component distribution;
obtain a density estimation for each cluster;
a “soft” classification is available.
Mixture of Gaussians
The most widely used clustering method of this kind is the Fig. 3 Overview of the structure of GMM
one based on learning a mixture of Gaussians: we can actually
consider clusters as Gaussian distributions centred on their In the Expectation Maximization algorithm no of mixtures
barycentres, as we can see in this picture, where the grey decided beforehand, it updates the parameters of given k-
circle represents the first variance of the distribution: component mixture with respect to the data set Xn = x1, ....xn
such that likelihood of Xn is never smaller under new
mixtures.
Estimates by iterating following equations for all
components j €1, ..., k:

P(j|xi) = πjφ(xi; θj)/fk(xi)

n
πj = ∑ P(j|xi)/n
Fig 2. GMM Cluster
i=1
The algorithm first chooses the component (the Gaussian)

at random with probability and it it samples a point


n
.
μj = ∑ P(j|xi)xi/(nπj)
Let’s suppose to have:
i=1
x1, x2,..., xN

n
We can obtain the likelihood of the sample: ∑j = ∑ P(j|xi)(xi − μi)(xi − μi)T/(nπj)
i=1
. Where θ is model with mean μ and covariace matrix ∑
πj: Mixing Weight
What we really want to maximise is
(probability of a datum given the centres of the Gaussians). φ(x; θj) : Mixture Component

The complete Gaussian mixture model is parameterized


by the mean vectors, covariance matrices and mixture weights
is the base to write the likelihood function:
from all component densities.
Snort and Snort Rules
SNORT is one of the most popular NIDS. SNORT is
Open Source, which means that the original program source
Now we should maximise the likelihood function by code is available to anyone at no charge, and this has allowed
many people to contribute to and analyse the programs
construction. SNORT uses the most common open-source
calculating , but it would be too difficult. That’s licence known as the GNU General Public License.
why we use a simplified algorithm called Expectation-

P a g e | 31
www.ijcait.com International Journal of Computer Applications & Information Technology
Vol. 1, No.1, July 2012
Rule generalization the abnormal users' behavior multigigabit per second speeds
with a moderately small amount of embedded memory and a
We propose to generate new rules by generalising few mega bytes of external memory.
SNORT rules. Given an Internet packet that contains a
variation of a known attack, there should be some automated Dimensionality Reduction Algorithm
way to identify the packet as nearly matching a NIDS attack
signature. If a particular statement has a set of conditions Dimension Reduction techniques are proposed as a data pre-
against it, an item may match some of the conditions. processing step. This process identifies a suitable low-
dimensional representation of original data. Reducing the
Whereas Boolean logic would give the value false to the dimensionality improves the computational efficiency and
query ’does this item match the conditions’, our logic could accuracy of the data analysis.
allow the item to match to a lesser extent rather than not at all.
This principle can be applied when comparing an Internet Steps :
packet against a set of conditions in a SNORT rule. Our Select the dataset.
hypothesis is that if all but one of the conditions are met, an
alert with a lower priority can be issued against the Internet Perform discretization for pre-processing the data.
packet, as the packet may contain a variation of a known  Apply Best First Search algorithm to filter out redundant &
attack. In our implementation, generalisation in the case of super flows attributes.
matching network packets against rules, involves allowing a
packet to generate an alert if:  Using the redundant attributes apply classification
algorithm and compare their performance.
The conditions in the rule do not all match, yet most of
them do;  Identify the Best One.
The only conditions that do not match exactly nearly The original dataset consist of 41 attributes and one class
match. label. The following list out the attribute names
As an example, assume a certain rule states that an alert (i) 41 Attributes: duration, protocol type, service, Flag,,
should be generated if a packet is a particular length, on a src_bytes, dst_bytes, land, wrong _ fragment,
particular port and contained a certain bit pattern. Using our urgent,Hot,num_field_logins,logged_in,num_compromised,ro
generalisation a packet matching those criteria, except perhaps ot_shell,su_attempted,num_root,num_file_creation,
on a different port, or with a slightly different bit pattern, srv_count,.
would still count as matching, and a (modified) alert would be
serror_rate,srv_serror_rate,rerror_rate,srv_rerror_rate,sam
generated.
e_srv_rate,diff_srv_rate,srv_diff_host_rate,dst_host_c
ount,dst_host_srv_count,dst_hosdst_same_srv_rate,dst_host_
diff_srv_rate, dst_host_same _ src _ port _ rate, dst _ host _
srv _ diff _ host _ rate, dst _ host
_serror_rate,dst_host_srv_serror_rate,dst_host_rerror_rate, dst
_ host_srv_rerror_rate.
Using Best First Search method we obtained two set of
reduced dimensionalities. 7 potential attributes and 14
potential attributes which are listed in the table 2 and 3
respectively.
(ii) 14 Attributes: duration, service, flag, src_bytes, dst_bytes,
count, srv _ count, serror_rate, rerror_rate, dst _
host _ same _ srv _ count, dst_host_srv_rate, dst _ host _
Block diagram of a complete network intrusion detection rerror _ rate , dst _ host _ diff _ srv_byte, dst_host_
system consisting of Snort, MySQL, Apache, ACID, PHP, same _ src _port_rate.
GD Library and PHPLOT
(iii) 7 Attributes : Protocol Type, Service,Srcbytes,
5. PROPOSED SYSTEM Dstbytes,count, diff_srv_rate, dest_host_srv_count,

The hardware-implementable pattern matching algorithm Simulation Result


for content filtering applications, which is scalable in terms of
speed, the number of patterns and the pattern length. The The Receiver Operating Characteristic (ROC) curve is usually
algorithm is based on a memory efficient multihashing data used to measure the performance of the classification method.
structure called Bloom filter using embedded on-chip memory Here the ROC curve is a graphical plot of sensitivity,
blocks in field programmable gate array/very large scale specificity for the attributes.
integration chips. In the proposed system we have detected Table 1. Sensitivity, Specificity And Accuracy Based On 41
anomalies from Internet connections, automated generation of Attribute Feature Selections
rules and signatures for the anomalies Detection unknown or
new attacks and reduced the false alarm rate. Models of SENSITIVITY ACCURACY
network intrusion detection system are important component
of the issue. Intrusion Detection System based on Back ID3 96% 95%
Propagation algorithm that can promptly detect attacks, no
matter they are known or not. In this model, Back Propagation J48 94.2% 97%
algorithm is used to learn about the normal users behavior and

P a g e | 32
www.ijcait.com International Journal of Computer Applications & Information Technology
Vol. 1, No.1, July 2012
[4] Lo B., Thiemjarus S., King R., and Yang G., “Body
6. CONCLUSION Sensor Network - A Wireless Sensor Platform for
In order to protect web server, as a security tool, the intrusion Pervasive Healthcare Monitoring”, Adjunct Proceedings
detection system is indispensable. The GMM technique has of the 3rd International conference on Pervasive
been introduced to apply in the classification of rule set so as Computing (PERVASIVE'05), May 2005.
to improve the traditional classification technique, reduce the
[5] Milazzo Jr. A.S., Herlong J.R., Li J.S., Sanders S. P.,
matching times and eventually improve the detection
Barrington M., and Bengur A.R., “Real-time
efficiency. In this paper we proposed a novel method based on
transmission of pediatric echocardiograms using a single
Hierarchical Gaussian Mixture Model for intrusion detection
ISDN line”, Computers in Biology and Medicine, vol.
mechanism. HGMM is an effective model for detecting
32, pp. 379-388, September 2002.
computer attacks of unknown patterns. The Expectation-
maximization algorithm are used to compute the parameters [6] N. F. Timmons, W. G. Scanlon, “Analysis of the
of a parametric mixture model distribution. If the threshold performance of IEEE 802.15.4 for medical sensor body
value is made too low, the IDS Engine suffers from a high area networking”, IEEE Sensor and Ad Hoc
false alarm rate. Here new scan detection techniques that have Communications and Networks Conference (SECON),
much lower false alarm rate and much higher coverage than 2004.
existing techniques are used to reduce the overall false alarm
rate. Some of the methods used are Filtering the unwanted [7] N. Smith-Guerin, L. Al Bassit, G. Poisson, C. Delgorge,
packets and Setting medium level of threshold value. P. Arbeille, and P. Vieyres, “Clinical validation of a
mobile patient-expert tele-echography system using
Using Dimensionality Reduction for three dimensionalities ISDN lines”, in Proc. 4th Int. IEEE/EMBS Special Topic
such as for 41 attributes 14 attributes and 7 attributes the Conf. Inform. Technol. Applicat. Biomed., Birmingham,
classification of attacks are made and by applying the U.K., April 2003, pp. 23–26.
evaluation criteria the corresponding Specificity, Accuracy,
Sensitivity are evaluated to get the respective True Positive, [8] Pertersen S., Peto V. and Rayner M., “Coronary heart
false positive rate for both the algorithms . disease statistics 2004”, British Heart Foundation, June
2004
SNORT RESULTS:
[9] R. S. H. Istepanian, E. Jovanov, Y. T. Zhang, “Guest
editorial introduction to the special section on M-health:
beyond seamless mobility and global wireless health-care
connectivity”, IEEE Trans. on Information Technology
in Biomedicine, vol. 8, no. 4, December 2004.
[10] R. S. H. Istepanian, B. Woodward, and C. I. Richards,
“Advances in telemedicine using mobile
communications”, in Proc. 23rd Annu. Int. IEEE/EMBS
Conf., Istanbul, Turkey, 2001, pp. 3556–3558.
[11] Sinem Coleri Ergen, “Zigbee/IEEE 802.15.4 Summary”,
UC Berkeley, September 2004.
http://www.cs.wisc.edu/~suman/courses/838/papers/zigb
ee.pdf

7. REFERENCES [12] V. Shnayder, B. Chen, “Sensor networks for medical


care”, Technical Report TR-08-05, Division of
[1] B. Woodward, R. S. H. Istepanian, and C. I. Richards, Engineering and Applied Science, Harvard University,
“Design of a telemedicine system using a mobile 2005.http://www.eecs.harvard.edu/~brchen/papers/codeb
telephone”, IEEE Trans. on Information Technology in luetechrept05.pdf
Biomedicine, vol. 5, no. 1, pp. 13–15, March. 2001.
[13] W. J. Tompkins, Ed., Biomedical Digital Signal
[2] Jinwook C., Sooyoung Y., Heekyong P., and Jonghoon Processing. London, U.K: Prentice-Hall, 1993.
C., “MobileMed: A PDA-based mobile clinical
information system”, IEEE Trans. on Information [14] Yuechun Chu and Aura Ganz, “A mobile teletrauma
Technology in Biomedicine, vol. 10, no. 3, July 2006. system using 3G networks”, IEEE Trans. on Information
Technology in Biomedicine, vol. 8, no. 4, December
[3] Kyriacou E., S. Voskarides, C.S. Pattichis, R. Istepanian, 2004
M.S. Pattichis, C.N. Schizas, “Wireless Telemedicine
Systems: A brief Overview”, 4th International workshop [15] mHealth: The effectiveness of semantic healthcare
on Enterprise Networking and Computing in Healthcare knowledge frame work for Health Monitoring System
Industry (HEALTHCOM2002), Vol. 1, pp. 50-56, Using Smart Phones : Onkar S Kemkar, Dr P B Dahikar,
Nancy, France, June 2002. NSI-35, Belgaum 2011

P a g e | 33

You might also like