0% found this document useful (0 votes)
37 views

Building An Intrusion Detection System Using A Filter

This document proposes a mutual information based feature selection algorithm to build an intrusion detection system. Existing intrusion detection systems are ineffective at protecting against modern cyber attacks and cannot handle large datasets. The proposed algorithm analytically selects optimal features to classify network traffic data and handle linearly and nonlinearly dependent features. An intrusion detection system using the selected features and a least square support vector machine classifier is shown to achieve better accuracy and performance than state-of-the-art methods on evaluation datasets.

Uploaded by

Chetan Raju
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views

Building An Intrusion Detection System Using A Filter

This document proposes a mutual information based feature selection algorithm to build an intrusion detection system. Existing intrusion detection systems are ineffective at protecting against modern cyber attacks and cannot handle large datasets. The proposed algorithm analytically selects optimal features to classify network traffic data and handle linearly and nonlinearly dependent features. An intrusion detection system using the selected features and a least square support vector machine classifier is shown to achieve better accuracy and performance than state-of-the-art methods on evaluation datasets.

Uploaded by

Chetan Raju
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Building an intrusion detection system using a filter-based feature selection

algorithm
ABSTRACT:
Redundant and irrelevant features in data have caused a long-term problem in network traffic
classification. These features not only slow down the process of classification but also prevent
a classifier from making accurate decisions, especially when coping with big data. In this paper,
we propose a mutual information based algorithm that analytically selects the optimal feature
for classification. This mutual information based feature selection algorithm can handle
linearly and nonlinearly dependent data features. Its effectiveness is evaluated in the cases of
network intrusion detection. An Intrusion Detection System (IDS), named Least Square
Support Vector Machine based IDS (LSSVM-IDS), is built using the features selected by our
proposed feature selection algorithm. The performance of LSSVM-IDS is evaluated using three
intrusion detection evaluation datasets, namely KDD Cup 99, NSL-KDD and Kyoto 2006+
dataset. The evaluation results show that our feature selection algorithm contributes more
critical features for LSSVM-IDS to achieve better accuracy and lower computational cost
compared with the state-of-the-art methods.

PROJECT OUTPUT VIDEO: (Click the below link to see the project output video):
EXISTING SYSTEM:
 A significant amount of research has been conducted to develop intelligent intrusion
detection techniques, which help achieve better network security. Bagged boosting-based
on C5 decision trees and Kernel Miner are two of the earliest attempts to build intrusion
detection schemes.
 Mukkamala et al. investigated the possibility of assembling various learning methods,
including Artificial Neural Networks (ANN), SVMs and Multivariate Adaptive Regression
Splines (MARS) to detect intrusions.
DISADVANTAGES OF EXISTING SYSTEM:
 Existing solutions remain incapable of fully protecting internet applications and computer
networks against the threats from ever-advancing cyber attack techniques such as DoS
attack and computer malware.
 Current network traffic data, which are often huge in size, present a major challenge to
IDSs. These “big data” slow down the entire detection process and may lead to
unsatisfactory classification accuracy due to the computational difficulties in handling such
data.
 Classifying a huge amount of data usually causes many mathematical difficulties which
then lead to higher computational complexity.
 Large-scale datasets usually contain noisy, redundant, or uninformative features which
present critical challenges to knowledge discovery and data modeling.
PROPOSED SYSTEM:
 We have proposed a hybrid feature selection algorithm (HFSA). HFSA consists of two
phases.
 The upper phase conducts a preliminary search to eliminate irrelevant and redundancy
features from the original data. This helps the wrapper method (the lower phase) to decrease
the searching range from the entire original feature space to the pre-selected features (the
output of the upper phase). The key contributions of this paper are listed as follows.
 This work proposes a new filter-based feature selection method, in which theoretical
analysis of mutual information is introduced to evaluate the dependence between features
and output classes.
 The most relevant features are retained and used to construct classifiers for respective
classes. As an enhancement of Mutual Information Feature Selection (MIFS) and Modified
Mutual Information based Feature Selection (MMIFS), the proposed feature selection
method does not have any free parameter, such as in MIFS and MMIFS. Therefore, its
performance is free from being influenced by any inappropriate assignment of value to a
free parameter and can be guaranteed. Moreover, the proposed method is feasible to work
in various domains, and more efficient in comparison with HFSA, where the
computationally expensive wrapper-based feature selection mechanism is used.
 We conduct complete experiments on two well known IDS datasets in addition to the
dataset used. This is very important in evaluating the performance of IDS since KDD
dataset is outdated and does not contain most novel attack patterns in it. In addition, these
datasets are frequently used in the literature to evaluate the performance of IDS. Moreover,
these datasets have various sample sizes and different numbers of features, so they provide
a lot more challenges for comprehensively testing feature selection algorithms.
 Different from the detection framework proposed that designs only for binary
classification, we design our proposed framework to consider multiclass classification
problems. This is to show the effectiveness and the feasibility of the proposed method.
ADVANTAGES OF PROPOSED SYSTEM:
 FMIFS is an improvement over MIFS and MMIFS.
 FMIFS suggests a modification to Battiti’s algorithm to reduce the redundancy among
features.
 FMIFS eliminates the redundancy parameter required in MIFS and MMIFS.
SYSTEM ARCHITECTURE:
SYSTEM REQUIREMENTS:
HARDWARE REQUIREMENTS:

 System : Pentium Dual Core.


 Hard Disk : 120 GB.
 Monitor : 15’’ LED
 Input Devices : Keyboard, Mouse
 Ram : 1GB.
SOFTWARE REQUIREMENTS:

 Operating system : Windows 7.


 Coding Language : JAVA/J2EE
 Tool : Netbeans 7.2.1
 Database : MYSQL

You might also like