Machine Learning Enabled Framework For Classification and Detection of Intrusion in MANET
Machine Learning Enabled Framework For Classification and Detection of Intrusion in MANET
Abstract: An Intrusion Detection System is a necessity in order to attempt to achieve such access. An Intrusion Detection
ensure the security of the existing network and user data which is System, often known as an IDS, is primarily used to gather
travelling across in the network. Because of the rapid development and analyze data regarding security events that occur in
of network technologies, the identification of dangers based on the computer systems and networks. Its subsequent purpose is to
analysis of contextual information may be application and
network-specific. A remedy to such a challenge may be found in
either prevent these events from happening or notify them to
the form of a hybrid intrusion detection system (IDS) which uses the administrator of the system. As a result of the increasing
machine learning algorithms for optimization. Many possible number of attacks carried out by attackers, users' level of
attacks are viable in a network, but our study id limited to few of mistrust in the Internet has increased. Attacks that cause denial
them. In one of the popular attack known as denial of of service are a major violation of security (DoS).
service(DOS) attack, the approach is to overwhelm the victim's An intrusion detection system, also known as an IDS,
MANET network with an overwhelming number of packets. It is is a type of network monitoring software that keeps tabs on
now possible for this kind of attack to create substantial issues for and analyses data from a company's computer networks and
networks of any scale. When analyzing high-performance hybrid traffic in order to spot any signs of malicious incursion,
intrusion detection systems, one of the most critical challenges is
the processing of vast volumes of information including a large
regardless of whether it originates from the outside or from
number of features. An excessive quantity of features may slow within the company's own walls. In most contexts, the term
down the training and testing process, increase resource usage, "burglar alarm" may be interchangeably used with "intrusion
and decrease detection accuracy. This can be a problem for detection device." For example, a potential thief is
malicious pattern recognition, which can be inhibited as a result. discouraged from stealing a car by the locking system.
This article provides a detailed description of a framework for an However, if the lock on the vehicle is broken and the
intrusion detection system that takes use of machine learning. As a thief still attempts to take it, the burglar alarm will go off and
result, it is essential to bring the size of the benchmark dataset the owner will be notified. In a similar manner, an intrusion
down to a more manageable level and get rid of any unnecessary detection system (IDS) serves as a warning in a computer
features. Cleansing up data sets may be accomplished by the use
of preprocessing, which involves deleting out-of-range numbers,
system or network to notify and report on any potentially
uncommon combinations of data, and missing information. In hazardous behavior.
order to improve the accuracy of classifiers, it is common practice Hackers are always coming up with new ways to
to apply methods known as feature selection (FS) in order to rid a infiltrate the infrastructure of a system and carry out unwanted
dataset of data points that are extraneous or unneeded. In the operations. Because of its immense scale, complexity, and the
paper there were three machine learning techniques namely SVM, wide diversity of operating systems used by end hosts, the
Naïve Bayes and ID3 which were considered for comparative Internet is especially susceptible to security issues due to the
study. It was observed that SVM is achieving higher accuracy in nature of its architecture. In light of these challenges, the best
intrusion data classification and detection. practices for the Internet today depend on data that
demonstrates how to recognize harmful trends in assaults,
Index Terms: Intrusion Detection, Machine Learning, SVM,
Accuracy locate security flaws, and fix them as rapidly as is practically
possible. There has been a rise in the quantity of erroneous
I. INTRODUCTION warnings that are produced by previously installed intrusion
Protecting sensitive data, ensuring the continued detection systems. Streamlining the CI components of the IDS
dependability of essential systems, and preventing might help bring these numbers down. The researchers used a
unauthorized access to systems and data are the primary broad range of CI techniques and assessed how well they
objectives of computer security [1]. An unwarranted entry into performed on typical datasets.
a computer system, network, or other service is referred to as Intrusion Detection Systems, often known as IDSs, use
an intrusion [2]. An intrusion might refer to any concerted a wide variety of techniques in order to search both incoming
and outgoing network data for abnormalities and then filter
those anomalies out. A database, an analysis engine, and a been integrated with it. Many authors have presented their
response manager are the standard components of an intrusion works for the same and shown the ways for other researchers
detection system [4]. to dig out the work in this field. There are works done related
to integration of machine learning techniques like neural
network, fuzzy logic and genetic algorithms and many others.
The authors in paper [23] [24] have given emphasis
to devise some methodology to safeguard against the
compromised nodes due to black hole and other attacks. The
research paper [30] by padmalaya Nayak also refers to the
impact of black hole and sink hole attack on WSN. The paper
[24] has discussed the strategy to detect intrusion using
machine learning techniques like SVM, random forest and
naive Bayes where using NS2 simulator and DSR routing
algorithm the trace route files were used for the data set. There
were selected features taken and the performance of detecting
the intrusion was noted. It was also noted that these under
consideration algorithms were better than one another
depending upon the parameter which was considered.
In paper [25] the authors have presented another
novel way of IDS implementation where the notification of a
malicious or suspicious activity has been analyzed on the
basis of minimum hop count or an early reply. Such kind of
Figure 1: Intrusion Detection with Machine Learning the node is kept aloof or segregated such that it does not
hamper the activities of the network. The paper has considered
The database is the central component of any intrusion the concept of generation of clusters for route selection and
detection system (IDS), which is also often referred to as an
packet formation and hence the node detected as malicious is
event driver. There are four major groups of information
sources, which are referred to as host-based, network-based, removed from the cluster and treated as an independent node
application-based, and target-based monitors respectively. so that it does not participate in forwarding the RREP packets
The research engine, which is responsible for and hence the system is prevented from the malicious activity.
analyzing prospective dangers, makes up about two-thirds of The authors of the paper [18]and [26] have
an intrusion detection system. In this stage, information is emphasized the importance of machine learning in IDS
retrieved from the data source, and it is analyzed to see systems. The authors have worked upon deriving a new model
whether or not it contains any indications of intrusion or other based on mathematical derivation for intrusion detection. The
breaches of the relevant rules. Techniques such as misuse/ model is trained and automatically assigns a threshold value
signature-based detection and anomaly/statistical detection are using the selected features of the training set and this
common approaches that are used while doing IDS data threshold is valid only for a specific time slot. This accounts
analysis. for the dynamic nature of this proposed model. This is
Thirdly, a solution manager is a component that must
considered as a single iteration and the other iteration get an
be included in an intrusion detection system. A denial-of-
updated value of the threshold which accounts for an optimal
service attack, also known as a DoS attack, is one in which a
large number of messages that are not relevant to the victim accuracy. the proposed model has considered models like
resource are generated and sent to that resource, overloading it SVM, nearest neighbor and decision tree. the authors were
to the point where it can no longer serve users who are able to estimate a minimum time overhead to detect an
actually authorized to use it [3]. The role of the response abnormal behavior in the system. Also the presenters in the
manager is contingent on the discovery of possible flaws in paper [29] have proposed a mobile agent which helps to
the mechanism, which then triggers an alert to be sent to enhance the overall lifetime of a network by initiating the
someone or something in the form of a response. This results agent feature. The same agent will be also helpful in collecting
in networks being unreliable and inaccessible, as well as the data from different nodes and heading for minimum
causing interference in data transmission and the formation of energy usage at the node level.
connections. Figure 1 shows a simple form of working of an The research paper presented by the researchers in
IDS with machine learning approach. [27] have hinted about the machine learning techniques like
II. RELATED METHODOLOGY Support Vector Machine, J48, Random Forest, and Naïve
Bytes can be used for both binary and multi-class
Good amount of work has been studied in the classification for Network based Intrusion Detection System.
category of IDS systems. In order to increase the efficiency of The various techniques of data mining helped out to extract
the existing IDS system the Machine learning approach has
the features which helped eminently and to correlate with the various attributes where some may be ignored
features of the intrusion detection system. The proposed for good classification
model has enhanced the feature selection process by • Feature selection is one of the most important
separating the pertinent features from the unrelated ones in the step as it the most appropriate selection of
KDD data set hence emphasizing the significance of the same features on which Training and testing is
In order to consider the features which are conditionally performed will help to optimize the work.
related, the correlation feature was considered. The • Test and evaluate the model with the KDD
researchers have used multi classification techniques and gave dataset.
some research insight by maintaining the tables for evaluation
metrics for the set of attacks. As many as five classes were
identified for approx. 20 attacks applicable to different IV. RESULT FINDING AND ANALYSIS
systems. Out of the machine learning algorithms, Random
The data set is used here is a mix of NSL KDD as well as
forest has performed better with the algorithms under
the implementation results of few attacks like black hole, grey
consideration
hole and warm hole attack which is used as input data set. It
The Decision Tree approach for Intrusion Detection is was a dataset comprising of above 1,25000 instances with 40
discussed in the research paper [28] where the Classification plus distinct featuring characteristics. Approximately seventy
Regression Tree is used for the detection purpose. There are percent of the NSL KDD dataset was used in training data
four classified types attacks identified by implementing this (25192 records), and the remaining approximately thirty
approach. These include U2R (unauthorized access to local percent is testing data (100781 records). The Denial of service
super user), R2L (unauthorized access from a remote attack refers to a type of attack that aims to disrupt or restrict
machine), DOS (denial-of-service attack) and probing the availability of services for legitimate users. Various attack
(surveillance and other probing). The dataset used is from the techniques like smurf, teardrop attack, SYN flooding and
KDD data. The given algorithm was able to classify the 27 Neptune attacks. These attacks overload a target system with a
categories of attacks under four classified categories. flood of request on malicious traffic making it unresponsive
Also some other attacks gain control over user machine due to
some vulnerability. Some attacks like U2R involves typically
III. PROPOSED METHODOLOGY escalating their own privileges to root level and harming the
The paper proposes intrusion detection using machine system.
learning techniques. The paper has considered the three The attacks are classified with the help of a classifier
algorithm like Support Vector Machine, Naïve Bayes and ID3. system. The accuracy of a classifier, which may be understood
Support Vector Machine rely on intuitive geometric as the percentage of correct classifications that it creates, is the
principles to achieve optimal classification by linearly indication that is used most often in order to determine how
separating the training data and also incorporating the successful it is. There are some important metrics which
polynomial or radial basis functions to transform linear would be useful in comparing the different classifiers that are
algorithms into nonlinear ones with the feature space available. The metrics are represented in the equations from
mapping. (1) to (4) Even if it is essential to evaluate the fairness of
classifiers, there are other factors that are often ignored. When
Naïve Bayes machine learning algorithm is renowned a test is run on a particular population or collection of data
for its straightforward approach, gracefulness and resilience in linked to any ailment, there are a number of results that might
constructing a classifier Instead of calculating the entire possibly occur. These possibilities include true positives (TP),
covariance matrix, this algorithm utilizes a limited amount of false positives (FP), true negatives (TN), and false negatives
training data to estimate the mean and variance for each class (FN). When determining the usefulness of the classifier, it is
it examines. Moreover, the assumption of attribute essential to be as objective as is humanly feasible, and the
independence within a given class in Naïve Bayes allows for aforementioned metrics will be of great assistance in this
some flexibility to enhance its classification accuracy. Another endeavor. The metrics are as follows.
algorithm considered here is IDS which is an inductive ● Accuracy
learning method that follows a top down and greedy approach.
Accuracy refers to the proposition of predictions that are
It follows the information entropy theory. By selecting the
correctly predicted or classified
attribute with the highest information gain as the categorical
attribute, it iteratively expands the decision tree branches,
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = (𝑇𝑃 + 𝑇𝑁) / (𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁)-----(1)
slowly constructing the decision tree
The proposed model has been described by the ● Sensitivity
researchers and the tools and techniques have been described.
The steps of the proposed model can be summarized as Sensitivity, also Known as Recall refers to the
follows: proportion of positive cases identified as positive
• The data set is loaded and preprocessing is
performed for any missing values. There are 𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 = (𝑇𝑃) / (𝑇𝑃 + 𝐹𝑁)--------------------------(2)
● Correctness
[1] Othman, S.M.; Alsohybe, N.T.; Ba-Alwi, F.M.; Zahary, A.T. Survey on
Intrusion Detection System Types. Int. J. Cyber Secur. Digit. Forensics
2018, 7, 444–463.
[2] , B.B.; Miani, R.S.; Kawakani, C.T.; de Alvarenga, S.C. A survey of
intrusion detection in Internet of Things. J. Netw. Comput. Appl. 2017,
84, 25–37. [CrossRef]
[3] Alexandros G. Fragkiadakis, Vasilios A. Siris†, Nikolaos E. Petroulakis
and Apostolos P. Traganitis, Anomaly-based intrusion detection of
jamming attacks, Local versus collaborative detection, Published online
in Wiley Online Library (wileyonlinelibrary.com), DOI:
10.1002/wcm.2341,Febraury 2013.
[4] S. Yan and Y. Chung, “Improved ad hoc on-demand distance vector
routing (AODV) protocol based on blockchain node detection in ad hoc
networks,” International Journal of Internet, Broadcasting and
Communication, vol. 12, no. 3, pp. 46–55, 2020.
[5] A. K. Biswas and M. Dasgupta, “AODV-DSR hybrid reactive routing
protocol and its generalization for mobile ad-hoc networks,” in in 2019
3rd International Conference on Electronics, Materials Engineering &
Nano-Technology (IEMENTech), pp. 1–5, IEEE, 2019.
[6] K. L. Arega, G. Raga, and R. Bareto, “Survey on performance analysis
of AODV, DSR and DSDV in MANET,” Computer Engineering and
Intelligent Systems, vol. 11, no. 3, pp. 23–32, 2020.
[7] G. K. Wadhwani, S. K. Khatri, and S. K. Mutto, “Trust framework for
attack resilience in MANET using AODV,” Journal of Discrete
Mathematical Sciences and Cryptography, vol. 23, no. 1, pp. 209–220,
V. CONCLUDING REMARKS AND FUTURE ENHANCEMENTS 2020.
[8] S. Shrestha, R. Baidya, B. Giri, and A. Thapa, “Securing blackhole
attacks in MANETs using modified sequence number in AODV routing
Since the wireless medium of data transfer is free and protocol,” in in 2020 8th International Electrical Engineering Congress
it is impossible to include hefty weight but robust encryption (iEECON), pp. 1–4, IEEE, 2020.
into the protocols, attackers are able to take advantage of [9] S. Hossain, M. S. Hussain, R. R. Ema, S. Dutta, S. Sarkar, and T. Islam,
“Detecting black hole attack by selecting appropriate routes for
certain vulnerabilities that are present in WSN routing authentic message passing using SHA-3 and Diffie-Hellman algorithm
vulnerabilities can be exploited by malicious users and make in AODV and AOMDV routing protocols in MANET,” in in 2019 10th
way open for many vulnerabilities include the black hole, International Conference on Computing, Communication and
wormhole, man-in-the-middle, denial of service, Sybil, and Networking Technologies (ICCCNT), pp. 1–7, IEEE, 2019.
power depletion attacks, amongst others. Many intrusion [10] A. Patel and A. Jain, “A study of various Black Hole Attack techniques
and IDS in MANET,” International Journal of Advanced Computer
detection systems (IDSs) have been built on top of Technology, vol. 4, no. 3, pp. 58–62, 2015.
conventional routing protocols in order to defend networks [11] J. Visumathi and K. L. Shunmuganathan: A computational intelligence
from the typical dangers described above. At the moment, for evaluation of intrusion detection system, Indian Journal of Science
there is a great deal of interest being shown in the area of and Technology, Vol. 4 No. 1, Jan 2011.
intrusion detection systems, not just among the testing [12] V. Jain and M. Agrawal, "Applying Genetic Algorithm in Intrusion
Detection System of IoT Applications," 2020 4th International
community but also inside organizations. And it has ben also Conference on Trends in Electronics and Informatics (ICOEI)(48184),
noticed that since its inception, it has consistently served as a 2020, pp. 284-287, doi: 10.1109/ICOEI48184.2020.9143019.
vital component in the establishment of comprehensive safety [13] Kunhare, N., Tiwari, R. &Dhar, J. Particle swarm optimization and
regulations. Because of this, technology for preventing feature selection for intrusion detection system. Sādhanā 45, 109
intrusions into computer networks has been created to (2020).https://doi.org/10.1007/s12046-020-1308-5
compensate for the weaknesses of more traditional approaches [14] Win, T.Z.; Kham, N.S.M. Information Gain Measured Feature Selection
to Reduce High Dimensional Data. In Proceedings of the 17th
to the security of computer networks. International Conference on Computer Applications (ICCA 2019),
It was observed that the data was to be ported on a Novotel hotel, Yangon, Myanmar, 27 February–1 March 2019; pp. 68–
centralized server in-order to work on the training and testing 73
data set. A suggestion as a future upliftment to this concept [15] A. Chaudhary, V. N. Tiwari and A. Kumar: Analysis of Fuzzy Logic
can be a cloud based or distributed server to store the training Based Intrusion Detection Systems in Mobile Ad Hoc Networks,
International Journal of Information Technology, Vol. 6, No. 1, June
and testing data. However, the overheads of this approach may 2014.
be first analyzed as there are many challenges to distributed or [16] A. Chaudhary, V. N. Tiwari and A. Kumar: Analysis of Fuzzy Logic
cloud based approach. Also a hybrid approach involving the Based Intrusion Detection Systems in Mobile Ad Hoc Networks,
above approaches can also be considered. The sub International Journal of Information Technology, Vol. 6, No. 1, June
2014.
classification for the attacks can also be performed. Also other
[17] Nagar, P.; Menaria, H.K.; Tiwari, M. Novel Approach of Intrusion
Machine Learning approaches can also be analyzed for much Detection Classification Deeplearning Using SVM. In First International
better results. Conference on Sustainable Technologies for Computational Intelligence,