Project Document
Project Document
CERTIFICATE
This is to certify that the project work entitled “NETWORK INTRUTION DETECTION
SYSTEM USING MACHINE LEARNING ” is a bonafied work done by M. Pavithra devi
(21991A0522), G. Ramya(21991A0556), A. Prasad (21991A0555), V. Govind (21991A0507),
N. Lakshmanarao(21991A0544) and under my supervision in partial fulfilment for the award
of the degree of Bachelor of technology in Computer Science and engineering , Avanthi st.
thereasa institute of engineering & technology, Garividi , Vizianagaram during the academic
year 2021-2025.
EXTERNAL EXAMINER
ACKNOWLEDGEMENT
hearted co-operation, unfailing inspiration and valuable guidance. Throughout the work, his
valuable suggestions, constant encouragement have helped us a long way. We thank him for
giving his valuable time at odd hours and for the patience and understanding he showed, that
greatly helped the seminar work to get successfully completed.
We convey our sincere thanks to Dr. V. JOSHUA JAYA PRASAD, Principal of Avanthi’s St.
Theressa Institute of Engineering & Technology who provided an opportunity to tack on project
work in well-equipped laboratories of computer science department in our college.
At the outset, we thank SRI M. SRINIVASA RAO, beloved chairman of Avanthi group of
colleges who is the back bone of college, Thank you sir.
We declare that this project entitled “NETWORK INTRUSION DETECTION SYSTEM USING
MACHINE LEARNING” is an original work done by M. Pavithra devi (21991A0522), G.
Ramya(21991A0556), A. Prasad (21991A0555), V. Govind (21991A0507), N. Lakshmanarao
(21991A0544) for B. Tech Degree and we assure that this project work hasn’t been submitted
by us towards any degree or diploma of this or any other university.
(Regd. No:21991A0555)
A. Prasad
(Regd. No:21991A0507)
V. Govind
CERTIFICATE
ACKNOWLEDGEMENT
DECLARATION
TABLE OF CONTENTS
LIST OF FIGURES
LIST OF TABLES
ABSTRACT
1 INTRODUCTION 1
1.3 Motivation 9
1.4 Objective 10
2 SYSTEM ANALYSIS 13
3.2.1 Methodology 24
42
4 SYSTEM IMPLEMENTATION
REFERENCE 57
LIST OF FIGURES
CHAPTER-1
INTRODUCTION
Page | 1
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
1. INTRODUCTION
In today's interconnected digital world, cybersecurity has become a critical concern for
individuals, organizations, and governments. With the widespread use of internet-enabled
devices, cloud services, and enterprise networks, the volume of network traffic has grown
exponentially. As a result, the frequency and sophistication of cyberattacks have also increased,
posing significant risks to data confidentiality, integrity, and availability. To counter these
threats, advanced security mechanisms such as Intrusion Detection Systems have become an
essential layer in modern cybersecurity infrastructure.
An Intrusion Detection System is a network security tool that monitors traffic patterns and
detects unauthorized, anomalous, or malicious activities within a network. There are two
primary types of intrusion detection systems: signature-based and anomaly-based. Signature-
based IDS detect threats by comparing network activity against a database of known attack
patterns or signatures.These systems are highly accurate in detecting previously identified
threats but struggle to recognize new or unknown attacks, including zero-day vulnerabilities.
In contrast, anomaly-based IDS analyze the behavior of users and systems to establish a
baseline of normal activity. Deviations from this baseline are flagged as potential threats.
Although anomaly-based systems are better at detecting novel attacks, they often suffer from
high false positive rates, which can overwhelm analysts with irrelevant alerts.
This project focuses on developing a machine learning-based Network Intrusion Detection
System using real-world datasets. The system aims to accurately classify normal and malicious
network traffic by leveraging powerful ML algorithms like Random Forest, XGBoost, and
Logistic Regression. The ultimate goal is to create a robust, scalable, and intelligent intrusion
detection framework that enhances the security posture of modern digital infrastructures while
addressing the shortcomings of traditional IDS approaches.
1.1 Brief Introduction:
In recent years, the number and complexity of cyberattacks have significantly increased,
targeting both individuals and organizations. This creates a strong need for effective security
systems that can monitor network traffic and detect malicious activities in real time. One such
system is an Intrusion Detection System (IDS), which plays a vital role in identifying abnormal
or unauthorized behaviour within a network.
This project focuses on designing a Robust Network Intrusion Detection System using machine
learning techniques. The system is built using three key algorithms: XGBoost, Random Forest,
Page | 2
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
and Logistic Regression. These models are trained and tested using the CICIDS2018 dataset,
which is a realistic and comprehensive dataset containing both normal and various attack traffic
types. The system aims to accurately detect different types of intrusions and minimize false
alarms, making it practical for real-time network security applications.
Traditional IDS solutions rely heavily on signature-based detection, where known patterns of
malicious behavior are identified and blocked. However, this approach fails to detect zero-day
attacks or newly evolving threats that do not match any known signatures. To overcome this
limitation, machine learning-based approaches have emerged as a more intelligent and adaptive
solution for building robust intrusion detection systems.
In this project, we aim to implement a Robust Network Intrusion Detection System (RNIDS)
that leverages machine learning techniques to accurately detect intrusions in real-time. We
utilize three well-established algorithms XG Boost, Random Forest, and Logistic Regression—
to train classification models on a realistic dataset. By applying advanced data preprocessing,
feature selection, model evaluation, and real-time prediction, this system is designed to be
effective, scalable, and practical for real-world security environments.
Adaptability: ML models can adapt to new types of attacks as they learn from fresh
data.
Accuracy: Models can learn from complex patterns and improve accuracy over time.
Page | 3
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
These systems are often deployed in various cloud based or on-premises environments.
Their services may range from real-time monitoring to historical log analysis,
depending on the needs of the organization.
Improved Threat Detection: ML models can detect complex and evolving cyber
threats that may go unnoticed by traditional systems.
Automation: Reduces the need for manual rule updates, allowing systems to learn and
adapt autonomously.
Efficiency: Accelerates the detection process, allowing for faster incident response and
mitigation.
Pattern Recognition: Learns from historical data to identify hidden patterns, making
the system robust against zero-day attacks.
Page | 4
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
These benefits make ML a powerful tool in modern cybersecurity frameworks, paving the
way for intelligent, self-learning security systems capable of defending against a wide array
of network-based threats.
The study examines ML and DL methods for NIDS (network intrusion detection
systems).Effectiveness, problems including false alarms, and real-time processing are covered.
According to the study, feature selection and dataset quality are crucial.[1] For a deep learning-
based NIDS, the study employs Self-Taught Learning (STL) with sparse autoencoders and
softmax regression. It increases intrusion detection efficiency and accuracy when tested on the
NSL-KDD dataset.[2] A probabilistic approach to estimating discrete values in huge datasets is
presented in the paper. It presents an enhanced approach for less computationally complex and
more accurate estimations. For database administration and query optimization, this is
essential.[3] Using Random Forest modeling, the paper suggests an NIDS that exhibits increased
efficiency and accuracy on the KDD'99 dataset. It draws attention to how ensemble learning
may improve NIDS performance [4] The study suggests a NIDS that uses deep learning to detect
known and unknown threats. The system exhibits flexibility and durability in identifying
network intrusions using the UNSW-NB15 dataset. This study demonstrates how deep neural
networks can improve cybersecurity protocols.[5] The work highlights adaptive strategies and
real-time monitoring while presenting an intrusion detection strategy utilizing statistical
anomaly detection. It improves the fundamental knowledge of cybersecurity intrusion
detection.[6] The report examines IDS approaches, pointing out shortcomings and difficulties in
hybrid, anomaly-based, and signature-based approaches. It ends with suggestions for future
study aimed at enhancing IDS models.[7] Using the KDD Cup 99 dataset, the study suggests a
fuzzy logic-based NIDS that produces detection rules from frequently occurring itemsets with
good accuracy. It draws attention to how fuzzy logic might improve intrusion detection.[8]
A hybrid NIDS that combines anomaly and misuse detection algorithms is presented in the
paper. It lowers false positives while increasing detection accuracy.[9] A hybrid Network
Intrusion Detection System (NIDS) that blends anomaly-based and signature-based detection
techniques is presented in this research. In detecting network intrusions, this method lowers
Page | 5
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
false positives and improves detection accuracy.[10] With an emphasis on deep learning
methodologies, the study investigates machine learning-based NIDS in Software Defined
Networking (SDN) systems. It assesses different approaches, points out difficulties, and makes
recommendations for future lines of inquiry.[11] The study examines intrusion detection
systems and divides them into three categories: stateful protocol analysis, anomaly-based, and
signature-based. It draws attention to their advantages, disadvantages, and the requirement for
flexible real-time monitoring.[12] The UNSW-NB15 dataset, which is intended to assess
network intrusion detection systems using contemporary attack scenarios, is introduced in this
research. With the goal of advancing NIDS research and development, it provides a wide range
of features and attack kinds.[13] A bibliometric study of research on oxidative stress in
intervertebral disc degeneration (IDD) is presented in this paper. It covers prominent authors,
publication patterns, and new studies on senescence and mitochondria.[14] The study suggests
a Hidden Naïve Bayes classifier with PKI discretization and INTERACT feature selection for
network intrusion detection. It outperforms conventional models in terms of accuracy and
predictive performance.[15] An unsupervised network intrusion detection system that can
recognize unknown attacks without any prior knowledge is presented in the paper. The
technology efficiently identifies anomalies by using clustering algorithms and examining
network traffic patterns, strengthening network security against new threats.[16]
Using the Kyoto 2006+ dataset and the J48 decision tree technique for NIDS classification,
the study identifies known, unknown, and normal attacks with 97.2% accuracy. It draws
attention to the shortcomings of conventional signature-based NIDS and illustrates how
successful anomaly detection methods are.[17] The UNSW-NB15 and KDD99 datasets are
used to compare the efficacy of feature selection methods in NIDS. By improving feature
selection through the use of Association Rule Mining, it shows that UNSW-NB15 attributes
increase detection accuracy but have a higher False Alarm Rate.[18]Using a linear classifier
for attack detection and a sparse autoencoder for feature learning, the study suggests a deep
learning-based NIDS. It shows improved recall, accuracy, and precision in recognizing known
and novel attacks.[19] By integrating Genetic Algorithms into SVM-based IDS, the study
improves feature selection and parameter modification, increasing detection efficiency and
accuracy. The advantages of the GA-SVM fusion over solo SVM techniques is confirmed by
experimental findings on intrusion detection datasets.[20] By concentrating on network-wide
event analysis rather than host-based monitoring, network-based intrusion detection has
Page | 6
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
developed to handle multiple network threats. By modeling attacks with state diagrams and
hypergraphs, the State Transition Analysis Technique enhances event monitoring and IDS
settings.[21] According to the research, machine learning-based classifiers cannot be
effectively evaluated and compared across different datasets when Network Intrusion
Detection System datasets lack a uniform feature set. Researchers have suggested NetFlow-
based feature sets as a solution to this problem, and they have shown better classification
performance and enable a more thorough evaluation of NIDS models.[22]
Debar, Dacier, and Wes pi give a thorough overview of intrusion detection systems in their
2002 work "Intrusion Detection Systems: Technology and Development," outlining the
difficulties in the field and going over different detection techniques.[23] Through an analysis
of several attack strategies and accompanying response mechanisms, the study "Adversarial
Machine Learning for Network Intrusion Detection Systems: A Comprehensive Survey" offers
a thorough overview of how adversarial machine learning techniques provide problems to
NIDS. It highlights how important it is to have strong NIDS that can withstand hostile attacks
in order to preserve network security.[24] An overview of different deep learning approaches
used with intrusion detection systems is given in the publication "A Survey of Intrusion
Detection Systems Based on Deep Learning". It highlights the need for more study to improve
IDS performance while going over the advantages and disadvantages of these strategies.[25]
The proceedings of the 2018 APAN Research Workshop, which focused on experimental
performances associated with the NSL-KDD dataset, seem to be the paper in question. A
thorough literature review is not possible due to restricted access to the complete text.The
core[26]
The literature emphasizes the drawbacks of standalone anomaly-based and misuse-based
intrusion detection systems and the benefits of hybrid IDSs, which integrate the two strategies
for better attack detection. According to studies, Snort's integration with anomaly-based
systems such as PHAD and NETAD improves the accuracy of intrusion detection, especially
when it comes to new threats.[27] The study "A Survey of Intrusion Detection Systems in
Wireless Sensor Networks" offers a thorough analysis of the advantages and disadvantages of
several intrusion detection strategies designed for wireless sensor networks. It highlights the
need for reliable and energy-efficient IDS systems to handle the particular difficulties these
networks provide.[28] By combining anomaly and misuse detection, hybrid intrusion detection
systems increase intrusion detection accuracy while lowering false positives. SDN-based IDSs
Page | 7
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
have recently advanced to use machine learning for effective and dynamic threat detection.[29]
IDSs based on genetic algorithms increase the effectiveness of anomaly detection and feature
selection. By optimizing rule generation, GA reduces false positives and increases detection
accuracy.[30]
Page | 8
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
1.3 Motivation:
The motivation behind this project arises from the escalating threat landscape in the realm
of cybersecurity. In recent years, there has been a significant increase in the frequency,
complexity, and impact of cyberattacks on modern digital infrastructures. Organizations
across the world face constant risks of data breaches, denial-of-service (DoS) attacks,
malware infections, and unauthorized access. As these threats continue to evolve, traditional
security mechanisms such as signature-based Intrusion Detection Systems are proving to be
inadequate in ensuring real-time protection.
Signature-based IDS are fundamentally reactive they rely on predefined patterns of known
attacks to detect malicious activity. While these systems are effective against previously
encountered threats, they fail to recognize novel, emerging, or polymorphic attacks that do
not match existing signatures. This limitation creates a significant gap in an organization’s
defence, making them vulnerable to zero-day exploits and adaptive intrusions. Additionally,
these systems often suffer from a high rate of false positives, which can lead to alert fatigue
among security analysts and divert attention from real threats.
The growing scale and complexity of network traffic also pose challenges in real-time
analysis and response. As networks expand and data flows become more intricate, the ability
to detect and respond to intrusions promptly becomes increasingly difficult. There is a clear
need for intelligent, scalable solutions that can go beyond static rule-based systems and offer
proactive defence mechanisms.
Machine Learning presents a powerful alternative to traditional IDS approaches. Unlike rule-
based systems, ML models are capable of learning from large datasets, identifying complex
patterns, and adapting to new and unseen attack types. These models not only enhance the
detection capabilities but also significantly reduce false alarms by understanding the context
and variations in network behaviour. They can be trained to distinguish between normal and
malicious traffic with high precision, offering a more reliable and efficient approach to
network security.
The key motivations for integrating machine learning into this project include:
The ability to detect both known and unknown attacks with improved accuracy.
A reduction in false positives and false negatives, enhancing trust in alerts.
The capacity to automate learning from labeled datasets without manual
intervention.
The flexibility to adapt to dynamic network environments and changing attack
vectors.
Support for real-time traffic monitoring and prediction, enabling timely threat
response.
9
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
By leveraging machine learning techniques, this project aims to develop a robust and
intelligent intrusion detection system that addresses the limitations of conventional solutions
and strengthens the defence mechanisms of modern networks.
1.4 Objective:
The primary objective of this project is to design and implement a Robust Network Intrusion
Detection System capable of identifying and classifying various types of network-based
attacks with high accuracy and minimal false alarms. As the number and complexity of cyber
threats continue to rise due to the increasing reliance on digital services, traditional rule-
based detection systems often fail to detect new or unknown attacks. To address this, the
project leverages machine learning techniques to analyse large volumes of network traffic
and detect anomalous behaviour. The system is developed using the CICIDS2018 dataset,
which contains a wide range of realistic attack scenarios and normal traffic patterns. By
training and evaluating three different machine learning model Random Forest, XG Boost,
and Logistic Regression the goal is to identify the most effective algorithm for robust
intrusion detection. The final objective is to create a scalable and intelligent IDS solution
that can be deployed in real-world environments to enhance the security of modern
networks.
Handle missing values, anomalies, and outliers using techniques like imputation and
scaling.
Select the most relevant features using domain knowledge and experimentation.
10
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
o XG Boost Classifier
o Logistic Regression
Handle infinite or missing values in the dataset and apply Standard Scaler for
normalization.
Train each model on the same dataset and evaluate them based on performance
metrics.
Use metrics such as accuracy, precision, recall, and F1-score to assess model
performance.
Identify the best-performing model and store it using Python’s pickle library for
future use.
Implement a function to accept new network session data, preprocess it using the
same pipeline, and predict whether it is an attack or normal.
Ensure that the detection pipeline is optimized for speed and efficiency, allowing
integration into real-time monitoring systems.
Explore the potential for extending this system into an intrusion detection and
prevention framework in future versions.
11
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
In this context, there is a critical need for intelligent intrusion detection systems that can
analyse large volumes of network traffic, recognize patterns, and detect both known and
unknown threats in real-time. Machine learning offers a promising solution to these
challenges. Unlike traditional systems, ML-based IDS can learn from historical data,
identify complex patterns, and adapt to new types of attacks. They can also reduce the rate
of false positives by learning the difference between normal and malicious behaviour based
on context and data trends.
12
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
CHAPTER -2
SYSTEM ANALYSIS
13
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
2. SYSTEM ANALYSIS
In the era of increasing cyber connectivity, safeguarding computer networks from
malicious threats is more crucial than ever. The exponential rise in internet traffic, cloud-
based services, and Internet of Things devices has led to an upsurge in sophisticated
cyberattacks. Organizations rely heavily on Intrusion Detection Systems as a defensive layer
in their cybersecurity infrastructure. However, with evolving attack strategies, traditional
IDS methods often fall short in effectively identifying and mitigating modern threats. This
section critically examines the existing systems used in network intrusion detection,
outlines their limitations, and introduces a more adaptive and intelligent proposed system
built on machine learning principles.
2.1 Existed System:
Traditional Intrusion Detection Systems have played a crucial role in monitoring and
securing network environments. These systems are mainly categorized into two types:
signature-based IDS and anomaly-based IDS. Signature-based systems detect intrusions by
comparing network activity against a database of known attack signatures. They are highly
accurate in detecting previously known threats and are widely deployed due to their
simplicity and efficiency. Anomaly-based systems, on the other hand, establish a baseline of
normal network behaviour and flag any deviations as potential threats. They are capable of
identifying unknown attacks, making them more flexible. However, both types of systems
have their own drawbacks. Signature-based IDS are ineffective against zero-day or
previously unseen attacks, as they cannot recognize patterns that are not already in their
database. Anomaly-based systems may raise frequent false alarms due to slight variations in
legitimate behaviour. In general, traditional IDS require extensive manual configuration,
lack adaptability, and struggle to cope with large-scale and high-speed network traffic. These
limitations highlight the need for more intelligent, automated, and scalable intrusion
detection mechanisms, particularly in the face of increasing cyber threats.
Limitations of the Existing System:
Inability to Detect Unknown Attacks:
Signature-based systems are ineffective against newly emerging threats that do not have a
predefined pattern or signature.
14
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
15
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
including both benign traffic and a variety of attack types such as DDoS, brute force, port
scans, and web intrusions. The dataset undergoes preprocessing steps such as handling
missing values, encoding categorical variables, extracting time-based features (Hour,
Minute, Second), and scaling numerical features using Standard Scaler to ensure model
accuracy and stability.Once preprocessed, the data is split into training and testing sets using
stratified sampling to maintain class balance. Each model is trained and evaluated using the
F1-score to determine the best-performing algorithm. The system selects the model with the
highest F1-score as the final classifier and saves it using Python's pickle module for future
use. Additionally, a prediction function is implemented to classify new, unseen network
sessions as either normal or attack, allowing for real-time integration in practical
environments.
By combining modern machine learning approaches with a realistic intrusion dataset, the
proposed system significantly reduces false positives and enhances the ability to detect a
wide variety of threats. This intelligent, automated solution is scalable and adaptable,
making it suitable for real-world deployment in enterprise networks where cybersecurity is
a critical concern. Ultimately, this system offers a more accurate, efficient, and proactive
approach to intrusion detection compared to traditional methods. Key Features of the
Proposed System are:
Data Preprocessing and Feature Engineering
The system begins by cleaning and preprocessing the raw data. Missing values, infinite
values, and outliers are handled to ensure clean input to the models. Time-based features
such as Hour, Minute, and Second are extracted from timestamps to help in identifying time-
sensitive patterns of attacks. Additionally, categorical features like protocol types are
encoded using Label Encoding, making the data suitable for model training.
Feature Selection and Normalization
Not all features are useful for detecting intrusions. The system selects relevant features such
as Flow Duration, Protocol, Tot Fwd Pkts, Flow Byts/s, etc., which have a strong correlation
with intrusion patterns. These features are then normalized using Standard Scaler to bring
them into the same scale, which improves the performance and convergence speed of ML
models.
Model Training and Evaluation
16
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
The selected machine learning models are trained using 80% of the dataset, and the
remaining 20% is used for testing. Models are evaluated based on F1-Score, which considers
both precision and recall, ensuring that the system performs well not only in identifying
attacks but also in minimizing false positives.
Model Selection and Serialization
The best-performing model based on weighted F1-Score is saved using Python’s pickle
module for future use. This enables easy deployment of the model in real-world applications
without retraining.
Real-Time Prediction Module
A prediction function is implemented to classify new sessions or real-time network flows as
either "Attack" or "Normal". This allows the system to be integrated into live monitoring
tools for real-time intrusion detection and alert generation.
Advantages of the Proposed System:
Higher Detection Accuracy: Machine learning models learn complex patterns in the data,
allowing for better identification of sophisticated attacks.
Detection of Unknown Threats: Unlike signature-based systems, the ML models can
generalize to detect previously unseen attack types.
Reduced False Positives: The system can differentiate between genuine anomalies and
legitimate deviations, reducing unnecessary alerts.
Scalability: Once trained, the models can process large volumes of traffic efficiently, making
the system suitable for modern enterprise environments.\
Automation and Adaptability: The system requires minimal manual intervention and can be
retrained with new data to adapt to evolving threats.
2.3 Feasibility Study:
Before initiating the development of any system, it is important to determine whether the
proposed idea is feasible and worth implementing. The feasibility study is conducted to
evaluate the practicality and effectiveness of the proposed machine learning-based Network
Intrusion Detection System (NIDS) from multiple perspectives. It helps identify potential
risks, limitations, and benefits that could affect the success and adoption of the system.
This study considers three major aspects of feasibility:
Economic Feasibility: Determines if the project is financially viable.
17
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
Technical Feasibility: Assesses whether the technology, tools, and expertise are
sufficient.
Social Feasibility: Evaluates how the solution will be received and its societal
impact.
2.3.1 Economic Feasibility:
Economic feasibility examines the cost-effectiveness of the project and its return on
investment. In today’s digital environment, organizations spend vast amounts of money on
securing their networks from cyberattacks. Despite this investment, many companies still
fall victim to advanced threats due to the limitations of traditional security systems.
The proposed system provides a low-cost, high-impact solution by leveraging freely
available datasets (e.g., CICIDS2018) and open-source machine learning tools like Python,
Scikit-learn, XG Boost, and Pandas. There is no need to purchase expensive software or
licenses, which significantly reduces the development and deployment cost.
Moreover, this system can potentially save organizations from heavy financial losses that
might result from data breaches, service disruptions, or ransomware attacks. Early detection
and prevention of such threats help minimize financial damage, legal penalties, and
reputational loss. From a long-term perspective, maintenance costs are also low, as the
system can be updated using new datasets and retrained periodically with minimal resources.
Overall, the project is economically sustainable and offers a strong return on investment,
especially for medium to large-scale organizations that manage sensitive or valuable data.
2.3.2 Technical Feasibility:
Technical feasibility assesses whether the project can be built using existing tools,
technologies, and resources.
The proposed system is technically feasible and utilizes widely adopted technologies that
are well-supported and documented. It is developed using:
Python programming language, known for its simplicity and vast library support,
Machine learning algorithms such as Random Forest, XG Boost, and Logistic
Regression,
Preprocessing tools for data cleaning and feature scaling (e.g., Standard Scaler,
LabelEncoder),
And matplotlib/seaborn for visualization and result interpretation.
2.3.3 Social Feasibility:
18
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
Social feasibility focuses on the human aspect—how the system will be perceived, accepted,
and used by people and organizations.
The rise in data breaches, identity theft, and ransomware attacks has made cybersecurity a
top priority for businesses, governments, and individuals alike. In this context, a machine
learning-based NIDS is socially relevant and highly demanded.
Unlike traditional systems that require manual rule updates and generate large volumes of
false alarms, this system offers:
Reduced false positives, which increases trust among security teams,
Automation and intelligence, which lessens the need for constant manual
intervention,
Ease of use, since the system does not require deep technical knowledge to operate
once deployed.
2.4 System Requirement Specifications:
It defines the key functionalities and performance expectations of the machine learning-
based Network Intrusion Detection System. It outlines what the system should do (functional
requirements) and how well it should perform those actions (non-functional requirements).
These requirements ensure that the system is developed in line with the intended objectives
and serves the needs of users effectively.
2.4.1 Functional Requirements:
Functional requirements describe the core features and operations the system must support
to meet its objectives. The main functional requirements of the proposed NIDS include:
Data Collection and Preprocessing
The system should be able to ingest structured network traffic data from datasets like
CICIDS2018.It must perform data cleaning, handle missing values, and preprocess
categorical features for model compatibility.
Feature Selection and Extraction
The system should allow selection of relevant features for model training. Feature
engineering steps like normalization, scaling, and transformation should be supported.
Model Training and Evaluation
The system should train machine learning models (e.g., XG Boost, Random Forest, Logistic
Regression) using labeled network data. It must evaluate model performance using metrics
like accuracy, precision, recall, and F1-score.
19
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
20
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
The system should be easy to use by individuals with basic machine learning or
cybersecurity knowledge. Output reports and visualizations should be clearly
understandable and interpretable.
Maintainability
The codebase should follow modular programming practices, making it easy to update or
debug. Retraining or fine-tuning the model with new data should not require rewriting the
entire code.
Security
The system should process data securely and ensure no sensitive information is exposed.It
must comply with basic data handling protocols, especially if extended to real-time
environments. Portability
The system should be platform-independent and run on commonly available hardware and
operating systems like Windows or Linux. It should support execution on standard laptops
with at least 8 GB RAM and an Intel Core i5 processor.
2.5 Hardware requirements:
21
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
CHAPTER -3
SYSTEM DESIGN
22
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
3. SYSTEM DESIGN
23
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
real-time. This architecture ensures the system is scalable, adaptable, and capable of
accurately detecting both known and unknown cyber threats, thereby enhancing the overall
security posture of the network.
3.2 System Modules
The system consists of five major modules:
Data Preprocessing Module: Responsible for cleaning the dataset, encoding categorical
variables, handling missing values, and scaling features to standardize input for the models.
Feature Engineering Module: Extracts time-based features and selects the most relevant
columns for classification based on domain knowledge and statistical correlation.
Model Training Module: Implements training pipelines for multiple machine learning
models. It includes cross-validation and hyperparameter tuning for optimal performance.
Evaluation Module: Evaluates each trained model using metrics like Accuracy, Precision,
Recall, and F1-score. Confusion matrices and classification reports are generated for
detailed analysis.
Prediction Module: Loads the serialized model and takes new input for real-time
classification of network traffic as normal or malicious. This module can be integrated into
network monitoring systems for live use.
3.2.1 METHODOLOGY
The methodology adopted in this project is structured into systematic stages to ensure a
robust and reliable intrusion detection system. It includes data acquisition, data
preprocessing, feature selection, model building and training, evaluation, and real-time
intrusion prediction. The entire pipeline is designed to simulate the practical deployment of
a machine learning-based Network Intrusion Detection System (NIDS) using realistic
network traffic data.
24
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
Fig.3.2.1Attack recognition
1. Data Acquisition
The CICIDS2018 dataset was selected as the benchmark dataset for training and testing the
intrusion detection models. Developed by the Canadian Institute for Cybersecurity, this
dataset comprises realistic traffic, which includes both benign and malicious activities. It is
labeled and contains features such as flow duration, packet statistics, bytes per second, and
attack categories such as:
DoS/DDoS
Brute Force
Port Scanning
2. Data Preprocessing
Before training any machine learning models, the dataset underwent multiple preprocessing
steps to ensure quality and consistency:
3. Handling Missing Values
Numeric columns were filled using their mode values to maintain the integrity of
the data distribution.
Categorical columns were filled using their most frequent values to avoid
introducing bias.
25
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
26
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
27
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
28
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
3. Logistic Regression
A baseline linear model suitable for multiclass classification with good
interpretability.
Useful for evaluating how simpler models compare to complex ensemble methods.
Each model was trained on the standardized training data and evaluated using the test set.
29
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
1.Model Evaluation
To evaluate the performance of each model, the following metrics were used:
Accuracy: Overall correctness of the model.
Precision: Proportion of predicted positives that are actually positive.
Recall: Proportion of actual positives correctly identified.
F1-Score: Harmonic mean of precision and recall, providing a balanced metric.
A weighted F1-score was used to handle class imbalance and evaluate multiclass
classification performance fairly.
The model with the highest F1-score was selected as the best model and saved using
Python’s pickle module for later use in real-time prediction.
Intrusion Detection and Prediction
A prediction function was implemented to classify new network session data using the
trained model:
New input data is preprocessed (scaled using the same Standard Scaler).
The best saved model is loaded using pickle.
The function outputs either “Normal” or “Attack” depending on the prediction
result.
30
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
This simulates the real-time usage of the system where new traffic data can be analyzed and
flagged automatically.
3.3 Introduction to UML Diagrams
Unified Modelling Language (UML) is a standardized visual language used to model and
design the structure and behaviour of software systems. It provides a set of diagrams and
symbols that help developers, analysts, and stakeholders understand, plan, and communicate
the components and flow of a system. UML diagrams serve as blueprints, representing both
the static aspects (such as classes and objects) and dynamic behaviours (such as interactions
and workflows) of a system. Common UML diagrams include Class Diagrams, Use Case
Diagrams, Sequence Diagrams, Activity Diagrams, and State Diagrams, each serving a
specific purpose in software development. By using UML, teams can visually map out the
system’s architecture, identify relationships between components, and ensure clear
communication throughout the development lifecycle.
The goal is for UML to become a common language for creating models of object-oriented
computer software. In its current form UML is comprised of two major components: a Meta
model and a notation. In the future, some form of method or process may also be added to;
or associated with UML. The Unified Modelling Language is a standard language for
specifying, Visualization, Constructing and documenting the artifacts of software system, as
well as for business modelling and other non-software systems.
3.3.1 Use Case Diagram
Use Case Diagrams are a type of Unified Modelling Language (UML) diagram that visually
represent the functional requirements of a system from the user's perspective. They illustrate
the interactions between users (called actors) and the system itself through various use
cases, which describe specific functionalities or services the system provides. Use case
diagrams help developers and stakeholders understand what the system is supposed to do,
who will use it, and how different users interact with different parts of the system. Typically,
the diagram includes actors (users or external systems), use cases (oval shapes describing
actions), and associations (lines connecting actors to their relevant use cases). These
diagrams are particularly useful during the early stages of software development to gather
and communicate requirements clearly and effectively.
31
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
The UML use case diagram visually represents how a user interacts with the core functions
of the Network Intrusion Detection System (NIDS). In the diagram, the Apps User is the
main actor who initiates interactions with the system. The system, shown as a rectangular
boundary, contains four main functions: Monitoring network, Port scan attack detector,
Analyze the network, and Receive notification. These use cases describe the system's role
in securing a network. The Monitoring network use case enables real-time tracking of all
incoming and outgoing data packets, ensuring every flow of traffic is observed. The Port
scan attack detector is responsible for detecting suspicious scanning activities where
attackers probe multiple ports to find vulnerabilities. This early detection is vital to prevent
further intrusion. The Analyze the network function processes and inspects network traffic
using various machine learning techniques to determine if the traffic is normal or malicious.
This analysis uses predefined models trained on datasets like CICIDS2018 to ensure
accurate detection. Once a threat is detected, the Receive notification use case sends an alert
to the user, allowing quick and
32
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
33
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
system is composed of several interrelated classes, each with its own role in the detection
process. At the heart of the system is the IDS class, which has attributes like id and methods
such as requestService(), detectIntrusion(), and issueAlert(). It acts as the main interface that
manages intrusion detection requests and communicates with other components.
The IDS class forwards requests to the EventProcessor, which contains the name attribute
and a method processDataForDetector(). This component plays a crucial role in
preprocessing and filtering network data before sending it to the core detection unit. The
processed data is then forwarded to the AttackDetector class, which includes methods like
detectIntrusion() and checkAttackInfo(). This is the component responsible for comparing
incoming data against known signatures to detect malicious behavior.
Supporting the AttackDetector is the SignatureInformation class, which manages the
signatures. It contains functions like addSignature() and removeSignature(), allowing the
system to dynamically update its knowledge base. This class is associated with multiple
Signature objects, each having an id and a signature string. These signatures represent
known patterns of malicious traffic.
Once the Attack Detector processes the data and identifies a threat, it interacts with the
Response class, which includes the method create Response(). This class handles how the
system responds to detected intrusions, whether by generating alerts, logging activity, or
taking preventive measures. The send Result and send Response associations between
classes ensure that alerts or outputs are delivered properly back to the IDS and ultimately to
the user or administrator.
Each class in the diagram is connected with meaningful associations, clearly showing the
communication and workflow in the system. For example, the use of composition between
Signature Information and Attack Detector signifies a strong dependency, indicating that
attack detection relies on accurate and up-to-date signature data. Overall, this class diagram
offers a clear, modular, and scalable architecture for building a signature-based intrusion
detection system. It separates concerns efficiently, ensuring maintainability and extensibility
for future improvements like adding more detection methods or response types.
34
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
35
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
Response module is called. The Response module uses the create Response method to form
an appropriate reaction to the detected threat. This could include alerting administrators or
activating emergency measures.
The sequence diagram clearly shows how each class and module works together in a time-
based order to detect intrusions in the network. The vertical lines in the diagram represent
each component's lifeline, while the horizontal arrows indicate the messages being passed
from one to another. The numbering (like 1.1.1.2.2.1) shows the nested calls and their order
of execution. The IDS acts as the entry point, and everything follows a well-organized flow
from detecting to responding to a threat. This setup makes the system capable of identifying
and managing network intrusions efficiently. Each module does a specific task, and their
interaction ensures a complete check of possible attacks and a quick reaction. Overall, the
diagram helps in understanding the logical sequence and communication among different
parts of the intrusion detection system, ensuring that the system runs smoothly, detects
problems quickly, and responds accurately
36
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
time and creating a Normal Pattern Database. This database stores baseline behavior that
serves as a reference to identify any deviations in real-time network activity. On the other
side, the information collection subsystem continuously monitors incoming data and
network traffic, collecting important behavioral metrics.
Once both modules are operational, the system compares the collected data with the entries
in the normal pattern database. This comparison plays a crucial role in identifying anomalies.
A decision-making point, often represented by a threshold value, evaluates whether the
current behavior exceeds the acceptable limit defined during training. If the data does not
exceed the threshold, it is classified as Normal, and no further action is taken. However, if
the data does exceed the threshold, it is flagged as an Anomaly, indicating a potential threat
or suspicious activity in the network.
This entire activity flow reflects a typical Anomaly-Based Intrusion Detection System,
where machine learning or statistical analysis is used to establish a model of normal behavior
and detect deviations from it. The diagram simplifies the detection process into a decision-
driven mechanism, showing how automated systems can help administrators quickly
identify threats without manually analyzing traffic.
The use of thresholds in this diagram is a key security strategy. A well-defined threshold
helps reduce false positives (normal behavior incorrectly flagged as intrusion) and false
negatives (actual threats missed by the system). The feedback from the decision process can
also be looped back to the training module to update the normal behavior patterns, making
the system adaptive and intelligent over time.
Furthermore, the simplicity of the activity diagram allows system designers and developers
to visualize the workflow at a high level and make necessary improvements. It captures the
step-by-step logic and decisions made by the IDS engine without going into complex
technical implementation. The focus remains on the logical transition from data collection
to decision-making, which is essential for designing and validating an effective intrusion
detection model.
In conclusion, the activity diagram of NIDS provides a clear, structured representation of
how suspicious network behaviors are identified, categorized, and responded to. It plays a
foundational role in modeling the internal logic of an IDS system. Through continuous data
monitoring, comparison against normal behavior, and threshold-based decisions, the activity
37
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
diagram highlights how the system maintains network integrity and prevents unauthorized
access or attacks. This high-level understanding is essential for researchers, developers, and
security professionals working on IDS implementations
38
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
In parallel, CTI Feeds (Cyber Threat Intelligence feeds) act as an external source of threat
intelligence, supplying up-to-date IoCs and all contextual threat information (CTI). These
feeds enrich the analysis performed by the Traffic Analyses. The CTI data is integrated into
the detection mechanism to ensure the system can identify threats that go beyond static
signatures and includes dynamic, emerging threats. The Traffic Analyses generates events
based on the findings from traffic inspection. These events are passed to the Event
Correlation component.
The Event Correlation component plays a critical role in analyzing multiple events together
to determine patterns, identify coordinated attacks, or deduce whether an incident is isolated
or widespread. It collects individual detection events and correlates them over time or across
multiple devices to produce high-confidence network incidents. These network incidents are
then forwarded to the CSIRT Analyst (Computer Security Incident Response Team analyst),
who manually reviews the generated reports.
The CSIRT Analyst provides human expertise by validating the automated findings,
conducting further investigation, and initiating appropriate response actions. This interaction
closes the feedback loop between the system's automated detection and human-driven
validation. The system is modular and maintains separation between input data, processing
units, data flows, and storage, as represented in the legend.
Each component operates independently yet collaboratively to ensure scalability and
flexibility. The Traffic Analyser continuously updates itself using fresh CTI data. The Event
Correlation module applies logic, thresholds, or machine learning algorithms to make
decisions more intelligent and context-aware. The Blacklist database evolves over time as
new threats are identified and stored for future use. The CSIRT Analyst serves as the final
check before any response or alert is escalated to the organization.
This component diagram effectively conveys a modern architecture for an intrusion
detection system that blends automation with expert oversight. The clear separation of
responsibilities between data ingestion, analysis, correlation, and human validation ensures
a well-structured and reliable security mechanism. Overall, this design demonstrates how
both static rules and dynamic intelligence can be combined to create a more robust and
proactive intrusion detection capability. The architecture reflects best practices in
39
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
40
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
If a potential threat is identified, alerts are generated and sent to administrators for review
and action. These administrators are responsible for managing, responding to, and mitigating
any identified security incidents. The IDS is also connected to a central server, which stores
logs, alerts, and historical data necessary for post-attack investigations and system audits.
This server supports the IDS by providing a repository for storing large volumes of traffic
data, which is crucial for training detection models and refining rules over time.
The diagram emphasizes the collaborative role of each component in maintaining network
security. The firewall blocks known threats, the router manages traffic flow, and the IDS
inspects network behavior in-depth. The data analysis process transforms raw network data
into meaningful security insights. These insights are interpreted by administrators, who
maintain system integrity and respond to incidents. In this system, real-time data processing
ensures timely threat detection, while archived data supports trend analysis and forensic
investigations.
Deploying a robust NIDS in this manner enhances an organization's ability to detect and
respond to cyber threats effectively. It forms a comprehensive security framework where
prevention, detection, and response mechanisms work in tandem. The architecture is
designed for scalability, allowing the addition of more sensors or data sources as network
traffic grows. It is also adaptable to evolving cyber threats, as new detection models can be
trained and deployed within the analysis component. Thus, this deployment diagram
represents a layered and intelligent defense strategy for securing modern digital
infrastructures.
41
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
42
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
CHAPTER - 4
SYSTEM IMPLEMENTATION
43
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
44
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
45
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
<div class="col-md-6">
<label for="tot_bwd_pkts" class="form-label">Total Backward Packets:</label>
<input type="number" class="form-control" id="tot_bwd_pkts" name="Tot
Bwd Pkts" required>
</div>
<div class="col-md-6">
<label for="fwd_pkt_len_max" class="form-label">Forward Packet Length
Max:</label>
<input type="number" class="form-control" id="fwd_pkt_len_max"
name="Fwd Pkt Len Max" required>
</div>
<div class="col-md-6">
<label for="bwd_pkt_len_max" class="form-label">Backward Packet Length
Max:</label>
<input type="number" class="form-control" id="bwd_pkt_len_max"
name="Bwd Pkt Len Max" required>
</div>
<div class="col-md-6">
<label for="flow_byts_s" class="form-label">Flow Bytes/s:</label>
<input type="number" class="form-control" id="flow_byts_s" name="Flow
Byts/s" step="any" required>
</div>
<div class="col-md-6">
<label for="flow_pkts_s" class="form-label">Flow Packets/s:</label>
<input type="number" class="form-control" id="flow_pkts_s" name="Flow
Pkts/s" step="any" required>
</div>
<div class="col-md-4">
<label for="hour" class="form-label">Hour:</label>
<input type="number" class="form-control" id="hour" name="Hour" min="0"
max="23" required>
</div>
46
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
<div class="col-md-4">
<label for="minute" class="form-label">Minute:</label>
<input type="number" class="form-control" id="minute" name="Minute"
min="0" max="59" required>
</div>
<div class="col-md-4">
<label for="second" class="form-label">Second:</label>
<input type="number" class="form-control" id="second" name="Second"
min="0" max="59" required>
</div>
</div>
<div class="text-center mt-4">
<button type="submit" class="btn btn-primary">Submit</button>
</div>
</form>
<center><h1>Predicted Result : {{prediction}}</h1></center>
</div>
<script
src="https://cdn.jsdelivr.net/npm/[email protected]/dist/js/bootstrap.bundle.min.js"></script
>
</body>
</html>
4.3.2 Back End Code:
from flask import Flask, render_template, request
import numpy as np
import pickle
app = Flask(__name__)
# Load the trained model
model = pickle.load(open("best_nids_model.pkl", "rb"))
# Define the selected features
selected_features = [
"Flow Duration", "Protocol", "Tot Fwd Pkts", "Tot Bwd Pkts",
47
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
"Fwd Pkt Len Max", "Bwd Pkt Len Max", "Flow Byts/s", "Flow Pkts/s",
"Hour", "Minute", "Second"
]
lab=['Benign','FTP-BruteForce','SSH-Bruterorce']
@app.route("/")
def home():
return render_template("index.html") # Ensure "index.html" is the HTML form
@app.route("/predict", methods=["POST"])
def predict():
try:
# Get form values as a list of floats
values = [float(request.form[feature]) for feature in selected_features]
# Convert to a NumPy array and reshape for model input
input_array = np.array(values).reshape(1, -1)
# Make prediction
prediction = model.predict(input_array)[0]
# Return the prediction result
return render_template("index.html", prediction=lab[prediction])
except Exception as e:
return f"Error: {str(e)}"
if __name__ == "__main__":
app.run(debug=True)
48
NETWORK INTRUSION DETECTION SYSTEM USING MACHINE LEARNING
CHAPTER- 5
SYSTEM TESTING
49
MONITORING VITAL SIGNS AND AUTOMATED PRESCRIPTION GENERATION
Page | 50
MONITORING VITAL SIGNS AND AUTOMATED PRESCRIPTION GENERATION
While slightly behind Random Forest, XGBoost was more efficient in handling data
with noise and showed good generalization.
Logistic Regression:
Logistic Regression had the lowest F1-score of 0.943.
It performed decently but struggled with multiclass classification and complex
relationships in the dataset.
Being a linear model, it failed to capture the non-linear dependencies between features
effectively.
Visual Analysis
The performance was also visualized using confusion matrices and classification reports,
helping us understand misclassification patterns. Most attack types were correctly identified,
with minor confusion between similar patterns like DDoS and PortScan, which often exhibit
close network behaviors.
Visualizations using Seaborn heatmaps of confusion matrices revealed:
High true positive rates for both normal and attack classes in Random Forest and XG
Boost.
Logistic Regression had relatively higher false negatives, especially for minority
classes.
Observations
Random Forest consistently produced the most reliable and stable results, making it
the most suitable model for real-world deployment.
XG Boost, while slightly less accurate, is faster in prediction time, which could be an
advantage in real-time systems.
Logistic Regression is lightweight and interpretable but may not be sufficient for large-
scale, complex traffic environments like CICIDS2018.
Feature selection and preprocessing (e.g., normalization, label encoding, filling
missing values) played a crucial role in improving model accuracy.
Real-Time Testing
The system was further tested on a sample network session with manually fed input features
such as flow duration, packet sizes, and timestamp-based variables. The model successfully
predicted whether the session was “Normal” or “Attack”, demonstrating its applicability in
live environments.
Page | 51
MONITORING VITAL SIGNS AND AUTOMATED PRESCRIPTION GENERATION
The prediction function using the saved model (best_nids_model.pkl) worked accurately with
unseen data, confirming that the system generalizes well.
4.6 Discussion
The results confirm that machine learning can significantly improve network intrusion
detection over traditional rule-based methods. This system is capable of learning from vast and
evolving datasets, automatically identifying complex patterns and adapting to new types of
attacks.
The CICIDS2018 dataset ensured that models were trained on realistic traffic, which enhanced
their effectiveness and robustness. Furthermore, implementing timestamp features (Hour,
Minute, Second) added temporal awareness, which improved model context in detecting time-
specific attacks.
Page | 52
MONITORING VITAL SIGNS AND AUTOMATED PRESCRIPTION GENERATION
Page | 53
MONITORING VITAL SIGNS AND AUTOMATED PRESCRIPTION GENERATION
Page | 54
MONITORING VITAL SIGNS AND AUTOMATED PRESCRIPTION GENERATION
Page | 55
MONITORING VITAL SIGNS AND AUTOMATED PRESCRIPTION GENERATION
CHAPTER – 6
CONCLUSION
Page | 56
MONITORING VITAL SIGNS AND AUTOMATED PRESCRIPTION GENERATION
6.1 CONCLUSION:
In today’s digital era, the exponential growth of internet usage, online transactions, and digital
services has significantly increased the vulnerability of networks to cyber threats. Traditional
intrusion detection methods, which rely on manually crafted rules or known attack signatures,
are increasingly becoming ineffective in detecting sophisticated and evolving cyberattacks.
Hence, the development of intelligent, adaptable, and robust network security systems is
essential.
This project focused on building a Robust Network Intrusion Detection System (NIDS) using
machine learning algorithms such as Random Forest, XGBoost, and Logistic Regression,
trained on the CICIDS2018 dataset. The dataset provided a realistic mix of normal and attack
traffic, which was crucial for developing models that perform well in real-world environments.
The selected features from the dataset, along with proper preprocessing, scaling, and encoding
techniques, contributed to the enhanced performance of the models.
The results demonstrated that Random Forest achieved the highest accuracy and F1-score
among the three models, showcasing its strength in handling multiclass classification problems
and its ability to generalize well. XGBoost also performed exceptionally well, with faster
training and prediction times, making it a strong candidate for real-time applications. Logistic
Regression, although simpler and faster, showed limitations in capturing complex data patterns,
especially in multiclass scenarios.
The system was further validated through real-time testing with unseen data, and it successfully
predicted whether the given network session was normal or an attack. This confirms the
practicality and efficiency of the proposed system in real-time network environments.
Page | 57
MONITORING VITAL SIGNS AND AUTOMATED PRESCRIPTION GENERATION
REFERENCES
Page | 58
MONITORING VITAL SIGNS AND AUTOMATED PRESCRIPTION GENERATION
REFERENCE
1.Ahmad, Z., Shahid Khan, A., Wai Shiang, C., Abdullah, J., & Ahmad, F. (2021). Network
intrusion detection system: A systematic study of machine learning and deep learning
approaches. Transactions on Emerging Telecommunications Technologies, 32(1), e4150.
2. Javaid, A., Niyaz, Q., Sun, W., & Alam, M. (2016, May). A deep learning approach for
network intrusion detection system. In Proceedings of the 9th EAI International Conference on
Bio-inspired Information and Communications Technologies (formerly BIONETICS) (pp. 21-
26).
3.Sekar, R., Guang, Y., Verma, S., & Shanbhag, T. (1999, November). A high-performance
network intrusion detection system. In Proceedings of the 6th ACM Conference on Computer
and Communications Security (pp. 8-17).
4.Farnaaz, N., & Jabbar, M. A. (2016). Random forest modeling for network intrusion detection
system. Procedia Computer Science, 89, 213-217.
5. Ashiku, L., & Dagli, C. (2021). Network intrusion detection system using deep
learning. Procedia Computer Science, 185, 239-247.
6. Mukherjee, B., Heberlein, L. T., & Levitt, K. N. (1994). Network intrusion detection. IEEE
network, 8(3), 26-41.
7. Abdulganiyu, O. H., Ait Tchakoucht, T., & Saheed, Y. K. (2023). A systematic literature
review for network intrusion detection system (IDS). International journal of information
security, 22(5), 1125-1162.
8. Shanmugavadivu, R., & Nagarajan, N. (2011). Network intrusion detection system using
fuzzy logic. Indian Journal of Computer Science and Engineering (IJCSE), 2(1), 101-111.
9. Raghunath, B. R., & Mahadeo, S. N. (2008, July). Network intrusion detection system
(NIDS). In 2008 first international conference on emerging trends in engineering and
technology (pp. 1272-1277). IEEE.
10. Zhang, J., Zulkernine, M., & Haque, A. (2008). Random-forests-based network intrusion
detection systems. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications
and Reviews), 38(5), 649-659.
11. Sultana, N., Chilamkurti, N., Peng, W., & Alhadad, R. (2019). Survey on SDN based
network intrusion detection system using machine learning approaches. Peer-to-Peer
Networking and Applications, 12(2), 493-501.
12. Liao, H. J., Lin, C. H. R., Lin, Y. C., & Tung, K. Y. (2013). Intrusion detection system: A
comprehensive review. Journal of network and computer applications, 36(1), 16-24.
Page | 59
MONITORING VITAL SIGNS AND AUTOMATED PRESCRIPTION GENERATION
13. Moustafa, N., & Slay, J. (2015, November). UNSW-NB15: a comprehensive data set for
network intrusion detection systems (UNSW-NB15 network data set). In 2015 military
communications and information systems conference (MilCIS) (pp. 1-6). IEEE.
14. Aldarwbi, M. Y., Lashkari, A. H., & Ghorbani, A. A. (2022). The sound of intrusion: A
novel network intrusion detection system. Computers and Electrical Engineering, 104,
108455.
15. Koc, L., Mazzuchi, T. A., & Sarkani, S. (2012). A network intrusion detection system based
on a Hidden Naïve Bayes multiclass classifier. Expert Systems with Applications, 39(18),
13492-13500.
16. Casas, P., Mazel, J., & Owezarski, P. (2012). Unsupervised network intrusion detection
systems: Detecting the unknown without knowledge. Computer Communications, 35(7), 772-
783.
17. Sahu, S., & Mehtre, B. M. (2015, August). Network intrusion detection system using J48
Decision Tree. In 2015 International Conference on Advances in Computing, Communications
and Informatics (ICACCI) (pp. 2023-2026). IEEE.
18. Moustafa, N., & Slay, J. (2015, November). The significant features of the UNSW-NB15
and the KDD99 data sets for network intrusion detection systems. In 2015 4th international
workshop on building analysis datasets and gathering experience returns for security
(BADGERS) (pp. 25-31). IEEE.
19. Gurung, S., Ghose, M. K., & Subedi, A. (2019). Deep learning approach on network
intrusion detection system using NSL-KDD dataset. International Journal of Computer
Network and Information Security, 11(3), 8-14.-0q`49
20. Kim, D. S., Nguyen, H. N., & Park, J. S. (2005, March). Genetic algorithm to improve
SVM based network intrusion detection system. In 19th International Conference on Advanced
Information Networking and Applications (AINA'05) Volume 1 (AINA papers) (Vol. 2, pp. 155-
158). IEEE.
21.Vigna, G., & Kemmerer, R. A. (1999). NetSTAT: A network-based intrusion detection
system. Journal of computer security, 7(1), 37-71.
22. Sarhan, M., Layeghy, S., & Portmann, M. (2022). Towards a standard feature set for
network intrusion detection system datasets. Mobile networks and applications, 1-14.
23.Bai, Y., & Kobayashi, H. (2003, March). Intrusion detection systems: technology and
development. In 17th International Conference on Advanced Information Networking and
Applications, 2003. AINA 2003. (pp. 710-715). IEEE.
Page | 60
MONITORING VITAL SIGNS AND AUTOMATED PRESCRIPTION GENERATION
24. He, K., Kim, D. D., & Asghar, M. R. (2023). Adversarial machine learning for network
intrusion detection systems: A comprehensive survey. IEEE Communications Surveys &
Tutorials, 25(1), 538-566.
25.Van, N. T., & Thinh, T. N. (2017, July). An anomaly-based network intrusion detection
system using deep learning. In 2017 international conference on system science and
engineering (ICSSE) (pp. 210-214). Ieee
26. Mohammadpour, L., Ling, T. C., Liew, C. S., & Chong, C. Y. (2018). A convolutional neural
network for network intrusion detection system. Proceedings of the Asia-Pacific Advanced
Network, 46(0), 50-55.
27. Aydın, M. A., Zaim, A. H., & Ceylan, K. G. (2009). A hybrid intrusion detection system
design for computer network security. Computers & Electrical Engineering, 35(3), 517-526.
28. Hodo, E., Bellekens, X., Hamilton, A., Dubouilh, P. L., Iorkyase, E., Tachtatzis, C., &
Atkinson, R. (2016, May). Threat analysis of IoT networks using artificial neural network
intrusion detection system. In 2016 International Symposium on Networks, Computers and
Communications (ISNCC) (pp. 1-6). IEEE.
29. Alzahrani, A. O., & Alenazi, M. J. (2021). Designing a network intrusion detection system
based on machine learning for software defined networks. Future Internet, 13(5), 111.
30. Pillai, M. M., Eloff, J. H., & Venter, H. S. (2004, October). An approach to implement a
network intrusion detection system using genetic algorithms. In ACM International Conference
Proceeding Series (Vol. 75, pp. 221-221).
Page | 61