0% found this document useful (0 votes)
49 views61 pages

Intrusion Detection System Final 5

The document is a mini project report on an Intrusion Detection System (IDS) submitted by students from Dr. Sivanthi Aditanar College of Engineering. It covers the definition, types, advantages, and disadvantages of IDS, as well as its architecture, data collection, analysis methods, alert generation, and testing. The report emphasizes the importance of IDS in network security and the need for ongoing maintenance and evaluation to effectively detect and respond to threats.

Uploaded by

jefferjam716
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views61 pages

Intrusion Detection System Final 5

The document is a mini project report on an Intrusion Detection System (IDS) submitted by students from Dr. Sivanthi Aditanar College of Engineering. It covers the definition, types, advantages, and disadvantages of IDS, as well as its architecture, data collection, analysis methods, alert generation, and testing. The report emphasizes the importance of IDS in network security and the need for ongoing maintenance and evaluation to effectively detect and respond to threats.

Uploaded by

jefferjam716
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 61

INTRUSION DETECTION SYSTEM

A MINI PROJECT REPORT

Submitted by

ASWATH RS (950520104005)

KARTHIK RAJA G (950520104020)

SELVAMANI ALIAS PINEKAS N (950520104039)

VELUKUMARASAMY A (950520104046)

BACHELOR OF ENGINEERING

IN

COMPUTER SCIENCE AND ENGINEERING

DR. SIVANTHI ADITANAR COLLEGE OF ENGINEERING,


TIRUCHNEDUR-628 215

ANNA UNIVERSITY: CHENNAI 600 025

MAY 2023
ANNA UNIVERSITY: CHENNAI 600 025

BONAFIDE CERTIFICATE

Certified that this project report "INTRUSION DETECTION SYSTEM" is the


bonafide work of "ASWATH RS (950520104005), KARTHIK RAJA G
(950520104020), SELVAMANI ALIAS PINEKAS N (950520104039),
VELUKUMARASAMY A (950520104046)" who carried out the Mini Project work
under my supervision.

SIGNATURE SIGNATURE

Dr.G. Wiselin Jiji, M.E., Ph. D., Dr. D. Kesavaraja, M.E., Ph. D.,

PRINCIPAL & HEAD SUPERVISOR

HEAD OF DEPARTMENT ASSOCIATE PROFFESOR

Department of Computer Science Department of Computer Science

And Engineering And Engineering

Dr. Sivanthi Aditanar College of Dr. Sivanthi Aditanar College of

Engineering Engineering

Tiruchendur-628215 Tiruchendur-628215

Submitted to the B.E Mini Project viva-voce examination held on …………………

INTERNAL EXAMINER EXTERNAL EXAMINER


ACKNOWLEDGEMENT

First and foremost, we would like to thank The God Almighty, who by
his abundant grace sustained us to complete the course and the project
successfully.

Our sincere thanks to our honorable founder Padma Shri Dr. B. Sivanthi
Aditanar and our beloved chairman Sri. S. Balasubramanian Adityan for
providing us with an excellent infrastructure and conductive atmosphere for
developing our project

We also thank our respected Principal and Head of Department of


Computer Science and Engineering Dr. G. Wiselin Jiji M.E., Ph.D. for
giving us the opportunity to display our professional skills through this
project.

We are greatly indebted to Dr. R. Jensi M.E., Ph.D. Associate


Professor, Department of Computer Science and Engineering for her
motivation, guidance throughout the course of this project work.

We are greatly thankful to our guide Dr. D. Kesavaraja M.E., Ph. D.


Associate Professor, Department of Computer Science and Engineering, for
his valuable guidance and motivation, which helped us to complete this
project on time.

We thank all our teaching and non-teaching staff members of the


Computer Science department for their passionate support, for helping us
to identify our flaws and also for the appreciation they gave us in achieving
our goal. Also, we would like to record our deepest gratitude to our parents
for their constant encouragement.
ABSTRACT

An intrusion detection system (IDS) is a critical component of


modern network security. It is designed to monitor network traffic and
identify potential security threats, including unauthorized access, misuse, or
other malicious activities. IDSs work by analyzing network traffic and
comparing it against a database of known attack signatures or behavior
patterns. They can be deployed at various points in a network and can generate
alerts or take automated actions to respond to threats. There are two main
types of IDSs: signature-based and behavior-based. Signature-based IDSs use
a database of known attack patterns to identify threats, while behavior-based
IDSs use machine learning and other techniques to analyse network traffic
and detect anomalies. IDSs can be deployed as standalone appliances or
integrated into existing security architectures to provide real-time threat
intelligence and response capabilities. IDSs are essential tools for protecting
networks from a wide range of threats, including malware infections, network
breaches, and insider attacks. However, they can also generate false positives,
which can be time-consuming to investigate, and they require ongoing
maintenance and updates to remain effective against evolving threats. Regular
testing and evaluation of the IDS is important to ensure it is providing
adequate protection against emerging threats. In addition to generating alerts,
IDSs can also take automated actions to respond to threats. However, these
automated responses should be carefully configured and tested to avoid
disrupting legitimate traffic or causing other unintended consequences.
TABLE OF CONTENT

CHAPTER NO TITLE PAGE NO

ABSTRACT

1 INTRODUCTION 9

1.1 Intrusion Detection System

1.2 Types of IDS

1.3 Advantages of IDS

1.4 Disadvantages of IDS

2 DESIGN AND IMPLEMENTATION 17

2.1 System Architecture

2.2 Data Collection

2.3 Data Analysis

2.4 Alert Generation

2.5 Reporting

3 SYSTEM ARCHITECTURE 27

3.1 System Requirements

3.2 System Design


3.3 Hardware and Software Requirements

4 DATA COLLECTION 32

4.1 Packet Capture

4.2 Data Pre-processing

4.3 Data Storage

5 DATA ANALYSIS 36

5.1 Signature-based Detection

5.2 Anomaly-based Detection

5.3 Network behaviour Analysis

5.4 Machine Learning in IDS

6 ALERT GENERATION 41

6.1 Alert Criteria

6.2 Alert Prioritization

6.3 Alert Notification


7 REPORTING 45

7.1 Reporting Requirements

7.2 Report Generation

7.3 Report Analysis

8 ID3 ALGORITHM 49

8.1 Introduction

8.2 History of ID3

8.3 ID3 in IDS

8.4 ID3 Algorithm

8.5 ID3 in IDS code

8.6 Scope of ID3

8.7 Advantage of ID3

8.8 Disadvantage of ID3

9 TESTING AND VALIDATION 64


9.1 Test Plan

9.2 Test Cases

9.3 Validation

10 CONCLUSION 67

11 REFERENCES 70
Chapter 1

OVERVIEW OF INTRUSION DETECTION SYSTEM

1.1. INTRUSION DETECTION SYSTEM

An Intrusion Detection System (IDS) is a security mechanism


that monitors network traffic and alerts the administrator when an
intrusion occurs. IDS can detect various types of attacks, including port
scanning, denial of service (DoS), buffer overflow, and malware. IDS
can be classified into three types: network-based, host-based, and
application-based.

An Intrusion Detection System (IDS) is a security technology


designed to detect unauthorized access or malicious activities within a
computer network or system. Its primary function is to monitor network
traffic and identify any suspicious or potentially harmful behavior.

The IDS operates by examining network packets, system logs, and


other network activities, comparing them against a set of predefined
rules or patterns. These rules define what is considered normal or
acceptable behavior and what is deemed anomalous or potentially
malicious. When an IDS detects a deviation from the established
patterns or identifies suspicious activities, it generates an alert or takes
appropriate action to notify network administrators or trigger an
automated response

1.2. TYPES OF IDS

Network-based IDS (NIDS) monitors network traffic


and detects attacks based on network activity. Host-based IDS (HIDS)
monitors system events and detects attacks based on system activity.
Application-based IDS (AIDS) monitors application activity and
detects attacks based on application behavior.

There are several types of IDS, each with its own approach to detecting
and responding to security threats. Here are brief explanations of the
main types:

Signature-based IDS:

This type of IDS relies on a database of known attack


signatures or patterns. It compares network traffic or system behavior
against these signatures to identify known threats. If a match is found,
an alert is generated. Signature-based IDS is effective against known
attacks but may struggle with detecting new or zero-day attacks.

Heuristic-based IDS:

Heuristic-based IDS uses a set of predefined rules or


algorithms to detect patterns that indicate malicious activity. It
combines aspects of both signature-based and anomaly-based
detection. Heuristic-based IDS can identify certain types of attacks that
are not caught by signature-based IDS, but it may also produce false
positives or miss sophisticated attacks.

Network Behavior Analysis (NBA) IDS:

NBA IDS focuses on analyzing network traffic patterns


and behavior to identify abnormal or malicious activities. It looks for
deviations from expected traffic patterns, such as unusual data
transfers, port scanning, or unauthorized network connections. NBA
IDS can provide insights into potential threats and help detect zero-day
attacks, but it may require more computational resources.

Intrusion Prevention System (IPS):

IPS goes beyond detection and actively takes measures to


prevent intrusions. It can block or drop malicious traffic,
reconfigurenetwork settings, or terminate suspicious connections. IPS
can operate as a standalone system or as an additional feature within an
IDS.

Hybrid IDS:

Hybrid IDS combines multiple detection techniques, such as


signature-based, anomaly-based, and heuristic-based approaches, to
provide comprehensive threat detection. By leveraging the strengths of
different detection methods, hybrid IDS can enhance accuracy and
reduce false positives and false negatives.

1.3. ADVANTAGES OF IDS

IDS can detect attacks that firewalls cannot. IDS can detect
internal and external attacks. IDS can detect known and unknown
attacks. IDS can generate alerts when an attack occurs.

IDS helps in the early detection of potential security threats, including


unauthorized access attempts, malware infections, suspicious network
activities, and system vulnerabilities. It provides real-time monitoring
and alerts administrators or security teams to take appropriate actions
promptly.

IDS logs and reports can assist in meeting regulatory compliance


requirements. They provide evidence of security measures in place, aid
in audits, and demonstrate adherence to security policies and industry
regulations.

1.4. DISADVANTAGES OF IDS

IDS can generate false positives. IDS can generate false


negatives. IDS can be bypassed by attackers. IDS can consume network
bandwidth and system resources.
While Intrusion Detection Systems (IDS) provide several benefits,
they also have some potential disadvantages that organizations should
consider:

IDS systems require ongoing maintenance, including regular updates


to signature databases and system software, to effectively detect new
threats. Proper configuration and tuning are crucial to minimize false
positives and false negatives, but this process can be complex and time-
consuming, requiring expertise and ongoing monitoring.

As networks become more complex and distributed, deploying and


managing IDS systems across the entire infrastructure can be
challenging. Ensuring coverage across various network segments,
dealing with encrypted traffic, or managing IDS deployments in cloud
or virtualized environments can pose logistical and technical
difficulties.

While IDS plays a valuable role in network security, organizations


should carefully consider these disadvantages and address them
through appropriate planning, configuration, and continuous
monitoring to maximize the effectiveness of their IDS deployment.
CHAPTER 2

DESIGN AND IMPLEMENTATION

2.1. SYSTEM ARCHITECTURE

The IDS system architecture consists of three main


components: data collection, data analysis, and alert generation. The
data collection component captures network traffic and stores it in a
database. The data analysis component analyses network traffic and
generates alerts. The alert generation component generates alerts and
notifies the administrator. The system architecture should be scalable
and fault tolerant.

Anomaly-based IDS establishes a baseline of normal behavior by


analyzing network traffic or system activities over time. It then looks
for deviations from this baseline and raises an alert if it detects unusual
or suspicious behavior.

IDS systems require ongoing maintenance, including regular updates


to signature databases and system software, to effectively detect new
threats. Proper configuration and tuning are crucial to minimize false
positives and false negatives, but this process can be complex and time-
consuming, requiring expertise and ongoing monitoring.

IDS systems monitor network traffic and perform analysis, which can
introduce latency and impact network performance. High network
traffic volume or resource-intensive analysis can strain the IDS, leading
to potential performance degradation or network bottlenecks. Careful
capacity planning and optimization are necessary to minimize these
performance impacts.

The system architecture of an Intrusion Detection System (IDS) refers


to the underlying design and components that work together to enable
the detection and monitoring of security threats. While the specific
architecture may vary depending on the type of IDS and its
implementation, here is a brief explanation of the key components
typically found in an IDS architecture:

2.2. DATA COLLECTION

The data collection component captures network traffic


using a packet capture library such as Libecap the captured packets are
pre-processed to remove irrelevant packets and stored in a database
such as MySQL. Data collection can be done using various techniques
such as port mirroring, span ports, or network taps.

Network-based IDS (NIDS) collects data by monitoring network


traffic. It deploys sensors or network taps at strategic locations within
the network infrastructure to capture network packets. The sensors
capture packets traversing the network and forward them to the IDS for
analysis. This data includes information such as source and destination
IP addresses, port numbers, protocols, packet payloads, and other
relevant metadata.

Host-based IDS (HIDS) collects data from individual hosts or servers


by monitoring system logs and event data. It utilizes agents installed on
the hosts to capture and analyse various types of logs, including system
logs, application logs, authentication logs, and other relevant events.
These logs provide information about system activities, user behavior,
file changes, and other events that can indicate potential security
incidents.

IDS can also incorporate external feeds or threat intelligence data from
trusted sources. These feeds provide additional information about
known malicious IP addresses, domains, or other indicators of
compromise. By leveraging external feeds, IDS can enhance its
detection capabilities by cross-referencing observed network traffic
against known threats.

By collecting and analyzing data from multiple sources, IDS can gain
visibility into the network, host activities, and system events. This
comprehensive data collection enables the IDS to detect anomalies,
identify potential security threats, and generate alerts or notifications
for further investigation and response by security personnel.
2.3. DATA ANALYSIS

The data analysis component analyses network traffic


using two detection methods: signature-based detection and anomaly-
based detection. The signature-based detection method matches
network traffic against known attack signatures. The anomaly-based
detection method detects abnormal behavior in network traffic that
deviates from normal patterns. The data analysis component should be
able to handle large amounts of data and provide fast results.

Signature-based IDS compares the collected data against a database of


known attack signatures or patterns. It looks for exact matches between
the observed data and the signatures in its database. If a match is found,
it indicates the presence of a known attack or malicious activity.
Signature matching is effective in detecting well-known attacks but
may miss novel or unknown threats.

Anomaly-based IDS focuses on identifying deviations from normal or


expected behavior. It establishes a baseline of normal activity by
analyzing historical data or learning from observed patterns. Any
significant deviation from the baseline is flagged as an anomaly and
potentially indicative of a security threat. Anomaly detection can
identify previously unknown or zero-day attacks but may also generate
false positives if the baseline is not accurately established.Behavior-
based IDS monitors and analyzes the behavior of entities within the
network or system, such as users, hosts, or applications. It looks for
patterns that deviate from expected behavior and may indicate
malicious intent. Behavior analysis considers factors such as
communication patterns, resource usage, access privileges, or data
transfer volumes. It helps identify insider threats, unauthorized
activities, or abnormal user behaviours.

2.4. ALERT GENERATION

The alert generation component generates alerts


when an attack is detected. Alerts should be prioritized based on
severity and relevance. The alert generation component should be able
to generate alerts in real-time and notify the administrator via email or
SMS.

The IDS analyzes the collected data using various detection techniques,
such as signature matching, anomaly detection, heuristic analysis, or
behavior analysis. It compares the observed data against known attack
signatures, baseline behavior, or predefined rules to identify potential
security threats.

IDS may utilize predefined thresholds or triggers to determine when an


event or activity should be flagged as suspicious or indicative of a
security incident. These thresholds can be based on factors such as the
number of failed login attempts, network traffic volume, or abnormal
system behavior. When a threshold is crossed or a trigger condition is
met, it triggers the generation of an alert. IDS assigns a severity level
to each alert to indicate the potential impact or severity of the detected
event. Severity levels can range from informational or low to critical or
high, allowing security teams to prioritize their response based on the
potential risk or impact of the event.

2.5. REPORTING

The reporting component provides reports on IDS activity and alerts.


Reports should include details such as the type of attack, source and
destination IP addresses, and the time of the attack. Reports can be
generated in various formats such as PDF or CSV.

IDS reports typically include event summaries that provide an


overview of the detected security events during a specific time period.
These summaries may include information such as the total number of
events, the distribution of events by severity level, the top detected
threats or attack types, and the affected hosts or systems.

Reporting includes detailed information about the generated alerts,


including the nature of the detected events, associated timestamps,
source and destination IP addresses, affected hosts or systems, and
severity levels. This helps in understanding the specific security
incidents and their impact on the network or system.
CHAPTER 3

SYSTEM ARCHITECTURE

3.1. SYSTEM REQUIREMENTS

This section outlines the functional and non-functional


requirements of the Intrusion Detection System. Functional
requirements are specific features or functions that the system must
have, such as the ability to detect and alert for unauthorized access
attempts. Non-functional requirements are performance-related
characteristics, such as system availability, scalability, and reliability.

IDS may have specific hardware requirements depending on the


deployment type and the volume of network traffic to be monitored.
The hardware requirements typically include sufficient processing
power, memory (RAM), and storage capacity to handle the data
analysis and storage needs of the IDS. Additionally, network interface
cards (NICs) or specialized network sensors may be required for
capturing and monitoring network traffic.

IDS software typically runs on a specific operating system (OS)


platform. The system requirements will specify the compatible OS
versions, patches, and updates needed for the IDS to operate effectively.
It is essential to ensure that the IDS software is compatible with the
chosen OS and that the OS is properly configured and secured to
support the IDS functions.

The IDS relies on a robust network infrastructure to capture and analyse


network traffic. Sufficient network bandwidth is essential to handle the
volume of traffic being monitored without causing network congestion
or performance degradation. Proper network segmentation, spanning
ports, or network taps may be required to facilitate the traffic
monitoring and data collection process.

3.2. SYSTEM DESIGN

The system design section describes the overall architecture


of the Intrusion Detection System. It includes the network topology,
data flow, and component interactions. This section may also include
diagrams, flowcharts, and other visual aids to help readers understand
the system's design.

IDS system design begins with determining the appropriate


architecture for the organization's needs. There are various architecture
options, such as network-based IDS (NIDS), host-based IDS (HIDS),
or a combination of both (hybrid IDS). The chosen architecture
depends on factors like the network infrastructure, security goals,
monitoring scope, and deployment complexity.
In network-based IDS (NIDS), sensor placement is a critical aspect of
the system design. Sensors are strategically positioned within the
network to capture and monitor network traffic. The placement depends
on factors such as network topology, traffic patterns, critical network
segments, and potential attack vectors. Proper sensor placement
ensures comprehensive visibility into network traffic.

System design includes planning for data collection mechanisms. This


involves capturing network traffic or host-level events for analysis.
Data collection methods can include network taps, port mirroring, span
ports, or agent-based monitoring on individual hosts. The design should
consider the scalability, performance, and reliability of the data
collection mechanisms to handle the desired level of monitoring.

IDS system design includes defining the processes and techniques for
data analysis. This involves selecting appropriate detection methods
such as signature-based analysis, anomaly detection, heuristic analysis,
or behavior-based analysis. The design should consider the
computational resources, algorithms, and rule sets required for efficient
and accurate detection of security threats.
3.3. HARDWARE AND SOFTWARE REQUIREMENTS

Hardware and software requirements are essential


considerations when deploying an Intrusion Detection System (IDS).
These requirements ensure that the IDS operates effectively, efficiently,
and securely. Here is a brief explanation of hardware and software
requirements for an IDS:

Hardware Requirements

The IDS requires sufficient processing power to analyze


network traffic or host events in real-time. The hardware should include
modern processors with multiple cores to handle the computational
demands of the IDS algorithms.

Adequate memory is crucial for the IDS to store and process data
efficiently. The amount of required RAM depends on the volume of
network traffic or the number of hosts being monitored. Sufficient
memory ensures smooth data analysis and reduces the risk of
performance degradation.

The IDS requires storage space to store collected data, event logs, and
alerts. The storage capacity depends on factors such as the duration of
data retention, the expected volume of network traffic, and the
organization's compliance or auditing requirements. Sufficient storage
capacity is necessary for forensic analysis, incident response, and
historical trend analysis.

Software Requirements

IDS software runs on a specific operating system (OS). The


software requirements specify compatible OS versions and any
required patches or updates. It is important to select an OS that is
known for stability, security, and compatibility with the IDS software.

Some IDS solutions require a DBMS for storing and managing


collected data, event logs, and alert information. The software
requirements may specify a particular DBMS version or provide
compatibility with multiple options. It is necessary to ensure that the
selected DBMS meets the performance, scalability, and security needs
of the IDS.
CHAPTER 4

DATA COLLECTION

4.1. PACKET CAPTURE

Packet capture is the process of capturing network traffic.


Packet capture can be done using various techniques such as port
mirroring, span ports, or network taps. Packet capture can be done on
different layers of the OSI model, such as layer 2 or layer 3. Packet
capture can capture packets based on different criteria such as source
IP address, destination IP address, and port number.

Packet capture in an Intrusion Detection System (IDS) refers to the


process of capturing and inspecting network packets to detect potential
security threats. It involves capturing packets from the network traffic
flow and analyzing their content to identify suspicious or malicious
activities. Here is a brief explanation of packet capture in IDS:

IDS relies on packet capture to monitor network traffic and analyze the
packets in real-time or offline. The IDS system captures packets from
different network segments, interfaces, or specific points in the network
where the traffic needs to be monitored.

The IDS uses packet sniffing techniques to capture packets. It can be


achieved by employing specialized network sensors, network interface
cards (NICs), or by configuring network switches to forward a copy of
packets to the IDS sensors. Packet sniffing allows the IDS to inspect
the content and analyse the packets for potential security threats.

Packet capture enables the IDS to perform protocol analysis. The IDS
examines the packet headers and payload to identify the protocol being
used, source and destination IP addresses, port numbers, and other
relevant information. Protocol analysis helps the IDS understand the
nature of the traffic and identify any anomalies or deviations from
expected behavior.

4.2. PACKET PRE-PROCESSING

Packet pre-processing is the process of filtering and selecting


relevant packets from the captured packets. Packet pre-processing can
remove irrelevant packets such as broadcast packets or packets that do
not belong to the network segment being monitored. Packet pr pre-
processing reprocessing can also extract relevant fields from the packet
header such as source and destination IP addresses and port numbers.

In the case of network protocols that split data across multiple packets
(e.g., TCP/IP fragmentation), packet reassembly is performed to
reconstruct the original data stream. This ensures that the IDS analyzes
the complete payload of network communications, avoiding potential
gaps or incomplete information.
Packet pre-processing includes parsing the packet headers to extract
relevant information about the network protocol being used. This
involves extracting details such as source and destination IP addresses,
port numbers, protocol types, sequence numbers, and other header
fields. Protocol parsing helps the IDS understand the context and
characteristics of the network traffic.

In some cases, certain packet headers may be removed during pre-


processing to optimize the analysis process and reduce resource
consumption. Non-essential headers or fields that do not contribute to
intrusion detection may be stripped to streamline the subsequent
analysis and detection steps.

4.3. PACKET STORAGE

The pro pre-processing cessed packets are stored in a database


such as MySQL. The database can be optimized for performance by
using techniques such as indexing and partitioning. The database
should be able to handle large amounts of data and provide fast retrieval
times.

The IDS typically uses a capture buffer to temporarily store the


captured packets before further processing or analysis. The capture
buffer is a portion of memory or disk space that holds the packets in the
size of the capture buffer determines how many packets can be stored
temporarily before they are processed or analyzed.

For long-term storage of packets, IDS solutions may employ various


storage architectures, such as disk-based storage or network-attached
storage (NAS). The choice of storage architecture depends on factors
like the expected volume of captured packets, storage requirements,
performance considerations, and the organization's data retention
policies.
CHAPTER 5

DATA ANALYSIS

5.1. SIGNATURE BASED DETECTION

Signature-based detection is a detection method that


matches network traffic against known attack signatures. Attack
signatures are patterns of network traffic that are associated with
specific attacks. Signature-based detection can be effective against
known attacks, but it may not detect unknown attacks or attacks that
have been modified to evade detection.

The signature database contains specific patterns or characteristics


associated with known attacks, such as network packet sequences, file
hashes, or behavior patterns. These signatures are typically derived
from analyzing past attacks and identifying their unique attributes.
When the IDS encounters network traffic or system behavior that
matches a signature in its database, it generates an alert to notify system
administrators or security personnel about the potential intrusion.

Signature-based IDS is highly effective at detecting known attacks and


malware that have been previously identified and analyzed. It can
quickly recognize and respond to well-known threats by comparing the
observed patterns with the signatures in its database. This approach is
particularly useful for detecting common and widely used attack
methods.

However, signature-based IDS has limitations. It relies on the


availability of up-to-date signatures, so it may miss new or zero-day
attacks that have not yet been added to the database. Additionally,
attackers can attempt to evade detection by modifying their attacks
slightly or using encryption to obfuscate their activities, making them
harder to match with existing signatures.

5.2. ANAMALY BASED DETECTION

Anomaly-based detection is a detection method that detects


abnormal behavior in network traffic that deviates from normal
patterns. Anomaly-based detection can be effective against unknown
attacks and attacks that have been modified to evade detection.
Anomaly-based detection requires a baseline of normal network
behavior to compare against.

Anomaly-based IDS analyzes and profiles network traffic, system logs,


or other monitored activities over a period of time to learn what
constitutes normal behavior. It takes into account factors such as
network protocols, traffic volumes, communication patterns, resource
usage, and user behavior. By establishing a baseline of normal activity,
it can identify anomalies or deviations that may indicate malicious
activities or security breaches.

When an anomaly is detected, the IDS generates an alert to notify


system administrators or security personnel. The alerts can indicate
potential security threats, such as network scanning, unauthorized
access attempts, unusual data transfers, or abnormal system activities.
The IDS may also use statistical analysis, machine learning, or artificial
intelligence techniques to improve its ability to distinguish between
normal and abnormal behavior.

Anomaly-based IDS is particularly effective at detecting unknown or


novel attacks, including zero-day exploits or attacks that have not yet
been identified and characterized by signature-based IDS. It can adapt
to new attack vectors and emerging threats by focusing on deviations
from the established baseline. However, it may also generate false
positives if there are significant variations in legitimate activities or if
the baseline is not properly calibrated.

5.3. NETWORK BEHAVIOUR ANALYSIS

NBA IDS collects and analyzes network traffic data,


looking for deviations from expected or normal behavior. It examines
various network parameters, including packet sizes, traffic volumes,
communication protocols, connection patterns, and data transfer rates.
By establishing a baseline of normal network behavior, the NBA IDS
can identify anomalies that may signify security breaches, network
attacks, or unauthorized activities.

Unlike signature-based IDS that relies on predefined attack signatures


or anomaly-based IDS that focuses on deviations from overall system
behavior, NBA IDS specifically looks for anomalies in network-level
behavior. It identifies patterns such as port scanning, network
reconnaissance, unusual traffic flows, unauthorized connections, or
other suspicious network activities that may indicate malicious intent.

When an anomaly is detected, the NBA IDS generates alerts or triggers


automated responses to notify network administrators or security
teams. These alerts can help initiate further investigation, incident
response, or mitigation measures to address potential security threats
promptly.

5.4. MACHINE LEARNING IN IDS

Machine learning can be used to enhance the


detection capabilities of an IDS. Machine learning algorithms can learn
patterns of normal network behavior and detect deviations from these
patterns. Machine learning algorithms can also learn new attack
signatures and adapt to new attack vectors.

In the training phase, the IDS uses a labeled dataset that contains
examples of both normal and malicious network traffic. The labeled
data is used to train machine learning models to recognize patterns and
characteristics associated with different types of attacks or anomalies.
Various supervised learning algorithms, such as decision trees, support
vector machines (SVM), or neural networks, are commonly employed
for this purpose.

Machine learning in IDS involves extracting relevant features from


network traffic data to represent the behavior or characteristics of the
traffic. Features can include attributes such as source and destination IP
addresses, port numbers, packet sizes, protocols, or statistical measures
derived from the network traffic. Proper feature selection or
engineering is crucial to ensure the machine learning model captures
the most discriminative and informative aspects of the data.

The selected machine learning algorithm is trained using the labeled


dataset and the extracted features. The training process involves
adjusting the model's parameters to minimize errors and maximize its
ability to distinguish between normal and malicious network behavior.
Optimization techniques, such as cross-validation or hyperparameter
tuning, are employed to fine-tune the model's performance.
CHAPTER 6

ALERT GENERATION

6.1. ALERT CRITERIA

Alert criteria are the conditions or rules that an IDS uses


to identify and classify events as potential security threats. These
criteria can be based on various factors, such as signatures of known
attacks, abnormal traffic patterns, user behavior anomalies, and other
indicators of compromise. Alert criteria can be predefined by security
experts or customized based on the organization's specific security
policies and requirements.

Signature-Based Alerts: In signature-based IDS, alerts are generated


based on predefined attack signatures or patterns. The IDS compares
the characteristics of the captured network traffic against a database of
known attack signatures. If a match is found, indicating a potential
attack, an alert is triggered. The alert criteria in this case are the specific
signatures or patterns associated with known attacks.

Anomaly-Based Alerts: In anomaly-based IDS, alerts are generated


when network traffic deviates from the expected or normal behavior.
The IDS establishes a baseline of normal behavior by analyzing
historical or representative network traffic data. Any significant
deviation or anomaly from this baseline triggers an alert. The alert
criteria here are the statistical or behavioral anomalies that exceed
predefined thresholds.

IDS may assign severity levels to alerts based on the perceived impact
or risk associated with the detected activity. The severity levels help
prioritize alerts and allocate resources effectively. For example, high-
severity alerts may indicate critical threats that require immediate
attention, while low-severity alerts may represent less significant or
potential false positives.

6.2. ALERT PRIORITIZATION

Alert prioritization is the process of assigning a level of


severity or priority to the generated alerts based on their potential
impact on the organization's security. This prioritization helps security
analysts to focus their attention on the most critical alerts first. The
severity level can be based on various factors, such as the type and
sophistication of the attack, the affected assets' criticality, and the
potential damage to the organization's reputation and finances.

IDS may assign severity levels to alerts based on the perceived impact
or risk associated with the detected activity. Severity levels indicate the
potential harm or damage that can result from an identified threat.
Common severity levels include critical, high, medium, and low.
Assigning severity levels helps prioritize alerts based on their criticality
and allocate resources accordingly.

Alert prioritization involves conducting an impact analysis to assess


the potential consequences of a security incident associated with each
alert. The impact analysis considers factors such as the affected
systems, data sensitivity, potential business disruption, regulatory
compliance implications, or the potential for data loss or unauthorized
access. Alerts with higher potential impact are prioritized for
immediate attention.

IDS can leverage threat intelligence feeds or external sources of


information to prioritize alerts. Threat intelligence provides insights
into the current threat landscape, including known malicious indicators,
attack campaigns, or emerging vulnerabilities. Alerts associated with
known threat actors, sophisticated attack techniques, or active
campaigns may be given higher priority.

6.3. Alert Notification

Alert notification is the process of informing security


personnel about the generated alerts through various communication
channels, such as email, SMS, phone calls, or dashboard displays. The
notification can include information such as the alert severity, the
affected asset, the attack type, and recommended actions to mitigate the
threat. Timely and effective alert notification is critical for incident
response and reducing the impact of security breaches.

IDS utilizes various notification channels to deliver alerts to the


appropriate recipients. Common notification channels include email,
SMS, instant messaging platforms, ticketing systems, or dedicated
security incident management platforms. The choice of channels
depends on the organization's communication infrastructure and the
preferences of the security team.

The IDS identifies the relevant recipients who need to be notified about
the alert. This typically includes security analysts, incident response
teams, network administrators, or other designated personnel
responsible for handling security incidents. The recipients may be
determined based on predefined escalation procedures, roles and
responsibilities, or a specific incident response plan.
CHAPTER 7

REPORTING

7.1. REPORTING REQUIREMENTS

Reporting requirements in an Intrusion Detection System


(IDS) refer to the need to generate comprehensive and informative
reports on the system's performance, detected threats, and security
incidents. These reports serve multiple purposes, including compliance
auditing, incident analysis, performance monitoring, and
communication with stakeholders. Here is a brief explanation of
reporting requirements in IDS:

IDS may need to generate reports that demonstrate compliance with


specific regulations, industry standards, or internal security policies.
These reports provide evidence of the IDS's effectiveness in detecting
and mitigating security threats and can be used during audits or
assessments. Compliance reports may include details such as the
number of alerts, incident response times, system availability, and
adherence to specific security controls.

IDS reports play a crucial role in analyzing security incidents and


investigating potential threats. These reports provide in-depth
information about the detected events, including attack vectors,
affected systems, timestamps, and other contextual details. Incident
reports may include forensic data, packet captures, logs, and analysis
of the attack or anomaly. They help security teams understand the
nature of the incidents, identify trends or patterns, and formulate
effective response strategies.

7.2. REPORT GENERATION

Report generation in an Intrusion Detection


System (IDS) involves the process of creating comprehensive and
informative reports based on the data collected and analyzed by the
IDS. These reports provide valuable insights into the security events,
threats, and system performance, enabling security teams,
management, and stakeholders to make informed decisions. Here is a
brief explanation of report generation in IDS

The first step in report generation is the collection of relevant data from
the IDS. This data can include information such as network traffic logs,
alerts, event logs, system performance metrics, and contextual details
related to detected incidents. The IDS gathers this data from various
sources, including packet captures, log files, sensor outputs, and other
monitoring mechanisms.

The collected data is then aggregated and analyzed to derive


meaningful insights. The IDS processes the data using algorithms,
statistical methods, and correlation techniques to identify patterns,
anomalies, and potential security threats. This analysis helps identify
the most critical events, assess their impact, and determine the
significance of detected incidents.

IDS should provide flexible reporting capabilities that allow users to


design and customize reports according to their specific requirements.
Users can choose the content, format, and layout of the reports to align
with their needs and the intended audience. Customization options may
include selecting specific data fields, adding charts or graphs, applying
filters, and defining time ranges.

7.3. REPORT ANALYSIS

Report analysis in an Intrusion Detection System (IDS)


involves the examination and interpretation of generated reports to gain
insights into security events, identify patterns or trends, assess the
effectiveness of security measures, and make informed decisions.
Report analysis plays a crucial role in incident response, threat
mitigation, compliance management, and overall security
enhancement. Here is a brief explanation of report analysis in IDS:

IDS reports provide valuable information about detected security


incidents, including the type of attack, affected systems, timestamps,
and other relevant details.
Report analysis allows security teams to investigate and understand the
nature and scope of each incident. By analyzing incident reports,
security analysts can identify the root causes, attack vectors, and
potential vulnerabilities that were exploited. This analysis helps in
formulating effective mitigation strategies, applying necessary patches
or configurations, and preventing similar incidents in the future.

IDS reports can be analyzed to identify emerging threats, attack trends,


or changes in the threat landscape. By analyzing reports over time,
security teams can detect patterns, observe the evolution of attack
techniques, and anticipate potential future threats. Trend analysis helps
in proactive threat detection and allows for adjustments in security
measures and countermeasures to align with the evolving threat
landscape. Reports can also be compared against external threat
intelligence sources
CHAPTER 8

ID3 ALGORITHM

8.1. INTRODUCTION

The ID3 algorithm, short for Iterative Dichotomiser 3, is a


decision tree algorithm used for machine learning and data
classification tasks. It was developed by Ross Quinlan in 1986 and is
widely known for its simplicity and effectiveness in building decision
trees from labeled training data. The ID3 algorithm is specifically
designed for handling categorical data and is capable of handling both
binary and multi-class classification problems.

The ID3 algorithm operates in a top-down, recursive manner to


construct decision trees. It uses a statistical measure called information
gain to determine the best attribute to split the data at each step.
Information gain quantifies the reduction in uncertainty achieved by
partitioning the data based on a particular attribute. The attribute with
the highest information gain is chosen as the splitting criterion at each
node of the decision tree.

The ID3 algorithm continues to recursively split the data based on the
selected attribute until all instances in a subset belong to the same class
or when there are no more attributes available for splitting. In the
resulting decision tree, each internal node represents a decision based
on a specific attribute, and each leaf node represents a class label or a
prediction.

8.2. HISTORY OF ID3

Developer development of ID3 was influenced by earlier work


in the field of decision tree learning, including the CART
(Classification and Regression Trees) algorithm by Leo Bierman,
Jerome Friedman, Richard Olshan, and Charles Stone, and the AQ
(Argumentation-Based Learning) system by Raymond J. Mooney.
Quinlan's objective was to create a simpler and more efficient decision
tree algorithm that could handle categorical data effectively.

ID3 was particularly well-suited for handling categorical features and


became widely recognized for its simplicity and effectiveness in
building decision trees. The algorithm utilized a statistical measure
known as information gain to determine the best attribute for splitting
the data at each step. Information gain quantified the reduction in
uncertainty achieved by partitioning the data based on a specific
attribute, enabling the algorithm to select the attribute that provided the
most valuable information for classification
8.3. ID3 IN IDS

The ID3 algorithm, as described in the previous response, is primarily


used for machine learning and data classification tasks. However, it can
also be applied in the context of an Intrusion Detection System (IDS)
for certain purposes. Here is a brief explanation of how the ID3
algorithm can be used in IDS:

An IDS needs to analyze network traffic and identify suspicious or


malicious activities. The ID3 algorithm can be utilized to select
relevant features or attributes that contribute to the detection of
intrusions. These features can include network protocols, source and
destination IP addresses, port numbers, packet size, and other
characteristics of network packets.

To apply the ID3 algorithm in an IDS, a labeled training dataset needs


to be prepared. This dataset consists of historical network traffic
instances or packets, each labeled as either normal or malicious. The
features extracted from the packets serve as input variables, and the
labels indicate the corresponding class (normal or malicious).

The ID3 algorithm is employed to construct a decision tree based on


the labeled training dataset.

The algorithm recursively selects the best attribute to split the data,
aiming to maximize the information gain at each node.
The decision tree represents a set of rules that help classify new
instances of network traffic as normal or malicious based on their
attribute values.

8.4. ID3 ALGORITHM

import NumPy as np

class Node:

def _init_(self, attribute=None, label=None):

self.attribute = attribute

self.label = label

self.children = {}

def entropy(data):

_, counts = np.unique(data, return_counts=True)

probabilities = counts / len(data)

entropy_value = -np.sum(probabilities * np.log2(probabilities))

return entropy_value
def information_gain(data, attribute_index, labels):

total_entropy = entropy(labels)

attribute_values, counts = np.unique(data[:, attribute_index],

return_counts=True)

attribute_entropy = np.sum([(counts[i] / np.sum(counts)) *

entropy(labels[data[:, attribute_index] == attribute_values[i]]) for i in

range(len(attribute_values))])

return total_entropy - attribute_entropy

def majority_label(labels):

unique_labels, counts = np.unique(labels, return_counts=True)

return unique_labels[np.argmax(counts)]

def id3(data, attributes, labels):

# Create a root node


root = Node()

# If all instances have the same class label, return a leaf node

if len(np.unique(labels)) == 1:

root.label = labels[0]

return root

# If there are no attributes left, return a leaf node with the majority class

label

if len(attributes) == 0:

root.label = majority_label(labels)

return root

# Find the attribute with the highest information gain

gains = [information_gain(data, attribute_index, labels) for

attribute_index in range(data.shape[1])]
best_attribute_index = np.argmax(gains)

best_attribute = attributes[best_attribute_index]

# Set the attribute of the current node

root.attribute = best_attribute

# Create child nodes for each attribute value

attribute_values, counts = np.unique(data[:, best_attribute_index],

return_counts=True)

for i in range(len(attribute_values)):

value = attribute_values[i]

subset_indices = np.where(data[:, best_attribute_index] == value)[0]

subset_data = data[subset_indices]

subset_labels = labels[subset_indices]

if len(subset_data) == 0:
child = Node(label=majority_label(labels))

else:

child = id3(subset_data, np.delete(attributes, best_attribute_index),

subset_labels)

root.children[value] = child

return root

8.5.ID3 IN IDS PROGRAM

import numpy as np

class Node:

def _init_(self, attribute=None, label=None):

self.attribute = attribute

self.label = label

self.children = {}
def entropy(data):

_, counts = np.unique(data, return_counts=True)

probabilities = counts / len(data)

entropy_value = -np.sum(probabilities * np.log2(probabilities))

return entropy_value

def information_gain(data, attribute_index, labels):

total_entropy = entropy(labels)

attribute_values, counts = np.unique(data[:, attribute_index],

return_counts=True)

attribute_entropy = np.sum([(counts[i] / np.sum(counts)) *

entropy(labels[data[:, attribute_index] == attribute_values[i]]) for i in

range(len(attribute_values))])

return total_entropy - attribute_entropy


def majority_label(labels):

unique_labels, counts = np.unique(labels, return_counts=True)

return unique_labels[np.argmax(counts)]

def id3(data, attributes, labels):

# Create a root node

root = Node()

# If all instances have the same class label, return a leaf node

if len(np.unique(labels)) == 1:

root.label = labels[0]

return root

# If there are no attributes left, return a leaf node with the majority

class label

if len(attributes) == 0:
root.label = majority_label(labels)

return root

# Create child nodes for each attribute value

attribute_values, counts = np.unique(data[:, best_attribute_index],

return_counts=True)

for i in range(len(attribute_values)):

value = attribute_values[i]

subset_indices = np.where(data[:, best_attribute_index] ==

value)[0]

subset_data = data[subset_indices]

subset_labels = labels[subset_indices]

if len(subset_data) == 0:

child = Node(label=majority_label(labels))

else:
child = id3(subset_data, np.delete(attributes,

best_attribute_index), subset_labels)

root.children[value] = child

return root

if value not in tree.children:

return "Unknown" # Return "Unknown" for unobserved

attribute values

tree = tree.children[value]

return tree.label

8.7. ADVANTAGES OF ID3


The ID3 (Iterative Dichotomiser 3) algorithm offers
several advantages that contribute to its popularity and applicability in
various domains.

Simplicity and Ease of Implementation: ID3 has a straightforward and


intuitive algorithmic approach. It is relatively easy to understand and
implement compared to more complex machine learning algorithms.
This simplicity makes it accessible to individuals with limited machine
learning experience and allows for quick experimentation and
prototyping.

8.8. DISADVANTAGE OF ID3

While ID3 (Iterative Dichotomiser 3) offers several


advantages, it also has certain limitations and disadvantages that need
to be considered. Here are some key disadvantages of ID3:

ID3 is designed to handle categorical attributes and is not well-suited


for datasets that contain continuous or numeric attributes. It requires
discrete values for attribute splitting, which means continuous
attributes need to be discretized or transformed into discrete bins before
using ID3
ID3 has a tendency to create decision trees that are overly complex and
specific to the training data, leading to overfitting. Overfitting occurs
when the decision tree captures noise or random variations in the
training data, reducing its ability to generalize to unseen data. Proper
techniques such as pruning or using other tree algorithms like C4.5 can
be employed to mitigate this issue.
CHAPTER 9

TESTING AND VALIDATION

9.1. TEST PLAN

The test plan begins by defining the objectives of the


testing process. This includes specifying the desired outcomes, such as
evaluating the detection capabilities, assessing the accuracy of alerts,
or measuring the system's response time.

The test plan outlines the methodologies and techniques to be employed


during testing. This may involve using various types of test data, such
as synthetic attack scenarios, known intrusion patterns, or real-world
network traffic captures. The methodologies also specify the tools,
software, and hardware resources required for the tests.

The test procedures describe the step-by-step instructions for executing


the test plan. This includes configuring the IDS with the necessary
settings, deploying the test data or attack scenarios, and monitoring the
system's response. The procedures should be clear, well-documented,
and reproducible to ensure consistency across multiple test runs.
The test plan defines the metrics and criteria for evaluating the
performance of the IDS. This may involve measuring the detection rate,
false positive rate, response time, resource utilization, or other relevant
factors. The plan specifies how the results will be collected, analyzed,
and compared against the predefined thresholds or benchmarks.

9.2. TEST CASE

The test case begins with a clear statement of the


objective or goal of the test. This identifies what is being tested, such
as a specific type of intrusion, detection rule, or response mechanism.

The test case specifies the input data or conditions required to execute
the test. This can include network traffic captures, log files, attack
patterns, or any other relevant information necessary to simulate the
desired test scenario.

The test case outlines the step-by-step instructions for executing the
test. It includes the actions or operations to be performed, such as
configuring the IDS, deploying the test data or attack scenario, and
monitoring the system's response.

The test case defines the expected outcomes or behavior of the IDS for
the given test scenario. This includes the expected detection of
intrusions, generation of accurate alerts, appropriate responses or
countermeasures, and any specific criteria for evaluating the results.

9.3. VALIDATION

The validation process begins by defining the


objectives and goals of the validation. This includes specifying the
desired outcomes, such as evaluating the detection capabilities,
assessing the accuracy of alerts, measuring the system's response time,
or validating compliance with regulatory standards.

To validate the IDS, appropriate test data is prepared. This may involve
using real-world network traffic captures, synthetic attack scenarios, or
known intrusion patterns. The test data should represent a wide range
of potential threats and include both normal and malicious activities.
CHAPTER 10

CONCLUSION

In conclusion, the development of an intrusion detection system is


an important aspect of maintaining the security of computer networks
and systems.

This project report has outlined the design, implementation, and testing
of an intrusion detection system that aims to detect and prevent various
types of attacks.

Chapter 1 provided an overview of the project, including the objectives


and scope of the system.

Chapter 2 discussed the literature review, which provided a


background on intrusion detection systems and the different types of
attacks that can be prevented using these systems.

Chapter 3 presented the methodology used in the development of the


system, including the requirements gathering process, the system
architecture, and the implementation details.
Chapter 4 presented the details of the system design, including the
components of the system and the algorithms used to detect and prevent
attacks.

Chapter 5 discussed the implementation of the system, including the


programming languages, tools, and technologies used to build the
system.

Chapter 6 presented the testing methodology used to validate the


system's performance.

Chapter 7 covered the reporting aspect of the system, including the


reporting requirements, report generation, and report analysis.

Chapter 8 covered overview of ID3 algorithm and its implementation

Chapter 9 presented the testing plan, test cases, and validation of the
system.

Overall, the development and implementation of an intrusion detection


system is a complex process that requires a thorough understanding of
security threats and the tools and technologies needed to prevent them.

In conclusion, the intrusion detection system developed in this project


report has the potential to provide an effective solution to the security
challenges faced by computer networks and systems.

With ongoing development and refinement, this system could be an


important tool in maintaining the security of digital assets.
CHAPTER 11

REFERENCES

Bishop, M. (2003). Computer Security: Art and Science. Addison-


Wesley.

Chen, T., Zhou, S., & Chen, Y. (2019). A Survey on Intrusion Detection
Systems: Recent Advances and Future Trends. IEEE Access, 7, 66380-
66398.

Duan, K., Gao, L., Chen, Y., Zhang, J., & Hu, Y. (2019). A survey on
deep learning for network intrusion detection. Neurocomputing, 335,
27-38.

\
Suthaharan, S. (2014). Machine learning models and algorithms for big
data classification: Thinking with examples for effective learning.
Morgan Kaufmann.

Zhan, Y., & Zhang, Y. (2019). Deep learning for network intrusion
detection: A survey. IEEE Communications Surveys & Tutorials, 21(4),
2860-2881.

Ye, Y., Li, Y., Liu, Y., & Zhou, W. (2020). An intrusion detection system
based on extreme learning machine. Journal of Ambient Intelligence
and Humanized Computing, 11(3), 1003-1015.

You might also like