0% found this document useful (0 votes)
8 views25 pages

Group 7

Bak
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views25 pages

Group 7

Bak
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 25

PROGRAMME: BSc.

CYS III

MODULE NAME: DATA STRUCTURE AND ALGORITHMS

MODULE CODE: CYU 08104

FACILITATOR: MR. KABEYA ALMASI

NATURE OF ASSIGNMENT: GROUP ASSSIGNMENT

ACADEMIC YEAR: 2024/2025

GROUP MEMBERS
S/N NAMES REGISTRATION NUMBER
1 KELVIN ALEX PETER BCSe-01-0204-2022
2 MICHAEL ISSA AMANZI BCSe-01-0016-2022
3 ALEN DAUSON BCSe-01-0174-2022
4 FAUSTIN MASAGA BCSe-01-0098-2022
5 GILBERT MBUJI BCSe-01-0165-2022
6 ELLY MATHIAS BCSe-01-0058-2022
7 THADEO THOBIAS BCSe-01-0049-2022
8 BENARD KAKIZIBA BCSe-01-0158-2022
9 IAN LESIKA BCSe-01-0232-2022
10 DIANA ISACK BCSe-01-0024-2022
11 ESTHER MWAIBINDE BCSe-01-0031-2022
12 MAGDALENA LYANGA BCSe-01-0125-2022
13 MINZA SILVANO BCSe-01-0177-2022
14 MARIA NASORO BCSe-01-0176-2022
15 OLIVER OBED BCSe-01-0104-2022
QUESTIONS
Binary Tree Applications in Cybersecurity
1. Describe the Binary Tree Structure and Algorithm:
 Explain the properties of a Binary Tree and how search operations are performed
within it.
 Provide an example of a Binary Search Tree (BST) and outline the process of
inserting, deleting, and searching for nodes.
 Explain tree traversal methods

2. Time Complexity Analysis:


 Analyze the average-case and worst-case time complexity of searching in a
Binary Search Tree.
 Discuss how balanced and unbalanced trees affect search efficiency, particularly
in large datasets.

3. Cybersecurity Application:
 Explain how a Binary Search Tree could be applied in a cybersecurity context,
such as maintaining a list of known malicious IP addresses.
 Describe how the BST structure can help speed up search operations compared to
a linear search, especially in high-frequency scenarios like real-time threat
detection.

4. Limitations and Improvements:


 Discuss potential limitations of Binary Trees, especially in the case of unbalanced
trees.
 Suggest alternative or enhanced data structures (e.g., AVL Tree, Red-Black Tree)
that maintain balanced properties and analyze their potential impact on search
efficiency in cybersecurity applications.
1. Describe the binary Tree Structure and Algorithms
A Binary Tree Data Structure is a hierarchical data structure in which each node has at
most two children. It is commonly used in computer science for efficient storage and
retrieval of data, with various operations such as insertion, deletion and traversal.

Properties of Binary Tree Structure and Algorithms


i. Node Properties
 Each node can have at most two children: These are called the left child
and the right child.
 The number of children can range from 0 to 2
ii. Height and Depth
 Height of a tree: The length of the longest path from the root to a leaf.
 Depth of a node: The distance from the root to that node.

iii. Levels
 A binary tree grows in levels, starting at level 0 (the root).
 Nodes at level n can have up to 2n nodes.

iv. Number of nodes


 The total number of nodes in a full binary tree of height h is 2h+1 – 1

v. Traversal Order
 Nodes are accessed in a specific order using traversal methods such as in-
order, preorder, post-order or level-order.

vi. Balance
 A balanced binary tree ensures that the height difference between the left
and right subtrees of any node is at most 1. This keeps operations efficient.

vii. Special Binary Tree Types


 Full Binary Tree: Every node has either 0 or 2 children.
 Perfect Binary Tree: All internal nodes have two children, and all leaves
are at the same level.
 Complete Binary Tree: All levels except possibly the last are completely
filled, and nodes are added from left to right.

How search operations are performed within it


Search operations in binary tree involve traversing the tree to locate a node with a
specific value. The process depends on the type of binary tree:

a. In General Binary Tree


- A general binary tree does not have a specific order for its node (like a
Binary Search Tree). To search for a value:
 Use tree traversal methods:
 In-order: Visit nodes in left subtree, root, then right
subtree.
 Pre-order: visit root, left subtree, then right subtree.
 Post-order: Visit left subtree, right subtree, then root.
Level Order: Visit nodes level by level, starting from
the root.
 Compare the value of each visited node with the target value.
 Stop when the value is found or when all nodes are visited.

Time Complexity:
O(n) in the worst case, where n is the number of nodes, because
every node may need to be visited.

b. In Binary Search Tree (BST)

- In a Binary Search Tree, the nodes are arranged such that:


 The left subtree contains values smaller than the root.
 The right subtree contains values larger than the root. This
property allows for more efficient searching.

Search Process in a BST

 Start at the root node.


 Compare the target value x with the current node's value:
o If x equals the current node’s value, the search is
successful.
o If x is smaller, move to the left subtree.
o If x is larger, move to the right subtree.

 Repeat this process until:


o x is found (successful search), or
o A leaf node is reached without finding x
(unsuccessful search).

Example: Binary Search Tree (BST)


Search operation in a BST

Goal: Let’s search for the number 60 in this tree

Step 1: Start at the root node

 The root node is 50


 Compare 60 with 50:
o Is 60 = 50? No.
o Is 60 > 50? Yes. Move to the right subtree of 50, where the larger numbers are
stored.

Step 2: Move to the right subtree:

o Now you are at node 70.


o Compare 60 with 70:
o Is 60 < 70? No
o Is 60 < 70? Yes. Move to the left subtree of 70, where smaller numbers are stored.

Step 3: Move to the left subtree:

 Now you are at node 60.


 Compare 60 with 60:
o Is 60 = 60? Yes. You found the value.

Summary of the example

 You visited three nodes in-order 50, 70 and 60.


 Because the tree is organized (smaller values go left, larger values go right), you don’t
need to check all the nodes. This is why BST search is efficient.

Example 2: Consider the following example


Let’s search for 25:

Step 1: Start at the root (50).

 Is 25 < 50? Yes. Move to the left subtree.

Step 2. Move to the node 30.

 Is 25 < 30? Yes. Move to the left subtree.

Step 3: move to node 20

 Is 25 < 20? No. move to the right subtree.

Step 4: The right child of 20 is empty (null). That’s means 25 is not in the tree.

Note: The search ends when you reach an empty spot in the tree.

Provide an example of a Binary Search tree (BST) and outline the process of inserting, deleting
and searching for nodes.

Example of creating a binary search tree

Suppose the data elements are 45, 15, 79, 90, 10, 55, 12, 20, 50

Step-by-Step: Inserting nodes into a BST

Step 1: Insert 45

Step 2: insert 15

 As 15 is smaller than 45, so insert it as the root node of the left subtree.
Step 3: Insert 79:

 As 79 is greater than 45, so insert as the root node of the right subtree.

Step 4: insert 90:

 90 is greater than 45 and 79, so it will be inserted as the right subtree of 79.
Step 5: Insert 10:

10 smaller than 45 and 15, so it will be inserted as left subtree of 15.

Step 6: insert 55.

 55 is larger than 45 and smaller than 79, so it will be inserted as the left subtree of 79.
Step 7: Insert 12.

 12 is smaller than 45 and 15 but greater than 10, so it will be inserted as the right subtree
of 10.
Step 8: insert 20.

 20 is smaller than 45 but greater than 15, so it will be inserted as the right subtree of 15.

Step 9: insert 50.

 50 is greater than 45 but smaller than 79 and 55. So, it will be inserted as a left subtree of
55.
Searching for a node in a BST

Let’s say we want to search for node 55.

Step 1: start at the root 45

 Is 55 equal to 45? No
 Is 55 greater than 45? Yes, so move to the right subtree.

Step 2: move to node 79

 Is 55 equal to 79? No
 Is 55 less than 79? Yes, so move to the left subtree of 79.

Step 3: move to node 55

 Is 55 equal to 55? Yes, search is successful.

Nodes visited during search: 45, 79 and 55.


Deletion of a node

In binary search tree, we must delete a node from the tree by keeping in mind that the property of
BST is not violated. To delete a node from BST, there are three possible situations occurs:

 The node to be deleted is the leaf node


 The node to be deleted has only one child
 The node to be deleted has two children

a) When the node to be deleted is the leaf node


This is the simplest case to delete a node in BST. Here, we replace the leaf node with
NULL and simply free the allocated space.
For example, the below image, we are supposed to delete node 90, as the node to be
deleted is leaf node, so it will be replaced with NULL and the allocated space will free.

b) When the node to be deleted has only one child


In this case, we have to replace the target node with its child, and then delete the child
node. It means that after replacing the target node with the child node, the child node will
contain the value to be deleted. So, we simply have to replace the child node with NULL
and free up the allocated space.

We can see the process of deleting a node with one child from BST in the below image.
In the below image, suppose we have to delete the node 79 as the node to be deleted has
only one child, so it will be replaced with its child 55.

So, the replaced node 79 will now be leaf node that can be easily deleted.
c) When the node to be deleted has two children
This case of deleting a node in BST is a bit complex among other cases. In such a case,
the steps to be followed are listed as follows:
 Firstly, find the in-order successor the node to be deleted.
 After that, replace that node with the in-order successor until the target node is
placed at the leaf of tree.
 And at last, replace the node with NULL and free up the allocated space.

The in-order successor is required when the right child of the node is not empty. We
can obtain the in-order successor by finding the minimum element in the right child
of the node.

We can see the process of deleting a node with two children from BST in the below
image. In the below image, suppose we have to delete node 45 that is the root node,
as the node to be deleted has two children, so it will be replaced with its in-order
successor. Now, node 45 will be at the leaf of the tree so that it can be deleted easily
Insertion in Binary Search Tree

A new key in BST is always inserted at the leaf. To insert an element in BST, we have to start
searching from the root node; if the node to be inserted is less than the root node, then search for
an empty location in the left subtree.

Else, search for the empty location in the right subtree and insert the data. Insert in BST is
similar to searching, as we always have to maintain the rule that the left subtree is smaller than
the root, and right subtree is larger than the root.

Tree Traversal methods

Tree traversal refers to the process of visiting all nodes in binary tree in a specific. There are
several methods for traversing a binary tree, each providing a unique way to access and process
nodes.

i. In-order Traversal (Left, Root, Right)


- In this involves process as follows;
 Visit the left subtree.
 Visit the root node.
 Visit the right subtree.
- In-order traversal gives nodes in ascending order (sorted).
In-order Traversal: 10, 12, 15, 20, 45, 50, 55, 79, 90.

ii. Pre-order Traversal (Root, left, Right)


- In this process involved are as follows;
 Visit the root node.
 Visit the left subtree
 Visit the right subtree.
- Useful for copying the tree or creating a tree from a sequence.

Pre-order Traversal: 45, 15, 10, 12, 20, 79, 55, 50, 90

iii. Post-order Traversal (Left, Right, Root)


- In this process involved are as follows;
 Visit the left subtree.
 Visit the right subtree
 Visit the root node.
- Useful for deleting tree or evaluating expression like mathematical
expression tree.
Post-order Traversal: 12, 10, 20, 15, 50, 55, 90, 79, 45.

iv. Level-order Traversal (Breadth-First Traversal)


- In this process involved are as follows;
 Visit nodes level-by-level from top to bottom and left to right.
 Use a queue to keep track of nodes to be visited.
- Useful for finding the shortest path or examining nodes level by level.

Level order traversal for this example is: 45, 10, 79, 10, 20, 55, 90, 12,
50.
2. Analyze the average-case and worst-case complexity of searching in a Binary Search
Tree

i. Average-Case and Worst-Case Time Complexity

 Average Case:
A balanced BST has nodes that are evenly
distributed.
On average, the height of the tree is log (n) where n
is the number of nodes.
Average-Case Complexity: O (log n).

 Worst Case:
An unbalanced BST can occur if data is inserted in sorted order (either
increasing or decreasing).
The tree becomes a degenerate (linked-list-like) structure, where the
height is n
Worst-Case Complexity: O (n).

Summary of Time Complexity

Operation Average Case Worst Case


Search O (log n) O (n)
Insertion O (log n) O (n)
Deletion O (log n) O (n)

Discuss how unbalanced and balanced trees affect search efficiency, particular in large
datasets.

i. Balanced Trees

What makes a tree balanced?

 A balanced tree is a tree where the height (the number of levels or edges from the root to
the farthest leaf) is kept as small as possible.
 In a balanced tree, the number of nodes on the left and right subtrees of any node are
roughly the same.
In a balanced tree, the height is approximately log (n) where n is the number of nodes. This
means that for a tree with a large number of nodes, the height of the tree stays relatively small,
which is good for performance.

Why does a balanced tree improve search efficiency?

 When searching for a value, you only need to traverse a small number of levels (around
log (n)) to find the value.
 For example, if you have a balanced tree with 1,000,000 nodes, the height of the tree
will be around 20 (since log 2 (1,000,000) ≈20. So, you only need about 20 comparisons
to find a value, even in a large dataset.

Example of balanced tree

ii. Unbalanced Trees

What makes a tree unbalanced?

 An unbalanced tree is a tree where the height is too large relative to the number of
nodes.
 This happens when the tree is not structured well, for example, when you keep inserting
values in sorted order (e.g., 1, 2, 3, 4, 5...).
 In this case, the tree ends up being like a linked list, with each node having only one
child.

In an unbalanced tree, the height can be as large as n, where n is the number of nodes. This
makes search operations much slower.

Why does an unbalanced tree decrease search efficiency?

 In an unbalanced tree, you might have to traverse all the way from the root to the last
node, taking n comparisons (where n is the number of nodes).
 For example, in a tree with 1,000,000 nodes, the height of the tree could be 1,000,000 (if
it is completely unbalanced), and you'd have to make 1,000,000 comparisons just to find
a value.

Example of unbalanced tree:

Impact of Balanced Vs Unbalanced Trees on Large Datasets

Balanced Tree:

 Search Efficiency: The time it takes to search is much smaller. Even if you have
millions of nodes, the height will be relatively small (around log (n)).
For example, if you have 1,000,000 nodes, the height will be around 20,
and you'll only need 20 comparisons to find a value.
 Performance: The tree remains efficient for both searching, inserting, and deleting
values, even with very large datasets.

Unbalanced Tree:

 Search Efficiency: The time it takes to search increases significantly. As the tree
becomes more unbalanced, the height increases to n, so searching takes longer.
For example, if you have 1,000,000 nodes and the tree is unbalanced, the
height might be 1,000,000, meaning you'd have to check 1,000,000 nodes
to find the value.

 Performance: The operations (search, insert, delete) become inefficient and take a long
time as the tree grows. This becomes a major problem when dealing with large datasets.

3. Cybersecurity Application of a Binary Search Tree (BST)


i. Maintaining a List of Known Malicious IP Addresses.
In cybersecurity, a Binary Search Tree (BST) can be used to maintain a list of known malicious
IP addresses. Here's how it can be applied:
 Insertion: Malicious IP addresses are added to the BST as they are identified. Each IP
address acts as a unique key in the tree.
 Search: When incoming traffic is detected, its IP address can be checked against the BST
to determine if it is malicious.
 Deletion: If an IP address is no longer deemed malicious (e.g., false positive), it can be
removed from the BST.
 Efficiency in Updates: With dynamic updates, the BST efficiently handles the addition
and removal of entries, making it suitable for dynamic threat databases.

ii. How BST Speeds Up Search Operations
In a cybersecurity context, such as real-time threat detection, quick lookups of malicious IPs are
critical. Here's how the BST structure enhances efficiency:
 Search Time Complexity: In a balanced BST, the time complexity for searching is, where
is the number of nodes (malicious IPs). This is significantly faster than a linear search (),
especially as the number of entries grows.
 Real-Time Detection: A balanced BST ensures fast lookups even when the list of
malicious IP addresses is extensive, enabling real-time threat detection without slowing
down network performance.
 Comparison with Linear Search: In a linear search, each IP would need to be checked
sequentially, leading to delays as the dataset grows.
For example: With 1,000,000 IP addresses: Linear Search: Up to 1,000,000 comparisons.
BST Search (balanced): Approximately 20 comparisons ().

iii. Practical Example


Imagine a cybersecurity system monitoring a network with a dynamically updated list of 100,000
malicious IP addresses:
Using a BST: When a new IP is detected, it is quickly compared with the root node and then
traverses left or right depending on its value, significantly reducing the number of comparisons.
E.g., Checking if 192.168.1.10 is malicious might take at most 17 steps in a balanced tree.
Using Linear Search: Each detected IP would need to be compared sequentially against all
100,000 entries, potentially taking 100,000 steps.

iv. Advantages of BST in High-Frequency Scenarios.


 Scalability: BSTs can handle large datasets without a significant decrease in performance,
unlike linear search.
 Dynamic Nature: As new malicious IPs are detected or old ones are removed, BSTs
support efficient updates.
 Real-Time Response: Faster search operations mean quicker responses to potential
threats, which is crucial in mitigating attacks.
 A balanced BST (like an AVL tree or a Red-Black tree) is recommended to maintain
optimal performance. Without balancing, the BST can degrade to a linked list in the
worst case, making searches.

4. Limitations of Binary Trees (BSTs)


 Unbalanced Tree Structure:
In the worst-case scenario (e.g., when data is inserted in sorted order), the BST can degenerate
into a linked list. This results in time complexity for search, insertion, and deletion, negating the
advantages of the BST.
 Inefficiency with Large Datasets:
Unbalanced trees can lead to longer search times, especially in real-time applications like
cybersecurity, where rapid lookups are critical.
 Complexity in Maintaining Balance:
Implementing balancing mechanisms manually in a BST can be challenging and error-prone.
 Potential for Redundant Traversals:
Searching for non-existent keys may still require traversing a significant portion of the tree,
especially if it is unbalanced.

Improvements Using Enhanced Data Structures


1. AVL Tree
Definition: A self-balancing BST where the height difference (balance factor) between the left
and right subtrees of any node is at most.
2.Impact on Search Efficiency:
Guarantees time complexity for search, insertion, and deletion. Ensures the tree remains
balanced after every operation using rotations.
Use Case in Cybersecurity: Ideal for maintaining malicious IP addresses or attack patterns,
where new data is frequently added or removed.
Consistent performance ensures real-time threat detection without delays.

2. Red-Black Tree
Definition: A self-balancing BST that maintains balance through color-coding nodes (red or
black) and enforcing specific rules during insertion and deletion.
Impact on Search Efficiency:
Guarantees operations. Slightly faster insertion and deletion compared to AVL trees due to fewer
rotations.
Use Case in Cybersecurity:
Suitable for dynamic environments where frequent updates (additions or deletions) occur, such
as maintaining live threat feeds or firewall rules.

3. Hash Tables
Definition: A data structure that maps keys to values using a hash function for constant-time
lookups.
Impact on Search Efficiency:
Provides average-case search time. Does not require balancing but might experience collisions,
requiring a resolution strategy (e.g., chaining or open addressing).
Use Case in Cybersecurity:
Excellent for quickly searching large datasets like malicious IP addresses.
Collisions can slow performance, making it less suitable for scenarios with poor hash function
distribution.
4. B-Trees (and B+ Trees)
Definition: A self-balancing multi-way search tree commonly used in databases and file systems.

Impact on Search Efficiency:


Ideal for disk-based systems due to reduced height and fewer disk reads.
Guarantees operations, even for large datasets.
Use Case in Cybersecurity:
Efficient for storing large sets of historical threat data or IPs in disk-based systems.

Comparison of Enhanced Data Structures


Dynamic Threat Databases: Use Red-Black Trees for balancing the need for quick lookups and
frequent updates.
Static or Frequent Lookups: Use Hash Tables to maximize speed, ensuring a well-distributed
hash function.
Large Historical Data Storage: Use B-Trees to efficiently manage large-scale, disk-stored
datasets.
By selecting the right data structure based on the application’s requirements, the limitations of
standard BSTs can be overcome, ensuring efficient and robust real-time cybersecurity defenses.
REFERENCES.

1. GeeksforGeeks, 2024. Binary Search Tree (BST). [online] Available at:


https://www.geeksforgeeks.org/binary-search-tree-set-1-search-and-insertion/ [Accessed
2 December 2024].

2. Wikipedia, 2024. Binary Search Tree. [online] Available at:


https://en.wikipedia.org/wiki/Binary_search_tree [Accessed 2 December 2024].

3. Cormen, T.H., Leiserson, C.E., Rivest, R.L. and Stein, C., 2009. Introduction to
Algorithms. 3rd ed. Cambridge, MA: MIT Press.

4. Knuth, D.E., 1998. The Art of Computer Programming (Volume 3). 3rd ed. Boston:
Addison-Wesley.

5. Weiss, M.A., 2013. Data Structures and Algorithm Analysis in C++. 4th ed. Boston:
Pearson Education.

You might also like