DSA - Notes 2
DSA - Notes 2
May 1, 2025
Contents
1 Complexity Analysis 3
1.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Common Time Complexity Classes . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Asymptotic Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Growth Rate Comparison Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3 Hashing 6
3.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.2 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.3 Ideal Hash Function Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.4 Common Hashing Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.5 Python-like Pseudocode for Hash Table . . . . . . . . . . . . . . . . . . . . . . . 7
4 Collision Handling 7
4.1 Separate Chaining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
4.1.1 Separate Chaining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
4.2 Open Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
4.2.1 Linear Probing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
4.2.2 Quadratic Probing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4.2.3 Double Hashing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4.3 Comparison of Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
5 Recurrence Relations 9
5.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
5.2 Types of Recurrence Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
5.3 Solving Recurrences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
5.3.1 Example 1: Simple Recursion . . . . . . . . . . . . . . . . . . . . . . . . . 10
5.3.2 Example 2: Divide and Conquer Recurrence . . . . . . . . . . . . . . . . . 10
5.3.3 Example 3: Exponential Recurrence . . . . . . . . . . . . . . . . . . . . . 10
1
7 Trees and Binary Search Trees (BSTs) 10
7.1 Tree Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
7.2 Binary Search Tree (BST) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
7.3 BST Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
7.3.1 Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
7.3.2 Deletion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
7.4 Time Complexity of BST Operations . . . . . . . . . . . . . . . . . . . . . . . . . 11
8 Graph Representation 11
8.1 Graph Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
8.2 Types of Graph Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
8.2.1 Edge List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
8.2.2 Adjacency List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
8.2.3 Adjacency Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
8.3 Choosing the Right Representation . . . . . . . . . . . . . . . . . . . . . . . . . . 12
12 Conclusion 19
2
1 Complexity Analysis
1.1 Definition
Algorithmic complexity refers to the rate at which the computational resources used by an
algorithm (such as time or space) increase as the input size n increases. It allows us to:
• Compare the efficiency of algorithms.
• Predict performance on large inputs.
• Choose optimal solutions based on input constraints.
3
Example
Show that f (n) = 3n + 2 = O(n).
Proof: Let g(n) = n. Choose c = 5, then:
3n + 2 ≤ 5n for all n ≥ 1
Example 4: Linear-Logarithmic
for i in range(n):
j = 1
while j < n:
j = j * 2
Time Complexity: O(n log n)
4
Example 5: Conditional with Constant Time
if a > n:
a = a + 1
else:
a = a - 1
Time Complexity: O(1)
5
3 Hashing
3.1 Definition
Hashing is a technique used to map data (keys) to fixed-size indices in a hash table using a hash
function. The goal is to provide constant-time O(1) access for insertion, deletion, and retrieval.
3.2 Terminology
• Key: The unique identifier used to store/retrieve values.
• Slot Index: The location in the hash table where a key-value pair is stored.
• Hash Table: A data structure that stores key-value pairs.
• Hash Function: Converts a key into a valid slot index.
• Synonyms: Different keys that hash to the same index.
• Collision: Occurs when two keys map to the same slot.
• Probing: Process of handling collisions.
2. Arithmetic Methods:
• Addition: h(k) = (k + x)
• Subtraction: h(k) = |k − x|
• Multiplication: h(k) = k × x
• Division: h(k) = k ÷ x
When out of range: h(k) = h(k) mod N , where N is the table size.
3. Modulo Division:
h(k) = k mod N
Choose N as a prime number to reduce collisions.
4. Digit Extraction:
• Extract selected digits from key to form index.
• Example: Extract 1st, 3rd, and last digit.
5. Midsquare Method:
• Square the key, extract middle digits.
6. Folding Method:
• Split key into equal parts and sum them.
• Fold Boundary: Reverse first and last part before summing.
6
7. Rotation Method:
• Rotate digits in the key before applying a modulus.
8. Pseudorandom Generation:
h(k) = (a · k + c) mod N
4 Collision Handling
When two keys hash to the same slot index, we must resolve the collision to avoid data loss.
Two primary strategies are:
7
1. Compute index: i = h(k).
2. Search linearly through the chain at slot i for key k.
Deletion:
1. Compute index: i = h(k).
2. Remove the entry with key k from chain at slot i.
Time Complexity:
• Average Case: O(1 + λ), where λ is load factor.
• Worst Case: O(n), if all keys collide into one chain.
Example Code
class ChainHashTable:
def __init__(self, size):
self.table = [[] for _ in range(size)]
Insertion:
1. Compute base index: j = h(k).
2. If slot j is occupied, try (j + 1) mod N , then (j + 2) mod N , etc.
Time Complexity:
( 1 )
• Average Case: O 1−λ .
8
• Worst Case: O(n) if table is nearly full.
Code Snippet
def linear_probe_insert(table, key, value):
N = len(table)
i = key % N
for step in range(N):
idx = (i + step) % N
if table[idx] is None:
table[idx] = (key, value)
return
raise Exception("Hash table is full")
5 Recurrence Relations
5.1 Definition
A recurrence relation is an equation that expresses the value of a function in terms of its previous
values. It is commonly used to describe the time complexity of recursive algorithms.
9
• Master’s Theorem
Time Complexity: O(V 2 ) using a simple array, O((V + E) log V ) using a priority queue.
6.2.2 Example:
In a graph with nodes {0, 1, 2, 3}, Dijkstras algorithm can be used to find the shortest path
from node 0 to all other nodes.
10
7.2 Binary Search Tree (BST)
A binary search tree is a binary tree with the following properties:
• Each node has at most two children (left and right).
• The left subtree of a node contains only nodes with values less than the nodes value.
• The right subtree of a node contains only nodes with values greater than the nodes value.
7.3.2 Deletion
There are three cases when deleting a node:
1. The node is a leaf.
2. The node has one child.
3. The node has two children.
For the third case, we can replace the node with either its inorder predecessor or successor.
8 Graph Representation
8.1 Graph Definition
A graph is a non-linear data structure consisting of nodes (vertices) and edges (arcs) that
connect pairs of nodes. A graph is represented as G(V, E), where:
• V is the set of vertices, e.g., {V 1, V 2, V 3, . . . , V n}
• E is the set of edges, e.g., {E1, E2, E3, . . . , En}
A graph can be either directed or undirected, and edges can be weighted or unweighted.
11
Example:
E = {(u, v), (v, u), (u, w), (w, u), (v, w), (w, v)}
For a weighted graph:
E = {(u, v, 5), (v, u, 5), (u, w, 3), (w, u, 3)}
Operations:
• Space complexity: O(n + m), where n is the number of vertices and m is the number of
edges.
• Time complexity for inserting or removing edges: O(1).
• Searching for an edge: O(m).
V [u] → [(v, 5), (w, 3)], V [v] → [(u, 5), (w, 2)], V [w] → [(u, 3), (v, 2)]
Operations:
• Space complexity: O(n2 ).
• Time complexity for accessing an edge: O(1).
• Time complexity for listing neighbors: O(n).
12
9 Solving 10 Questions Using Recurrence Relations
In this section, we will solve 10 problems using recurrence relations. These problems illustrate
various approaches to solving recurrence relations, including back-substitution, recursion tree,
and applying the master theorem.
T (n) = T (n − 1) + 1, T (1) = 1
T (n) = T (n − 1) + 1
Substitute for T (n − 1):
T (n − 1) = T (n − 2) + 1
Substitute again:
T (n) = T (n − 2) + 1 + 1 = T (n − 2) + 2
Continuing this process:
T (n) = T (n − k) + k
When k = n − 1, we get:
T (n) = T (1) + (n − 1) = 1 + (n − 1) = n
Solution: This is a classic divide and conquer recurrence. We will apply the master theorem
for divide-and-conquer recurrences.
The recurrence matches the form:
logb a = log2 2 = 1
13
9.3 Problem 3: Exponential Recurrence
Given the recurrence relation:
T (n) = 2T (n − 1) + 1, T (1) = 1
Solution: Using back-substitution:
T (n) = 2T (n − 1) + 1
Substitute for T (n − 1):
T (n − 1) = 2T (n − 2) + 1
Substitute again:
T (n) = 2(2T (n − 2) + 1) + 1 = 22 T (n − 2) + 2 + 1 = 22 T (n − 2) + 3
Continuing this:
T (n) = 2k T (n − k) + (2k − 1)
When k = n − 1, we get:
T (n) = 2n−1 T (1) + (2n−1 − 1)
Since T (1) = 1:
T (n) = 2n−1 + (2n−1 − 1) = 2n − 1
Thus, T (n) = O(2n ).
14
9.6 Problem 6: Polynomial Recurrence
Given:
T (n) = 3T (n/2) + n2 , T (1) = 1
Solution: Using the master theorem:
T (n) = aT (n/b) + f (n)
Here a = 3, b = 2, and f (n) = n2 . Calculate logb a:
logb a = log2 3 ≈ 1.585
We compare f (n) with nlogb a : - f (n) = n2 , which is Θ(n2 ). Since f (n) = Θ(n2 ), which is
greater than nlogb a , the recurrence falls into the third case of the master theorem:
T (n) = Θ(n2 )
15
10 Solving 4 More Problems Using Back-Substitution
In this section, we will solve 4 additional problems using the back-substitution method. Each
problem will demonstrate the iterative process of substituting values and observing the growth
of the recurrence.
T (n) = T (n − 1) + 1
Substitute for T (n − 1):
T (n − 1) = T (n − 2) + 1
Substitute again:
T (n) = T (n − 2) + 1 + 1 = T (n − 2) + 2
Expanding further:
T (n) = T (n − 3) + 3
Continue this process until we reach T (1):
T (n) = T (1) + (n − 1)
Since T (1) = 1:
T (n) = 1 + (n − 1) = n
Thus, T (n) = O(n).
T (n) = T (n/2) + n
Substitute for T (n/2):
T (n/2) = T (n/4) + n/2
Substitute again:
T (n) = T (n/4) + n/2 + n = T (n/4) + 3n/2
Expanding further:
T (n) = T (n/8) + 3n/4 + 3n/2 = T (n/8) + 7n/4
Continue until n = 1. After k iterations, the recurrence becomes:
∑n
log
1
T (n) = T (1) + n
i=0
2i
The sum is a geometric series:
∑n
log
1
=2
i=0
2i
Thus, the solution is:
T (n) = 1 + 2n = O(n)
16
10.3 Problem 3: A Quadratic Recurrence
Given:
T (n) = T (n − 1) + n2 , T (1) = 1
Solution: Start by expanding the recurrence.
T (n) = T (n − 1) + n2
Substitute for T (n − 1):
T (n − 1) = T (n − 2) + (n − 1)2
Substitute again:
T (n) = T (n − 2) + (n − 1)2 + n2
Expanding further:
T (n) = T (n − 3) + (n − 2)2 + (n − 1)2 + n2
Continuing:
∑
n
T (n) = T (1) + k2
k=1
The sum of squares is:
∑
n
n(n + 1)(2n + 1)
k2 =
k=1
6
Thus, the final solution is:
n(n + 1)(2n + 1)
T (n) = 1 + = O(n3 )
6
T (n) = 2T (n − 1) + 1
Substitute for T (n − 1):
T (n − 1) = 2T (n − 2) + 1
Substitute again:
T (n) = 2(2T (n − 2) + 1) + 1 = 22 T (n − 2) + 2 + 1
Expanding further:
T (n) = 23 T (n − 3) + 22 + 2 + 1
Continuing the pattern:
∑
k−1
T (n) = 2k T (n − k) + 2i
i=0
When k = n − 1, we get:
T (n) = 2n−1 T (1) + (2n−1 − 1)
Since T (1) = 1:
T (n) = 2n−1 + (2n−1 − 1) = 2n − 1
Thus, T (n) = O(2n ).
17
11 Final Exam Style Questions with Solutions
This section presents a selection of representative questions sourced from Harvard University
and University of Waterloo final exam practice materials. These questions cover core topics in
data structures and algorithms and include detailed solutions.
Q2: True or False: The runtime complexity of range query for kd-trees depends on the spread
factor of points.
A2: True. The spread factor (distribution of data) affects the tree balance, and hence the
query time.
T (n) = Θ(n2.5 )
18
11.4 Graphs and Shortest-Path Algorithms
Q7: Which algorithm is used to find the shortest path in a graph with non-negative edge
weights?
A7: Dijkstras Algorithm. It maintains a priority queue and updates distances as:
Q8: True or False: Cuckoo Hashing may require rehashing even with a low load factor.
A8: True. Cuckoo Hashing can enter a cycle due to placement conflicts, requiring a full rehash.
Q10: True or False: Inserting the same set of keys in different orders into a compressed trie
always gives the same structure.
A10: False. The structure of compressed tries depends on the insertion order.
12 Conclusion
You all need help... You actually read through this whole document? Why? Here is the suicide
prevention helpline number: 0311 7786264
Also you won... but at the cost of being awake at 2 in the morning staring at a blank screen?
19