Ch-2: Abstract Data Structures
Ch-2: Abstract Data Structures
Abstract Data: An abstract data type (ADT) is a set of objects with a set of
operations.
Common ADTs:
● Stacks: LIFO (Last-In-First-Out) data structure.
● Operations: push, pop, peek, isEmpty.
● Queues: FIFO (First-In-First-Out) data structure.
● Operations: enqueue, dequeue, peek, isEmpty.
● Lists: Ordered collection of elements.
● Operations: insert, delete, search, traverse.
● Trees: Hierarchical data structure.
● Types: binary trees, binary search trees, AVL trees, etc.
● Graphs: Non-linear data structure with nodes and edges.
● Types: directed graphs, undirected graphs, weighted graphs, etc.
Maps:
Maps or dictionaries are abstract data types that store key-value pairs. This
allows for efficient retrieval of values based on their corresponding keys.
Key Components of a Map ADT:
● Keys: Unique identifiers used to access values.
● Values: Data associated with the keys.
● Operations:
● insert(key, value): Adds a new key-value pair to the map.
● get(key): Retrieves the value associated with the given key.
● remove(key): Removes the key-value pair from the map.
● contains(key): Checks if the map contains the given key.
● size(): Returns the number of key-value pairs in the map.
Hash Tables:
Hash tables are a type of data structure that use a hash function to map keys to indices in
an array. This allows for efficient storage and retrieval of key-value pairs.
2. Open Addressing: When a collision occurs, the algorithm probes other indices in the
array until an empty slot is found.
a. Linear probing: Search sequentially from the collision point.
b. Quadratic probing: Search using a quadratic function of the probe index.
c. Double hashing: Use a second hash function to determine the probe sequence.
Worst-Case Performance: In the worst case, when all keys hash to the same index, operations can
degrade to O(n).
Linear Probing
When a collision occurs, linear probing sequentially searches for the next available index in
the hash table.
How Linear Probing Works:
1. Hash the key: Calculate the hash value using the hash function.
2. Check the index: If the corresponding index in the hash table is empty, insert the
key-value pair.
3. Collision: If the index is occupied, increment the index by 1 and repeat step 2.
xx= [18,41,22,44,59,32,31]
y=13 //this is the table size
hash(x) = x mod y = [5, 2, 9, 5, 7, 6, 5]
Cons of Linear Probing:
● Clustering: Can lead to clustering of elements, where consecutive indices are
filled. This can degrade performance, especially for high load factors.
● Deletion difficulties: Deleting elements can create "holes" in the table, which can
make subsequent searches less efficient.
Quadratic Probing:
Quadratic probing is not guaranteed to find an empty bucket.
Collision: If the index is occupied, increment the index by a quadratic function of the probe
number. The quadratic function is typically of the form i^2, where i is the probe number
starting from 0.
xx= [18,41,31,54,28,44,15]
y=13 //this is the table size
hash(x) = x mod y = [5, 2, 5, 2, 2, 5, 2]
collision occurs:
put it in where
j = 1,2,3…..N-1, and iteration.
2
𝐴 [ (𝑖 + 𝑗 ) % 𝑁
Double Hashing
Aims to minimize clustering and improve performance by using two hash functions.
How Double Hashing Works:
1. Primary hash function
2. Secondary hash function
3. Probe sequence: Use the secondary hash value to determine the probe sequence.
The probe sequence is calculated as follows:
○ index = (initial_hash + j * secondary_hash) % table_size
○ i starts from 0 and increments with each probe.
31 41 18 32 59 73 22 44
0 1 2 3 4 5 6 7 8 9 10 11 12
18 5 3 5
41 2 1 2
22 9 6 9
44 5 5 5 ->10 (5 + 1 * 5) % 13 = 10
59 7 4 7
32 6 3 6
31 5 4 5 -> 0 (5 + 1 * 4) % 13 = 9
(5 + 2 * 4) % 13 = 0
73 8 4 8
Load Factor:
The load factor of a hash table is the ratio of the number of elements stored in the table to
the total size of the table. It is calculated as:
Load factor (α) = Number of elements (n) / Table size (N)
𝑛
α = 𝑁
If,
α = 0, p is a constant
α = 1, p is a ∞ (infinite)
Ideal load factor is a<0.5. Then, the expected running time of a search/insertion/deletion is
O(1).
Rehashing
1. Create a new hash table with a new size.
2. remove all elements from the old table (one at a time)
3. Insert them into the new table (one at a time).
Cuckoo hashing
It's named after the cuckoo bird, which lays its eggs in other birds' nests, causing the
original eggs to be evicted. In cuckoo hashing, when a collision occurs, one of the
conflicting elements is evicted and rehashed to a different location.
How it works:
1. Multiple hash functions: Cuckoo hashing uses two independent hash functions, h1
and h2.
2. Initial insertion:
a. h1, h2 tables are calculated.
b. Initially try to place it in table 1 using h1.
3. Collision:
a. KICK OUT occupant and get placed.
b. The kicked out dude uses h2 to get placed in the 2nd table.
h1 h2 Tab 1 Tab2
A 0 2 B D
B 0 0 C
D 1 0 B
C 1 4 E
F 3 4 F
E 3 2
● High load factors: Cuckoo hashing can handle very high load factors (ratio of
elements to buckets) without significant performance degradation.
● Constant time operations: On average, operations like insertion, deletion, and
search have constant time complexity.
● No clustering: Unlike linear probing or quadratic probing, cuckoo hashing doesn't
suffer from clustering issues.
Binary Heap
A heap is a binary tree satisfying the following properties:
▪ Heap-Order: for every node v other than the root
𝑘𝑒𝑦(𝑣) ≥ 𝑘𝑒𝑦(𝑝𝑎𝑟𝑒𝑛𝑡(𝑣))
▪ Complete Binary Tree: let h be the height of the heap
➢ for i = 0, … , h-1, there are 2i nodes of depth i
➢ at depth h, the external nodes are arranged to the
left of the tree
Height of a Heap
• Theorem: A heap storing n keys has height
O(log n)