0% found this document useful (0 votes)
7 views14 pages

ADS Unit 3

The document defines a dictionary as an abstract data type that stores unique key-value pairs and discusses its applications when duplicates are allowed, such as in database indexing and full-text search engines. It explains open hashing (separate chaining) and closed hashing (open addressing) with examples, detailing how collisions are handled in each method. Additionally, it covers the representation of hash tables, types, operations, advantages, disadvantages, and introduces double hashing as a collision resolution technique.

Uploaded by

coe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views14 pages

ADS Unit 3

The document defines a dictionary as an abstract data type that stores unique key-value pairs and discusses its applications when duplicates are allowed, such as in database indexing and full-text search engines. It explains open hashing (separate chaining) and closed hashing (open addressing) with examples, detailing how collisions are handled in each method. Additionally, it covers the representation of hash tables, types, operations, advantages, disadvantages, and introduces double hashing as a collision resolution technique.

Uploaded by

coe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 14

UNIT-3

1. Define dictionary. Give the applications of dictionary with duplicates


in which sequential access is desired

Definition of Dictionary
A dictionary is an abstract data type (ADT) that stores key-value pairs, where each key is
unique and maps to a corresponding value. Dictionaries provide efficient operations for
inserting, deleting, and searching for elements. In programming, dictionaries are often
implemented using data structures such as hash tables, binary search trees, or balanced trees.
When duplicates are allowed, the dictionary can be modified to store multiple values for a
single key, enabling use cases that require multi-mapping.
Applications of Dictionary with Duplicates in Sequential Access
Dictionaries that allow duplicate keys and provide sequential access are useful in various
applications, including:
1. Database Indexing
o In databases, dictionaries are used to maintain indexes that support efficient
searching and retrieval of data records.
o When duplicates exist (e.g., multiple employees with the same name),
sequential access ensures proper organization and retrieval of data.
2. Full-Text Search Engines
o Search engines use inverted indexes (a type of dictionary) to map words to
document IDs.
o If a word appears in multiple documents, sequential access helps retrieve all
related documents efficiently.
3. Multi-Valued Caching Systems
o Caching mechanisms in web applications store multiple responses for the
same request (e.g., different versions of a webpage for different users).
o Dictionary with duplicates ensures sequential access to retrieve the appropriate
version when needed.
4. Symbol Tables in Compilers
o Compilers use dictionaries to store information about identifiers (variables,
functions, etc.).
o If an identifier is overloaded (i.e., multiple functions with the same name but
different parameters), sequential access helps resolve function calls.
5. Log Management Systems
o Log management tools store multiple log entries under the same category
(e.g., errors, warnings).
o Sequential access allows analyzing logs in the order they were recorded.
6. Multimap in Graph Representations
o Graphs can be represented using adjacency lists stored in dictionaries where
multiple edges exist for the same node.
o Sequential access helps traverse connections in an ordered manner.
7. Auto-Suggestion and Predictive Text
o Dictionaries with duplicate entries store possible word completions for a given
prefix.
o Sequential access allows smooth navigation through suggestions.
2. Explain how open hashing and closed hashing is done with examples.
Open Hashing (Separate Chaining)
Concept
 In Open Hashing, also known as Separate Chaining, each index of the hash table
contains a linked list (or another structure) to store multiple elements that hash to the
same index.
 When a collision occurs (i.e., multiple keys map to the same index), the new key is
stored in a linked list at that index.
Example of Open Hashing
Given Data
Let’s consider a hash table of size 7 and use the hash function:
hash(key)=key mod 7
We insert the following keys: 50, 700, 76, 85, 92, 73, 101
Step-by-Step Insertion
1. 50 mod 7 = 1 → Insert 50 at index 1
2. 700 mod 7 = 0 → Insert 700 at index 0
3. 76 mod 7 = 6 → Insert 76 at index 6
4. 85 mod 7 = 1 → Collision! Append 85 to the linked list at index 1
5. 92 mod 7 = 1 → Collision! Append 92 to the linked list at index 1
6. 73 mod 7 = 3 → Insert 73 at index 3
7. 101 mod 7 = 3 → Collision! Append 101 to the linked list at index 3
Final Hash Table (After Insertions)
Index Elements (Linked List)
0 700
1 50 → 85 → 92
2 —
3 73 → 101
4 —
5 —
6 76
Searching in Open Hashing
To search for 92:
 Compute 92 mod 7 = 1 → Go to index 1
 Traverse 50 → 85 → 92 → Found ✅
To search for 100:
 Compute 100 mod 7 = 2 → Go to index 2
 No elements → Not found ❌
Deleting in Open Hashing
To delete 85:
 Compute 85 mod 7 = 1 → Go to index 1
 Traverse 50 → 85 → 92
 Remove 85, update list to 50 → 92
Final Hash Table (After Deleting 85)
Index Elements (Linked List)
0 700
1 50 → 92
Index Elements (Linked List)
2 —
3 73 → 101
4 —
5 —
6 76

Closed Hashing (Open Addressing)


Concept
 In Closed Hashing, also known as Open Addressing, all elements are stored inside
the hash table itself.
 If a collision occurs, probing is used to find the next available slot within the table.
 Common probing techniques include:
1. Linear Probing → Search for the next available slot sequentially.
2. Quadratic Probing → Use a quadratic function to find a new position.
3. Double Hashing → Use a second hash function to resolve collisions.
Example of Closed Hashing (Using Linear Probing)
Given Data
We take a hash table of size 7 and use the hash function:
hash(key)=key mod 7
We insert the keys: 50, 700, 76, 85, 92, 73, 101
Step-by-Step Insertion (Using Linear Probing)
1. 50 mod 7 = 1 → Insert 50 at index 1
2. 700 mod 7 = 0 → Insert 700 at index 0
3. 76 mod 7 = 6 → Insert 76 at index 6
4. 85 mod 7 = 1 → Collision! Search next available slot → Insert 85 at index 2
5. 92 mod 7 = 1 → Collision! Search next available slot → Insert 92 at index 3
6. 73 mod 7 = 3 → Collision! Search next available slot → Insert 73 at index 4
7. 101 mod 7 = 3 → Collision! Search next available slot → Insert 101 at index 5
Final Hash Table (After Insertions)
Index Key
0 700
1 50
2 85
3 92
4 73
5 101
6 76
Searching in Closed Hashing
To search for 92:
 Compute 92 mod 7 = 1
 Check index 1 → Found 50 (not a match)
 Check index 2 → Found 85 (not a match)
 Check index 3 → Found 92 ✅
To search for 100:
 Compute 100 mod 7 = 2
 Check index 2 → Found 85 (not a match)
 Check index 3 → Found 92 (not a match)
 Check index 4 → Found 73 (not a match)
 Check index 5 → Found 101 (not a match)
 Check index 6 → Found 76 (not a match)
 Reached the end → 100 not found ❌
Deleting in Closed Hashing
To delete 85:
 Compute 85 mod 7 = 1
 Check index 1 → Found 50 (not a match)
 Check index 2 → Found 85 ✅
 Mark index 2 as deleted (⛔)
Final Hash Table (After Deleting 85)
Index Key
0 700
1 50
2 ⛔ (Deleted)
3 92
4 73
5 101
6 76
Comparison: Open Hashing vs. Closed Hashing
Open Hashing (Separate Closed Hashing (Open
Feature
Chaining) Addressing)
Uses probing to find next
Collision Handling Uses linked lists at each index
available slot
Memory Usage Extra memory for linked lists Uses only the hash table itself
Deletion Complexity Easy (remove from list) Hard (requires special marking)
Search Time O(1) for small chains, O(n) for O(1) in best case, O(n) in worst
Complexity long chains case

3. How do you represent Hash Table? Explain.


Representation of a Hash Table
A hash table is a data structure that stores key-value pairs and uses a hash function to
compute an index (or position) into an array of buckets or slots, from which the desired value
can be found. The hash table offers constant time complexity O(1) for operations like
insertions, deletions, and lookups on average, although in the case of collisions, this
performance might degrade.

1. Hash Table Structure


A hash table consists of two primary components:
 Array of Buckets: This is the underlying structure that stores data. Each element of
the array (also called a bucket) is used to store key-value pairs.
 Hash Function: This function takes a key and computes an index (or slot) where the
key-value pair should be stored in the table.
The table is usually represented as an array of a fixed size, where each index can store one or
more entries. When multiple keys hash to the same index (a collision), techniques like
chaining or open addressing (such as linear probing or double hashing) are used to resolve
the collision.
2. Hash Table Representation with Chaining
In chaining, each index of the hash table contains a linked list (or other data structure like a
tree) of elements that hash to the same index. Each linked list node stores a key-value pair.
How It Works:
1. When a key is inserted into the hash table, the hash function is applied to compute the
index.
2. If no other element exists at that index, the key-value pair is placed there.
3. If other key-value pairs exist at that index (collision), the new key-value pair is added
to the linked list at that index.
Example:
Hash Table Size: 7
Hash Function: h(key)=keymod 7
Keys to Insert: 10, 20, 30, 40, 50, 60
Step-by-Step Insertion:
 Insert 10:
h(10)=10mod 7=3→ Place (10, value) at index 3.
 Insert 20:
h(20)=20mod 7=6→ Place (20, value) at index 6.
 Insert 30:
h(30)=30mod 7=2→ Place (30, value) at index 2.
 Insert 40:
h(40)=40mod 7=5→ Place (40, value) at index 5.
 Insert 50:
h(50)=50mod 7=1→ Place (50, value) at index 1.
 Insert 60:
h(60)=60mod 7=4→ Place (60, value) at index 4.
Final Hash Table with Chaining:
Index Key-Value Pairs
0
1 (50, value)
2 (30, value)
3 (10, value)
4 (60, value)
5 (40, value)
6 (20, value)
Each index holds a linked list of key-value pairs, and the hash table's structure effectively
distributes the keys to different indices.

3. Hash Table Representation with Open Addressing


In open addressing, when a collision occurs, the algorithm searches for the next available
slot within the hash table itself (rather than using linked lists). This is achieved through
probing techniques like linear probing, quadratic probing, or double hashing.
Example with Linear Probing:
Hash Table Size: 7
Hash Function: h(key)=key mod 7
Keys to Insert: 10, 20, 30, 40, 50, 60
Step-by-Step Insertion:
 Insert 10:
h(10)=10mod 7=3→ Place 10 at index 3.
 Insert 20:
h(20)=20mod 7=6→ Place 20 at index 6.
 Insert 30:
h(30)=30mod 7=2→ Place 30 at index 2.
 Insert 40:
h(40)=40mod 7=5→ Place 40 at index 5.
 Insert 50:
h(50)=50mod 7=1→ Place 50 at index 1.
 Insert 60:
h(60)=60mod 7=4→ Place 60 at index 4.
Final Hash Table with Open Addressing (Linear Probing):
Index Key
0
1 50
2 30
3 10
4 60
5 40
6 20

4. Types of Hash Tables


There are two common ways to represent hash tables based on the collision resolution
technique:
1. Chained Hash Table: Each index holds a linked list (or another data structure like a
tree) to store multiple key-value pairs that hash to the same index. This method is
suitable when the number of collisions is expected to be high.
2. Open Addressing Hash Table: In this method, the hash table itself stores all
elements. If a collision occurs, the algorithm searches for another available slot within
the table itself. The probing technique used (linear probing, quadratic probing, or
double hashing) affects the performance of the hash table.

5. Operations in a Hash Table


 Insertion: A key-value pair is inserted using the hash function to determine the index.
If there is a collision, the collision resolution method (like chaining or probing) is
used.
 Search: To find a value, the key is hashed, and the corresponding index is checked. If
there’s a collision, the method searches for the key using the resolution technique.
 Deletion: The key is hashed to find the index, and the key-value pair is removed from
the hash table.

Advantages of Hash Tables


1. Fast lookups: Average time complexity for search, insert, and delete operations is
O(1)
2. Efficient space usage: It only requires as much memory as necessary to store the
keys and values.
3. Scalable: Hash tables can grow dynamically to accommodate more data.
Disadvantages of Hash Tables
1. Collision Handling: Collisions can degrade performance, especially if the hash
function isn’t well-designed.
2. Memory Overhead: Hash tables can sometimes require extra space if there are too
many empty slots.
3. Non-ordered data: Hash tables do not maintain any specific order of the keys.

Conclusion
A hash table is a highly efficient data structure for storing and retrieving key-value pairs. It
uses a hash function to map keys to indices in an array, and there are different techniques for
handling collisions, such as chaining and open addressing. Hash tables offer constant time
complexity for operations in the average case, making them ideal for many real-world
applications like databases, caches, and sets.

4. Explain double hashing with an example.

Double Hashing is a collision resolution technique in open addressing, where a second


hash function is used to determine the next available slot in case of a collision. This technique
reduces clustering and distributes keys more evenly across the hash table.

1. How Double Hashing Works

Formula for Double Hashing

If a collision occurs at index h1(key), the next slot is determined using:

Index=(h1(key)+i×h2(key))mod Table Size

Where:

 h1(key) = First hash function


 h2(key) = Second hash function
 i = Probe number (0, 1, 2, ...)

The second hash function h2(key) should never return 0 to avoid infinite loops.

2. Example of Double Hashing

Given Data

 Hash Table Size = 7


 Keys to Insert = {50, 700, 76, 85, 92, 73, 101}
 Primary Hash Function (h1): h1(key)=key mod 7
 Secondary Hash Function (h2): h2(key)=1+(key mod 5)h2(key)
 Step-by-Step Insertion

Step 1: Insert 50

 h1(50)=50mod 7
 No collision → Insert 50 at index 1

Step 2: Insert 700


 h1(700)=700mod 7=0
 No collision → Insert 700 at index 0

Step 3: Insert 76

 h1(76)=76mod 7=6
 No collision → Insert 76 at index 6

Step 4: Insert 85

 h1(85)=85 mod 7=1 (Collision with 50)


 Compute second hash: h2(85)=1+(85mod 5)=1+0=1
 Probe: (1+1×1)mod 7=2
 No collision → Insert 85 at index 2

Step 5: Insert 92

 h1(92)=92mod 7=1 (Collision with 50)


 Compute second hash: h2(92)=1+(92mod 5)=1+2=3
 Probe: (1+1×3)mod 7=4
 No collision → Insert 92 at index 4

Step 6: Insert 73

 h1(73)=73mod 7=3
 No collision → Insert 73 at index 3

Step 7: Insert 101

 h1(101)=101mod 7=3 (Collision with 73)


 Compute second hash: h2(101)=1+(101mod 5)=1+1=2
 Probe 1: (3+1×2)mod 7=5
 No collision → Insert 101 at index 5

Final Hash Table After Insertions

Index Key
0 700
1 50
2 85
3 73
4 92
5 101
6 76

3. Searching in Double Hashing

To search for 92:


1. Compute h1(92)=1 → Index 1 contains 50 (not a match).
2. Compute h2(92)= 3, check next index:
o (1+1×3)mod 7=4 → Found 92 ✅

To search for 100:

1. Compute h1(100)=100mod 7=2


2. Index 2 contains 85 (not a match).
3. Compute h2(100)=1+(100mod 5)=1+0=1h2(100) = 1 + (100 \mod 5) = 1 + 0 =
1h2(100)=1+(100mod5)=1+0=1
4. Probe (2 + 1) mod 7 = 3 → Contains 73 (not a match).
5. Next probe (2 + 2) mod 7 = 4 → Contains 92 (not a match).
6. Continue probing → No match → Not found ❌

4. Advantages of Double Hashing

✅ Reduces clustering (better than linear probing)


✅ More uniform distribution of keys
✅ Efficient space utilization

5. Disadvantages of Double Hashing

❌ Slightly slower than linear probing due to extra hash function calculations
❌ If the table is almost full, probing might take longer

Conclusion

Double hashing is an effective collision resolution technique that minimizes clustering by


using a second hash function to determine alternative positions. It provides better
distribution of keys compared to linear probing and quadratic probing, making it a good
choice for large datasets.

5. What do you mean by collision and how can you handle it by using linear
probing

Collision in Hashing
A collision occurs when two or more keys hash to the same index in a hash table. Since a
hash table can only store one element per index, if two keys generate the same hash value
(i.e., the same index), a collision happens.
For example, given a hash table size of 7, if two keys like 13 and 20 both hash to the index 3,
a collision occurs.

How to Handle Collisions?


There are different ways to handle collisions. One common method is linear probing, which
is a form of open addressing. In linear probing, when a collision occurs, the algorithm
checks the next index in the table (sequentially) until it finds an empty slot. This method
reduces the number of collisions and helps maintain the integrity of the hash table.

Linear Probing in Hashing


In linear probing, when a collision occurs, we search for the next available position in the
hash table. The index to check is computed as:
Next index=(hash(key)+i)mod table size
Where:
 i is the probe number (starting from 0 and increasing by 1 for each collision).
 hash(key) is the index calculated by the hash function.
Linear Probing Example
Given Data
 Hash Table Size = 7
 Hash Function (h1): h1(key)=key mod 7
 Keys to Insert: {50, 700, 76, 85, 92, 73, 101}

Step-by-Step Insertion (Using Linear Probing)


1. Insert 50
o h1(50)=50mod 7=1
o No collision → Insert 50 at index 1
2. Insert 700
o h1(700)=700mod 7=0
o No collision → Insert 700 at index 0
3. Insert 76
o h1(76)=76mod 7=6
o No collision → Insert 76 at index 6
4. Insert 85
o h1(85)=85mod 7=1 (Collision with 50 at index 1)
o Check next index:
 (1+1)mod 7=2
 No collision → Insert 85 at index 2
5. Insert 92
o h1(92)=92mod 7=1 (Collision with 50 at index 1)
o Check next index:
 (1+1)mod 7=2 (Collision with 85 at index 2)
 Check next index:
 (2+1)mod 7=3
 No collision → Insert 92 at index 3
6. Insert 73
o h1(73)=73mod 7=3 (Collision with 92 at index 3)
o Check next index:
 (3+1)mod 7=4
 No collision → Insert 73 at index 4
7. Insert 101
o h1(101)=101mod 7=3 (Collision with 92 at index 3)
o Check next index:
 (3+1)mod 7=4 (Collision with 73 at index 4)
 Check next index:
 (4+1)mod 7=5
 No collision → Insert 101 at index 5

Final Hash Table (After Insertions)


Index Key
0 700
Index Key
1 50
2 85
3 92
4 73
5 101
6 76

Searching in Linear Probing


To search for a key, say 92:
1. Compute h1(92)=92mod 7=1.
2. Index 1 contains 50 (not a match).
3. Compute next index using linear probing:
o (1+1)mod 7=2→ Index 2 contains 85 (not a match).
o (2+1)mod 7=3 → Found 92 ✅
To search for a non-existent key like 100:
1. Compute h1(100)=100mod 7=2.
2. Index 2 contains 85 (not a match).
3. (2+1)mod 7=3→ Index 3 contains 92 (not a match).
4. (3+1)mod 7=4→ Index 4 contains 73 (not a match).
5. (4+1)mod 7=5→ Index 5 contains 101 (not a match).
6. (5+1)mod 7=6→ Index 6 contains 76 (not a match).
7. Reached an empty slot, and 100 is not found ❌.

Advantages of Linear Probing


✅ Simple to implement and efficient if the table is not too full.
✅ No extra space required for linked lists or other data structures.
✅ Good cache performance since elements are stored close together in memory.
Disadvantages of Linear Probing
❌ Clustering: As more elements are inserted, consecutive empty slots are filled up, causing
long probe sequences. This is called primary clustering.
❌ Performance deteriorates as the table becomes more populated, requiring more probes to
find an empty slot.

Conclusion
In linear probing, when a collision occurs, the algorithm searches the next sequential index
in the hash table. It provides a simple way to resolve collisions but may suffer from
clustering, which can degrade performance if the table is almost full. Proper resizing and
load factor management can help mitigate this issue.

6. Explain linear probing with an example.

Linear Probing in Hashing

Linear Probing is a method used to handle collisions in open addressing (a type of collision
resolution technique in hash tables). When a collision occurs, linear probing checks the next
consecutive position (in a linear manner) in the hash table until it finds an empty slot.
In linear probing, when a key hashes to an index, if that index is already occupied, the
algorithm searches for the next index by increasing the index by 1, wrapping around to the
beginning of the table if necessary.

Formula for Linear Probing

To compute the next index when a collision occurs:

Next index=(hash(key)+i)mod table size

Where:

 i is the probe number, starting from 0 and incremented by 1 for each subsequent
probe.
 hash(key) is the index computed by the hash function.

Example of Linear Probing

Given Data

 Hash Table Size = 7


 Hash Function (h1): h1(key)=key mod 7
 Keys to Insert: {50, 700, 76, 85, 92, 73, 101}

Step-by-Step Insertion Using Linear Probing

Step 1: Insert 50

 Compute h1(50)=50mod 7=1


 No collision → Insert 50 at index 1

Step 2: Insert 700

 Compute h1(700)=700mod 7=0


 No collision → Insert 700 at index 0

Step 3: Insert 76

 Compute h1(76)=76mod 7=6


 No collision → Insert 76 at index 6

Step 4: Insert 85

 Compute h1(85)=85mod 7=1 (Collision with 50 at index 1)


 Probe: Check next index:
o (1+1)mod 7=2
o No collision → Insert 85 at index 2
Step 5: Insert 92

 Compute h1(92)=92mod 7=1 (Collision with 50 at index 1)


 Probe: Check next index:
o (1+1)mod 7=2 (Collision with 85 at index 2)
o Probe: Check next index:
 (2+1)mod 7=3
 No collision → Insert 92 at index 3

Step 6: Insert 73

 Compute h1(73)=73mod 7=3 (Collision with 92 at index 3)


 Probe: Check next index:
o (3+1)mod 7=4
o No collision → Insert 73 at index 4

Step 7: Insert 101

 Compute h1(101)=101mod 7=3 (Collision with 92 at index 3)


 Probe: Check next index:
o (3+1)mod 7=4 (Collision with 73 at index 4)
o Probe: Check next index:
 (4+1)mod 7=5
 No collision → Insert 101 at index 5

Final Hash Table (After Insertions)

Index Key
0 700
1 50
2 85
3 92
4 73
5 101
6 76

Searching in Linear Probing

Search for 92:

1. Compute h1(92)=92mod 7=1.


o Index 1 contains 50 (not a match).
2. Check next index:
o (1+1)mod 7=2→ Index 2 contains 85 (not a match).
3. Check next index:
o (2+1)mod 7=3 → Found 92 ✅
Search for 100:

1. Compute h1(100)=100mod 7=2.


2. Index 2 contains 85 (not a match).
3. Check next index:
o (2+1)mod 7=3 → Index 3 contains 92 (not a match).
4. Check next index:
o (3+1)mod 7=4 → Index 4 contains 73 (not a match).
5. Check next index:
o (4+1)mod 7=5 → Index 5 contains 101 (not a match).
6. Check next index:
o (5+1)mod 7=6 → Index 6 contains 76 (not a match).
7. Reached an empty slot, and 100 is not found ❌.

Advantages of Linear Probing

1. Simple and easy to implement.


2. Efficient in terms of space as it uses the array directly, unlike methods that require
extra memory (like linked lists in separate chaining).
3. Good cache performance because the elements are stored close together in memory.

Disadvantages of Linear Probing

1. Clustering: As more elements are inserted, primary clustering occurs, where


consecutive slots become occupied, leading to longer probe sequences.
2. As the table gets full, probing becomes more costly and the performance degrades.

Conclusion

Linear probing is a simple and effective collision resolution strategy in hash tables. When a
collision occurs, it sequentially checks the next index in the array for an empty slot. Though
simple and memory-efficient, linear probing can suffer from clustering, especially when the
hash table is nearly full, which can slow down the insertion and search operations.

You might also like