Hashing
Hashing
Objective
•To learn the Hash Table as a dictionary data structure
•To understand the difference between direct address table and
hash table
•To understand the concept of constant time searching
•To learn about various hash functions
•To identify collision as an inevitable event in hashing
•To learn various collision resolution techniques
•To be able to differentiate between open addressing and
chained hashing
Dictionary data structure
•Dictionary:
•Dynamic-set data structure for storing items indexed using keys.
•Supports operations Insert, Search, and Delete.
•Applications:
• Symbol table of a compiler.
• Memory-management tables in operating systems.
• Large-scale distributed systems.
•Hash Tables:
•Effective way of implementing dictionaries.
•Generalization of ordinary arrays.
Direct Address Table (DAT) • Suppose each element in the data
structure has a key drawn from {0, 1, …, 9},
and no two elements have the same key
• An array T[0…9] can be used
• Each key in the universe U={0, 1, …,9}
corresponds to an index in the table
• The set K = {2, 3, 5, 8} of actual keys
determines the slots in the table that
contain pointers to elements
• Other slots, heavily shaded, contain NIL
• This structure is called a direct-address
table, and Search, Insert, and Delete all run
in O(1), since they simply correspond to
accessing the elements in the array
• Similar logic is equivocally applicable for
any range of universe of discourse
Problem with DAT
Direct-address tables are impractical
when:
i. The number of possible keys is
large – storage space required will
be proportionally large
ii. When it far exceeds the number of
keys that are actually stored –
results in large number of empty
slots
Hash table is a better option
Hash table – an introduction
•
• Advantages: 3
linked list. 6
36 16
10 10
Quadratic Probing
Q.5 Given the following input (4322, 1334, 1471, 9679, 1989, 6171, 6173, 4199) and the
hash function x mod 10, which of the following statements are true? (GATE CS 2004)
i. 9679, 1989, 4199 hash to the same value
ii. 1471, 6171 hash to the same value
iii. All elements hash to the same value
iv. Each element hashes to a different value
(A) i only (B) ii only (C) i and ii only (D) iii or iv
Q.6 Which one of the following hash functions on integers will distribute keys most
uniformly over 10 buckets numbered 0 to 9 for i ranging from 0 to 2020?
Q.7 A hash function h defined h(key)=key mod 7, with linear probing, is used to
insert the keys 44, 45, 79, 55, 91, 18, 63 into a table indexed from 0 to 6. What will
be the location of key 18?
(A) 3 (B) 4 (C) 5 (D) 6
Q.8 An advantage of chained hash table (external hashing) over the open
addressing scheme is
(A) Worst case complexity of search operations is less
(B) Space used is less
(C) Deletion is easier
(D) None of the above
Q.9 Assume ord(A) = 1, ord(B) = 2,…, ord(E) = 5, etc. Insert the characters of the
string K R P C S N Y T J M into a hash table of size 10. Use the hash function
h(x) = ( ord(x) – ord("a") + 1 ) mod10
If linear probing is used to resolve collisions, then the following insertion causes
collision
(A) Y (B) C (C) M (D) P
Q.10 A hash table with ten buckets with one slot per bucket is shown in the Table 2. The
symbols S1 to S7 initially entered using a hashing function with linear probing. The
maximum number of comparisons needed in searching an item that is not present is
0 S7
1 S1
2
3 S4
4 S2
5
6 S5
7
8 S6
9 S3
Table 2
(A) 4 (B) 5 (C) 6 (D) 3
Answers:
1–C
2–B
3–C
4–B
5–C
6–B
7–C
8–C
9–C
10 – B
Any
Questions?