Data Structure Seminar
Data Structure Seminar
G
• EXTENDIBLE
HASHING
Prepared by
Ashaaz Ahmed A
2nd Year CSE-A
REHASHING
• Rehashing in data structures is the process of resizing and
reorganizing a hash table when it becomes too full or inefficient
to handle the current set of keys.
Hash
H(key)Function
= key mod tablesize
Examp
le:
Insert the elements
37,90,55,22,17,49,87.
Table size is 10
NOT
TheE:
table is almost full and if we try to insert more
elements collisions will occur and eventually further
insertions will fail. Hence we will rehash by doubling
the table size. The old table size is 10 then we should
double this size for new table, that becomes 20. But 20
is not a prime number, we will prefer to make the table
size as 23. Hence Rehashing is applied
Advantages :-
• It provides the programmer a flexibility to enlarge
the table size if is required.
• The data to be placed in the hash table is by extracting certain number of bits.
• The extendible hashing scheme contains main memory (Directory) and one or more sub directories(buckets)
stored on disk.
• The hash table size is always 2d where d is called global depth. Each table entry points to one bucket.
10 = 1010, 7 = 0111,
EXTENDIBILE HASHING Insertion -
Working
EXTENDIBILE HASHING Insertion -
Working
Step 1: If the target bucket does not overflow, simply insert data in
appropriate bucket based on bit extract.
Step 3: If local depth is less than the global depth and if bucket gets
overflow, then there is no need to double the size of the directory.
EXAMPLE
Consider the following hash table and insert 14,20,24,18,22
Step 1: Insert 14
Binary representation of 14 is 1110 we will insert 14 in
Bucket #3. The Bucket #3 is not overflown, so insertion
of 14 in it is simply straight forward.
Step 2: Insert 20
Insert it in Bucket #1
Step 3: Insert 24
If we insert 24 in Bucket #1, then overflow will occur. As local depth d' of Bucket #1 = Global depth d
= 2. We need to double the size of main directory. Then increment the global depth. Rehash Bucket #1
to place the elements. Note that while rehashing for Bucket #1 elements, last 3 bits are considered.
Update local depth of Bucket #1 and the
Bucket that gets created as split image of
Bucket #1.
Step 4 : Insert 18
The binary representation of 18 = 10010. We will consider last two digits and
insert 18 in Bucket #3. (Because local depth of this bucket is 2).
Step 5: Insert 22
The binary representation of 22 = 10110. That means in 22 in Bucket #3. But as Bucket #3 is full
we need to create split image of Bucket #3 as Bucket #3'. There is no need to double the size of
directory as global depth (3) > local depth (2) . Now to rehash the values in Bucket #3 and Bucket
#3' we will consider last three bits. The local depth of Bucket #3 = Bucket #3' = 3. Thus 22 gets
inserted appropriately.
Points to Remember :
1. The maximum number of bits needed to tell which bucket an entry belongs to for
Directory is called global depth.
2. The number of bits used to determine if an entry belongs to particular which bucket is
called local depth.
3. The directory gets doubled only when bucket is full and local depth = global depth
before insertion.
EXTENDIBILE HASHING Deletion -
Working
Step-1 :The element can be removed from corresponding buckets and if bucke
becomes empty then it can be merged with split image bucket.
Step 2: If each directory element points to same bucket as its split image,
just half the directory.
SUMMARY