0% found this document useful (0 votes)
14 views33 pages

DS Unit-Ii

The document discusses dictionaries as collections of key-value pairs, detailing operations such as insertion, deletion, searching, and displaying values. It explains linear list representation using sorted arrays and chains, as well as skip lists and hash tables, including their structures, complexities, and operations. Additionally, it covers hashing techniques, collision resolution methods, and the advantages and disadvantages of skip lists.

Uploaded by

gamervicky0804
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views33 pages

DS Unit-Ii

The document discusses dictionaries as collections of key-value pairs, detailing operations such as insertion, deletion, searching, and displaying values. It explains linear list representation using sorted arrays and chains, as well as skip lists and hash tables, including their structures, complexities, and operations. Additionally, it covers hashing techniques, collision resolution methods, and the advantages and disadvantages of skip lists.

Uploaded by

gamervicky0804
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

UNIT-II

DICTIONARIES:
Dictionary is a collection of pairs of key and value where every value is associated
with the corresponding key.
Basic operations that can be performed on dictionary are:
1. Insertion of value in the dictionary
2. Deletion of particular value from dictionary
3. Searching of a specific value with the help of key

Linear List Representation


The dictionary can be represented as a linear list. The linear list is a collection of pair
and value. There are two method of representing linear list.
1. Sorted Array- An array data structure is used to implement the dictionary.
2. Sorted Chain- A linked list data structure is used to implement the dictionary

Structure of linear list for dictionary:

struct node
{
int key;
int
value;
struct node *next;
} *head;

void
dictionary();
void insert_d( );
void delete_d(
); void
display_d( );
void search();

Insertion of new node in the dictionary:


Consider that initially dictionary is empty
then head = NULL
We will create a new node with some key and value contained in it.
Now as head is NULL, this new node becomes head. Hence the dictionary contains only
one record. this node will be „curr‟ and „prev‟ as well. The „cuur‟ node will always point to
current visiting node and „prev‟ will always point to the node previous to „curr‟ node. As
now there is only one node in the list mark as „curr‟ node as „prev‟ node.

New/head/curr/prev

1 10 NULL

Insert a record, key=4 and value=20,

New

4 20 NULL

Compare the key value of „curr‟ and „New‟ node. If New->key > Curr->key then attach
New node to „curr‟ node.

prev/head New curr->next=New


prev=curr
1 10 4 20 NULL

Add a new node <7,80> then

head/prev curr New


1 10 4 20 7 80 NULL

If we insert <3,15> then we have to search for it proper position by comparing

key value. (curr->key < New->key) is false. Hence else part will get executed.

1 10 4 20
7 80 NULL

3 15
void insert_d()
{
int k;
int
data;
node *p,*curr,*prev;
printf("Enter an key and value to be inserted:");
scanf(“%d”, &k);
scanf(“%d”, &data);
p=new
node; p-
>key=k;
p-
>value=data;
p-
>next=NULL;
if(head==NULL)
{
head=p;
}
els
e
{ curr=head;
while((curr->key<p->key)&&(curr->next!=NULL))
{
prev=curr;
curr=curr-
>next;
}
if(curr->next==NULL)
{
if(curr->key<p->key)
{
curr-
>next=p;
} prev=curr;
els
e
{
p- >next=prev-
>next; prev-
}
} >next=p;
els
e
{
p->next=prev-
>next; prev-
>next=p;
}
printf("\nInserted into dictionary Sucesfully \n");
}}

The delete operation:

Case 1: Initially assign „head‟ node as „curr‟ node.Then ask for a key value of the node
which is to be deleted. Then starting from head node key value of each jode is cked and
compared with the desired node‟s key value. We will get node which is to be deleted in
variable „curr‟. The node given by variable „prev‟ keeps track of previous node of „cuu‟
node. For eg, delete node with key value 4 then

cur

1 10 3 15 4 20 7 80 ULL

Case 2:

If the node to be deleted is head


nodei.e.. if(curr==head)

Then, simply make „head‟ node as next node and delete „curr‟

curr head
1 10 3 15 4 20 7 80 ULL

Hence the list becomes

head

3 15 4 20 7 80 ULL
void delete_d( )
{
node*curr,*prev;
printf("Enter key value that you want to delete...");
scanf(“%d”,&k);
if(head==NULL)
printf("\ndictionary is Underflow");
else
{ curr=head;
while(curr!=NULL)
{
if(curr->key==k)
break;
prev=curr;
curr=curr->next;
}
}
if(curr==NULL)
printf("Node not found...");
else
{
if(curr==head)
head=curr->next;
else
prev->next=curr->next;
delete curr;
printf("Item deleted from dictionary...");
}
}

The search operation: void search ()


{
struct node
*curr;int k;
printf(“Enter key :
“); scanf(“%d”,&k);
curr=head;
if(curr==NULL)
{
printf(”The list is empty”);
}

while(curr!=NULL)
{
if(cur->key ==k)
{
printf(“\n%d : %d”, cur->key, cur-
>value); break;
}
cur=curr->next;
}

if(cur==NULL)
printf(“\n NOT found”);
}

The display operation:void display ()


{
struct node
*curr;
curr=head;
if(curr==NULL)
{
printf(”The list is empty”);
}
while(curr!=NULL)
{
printf(“%d : %d \t, cur->key, cur-
>value); cur=curr->next;
}
}
Skip list representation

A skip list is a probabilistic data structure. The skip list is used to store a sorted list of elements or data with
a linked list. It allows the process of the elements or data to view efficiently. In one single step, it skips
several elements of the entire list, which is why it is known as a skip list.

The skip list is an extended version of the linked list. It allows the user to search, remove, and insert the
element very quickly. It consists of a base list that includes a set of elements which maintains the link
hierarchy of the subsequent elements.

Skip list structure

It is built in two layers: The lowest layer and Top layer.

The lowest layer of the skip list is a common sorted linked list, and the top layers of the skip list are like an
"express line" where the elements are skipped.

Complexity table of the Skip list

S. No Complexity Average case Worst case

1. Access complexity O(logn) O(n)

2. Search complexity O(logn) O(n)

3. Delete complexity O(logn) O(n)

4. Insert complexity O(logn) O(n)

5. Space complexity - O(nlogn)

Working of the Skip list

Let's take an example to understand the working of the skip list. In this example, we have 14 nodes, such
that these nodes are divided into two layers, as shown in the diagram.

The lower layer is a common line that links all nodes, and the top layer is an express line that links only the
main nodes, as you can see in the diagram.
Suppose you want to find 47 in this example. You will start the search from the first node of the express
line and continue running on the express line until you find a node that is equal a 47 or more than 47.

You can see in the example that 47 does not exist in the express line, so you search for a node of less than
47, which is 40. Now, you go to the normal line with the help of 40, and search the 47, as shown in the
diagram.

Skip List Basic Operations

There are the following types of operations in the skip list.

Insertion operation: It is used to add a new node to a particular location in a specific situation.

Deletion operation: It is used to delete a node in a specific situation.

Search Operation: The search operation is used to search a particular node in a skip list.

Algorithm of the insertion operation

Insertion (L, Key)


local update [0...Max_Level + 1]
a = L → header
for i = L → level down to 0 do.
while a → forward[i] → key forward[i]
update[i] = a
a = a → forward[0]
lvl = random_Level()
if lvl > L → level then
for i = L → level + 1 to lvl do
update[i] = L → header
L → level = lvl
a = makeNode(lvl, Key, value)
for i = 0 to level do
a → forward[i] = update[i] → forward[i]
update[i] → forward[i] = a

Algorithm of deletion operation

Deletion (L, Key)


local update [0... Max_Level + 1]
a = L → header
for i = L → level down to 0 do.
while a → forward[i] → key forward[i]
update[i] = a
a = a → forward[0]
if a → key = Key then
for i = 0 to L → level do
if update[i] → forward[i] ? a then break
update[i] → forward[i] = a → forward[i]
free(a)
while L → level > 0 and L → header → forward[L → level] = NIL do
L → level = L → level - 1
Algorithm of searching operation

Searching (L, SKey)


a = L → header
loop invariant: a → key level down to 0 do.
while a → forward[i] → key forward[i]
a = a → forward[0]
if a → key = SKey then return a → value
else return failure

Example 1: Create a skip list, we want to insert these following keys in the empty skip list.

1. 6 with level 1.
2. 29 with level 1.
3. 22 with level 4.
4. 9 with level 3.
5. 17 with level 1.
6. 4 with level 2.

Ans:

Step 1: Insert 6 with level 1

Step 2: Insert 29 with level 1

Step 3: Insert 22 with level 4


Step 4: Insert 9 with level 3

Step 5: Insert 17 with level 1

Step 6: Insert 4 with level 2


Example 2: Consider this example where we want to search for key 17.

Ans:

Advantages of the Skip list


1. If you want to insert a new node in the skip list, then it will insert the node very fast because there
are no rotations in the skip list.
2. The skip list is simple to implement as compared to the hash table and the binary search tree.
3. It is very simple to find a node in the list because it stores the nodes in sorted form.
4. The skip list algorithm can be modified very easily in a more specific structure, such as indexable
skip lists, trees, or priority queues.
5. The skip list is a robust and reliable list.

Disadvantages of the Skip list

1. It requires more memory than the balanced tree.


2. Reverse searching is not allowed.
3. The skip list searches the node much slower than the linked list.

Applications of the Skip list

1. It is used in distributed applications, and it represents the pointers and system in the distributed
applications.
2. It is used to implement a dynamic elastic concurrent queue with low lock contention.
3. It is also used with the QMap template class.
4. The indexing of the skip list is used in running median problems.
5. The skip list is used for the delta-encoding posting in the Lucene search.

Hash Table

Hash table is one of the most important data structures that uses a special function known as a hash
function that maps a given value with a key to access the elements faster.

A Hash table is a data structure that stores some information, and the information has basically two main
components, i.e., key and value. The hash table can be implemented with the help of an associative array.
The efficiency of mapping depends upon the efficiency of the hash function used for mapping.

For example, suppose the key value is John and the value is the phone number, so when we pass the key
value in the hash function shown as below:
Drawback of Hash function

A Hash function assigns each value with a unique key. Sometimes hash table uses an imperfect hash
function that causes a collision because the hash function generates the same key of two different values.

Hashing

Hashing is one of the searching techniques that uses a constant time. The time complexity in hashing is
O(1). Till now, we read the two techniques for searching, i.e., linear search and binary search. The worst
time complexity in linear search is O(n), and O(logn) in binary search. In both the searching techniques, the
searching depends upon the number of elements but we want the technique that takes a constant time. So,
hashing technique came that provides a constant time.

In Hashing technique, the hash table and hash function are used. Using the hash function, we can calculate
the address at which the value can be stored.

The main idea behind the hashing is to create the (key/value) pairs. If the key is given, then the algorithm
computes the index at which the value would be stored. It can be written as:

Index = hash(key)

There are three ways of calculating the hash function:

o Division method
o Folding method
o Mid square method

In the division method, the hash function can be defined as:


h(ki) = ki % m; where m is the size of the hash table.

For example, if the key value is 6 and the size of the hash table is 10. When we apply the hash function to
key 6 then the index would be:

h(6) = 6%10 = 6

The index is 6 at which the value is stored.

Collision

When the two different values have the same value, then the problem occurs between the two values,
known as a collision. In the above example, the value is stored at index 6. If the key value is 26, then the
index would be:

h(26) = 26%10 = 6

Therefore, two values are stored at the same index, i.e., 6, and this leads to the collision problem. To
resolve these collisions, we have some techniques known as collision techniques.

The following are the collision techniques:


o Open Hashing: It is also known as closed addressing.
o Closed Hashing: It is also known as open addressing.

Open Hashing

In Open Hashing, one of the methods used to resolve the collision is known as a chaining method.
Let's first understand the chaining to resolve the collision.

Suppose we have a list of key values

A = 3, 2, 9, 6, 11, 13, 7, 12 where m = 10, and h(k) = 2k+3

In this case, we cannot directly use h(k) = ki/m as h(k) = 2k+3


o The index of key value 3 is:
index = h(3) = (2(3)+3)%10 = 9
The value 3 would be stored at the index 9.
o The index of key value 2 is:
index = h(2) = (2(2)+3)%10 = 7
The value 2 would be stored at the index 7.
o The index of key value 9 is:
index = h(9) = (2(9)+3)%10 = 1
The value 9 would be stored at the index 1.
o The index of key value 6 is:
index = h(6) = (2(6)+3)%10 = 5
The value 6 would be stored at the index 5.
o The index of key value 11 is:
index = h(11) = (2(11)+3)%10 = 5

The value 11 would be stored at the index 5. Now, we have two values (6, 11) stored at the same index, i.e.,
5. This leads to the collision problem, so we will use the chaining method to avoid the collision. We will
create one more list and add the value 11 to this list. After the creation of the new list, the newly created list
will be linked to the list having value 6.

o The index of key value 13 is:

index = h(13) = (2(13)+3)%10 = 9

The value 13 would be stored at index 9. Now, we have two values (3, 13) stored at the same index, i.e., 9.
This leads to the collision problem, so we will use the chaining method to avoid the collision. We will
create one more list and add the value 13 to this list. After the creation of the new list, the newly created list
will be linked to the list having value 3.

o The index of key value 7 is:

index = h(7) = (2(7)+3)%10 = 7

The value 7 would be stored at index 7. Now, we have two values (2, 7) stored at the same index, i.e., 7.
This leads to the collision problem, so we will use the chaining method to avoid the collision. We will
create one more list and add the value 7 to this list. After the creation of the new list, the newly created list
will be linked to the list having value 2.

o The index of key value 12 is:

index = h(12) = (2(12)+3)%10 = 7

According to the above calculation, the value 12 must be stored at index 7, but the value 2 exists at index 7.
So, we will create a new list and add 12 to the list. The newly created list will be linked to the list having a
value 7.

The calculated index value associated with each key value is shown in the below table:

key Location(u)

3 ((2*3)+3)%10 = 9
2 ((2*2)+3)%10 = 7

9 ((2*9)+3)%10 = 1

6 ((2*6)+3)%10 = 5

11 ((2*11)+3)%10 = 5

13 ((2*13)+3)%10 = 9

7 ((2*7)+3)%10 = 7

12 ((2*12)+3)%10 = 7

Closed Hashing

In Closed hashing, three techniques are used to resolve the collision:

1. Linear probing
2. Quadratic probing
3. Double Hashing technique

Linear Probing

Linear probing is one of the forms of open addressing. As we know that each cell in the hash table contains
a key-value pair, so when the collision occurs by mapping a new key to the cell already occupied by
another key, then linear probing technique searches for the closest free locations and adds a new key to that
empty cell. In this case, searching is performed sequentially, starting from the position where the collision
occurs till the empty cell is not found.

Let's understand the linear probing through an example.

Consider the above example for the linear probing:

A = 3, 2, 9, 6, 11, 13, 7, 12 where m = 10, and h(k) = 2k+3


The key values 3, 2, 9, 6 are stored at the indexes 9, 7, 1, 5 respectively. The calculated index value of 11 is
5 which is already occupied by another key value, i.e., 6. When linear probing is applied, the nearest empty
cell to the index 5 is 6; therefore, the value 11 will be added at the index 6.

The next key value is 13. The index value associated with this key value is 9 when hash function is applied.
The cell is already filled at index 9. When linear probing is applied, the nearest empty cell to the index 9 is
0; therefore, the value 13 will be added at the index 0.

The next key value is 7. The index value associated with the key value is 7 when hash function is applied.
The cell is already filled at index 7. When linear probing is applied, the nearest empty cell to the index 7 is
8; therefore, the value 7 will be added at the index 8.

The next key value is 12. The index value associated with the key value is 7 when hash function is applied.
The cell is already filled at index 7. When linear probing is applied, the nearest empty cell to the index 7 is
2; therefore, the value 12 will be added at the index 2.

Quadratic Probing

In case of linear probing, searching is performed linearly. In contrast, quadratic probing is an open
addressing technique that uses quadratic polynomial for searching until a empty slot is found.

It can also be defined as that it allows the insertion ki at first free location from (u+i2)%m

where i=0 to m-1.

Let's understand the quadratic probing through an example.

Consider the same example which we discussed in the linear probing.

A = 3, 2, 9, 6, 11, 13, 7, 12 where m = 10, and h(k) = 2k+3

The key values 3, 2, 9, 6 are stored at the indexes 9, 7, 1, 5, respectively. We do not need to apply the
quadratic probing technique on these key values as there is no occurrence of the collision.

The index value of 11 is 5, but this location is already occupied by the 6. So, we apply the quadratic
probing technique.
When i = 0
Index= (5+02)%10 = 5
When i=1
Index = (5+12)%10 = 6

Since location 6 is empty, so the value 11 will be added at the index 6.

The next element is 13. When the hash function is applied on 13, then the index value comes out to be 9,
which we already discussed in the chaining method. At index 9, the cell is occupied by another value, i.e.,
3. So, we will apply the quadratic probing technique to calculate the free location.

When i=0
Index = (9+02)%10 = 9
When i=1
Index = (9+12)%10 = 0

Since location 0 is empty, so the value 13 will be added at the index 0.

The next element is 7. When the hash function is applied on 7, then the index value comes out to be 7,
which we already discussed in the chaining method. At index 7, the cell is occupied by another value, i.e.,
7. So, we will apply the quadratic probing technique to calculate the free location.

When i=0
Index = (7+02)%10 = 7
When i=1
Index = (7+12)%10 = 8
Since location 8 is empty, so the value 7 will be added at the index 8.
The next element is 12. When the hash function is applied on 12, then the index value comes out to be 7.
When we observe the hash table then we will get to know that the cell at index 7 is already occupied by the
value 2. So, we apply the Quadratic probing technique on 12 to determine the free location.
When i=0
Index= (7+02)%10 = 7
When i=1
Index = (7+12)%10 = 8
When i=2
Index = (7+22)%10 = 1
When i=3
Index = (7+32)%10 = 6
When i=4
Index = (7+42)%10 = 3
Since the location 3 is empty, so the value 12 would be stored at the index 3.
The final hash table would be:

Therefore, the order of the elements is 13, 9, _, 12, _, 6, 11, 2, 7, 3.

Double Hashing

Double hashing is an open addressing technique which is used to avoid the collisions. When the collision
occurs then this technique uses the secondary hash of the key. It uses one hash value as an index to move
forward until the empty location is found.

In double hashing, two hash functions are used. Suppose h1(k) is one of the hash functions used to calculate
the locations whereas h2(k) is another hash function. It can be defined as "insert ki at first free place
from (u+v*i)%m where i=(0 to m-1)". In this case, u is the location computed using the hash function and v
is equal to (h2(k)%m).

Consider the same example that we use in quadratic probing.

A = 3, 2, 9, 6, 11, 13, 7, 12 where m = 10, and

h1(k) = 2k+3

h2(k) = 3k+1
key Location (u) v probe

3 ((2*3)+3)%10 = 9 - 1
2 ((2*2)+3)%10 = 7 - 1

9 ((2*9)+3)%10 = 1 - 1
6 ((2*6)+3)%10 = 5 - 1

11 ((2*11)+3)%10 = 5 (3(11)+1)%10 =4 3
13 ((2*13)+3)%10 = 9 (3(13)+1)%10 = 0

7 ((2*7)+3)%10 = 7 (3(7)+1)%10 = 2
12 ((2*12)+3)%10 = 7 (3(12)+1)%10 = 7 2

As we know that no collision would occur while inserting the keys (3, 2, 9, 6), so we will not apply double
hashing on these key values.

On inserting the key 11 in a hash table, collision will occur because the calculated index value of 11 is 5
which is already occupied by some another value. Therefore, we will apply the double hashing technique
on key 11. When the key value is 11, the value of v is 4.

Now, substituting the values of u and v in (u+v*i)%m

When i=0
Index = (5+4*0)%10 =5
When i=1
Index = (5+4*1)%10 = 9
When i=2
Index = (5+4*2)%10 = 3

Since the location 3 is empty in a hash table; therefore, the key 11 is added at the index 3.

The next element is 13. The calculated index value of 13 is 9 which is already occupied by some another
key value. So, we will use double hashing technique to find the free location. The value of v is 0.

Now, substituting the values of u and v in (u+v*i)%m

When i=0
Index = (9+0*0)%10 = 9

We will get 9 value in all the iterations from 0 to m-1 as the value of v is zero. Therefore, we cannot insert
13 into a hash table.

The next element is 7. The calculated index value of 7 is 7 which is already occupied by some another key
value. So, we will use double hashing technique to find the free location. The value of v is 2.

Now, substituting the values of u and v in (u+v*i)%m

When i=0
Index = (7 + 2*0)%10 = 7
When i=1
Index = (7+2*1)%10 = 9
When i=2
Index = (7+2*2)%10 = 1
When i=3
Index = (7+2*3)%10 = 3
When i=4
Index = (7+2*4)%10 = 5
When i=5
Index = (7+2*5)%10 = 7
When i=6
Index = (7+2*6)%10 = 9
When i=7
Index = (7+2*7)%10 = 1
When i=8
Index = (7+2*8)%10 = 3
When i=9
Index = (7+2*9)%10 = 5

Since we checked all the cases of i (from 0 to 9), but we do not find suitable place to insert 7. Therefore,
key 7 cannot be inserted in a hash table.
The next element is 12. The calculated index value of 12 is 7 which is already occupied by some another
key value. So, we will use double hashing technique to find the free location. The value of v is 7.
Now, substituting the values of u and v in (u+v*i)%m
When i=0
Index = (7+7*0)%10 = 7
When i=1
Index = (7+7*1)%10 = 4

Since the location 4 is empty; therefore, the key 12 is inserted at the index 4.

The final hash table would be:

The order of the elements is _, 9, _, 11, 12, 6, _, 2, _, 3.

REHASHING

Rehashing is a technique in which the table is resized, i.e., the size of table is doubled by
creating a new table. It is preferable is the total size of table is a prime number. There are
situations in which the rehashing is required.

When table is completely full


With quadratic probing when the table is filled
half.When insertions fail due to overflow.
In such situations, we have to transfer entries from old table to the new table
by re computingtheir positions using hash functions.

Consider we have to insert the elements 37, 90, 55, 22, 17, 49, and 87. the table
size is 10 and will use hash function.,

H(key) = key mod tablesize

37 % 10 = 7
90 % 10= 0
55 % 10 = 5
22 % 10 = 2
17 % 10 = 7 Collision solved by
linear probing49 % 10 = 9

Now this table is almost full and if we try to insert more elements collisions will
occur and eventually further insertions will fail. Hence we will rehash by
doubling the table size. The old table size is 10 then we should double this size
for new table, that becomes 20. But 20 is not a prime number, we will prefer to
make the table size as 23. And new hash function will be

H(key) key mod 23 0 90


1 11
37 % 23 = 14 2 22
90 % 23 = 21 3
55 % 23 = 9 4
22 % 23 = 22 5 55
17 % 23 = 17 6 87
49 % 23 = 3 7 37
87 % 23 = 18 8 49
9
10
11
12
13
14
15
16
17
18
19
20
21
22
Now the hash table is sufficiently large to accommodate new insertions.

Advantages:

1. This technique provides the programmer a flexibility to enlarge the table size if
required.
2. Only the space gets doubled with simple hash function which
avoids occurrence of collisions.

Extendible Hashing
Extendible Hashing is a dynamic hashing method wherein directories, and buckets are used to
hash data. It is an aggressively flexible method in which the hash function also experiences
dynamic changes.
Main features of Extendible Hashing: The main features in this hashing technique are:
 Directories: The directories store addresses of the buckets in pointers. An id is assigned to
each directory which may change each time when Directory Expansion takes place.
 Buckets: The buckets are used to hash the actual data.
Basic Structure of Extendible Hashing:

 Directories: These containers store pointers to buckets. Each directory is given a unique id
which may change each time when expansion takes place. The hash function returns this
directory id which is used to navigate to the appropriate bucket. Number of Directories =
2^Global Depth.
 Buckets: They store the hashed keys. Directories point to buckets. A bucket may contain
more than one pointers to it if its local depth is less than the global depth.
 Global Depth: It is associated with the Directories. They denote the number of bits which
are used by the hash function to categorize the keys. Global Depth = Number of bits in
directory id.
 Local Depth: It is the same as that of Global Depth except for the fact that Local Depth is
associated with the buckets and not the directories. Local depth in accordance with the
global depth is used to decide the action that to be performed in case an overflow occurs.
Local Depth is always less than or equal to the Global Depth.
 Bucket Splitting: When the number of elements in a bucket exceeds a particular size, then
the bucket is split into two parts.
 Directory Expansion: Directory Expansion Takes place when a bucket overflows.
Directory Expansion is performed when the local depth of the overflowing bucket is equal
to the global depth.
Basic working
Example based on Extendible Hashing: Now, let us consider a prominent example of
hashing the following elements: 16,4,6,22,24,10,31,7,9,20,26.
Bucket Size: 3 (Assume)
Hash Function: Suppose the global depth is X. Then the Hash Function returns X LSBs.

Solution: First, calculate the binary forms of each of the given numbers.
16- 10000
4- 00100
6- 00110
22- 10110
24- 11000
10- 01010
31- 11111
7- 00111
9- 01001
20- 10100
26- 11010
 Initially, the global-depth and local-depth is always 1. Thus, the hashing frame looks like
this:

 Inserting 16:
The binary format of 16 is 10000 and global-depth is 1. The hash function returns 1 LSB of
10000 which is 0. Hence, 16 is mapped to the directory with id=0.

Inserting4and6:
Both 4(100) and 6(110)have 0 in their LSB. Hence, they are hashed as follows:

 Inserting 22: The binary form of 22 is 10110. Its LSB is 0. The bucket pointed by
directory 0 is already full. Hence, Over Flow occurs.
 As directed by Step 7-Case 1, Since Local Depth = Global Depth, the bucket splits and
directory expansion takes place. Also, rehashing of numbers present in the overflowing
bucket takes place after the split. And, since the global depth is incremented by 1, now,the
global depth is 2. Hence, 16,4,6,22 are now rehashed w.r.t 2 LSBs.[
16(10000),4(100),6(110),22(10110) ]

 Inserting 24 and 10: 24(11000) and 10 (1010) can be hashed based on directories with id
00 and 10. Here, we encounter no overflow condition.
 Inserting 31,7,9: All of these elements[ 31(11111), 7(111), 9(1001) ] have either 01 or 11
in their LSBs. Hence, they are mapped on the bucket pointed out by 01 and 11. We do not
encounter any overflow condition here.

 Inserting 20: Insertion of data element 20 (10100) will again cause the overflow problem.

 20 is inserted in bucket pointed out by 00. As directed by Step 7-Case 1, since the local
depth of the bucket = global-depth, directory expansion (doubling) takes place along
with bucket splitting. Elements present in overflowing bucket are rehashed with the new
global depth. Now, the new Hash table looks like this:

 Inserting 26: Global depth is 3. Hence, 3 LSBs of 26(11010) are considered. Therefore 26
best fits in the bucket pointed out by directory 010.

 The bucket overflows, and, as directed by Step 7-Case 2, since the local depth of bucket
< Global depth (2<3), directories are not doubled but, only the bucket is split and elements
arerehashed.
Finally, the output of hashing the given list of numbers is obtained.
 Hashing of 11 Numbers is Thus Completed.

You might also like