0% found this document useful (0 votes)
36 views22 pages

ADS Unit-3

Advance data structure
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views22 pages

ADS Unit-3

Advance data structure
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

II CSE III Sem Advanced Data Structures & Algorithm

Unit-III
Trees Part-II:
Red-Black Trees, Splay Trees, Applications.
Hash Tables: Introduction, Hash Structures, Hash functions, Linear Open Addressing, Chaining and
applications:

Red-Black Tree:
A red-black tree is a type of binary search tree. It is self balancing like the AVL tree
though it uses different properties to maintain the invariant of being balanced. Balanced binary search trees
are much more efficient at search than unbalanced binary search trees, so the complexity needed to
maintain balance is often worth it. They are called red-black trees because each node in the tree is labeled
as red or black.
A Red Black Tree is a category of the self-balancing binary search tree. It was created in
1972 by Rudolf Bayer who termed them "symmetric binary B-trees."
A red-black tree is a Binary tree where a particular node has color as an extra attribute, either
red or black. By check the node colors on any simple path from the root to a leaf, red-black trees secure
that no such path is higher than twice as long as any other so that the tree is generally balanced.

Properties of Red Black Tree:


1. Red - Black Tree must be a Binary Search Tree.
2. The ROOT node must be colored BLACK.
3. The children of Red colored node must be colored BLACK. (There should not be two
consecutive RED nodes).
4. In all the paths of the tree, there should be same number of BLACK colored nodes.
5. Every new node must be inserted with RED color.
6. Every leaf (e.i. NULL node) must be colored BLACK.
Example:

1
II CSE III Sem Advanced Data Structures & Algorithm

Operations on Red-Black Trees:


The search-tree operations TREE-INSERT and TREE-DELETE, when
runs on a red-black tree with n keys, take O (log n) time. Because they customize the tree, the conclusion
may violate the red-black properties. To restore these properties, we must change the color of some of the
nodes in the tree and also change the pointer structure.
1. Rotation:
Restructuring operations on red-black trees can generally be expressed more clearly in details
of the rotation operation.

Clearly, the order (Ax By C) is preserved by the rotation operation. Therefore, if we start with a BST and
only restructure using rotation, then we will still have a BST i.e. rotation do not break the BST-Property.
Algorithm:
LEFT ROTATE (T, x)
1. y ← right [x]
1. y ← right [x]
2. right [x] ← left [y]
3. p [left[y]] ← x
4. p[y] ← p[x]
5. If p[x] = nil [T]
then root [T] ← y
else if x = left [p[x]]
then left [p[x]] ← y
else right [p[x]] ← y
6. left [y] ← x.
7. p [x] ← y.
Example: Draw the complete binary tree of height 3 on the keys {1, 2, 3... 15}. Add the NIL leaves and
color the nodes in three different ways such that the black heights of the resulting trees are: 2, 3 and 4.
Solution:

Tree with black-height-2

2
II CSE III Sem Advanced Data Structures & Algorithm

Tree with black-height-3

Tree with black-height-4

2. Insertion into RED-BLACK Tree


In a Red-Black Tree, every new node must be inserted with the color
RED. The insertion operation in Red Black Tree is similar to insertion operation in Binary Search Tree.
But it is inserted with a color property. After every insertion operation, we need to check all the properties
of Red-Black Tree. If all the properties are satisfied then we go to next operation otherwise we perform the
following operation to make it Red Black Tree.
 1. Recolor
 2. Rotation
 3. Rotation followed by Recolor
A discrepancy can decision from a parent and a child both having a red color. This type of discrepancy is
determined by the location of the node concerning grandparent, and the color of the sibling of the parent.
Algorithm: Algorithm:
RB-INSERT (T, z) RB-INSERT-FIXUP (T, z)
1. y ← nil [T] 1. while color [p[z]] = RED
2. x ← root [T] 2. do if p [z] = left [p[p[z]]]
3. while x ≠ NIL [T] 3. then y ← right [p[p[z]]]
4. do y ← x 4. If color [y] = RED
5. if key [z] < key [x] 5. then color [p[z]] ← BLACK //Case 1
6. then x ← left [x] 6. color [y] ← BLACK //Case 1

3
II CSE III Sem Advanced Data Structures & Algorithm

7. else x ← right [x] 7. color [p[z]] ← RED //Case 1


8. p [z] ← y 8. z ← p[p[z]] //Case 1
9. if y = nil [T] 9. else if z= right [p[z]]
10. then root [T] ← z 10. then z ← p [z] //Case 2
11. else if key [z] < key [y] 11. LEFT-ROTATE (T, z) //Case 2
12. then left [y] ← z 12. color [p[z]] ← BLACK //Case 3
13. else right [y] ← z 13. color [p [p[z]]] ← RED //Case 3
14. left [z] ← nil [T] 14. RIGHT-ROTATE (T,p [p[z]]) //Case 3
15. right [z] ← nil [T] 15. else (same as then clause)
16. color [z] ← RED With "right" and "left" exchanged
17. RB-INSERT-FIXUP (T, z) 16. color [root[T]] ← BLACK

After the insert new node, Coloring this new node into black may violate the black-height conditions and
coloring this new node into red may violate coloring conditions i.e. root is black and red node has no red
children. We know the black-height violations are hard. So we color the node red. After this, if there is any
color violation, then we have to correct them by an RB-INSERT-FIXUP procedure.
Example: Show the red-black trees that result after successively inserting the keys 41,38,31,12,19,8 into
an initially empty red-black tree.

1. Insert 41

4
II CSE III Sem Advanced Data Structures & Algorithm

Insert 19

Thus the final tree is

3. Deletion:
First, search for an element to be deleted
o If the element to be deleted is in a node with only left child, swap this node with one containing the
largest element in the left subtree. (This node has no right child).
o If the element to be deleted is in a node with only right child, swap this node with the one
containing the smallest element in the right subtree (This node has no left child).
o If the element to be deleted is in a node with both a left child and a right child, then swap in any of
the above two ways. While swapping, swap only the keys but not the colors.

5
II CSE III Sem Advanced Data Structures & Algorithm

o The item to be deleted is now having only a left child or only a right child. Replace this node with
its sole child. This may violate red constraints or black constraint. Violation of red constraints can
be easily fixed.
o If the deleted node is black, the black constraint is violated. The elimination of a black node y
causes any path that contained y to have one fewer black node.
o Two cases arise:
o The replacing node is red, in which case we merely color it black to make up for the loss of
one black node.
o The replacing node is black.
The strategy RB-DELETE is a minor change of the TREE-DELETE procedure. After splicing out a node,
it calls an auxiliary procedure RB-DELETE-FIXUP that changes colors and performs rotation to restore the
red-black properties.
RB-DELETE (T, z) RB-DELETE-FIXUP (T, x)
1. if left [z] = nil [T] or right [z] = nil [T] 1. while x ≠ root [T] and color [x] = BLACK
2. then y ← z 2. do if x = left [p[x]]
3. else y ← TREE-SUCCESSOR (z) 3. then w ← right [p[x]]
4. if left [y] ≠ nil [T] 4. if color [w] = RED
5. then x ← left [y] 5. then color [w] ← BLACK //Case 1
6. else x ← right [y] 6. color [p[x]] ← RED //Case 1
7. p [x] ← p [y] 7. LEFT-ROTATE (T, p [x]) //Case 1
8. if p[y] = nil [T] 8. w ← right [p[x]] //Case 1
9. then root [T] ← x 9. If color [left [w]] = BLACK and color [right[w]] =
10. else if y = left [p[y]] BLACK
11. then left [p[y]] ← x 10. then color [w] ← RED //Case 2
12. else right [p[y]] ← x 11. x ← p[x] //Case 2
13. if y≠ z 12. else if color [right [w]] = BLACK
14. then key [z] ← key [y] 13. then color [left[w]] ← BLACK //Case 3
15. copy y's satellite data into z 14. color [w] ← RED //Case 3
16. if color [y] = BLACK 15. RIGHT-ROTATE (T, w) //Case 3
17. then RB-delete-FIXUP (T, x) 16. w ← right [p[x]] //Case 3
18. return y 17. color [w] ← color [p[x]] //Case 4
18. color p[x] ← BLACK //Case 4
19. color [right [w]] ← BLACK //Case 4
20. LEFT-ROTATE (T, p [x]) //Case 4
21. x ← root [T] //Case 4
22. else (same as then clause with "right" and "left"
exchanged)
23. color [x] ← BLACK

6
II CSE III Sem Advanced Data Structures & Algorithm

Example: In a previous example, we found that the red-black tree that results from successively inserting
the keys 41,38,31,12,19,8 into an initially empty tree. Now show the red-black trees that result from the
successful deletion of the keys in the order 8, 12, 19,31,38,41.

Delete 38

Delete 41
No Tree.

7
II CSE III Sem Advanced Data Structures & Algorithm

Splay Trees:

 Splay trees are the self-balancing or self-adjusted binary search trees. In other words, we can say
that the splay trees are the variants of the binary search trees. The prerequisite for the splay trees
that we should know about the binary search trees.
 Splay trees are not strictly balanced trees, but they are roughly balanced trees.
 A splay tree contains the same operations as a Binary search tree, i.e., Insertion, deletion and
searching, but it also contains one more operation, i.e., splaying.
 Splaying an element is the process of bringing it to the root position by performing suitable
rotation operations.
 In a splay tree, splaying an element rearranges all the elements in the tree so that splayed element is
placed at the root of the tree.
 By splaying elements we bring more frequently used elements closer to the root of the tree so that
any operation on those elements is performed quickly. That means the splaying operation
automatically brings more frequently used elements closer to the root of the tree.
Advantages of Splay tree:
o In the splay tree, we do not need to store the extra information. In contrast, in AVL trees, we need
to store the balance factor of each node that requires extra space, and Red-Black trees also require
to store one extra bit of information that denotes the color of the node, either Red or Black.
o It is the fastest type of Binary Search tree for various practical applications. It is used in Windows
NT and GCC compilers.
o It provides better performance as the frequently accessed nodes will move nearer to the root node,
due to which the elements can be accessed quickly in splay trees. It is used in the cache
implementation as the recently accessed data is stored in the cache so that we do not need to go to
the memory for accessing the data, and it takes less time.
Drawback of Splay tree
The major drawback of the splay tree would be that trees are not strictly balanced, i.e., they are
roughly balanced. Sometimes the splay trees are linear, so it will take O(n) time complexity.

In splay tree, to splay any element we use the following rotation operations.
Rotations in Splay Tree:
1. Zig Rotation
2. Zag Rotation
3. Zig - Zig Rotation
4. Zag - Zag Rotation
5. Zig - Zag Rotation
6. Zag - Zig Rotation

Zig Rotation:
The Zig Rotation in splay tree is similar to the
single right rotation in AVL Tree rotations. In zig rotation,
every node moves one position to the right from its current
position. Consider the following example.

8
II CSE III Sem Advanced Data Structures & Algorithm

Zag Rotation
The Zag Rotation in splay tree is
similar to the single left rotation in AVL Tree
rotations. In zag rotation, every node moves one
position to the left from its current position. Consider
the following example.

Zig-Zig Rotation:
The Zig-Zig Rotation in splay tree is a double
zig rotation. In zig-zig rotation, every node moves two
positions to the right from its current position.
Consider the following example

Zag-Zag Rotation:
The Zag-Zag Rotation in splay tree is a double zag
rotation. In zag-zag rotation, every node moves two
positions to the left from its current position. Consider the
following example.

Zig-Zag Rotation:
The Zig-Zag Rotation in splay tree is a
sequence of zig rotation followed by zag rotation. In
zig-zag rotation, every node moves one position to
the right followed by one position to the left from its
current position. Consider the following example.

Zag-Zig Rotation:
The Zag-Zig Rotation in splay tree is a
sequence of zag rotation followed by zig rotation. In
zag-zig rotation, every node moves one position to
the left followed by one position to the right from its
current position. Consider the following example

Splay tree operations:


 Insertion
 Deletion
 Searching
 Splaying

9
II CSE III Sem Advanced Data Structures & Algorithm

Insertion operation in Splay tree:

In the insertion operation, we first insert the element in the tree and then perform the splaying
operation on the inserted element.

15, 10, 17, 7

Step 1: First, we insert node 15 in the tree. After insertion, we need to perform splaying. As 15 is a root
node, so we do not need to perform splaying.

Step 2: The next element is 10. As 10 is less than 15, so node 10 will be the
left child of node 15, as shown below:

Now, we perform splaying. To make 10 as a root node, we will perform the


right rotation, as shown below:

Step 3: The next element is 17. As 17 is greater than 10 and 15 so it will become the right child of node 15.

Now, we will perform splaying. As 17 is having a parent as well as a grandparent so we will perform zig
zig rotations.

In the above figure, we can observe that 17 becomes the root


node of the tree; therefore, the insertion is completed.

Step 4: The next element is 7. As 7 is less than 17, 15, and 10, so node 7 will be left child of 10.

Now, we have to
splay the tree. As 7 is
having a parent as well as a
grandparent so we will
perform two right rotations
as shown below:

10
II CSE III Sem Advanced Data Structures & Algorithm

Still the node 7 is not a root node, it is a left child of the root node, i.e., 17. So, we need to perform one
more right rotation to make node 7 as a root node as shown below:

Algorithm for Insertion operation:


Insert(T, n)

temp= T_root

y=NULL

while(temp!=NULL)

y=temp

if(n->data <temp->data)

temp=temp->left

else

temp=temp->right

n.parent= y

if(y==NULL)

T_root = n

else if (n->data < y->data)

y->left = n

else

y->right = n

Splay(T, n)

11
II CSE III Sem Advanced Data Structures & Algorithm

Deletion operation in splay tree:


As we know that splay trees are the variants of the Binary search tree, so
deletion operation in the splay tree would be similar to the BST, but the only difference is that the delete
operation is followed in splay trees by the splaying operation.
Types of Deletions:
There are two types of deletions in the splay trees:
1. Bottom-up splaying
2. Top-down splaying
Bottom-up splaying:
In bottom-up splaying, first we delete the element from the tree and then we perform
the splaying on the deleted node.Let's understand the deletion in
the Splay tree
Suppose we want to delete 12, 14 from the tree shown below
 First, we simply perform the standard BST deletion operation to
delete 12 elements. As 12 is a leaf node, so we simply delete the
node from the tree.
 The deletion is still not completed. We need to splay the parent
of the deleted node, i.e., 10. We have to perform Splay(10) on
the tree. As we can observe in the above tree that 10 is at the
right of node 7, and node 7 is at the left of node 13. So, first, we
perform the left rotation on node 7 and then we perform the right rotation on node 13, as shown below:

 Still, node 10 is not a root node; node 10 is the left child of the root node. So, we need to perform the
right rotation on the root node, i.e.,
14 to make node 10 a root node as
shown below:

12
II CSE III Sem Advanced Data Structures & Algorithm

 Now, we have to delete the 14 element from the tree, which is shown below.

As we know that we cannot simply delete the internal node. We will replace the value of the node
either using in-order predecessor or in-order successor. Suppose we use in-order successor in which we
replace the value with the lowest value that exist in the right subtree. The lowest value in the right subtree
of node 14 is 15, so we replace the value 14 with 15. Since node 14 becomes the leaf node, so we can
simply delete it as shown below:

 Still, the deletion is not completed. We need to perform one more


operation, i.e., splaying in which we need to make the parent of the deleted
node as the root node. Before deletion, the parent of node 14 was the root
node, i.e., 10, so we do need to perform any splaying in this case.

Top-down splaying:
In top-down splaying, we first perform the
splaying on which the deletion is to be performed and then delete
the node from the tree. Once the element is deleted, we will
perform the join operation. Let's understand the top-down
splaying through an example.
Suppose we want to delete 16 from the tree which is shown below:

Step 1: In top-down splaying, first we perform splaying on the


node 16. The node 16 has both parent as well as grandparent. The node 16 is at the right of its parent and
the parent node is also at the right of its parent, so this is a zag zag situation. In this case, first, we will
perform the left rotation on node 13 and then 14 as shown below:

13
II CSE III Sem Advanced Data Structures & Algorithm

Step-2: The node 16 is still not a root node,


and it is a right child of the root node, so we
need to perform left rotation on the node 12
to make node 16 as a root node.

Step-3: Once the node 16 becomes a root node, we


will delete the node 16 and we will get two different
trees, i.e., left subtree and right subtree as shown
below:
As we know that the values of the left subtree
are always lesser than the values of the right subtree.
The root of the left subtree is 12 and the root of the
right subtree is 17.

Step-4: The first step is to find the maximum element in the left subtree. In the left subtree, the maximum
element is 15, and then we need to perform splaying operation on 15.
As we can observe in the above tree that
the element 15 is having a parent as well as a
grandparent. A node is right of its parent, and
the parent node is also right of its parent, so we
need to perform two left rotations to make node
15 a root node as shown below:

Step-5: After performing two rotations on the


tree, node 15 becomes the root node. As we can
see, the right child of the 15 is NULL, so we
attach node 17 at the right part of the 15 as
shown below, and this operation is known as
a join operation.

14
II CSE III Sem Advanced Data Structures & Algorithm

Algorithm of Delete operation:


If(root==NULL)
return NULL
Splay(root, data)
If data!= root->data
Element is not present
If root->left==NULL
root=root->right
else
temp=root
Splay(root->left, data)
root1->right=root->right
free(temp)
return root
In the above algorithm, we first check whether the root is Null or not; if the root is NULL means
that the tree is empty. If the tree is not empty, we will perform the splaying operation on the element which
is to be deleted. Once the splaying operation is completed, we will compare the root data with the element
which is to be deleted; if both are not equal means that the element is not present in the tree. If they are
equal, then the following cases can occur:
Case 1: The left of the root is NULL, the right of the root becomes the root node.
Case 2: If both left and right exist, then we splay the maximum element in the left subtree. When the
splaying is completed, the maximum element becomes the root of the left subtree. The right subtree would
become the right child of the root of the left subtree.
Hash Table:
• A Hash table is a data structure that stores some information, and the information has basically two
main components, i.e., key and value.
• The hash table can be implemented with the help of an associative array.
• In a hash table, data is stored in an array format, where each data value has its own unique index
value. Access of data becomes very fast if we know the index of the
desired data.
• The efficiency of mapping depends upon the efficiency of the hash
function used for mapping.
The Hash table data structure stores elements in key-value pairs where
Key- unique integer that is used for indexing the values
Value - data that are associated with keys.

15
II CSE III Sem Advanced Data Structures & Algorithm

Hashing:
Hashing is one of the searching techniques that uses a constant time. The time complexity in
hashing is O(1). Till now, we read the two techniques for searching, i.e. linear search and binary search.
The worst time complexity in linear search is O(n), and O(logn) in binary search. In both the searching
techniques, the searching depends upon the number of elements but we want the technique that takes a
constant time. So, hashing technique came that provides a constant time.
In hashing technique, the hash table and hash function are used. Using the hash function, we can
calculate the address at which the value can be stored.
The main idea behind the hashing is to create the (key/value) pairs. If the key is given, then the algorithm
computes the index at which the value would be stored. It can be written as:
For example, suppose the key value is “Sai” and the value is the “phone number”, so when we pass the key
value in the hash function shown as below:
Hash (key) = index;
When we pass the key in the hash function, then it gives the index.
Hash (sai) = 3;
The above example adds the john at the index 3.
Hash function:
A hash function is any function that can be used to map a data set of an arbitrary size to a
data set of a fixed size, which falls into the hash table. The values returned by a hash function are called
hash values, hash codes, hash sums, or simply hashes.
To achieve a good hashing mechanism, It is important to have a good hash function with the following
basic requirements:
1. Easy to compute: It should be easy to compute and must not become an algorithm in itself.
2. Uniform distribution: It should provide a uniform distribution across the hash table and should not
result in clustering.
3. Less collision: Collisions
occur when pairs of elements
are mapped to the same hash
value. These should be
avoided.

Hash Collision:
When the hash function generates the same index for multiple keys, there will be a conflict (what
value to be stored in that index). This is called a hash collision.
We can resolve the hash collision using one of the following techniques.
 Collision resolution by chaining
 Open Addressing: Linear/Quadratic Probing and Double Hashing

16
II CSE III Sem Advanced Data Structures & Algorithm

Collision resolution by Chaining: or Separate chaining (open hashing)


Separate chaining is one of the most commonly used collision resolution techniques. It is usually
implemented using linked lists. In separate chaining, each element of the hash table is a linked list. To store
an element in the hash table you must insert it into a
specific linked list. If there is any collision (i.e. two
different elements have same hash value) then store
both the elements in the same linked list.
The cost of a lookup is that of scanning the entries of
the selected linked list for the required key. If the
distribution of the keys is sufficiently uniform, then
the average cost of a lookup depends only on the
average number of keys per linked list. For this reason,
chained hash tables remain effective even when the
number of table entries (N) is much higher than the
number of slots.
For separate chaining, the worst-case scenario is when
all the entries are inserted into the same linked list.
The lookup procedure may have to scan all its entries, so
the worst-case cost is proportional to the number (N) of
entries in the table.
In the following image, CodeMonk and Hashing both
hash to the value 2. The linked list at the index 2 can
hold only one entry, therefore, the next entry (in the case
Hashing) is linked to the entry of CodeMonk.

Linear probing (open addressing or closed hashing):


In open addressing, instead of in linked lists, all entry records are stored in the array itself. When a
new entry has to be inserted, the hash index of the hashed value is computed and then the array is examined
(starting with the hashed index). If the slot at the hashed
index is unoccupied, then the entry record is inserted in slot
at the hashed index else it proceeds in some probe sequence
until it finds an unoccupied slot.
The probe sequence is the sequence that is followed while
traversing through entries. In different probe sequences,
you can have different intervals between successive entry
slots or probes.
When searching for an entry, the array is scanned in the
same sequence until either the target element is found or an
unused slot is found. This indicates that there is no such
key in the table. The name "open addressing" refers to the
fact that the location or address of the item is not

17
II CSE III Sem Advanced Data Structures & Algorithm

determined by its hash value.


Linear probing is when the interval between successive probes is fixed (usually to 1). Let’s assume that the
hashed index for a particular entry is index. The probing sequence for linear probing will be:
index = index % hashTableSize
index = (index + 1) % hashTableSize
index = (index + 2) % hashTableSize
index = (index + 3) % hashTableSize and so on…
Hash collision is resolved by open addressing with linear probing. Since CodeMonk and Hashing are
hashed to the same index i.e. 2, store Hashing at 3 as the interval between successive probes is 1.

Sr.No. Key Hash Array Index After Linear Probing, Array Index

1 1 1 % 20 = 1 1 1

2 2 2 % 20 = 2 2 2

3 42 42 % 20 = 2 2 3

4 4 4 % 20 = 4 4 4

5 12 12 % 20 = 12 12 12

6 14 14 % 20 = 14 14 14

7 17 17 % 20 = 17 17 17

8 13 13 % 20 = 13 13 13

9 37 37 % 20 = 17 17 18

Quadratic Probing:

Quadratic probing is similar to linear probing and the only difference is the
interval between successive probes or entry slots. Here, when the slot at a hashed
index for an entry record is already occupied, you must start traversing until you find
an unoccupied slot. The interval between slots is computed by adding the successive
value of an arbitrary polynomial in the original hashed index.

Let us assume that the hashed index for an entry is index and at index there is an
occupied slot. The probe sequence will be as follows:

index = index % hashTableSize


index = (index + 12) % hashTableSize

18
II CSE III Sem Advanced Data Structures & Algorithm

index = (index + 22) % hashTableSize


index = (index + 32) % hashTableSize and so on…

Example: A = 3, 2, 9, 6, 11, 13, 7, 12 where m = 10, and h(k) = 2k+3

The key values 3, 2, 9, 6 are stored at the indexes 9, 7, 1, 5 respectively. The calculated index value
of 11 is 5 which is already occupied by another key value, i.e., 6. When linear probing is applied, the
nearest empty cell to the index 5 is 6; therefore, the value 11 will be added at the index 6.

h(3)=2*3+3=9 h(2)=2*2+3=7 h(9)=2*9+3=21%10=1


h(6)=2*6+3=15 %10=5 h(11)=2*11+3=25%10=5 (linear probing) =6
h(13)=2*13+3=29%10=9 (linear probing)=0
h(7)=2*7+3=17%10=7 h(12)=2*12+3=27%10=7
Double hashing:

Double hashing is similar to linear probing and the only difference is the interval between
successive probes. Here, the interval between probes is computed by using two hash functions.

Let us say that the hashed index for an entry record is an index that is computed by one hashing
function and the slot at that index is already occupied. You must start traversing in a specific probing
sequence to look for an unoccupied slot. The probing sequence will be:

index = (index + 1 * indexH) % hashTableSize;


index = (index + 2 * indexH) % hashTableSize; and so on…

Double hashing is an open addressing technique which is used to avoid the collisions. When the collision
occurs then this technique uses the secondary hash of the key. It uses one hash value as an index to move
forward until the empty location is found.

In double hashing, two hash functions are used. Suppose h1(k) is one of the hash functions used to calculate
the locations whereas h2(k) is another hash function. It can be defined as "insert ki at first free place
from (u+v*i)%m where i=(0 to m-1)". In this case, u is the location computed using the hash function and
v is equal to (h2(k)%m).

Consider the same example that we use in quadratic probing.


A = 3, 2, 9, 6, 11, 13, 7, 12 where m = 10, and
h1(k) = 2k+3
h2(k) = 3k+1
key Location (u) v probe
3 ((2*3)+3)%10 = 9 - 1
2 ((2*2)+3)%10 = 7 - 1
9 ((2*9)+3)%10 = 1 - 1
6 ((2*6)+3)%10 = 5 - 1
11 ((2*11)+3)%10 = 5 (3(11)+1)%10 =4 3

19
II CSE III Sem Advanced Data Structures & Algorithm

13 ((2*13)+3)%10 = 9 (3(13)+1)%10 = 0
7 ((2*7)+3)%10 = 7 (3(7)+1)%10 = 2
12 ((2*12)+3)%10 = 7 (3(12)+1)%10 = 7 2
As we know that no collision would occur while inserting the keys (3, 2, 9, 6), so we will not apply double
hashing on these key values.
On inserting the key 11 in a hash table, collision will occur because the calculated index value of 11 is 5
which is already occupied by some another value. Therefore, we will apply the double hashing technique
on key 11. When the key value is 11, the value of v is 4.
Now, substituting the values of u and v in (u+v*i)%m
When i=0
Index = (5+4*0)%10 =5
When i=1
Index = (5+4*1)%10 = 9
When i=2
Index = (5+4*2)%10 = 3
Since the location 3 is empty in a hash table; therefore, the key 11 is added at the
index 3.
The next element is 13. The calculated index value of 13 is 9 which is already
occupied by some another key value. So, we will use double hashing technique to
find the free location. The value of v is 0.
Now, substituting the values of u and v in (u+v*i)%m
When i=0
Index = (9+0*0)%10 = 9
We will get 9 value in all the iterations from 0 to m-1 as the value of v is zero.
Therefore, we cannot insert 13 into a hash table.
The next element is 7. The calculated index value of 7 is 7 which is already occupied by some another key
value. So, we will use double hashing technique to find the free location. The value of v is 2.
Now, substituting the values of u and v in (u+v*i)%m
When i=0
Index = (7 + 2*0)%10 = 7
When i=1
Index = (7+2*1)%10 = 9
When i=2
Index = (7+2*2)%10 = 1
When i=3
Index = (7+2*3)%10 = 3
When i=4
Index = (7+2*4)%10 = 5

20
II CSE III Sem Advanced Data Structures & Algorithm

When i=5
Index = (7+2*5)%10 = 7
When i=6
Index = (7+2*6)%10 = 9
When i=7
Index = (7+2*7)%10 = 1
When i=8
Index = (7+2*8)%10 = 3
When i=9
Index = (7+2*9)%10 = 5
Since we checked all the cases of i (from 0 to 9), but we do not find suitable place
to insert 7. Therefore, key 7 cannot be inserted in a hash table.
The next element is 12. The calculated index value of 12 is 7 which is already
occupied by some another key value. So, we will use double hashing technique to
find the free location. The value of v is 7.
Now, substituting the values of u and v in (u+v*i)%m
When i=0
Index = (7+7*0)%10 = 7
When i=1
Index = (7+7*1)%10 = 4
Since the location 4 is empty; therefore, the key 12 is inserted at the index 4.
The final hash table would be:
Applications

 Associative arrays: Hash tables are commonly used to implement many types of in-memory tables.
They are used to implement associative arrays (arrays whose indices are arbitrary strings or other
complicated objects).

 Database indexing: Hash tables may also be used as disk-based data structures and database indices
(such as in dbm).

 Caches: Hash tables can be used to implement caches i.e. auxiliary data tables that are used to
speed up the access to data, which is primarily stored in slower media.

 Object representation: Several dynamic languages, such as Perl, Python, JavaScript, and Ruby use
hash tables to implement objects.

 Hash Functions are used in various algorithms to make their computing faster

Example program:

# Python program to demonstrate working of HashTable

21
II CSE III Sem Advanced Data Structures & Algorithm

hashTable = [[],] * 10
def checkPrime(n):
if n == 1 or n == 0:
return 0
for i in range(2, n//2):
if n % i == 0:
return 0
return 1
def getPrime(n):
if n % 2 == 0:
n=n+1
while not checkPrime(n):
n += 2
return n
def hashFunction(key):
capacity = getPrime(10)
return key % capacity
def insertData(key, data):
index = hashFunction(key)
hashTable[index] = [key, data]
def removeData(key):
index = hashFunction(key)
hashTable[index] = 0
insertData(123, "apple")
insertData(432, "mango")
insertData(213, "banana")
insertData(654, "guava")
print(hashTable)
removeData(123)
print(hashTable)

22

You might also like