SUBJECT CODE : 210252
As per Revised Syllabus of
SAVITRIBAI PHULE PUNE UNIVERSITY
Choice Based Credit System (CBCS)
S.E. (Computer) Semester - IV
DaTA STRUCTURES AND ALGORITHMS
Anuradha A. Puntambekar
M.E. (Computer)
Formerly Assistant Professor in
PE.S. Modem College of Engineering,
Pune
Minal P. Nerkar
ME. (Computer Science)
Assistant Professor in
AISSMS Institute of Information Technology,
Pune
==> TECHNICAL
PUBLICATIONS
An Up-Thrust for Knowledge
ri
5
iDATA STRUCTURES AND ALGORITHMS
Subject Code : 210252
S.E. (Computer Engineering) Semester - IV
© Copyright with A.A. Puntambekar
All publishing rights (printed and ebook version) reserved with Technical Publications. No part ofthis book
should be reproduced in any form, Electronic, Mechanical, Photocopy or any information storage and
retrieval system without prior permission in writing, from Technical Publications, Pune.
Published by :
- TECHNICAL | °°" °°*2222% tice No.1, 412, Shoniwartetn,
PUBLICATIONS | "2 ~ 417090, MS. INDIA, Ph.: +91-020.24495496/97
PRUs oe
[email protected] Website: vava:technicalpublications.org
Printer =
Yogi Printers & Bindes
SNe. 10/14,
Ghule Indus Estate, Nanded Vilage Road,
Tal, - Havel, Dat. - Pune - 411041
ISBN 978-93-90450-38-0
dirs93001450350
9789390450250 [1] GyPREFACE
The importance of Data Structures and Algorithms is well known in various
engineering fields. Overwhelming response to our books on various subjects inspired us
to write this book. The book is structured to cover the key aspects of the subject
Data Structures and Algorithms.
The book uses plain, lucid language to explain fundamentals of this subject. The
book provides logical method of explaining various complicated concepts and stepwise
methods to explain the important topics. Each chapter is well supported with necessary
illustrations, practical examples and solved problems. All chapters in this book are
arranged in a proper sequence that permits each topic to build upon earlier studies. All
care has been taken to make students comfortable in understanding the basic concepts
of this subject.
Representative questions have been added at the end of each section to help the
students in picking important points from that section,
The book not only covers the entire scope of the subject but explains the philosophy
of the subject. This makes the understanding of this subject more clear and makes it
more interesting. The book will be very useful not only to the students but also to the
subject teachers. The students have to omit nothing and possibly have to cover nothing
more
We wish to express our profound thanks to all those who helped in making this
book a reality. Much needed moral support and encouragement is provided on
numerous occasions by our whole family. We wish to thank the Publisher and the
entire team of Technical Publications who have taken immense pain to get this book
in time with quality printing.
Any suggestion for the improvement of the book will be acknowledged and well
appreciated
Authors
A.A. Dantambekar
M.D. Neskar
Dedicated to God
iSYLLABUS
Data Structures and Algorithms - 210252
Credit Scheme _| Examination Scheme and Marks
Mid_Semester (TH) : 30 Marks
End_Semester (TH) : 70 Marks
03
Unit-1 Hashing
Hash Table : Concepts - hash table, hash function, basic operations, bucket, collision, probe. synonym,
‘overflow, open hashing, closed hashing, perfect hash function, load density, full table, load factor,
rehashing. Issues in hashing. hash functions - properties of good hash function, division, multiplication,
extraction, mid - square, folding and universal. collision resolution strategies - open addressing and
chaining, hash table overtiow - open addressing and chaining, extendible hashing, closed addressing and
separate chaining.
Skip List : Representation, searching and operations - insertion, removal, (Chapter - 1)
Unit-1 Trees
Tree: Basic terminology, General tree and it's representation, representation using sequential anc{ linked
organization, Binary tree properties, converting tree to binary tree, binary tree traversals (recursive and
non-recursive) - inorder, preorder. post order, depth first and breacith first, Oprations on binary tree.
‘Huffman Tree (Concept and use), Binary Search Tree (BST), BST operations, Threaded binary search tree
- concepts, threading, insertion and deletion of nodes in inorder threaded binary search tree, in order
traversal of in-order threaded binary search tree. (Chapter - 2)
Unit-Il_ Graphs
Basic Concepts, Storage representation, Adjacency matrix, Adjacency list, Adjacency multi fist, Inverse
adjacency list. Traversals - depth first and breadth first, Minimum spanning tree. Greedy algorithms for
‘computing minimum spanning tree - Prims and Kruskal Algorithms, Dikjtra’s single source shortest path,
All pairs shortest paths - Flyod - Warshall Algorithm, Topological ordering. (Chapter - 3)
Unit-IV_ Search Trees
Symbol Table - Representation of Symbol Tables-Static treee table and Dyanamic tree table. Weight
balanced tree - Optimal Binary Search Tree (OBST), OBST as an exmple of Dynamic Programming.
Helght Balanced Tree- AVL tree. Red-Black Tree, A tree, K-dimensional tree, Splay Tree. (Chapter -4)
Unit-V Indexing and Multiway Trees
Indexing and multiway trees - Indexing, Indexing techniques - Primary, secondary, dense, sparse.
Multiway search trees. BeTree - Insertion, deletion. B+Tree - Insertion. deletion. use of B+ tree in
Indexing, Trie tree, (Chapter - 5)
roUnit-VI__ File Organization
Files - Concept, need, primitive operations, Sequential file organization - Concept and primitive
‘operations, Direct Access File - Concepts and primitive operations, Indexed sequential file
organization - Concept, Types of indices, Structure of index sequential file, Linked organization - Multi
list tiles, Coral rings, inverted files ancl celular partitions. (Chapter - 6)
oTABLE OF CONTENTS
Chapter- 1 Hashing (1 - 1) to (1 - 64)
1.1 Concept. . 11-2
1.1.1 Basic Concepts in Hashing. . . .1-3
ALD lta PUmPCRa Ti ress cass snnessccemenacensncdnssssnmnensininiccnsssesnsensoustsnsnasnsoumsssemsissneconssnsasd =
1.2.1 Division Method. ... 2.00.0. 0 0c cece cece cect eee e eee tree eee e eee LS
1.2.2 Multiplicative Hash Function .... been e ere eeeeee on oaseessass kee.
1.2.3 Extraction . seen . . oanee 1-6
1.2.4 Mid Square... .. weed -7
1.2.5 Folding 1-7
1.2.6 Universal Hashing . seane . - . sees 1-8
1.3 Properties of Good Hash Function. 1-9
1.4 Collision Resolution Strategies .. 11-9
1.4.1 Open and Closed Hashing..........0000eecseeeveeeeeeeteeereeeeeeee 1-10
VA2 Chaining ....0. 60 ccc eects a essesevevers seee1-10
1.4.3 Open Addressing ..... 0... .ceeecseeeseeeeeeeeeeeeseeeseeeseeeseees D223,
1.5 Extensible Hashing 1-50
1.6 Applications of Hashing 1-53
17 Skip List
1.7.1 Operations on Skip List... 6.66.66 ec eee eevee eee eeeeeee eet eeeeeee 1-55
1.7.2 Searching of aNode...........6.0e0008 beeen veeeees cee 155
1.7.3 Insertion of ANOdE ......e eve eeeeee ee eveeeeeeeereeeeeessseeseees L257
1.7.4 Removal of aNOde 0.2... 66 cece cece ee eee e eee e ete settee ete eeeeee 1-58
1.7.5 Features of Skip Lists. 1-59
1.7.6 Comparison between Hashing and Skip Lists 1-59
1.8 Multiple Choice Questions .........+..++ 1-60
woChapter-2 Trees (2 - 1) to (2-94)
2.1 Basic Terminology...... send
2.2 Properties of Binary Tree... 12-6
2.3 Representation of Binary Tree.
2.4 General Tree and its Representation......
2.5 Converting Tree into Binary Tree...
2.6 Binary Tree Traversals ..
2.6.1 Non Recursive Traversals
2.7 Depth and Level Wise Traversals...
2.7.1 Depth First Search (DFS)
2.7.2 Breadth First Search (BFS)...........+
2.8 Operations on Binary Tree........
2.9 Huffman's Tree......
2.9.1 Algorithm. ..
2.10 Binary Search Tree (BST)...
2.11 BST Operations...
2.12 BST as ADT oss
2.13 Threaded Binary Tree ..
2.13.1 Concept ...
2.13.2 Insertion and Deletion.........0...c0eeceeeeeneeeeeeseeeeseseeenes 2°76
2.13.3 Inorder Traversal 2-79
2.13.4 Advantages and Disadvantages ©... 666. 0.seeeeee es eeeeeee ee eeee ees 2°89)
2.14 Multiple Choice Questions.....
Lbs
p08
Chapter-3 Graphs (B- 1) to (3 - 124)
3.1 Basic Concept... 3-2
win3.1.1 Comparison between Graph and Tree 3-2
BAZ Types of Graph... eect e eee 3-3
3.1.3 Properties Of Graph... ..666ceeesceteeeetetsettetetettreterereeeeee BS
3.2 Storage Representation
3.2.1 Adjacency Matrix...
3.2.2 Adjacency List......
3.2.3 Adjacency Multilist
3.2.4 Inverse Adjacency List. .
3.3 Graph Operations.
3.4 Storage Structure...
3.5 Traversals....
3.5.1 BFS Traversal of Graph .
3.5.2 DFS Traversal of Graph 0.0.0... eee ceeee cece eeeeeeseeeesev estes
3.6 Introduction to Greedy Strategy.
3.6.1 Applications of Greedy Method ........0.00csceceeveceeeeeeeseeeeeees 3 +56
3.7 Minimum Spanning Tree
3.7.1 Difference between Prim's and Kruskal's Algorithm .........6...00e0005 3-67
3.8 Dijkstra's Shortest Path..
3.9 All Pair Shortest Path (Warshall and Floyd's Algorithm)...
3.9.1 Warshall's Algorithm ...... 6600.0. cc cece cceee eee eeceeeee ee teeeeeee
3.9.2 Floyd's Algorithm .........0ceceeeceeeseseteeeeeeseeseeeeiereeees
3.10 Topological Ordering
3.11 Case Study ...
3.11.1 Webgraph and Google Map ..
3.12 Multiple Choice Questions
Chapter-4 Search Trees (4-1) to (4 - 104)
4.1 Symbol Table - Representation... wh
ro)4.2 Static Tree Table ..
4.3 Dynamic Tree Table ..
4.4 Introduction to Dynamic Programming.
4.4.1 Problems that can be Solved using Dynamic Programming. .... 4-6
4.4.2 Principle of Optimality. ........00c00ceeecccseeeeeesesseeeeeeeseaeee
4.5 Weight-Balance Tree
4.6 Optimal Binary Search Tree (OBST) ..
QL AMBOrthM 6... eee cece cece cece ence ee tees sense et eeeeneneneneaee
4,7 Height Balance Tree (AVL)...
4.7.1 Height of AVL Tree.
4.7.2 Representation of AVL Tree «2... . 1. seeee weet ee eee eee ee ee d= 26
4.7.3 Algorithms and Analysis of AVL Trees ....... 00 ceceeeeeeeeeeeeeee ees 4-28
ATA Insertion 0... .ee cece ccc eeee eee teeeees es eeeeseeteneeeesee senses A= 28
ATS Deletion 66s eeeeeeeeveeeeeee ee eeeeeeeeeeeeees ste eeee eee = 36
ATG SCATCHING eee eee e eee e ee eeee eee eeeeeee eens A237
4.7.7 AVLTree Implementation ........006ceeeeseveeeeeeeeeeeeeeeee een ee eA 38
4.7.8 Comparison between BST, OBST and AVL Tree. 4-47
4.7.9 Comparison of AVL Tree with Binary Search Tree ...........0esseeee ees 4-63
4.8 Red Black Tree.
4.8.1 Properties of Red Black Tree. ...... sce eeeceeeeeeeeeeeeee eee e eee A= 64
4.8.2 Representation . . 4-65
4.8.3 Insertion in Red Black Tree... 60... 0 cece cece eesee essen eens ence nese 4 66
4.8.4 Deletion Operation ......... 000. 00ceee cece cess esse stew sees nesses A= 72
4.9 AATree ..
4.9.1 Insertion Operation............ oy toesterievens wee 4-81
4.9.2 Deletion Operation ..... 0... .eeeceee cree eee eee etter eee e eee eee 2 BS
4.10 k-dimensional Tree...
4.11 Splay Tree...
4.11.1 Splay Operations... eee eee e eee cece ee eee
4.12 Multiple Choice Questions ....
wwChapter-5 — Indexing and Multiway Trees (5 - 1) to (5 - 70)
5.1 Indexing .. wo 22
5.2 Indexing Techniques... wn 3
5.2.1 Primary Indexing Techniques woven 5-3
5.2.2 Secondary Indexing Techniques.......66006000eeecc vette SM
5.2.3 Dense and Sparse Indexing........-..0s0c0ceeeeeeceeeeeeeeeee 5-5
5.3 Types of Search Tree. we 6
5.4 Multiway Search Tree... 5 6
5.5 B Tree... wo 7
5.5.1 Insertion .. ae 7 . : . 7 5-7
S5.2DEItON oe ceeeeceeeeeeeeeesee esse tees ee eeeeeeeeeeteneee ee SAIS,
5.5.3 Searching ... . . rere ere t ers wane tenet eeeene woees 518
5.5.4 Height of B-Trees. . . . 5-20
5.5.5 Variants of B-Trees .
5.5.6 Implementation of B-Trees...............
5.6 B+ Tree
5.7 Trie Tree...
5.8 Heap - Basic Concept
5.8.1 Heaps using Priority Queues.
5.9 Multiple Choice Questions
Chapter-6 — File Organization (6 - 1) to (6 - 60)
6.1 File Definition and Concept 6-2
6.2 File Handling in C++... son = 4
6.3 File Organization. 6-10
6.4 Sequential Organization ....... ..6-10
ro)6.5 Direct Access File.......
6.6 Index Sequential File Organization...
6.6.1 Types of Indices ........ -6-29
6.7 Linked Organization...
6.8 External Sort ..
6.8.1 Consequential Processing and Merging Two Lists ..
6.8.2 Multiway Merge... 0.00 cseeeceeeeeeeeeeeeeeeeeeteetetereeeeeee 6-49
6.8.3 K way Merge Algorithm.............0008 6-53
6.9 Multiple Choice Questions... 6-57
Laboratory Experiments (L- 1) to (L- 86)
Solved Model Question Papers (M- 1) to (M- 4)
oii)UNIT - I
Hashing
Syllabus
Hash Table : Concepts - hash table, hash function, basic operations, bucket, collision, probe, synonym,
overflow, open hashing, closed hashing, perfect hash fiction, load density, full table, load factor
rehashing, issues in hashing, hash functions - properties of good hash function, division,
multiplication, extraction, mid - square, folding and universal, collision resolution strategies - open
addressing and chaining, hash table overflow - open addressing and chaining, extendible hashing,
closed addressing and separate chaining.
Skip List : Representation, searching and operations - insertion, removal,
Contents
1.1 Concept
1.2 Hash Funetions «+ May-10, 12, Dec.-12,
1.3 Properties of Good Hash Function... . May-05, 11, 13,
Dec.-07, 08, 13,
1.4 Collision Resolution Strategies. ..... . May-07, 08, 09, 14, 17, 18, 19
Dec.-06, 07, 09, 10, 11, 13, 17, 19,
1.5. Extensible Hashing
1.6 Applications of Hashing
1.7 Skip List
1.8 Multiple Choice Questions
a-nData Structures and Algorithms 1-2 Hashing
Concept
Hashing is an effective way to Position Key pocord
reduce the number of comparisons. — FeeroogL > 0 AB67000
Actually hashing deals with the idea 1
of proving the direct address of the 2[__ Ba2T002
record where the record is likely to 3
store. To understand the idea clearly
let us take an example - —
Suppose the manufacturing
company has an inventory file that
consists of less than 1000 parts. Each 395
part is having unique 7 digit 396 | 4618396
number. The number is called ‘key’ 397 | 4957397
and the particular keyed record 398
consists of that part name. If there een toe ES
are less than 1000 parts then a 1000 Table | 494
element array can be used to store 402
the complete file, Such an array will 403,
be indexed from 0 to 999. Since the 404
key number is 7 digit it is converted 405
to 3 digits by taking only last three 406
digits of a key. This is shown in the a
Fig. 1.1.1,
Observe in Fig. 1.1.1 that the first ——
key 496700 and it is stored at 0”
position. The second key is 8421002.
The last three digits indicate the
position 2" in the array. Let us
search the element 4957397. Naturally tee
it will be obtained at position 397. i
This method of searching is called 993 | 0047998
hashing. The function that converts 994
the key (7 digit) into array position is 995| 9846995,
called hash function. 996 0
Here hash function is ooo aoe
(key) = key % 1000 999[ 0001999
Where key % 1000 will be the
hash function and the key obtained
by hash function is called hash key.
Fig. 1.1.1 Hashing
TECHNICAL PUBLICA’ Tions® - An up thrust for knowledgeData Structures and Algorithms 1-3 Hashing
Basic Concepts in Has!
9
1) Hash Table : Hash table is a data structure used for storing and retrieving data
quickly. Every entry in the hash table is made using Hash function.
2) Hash Function :
* Hash function is a function used to place data in hash table.
* Similarly hash function is used to retrieve data from hash table.
* Thus the use of hash function is to implement hash table.
For example : Consider hash function as key mod 5. The hash table of size 5.
Step 1: Insert 33 of 2 Step 3: Insert 25
[a3 mod 5]- 3 1
Hash function ~~ 25mod S20
Step 2: Ga
Insert 54 3 | Gay] — Bucket
54mod4=4—~4 [54
Hash Table
3) Bucket : The hash function H(key) is used to map several dictionary entries in the
hash table. Each position of the hash table is called bucket.
4) Collision : Collision is situation in which hash function returns the same address
for more than one record.
For example
we want to
insert 55 the
But at 0" location
25 is already placed
and now 55 is
demanding the
same location.
Hence we say
“collision occurs”
Fig, 1.1.2
5) Probe : Each calculation of an address and test for success is known as a probe.
6) Synonym : The set of keys that has to the same location are called synonyms. For
example - In above given hash table computation 25 and 55 are synonyms.
7) Overflow : When hash table becomes full and new record needs to be inserted
then it is called overflow.
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 1-4 Hashing
For example -
25
31
42
63,
49
Fig. 1.1.3 Overflow situation
8) Perfect hash function : The perfect hash function is a function that maps distinct
key elements into the hash table with no collisions.
Advantages of perfect hash function
1) A perfect hash function with limited set of elements can be used for efficient
lookup operation.
2) There is no need to apply collision resolution technique.
9) Load factor and load density : Consider the hash table as given below -
Slot Slot2
o[C& ‘Ai_} }+— Bucket
1
a{_e¢ C2
s[_ 0 Dt
4
Hash Table
Fig. 4.1.4
Let,
* be the total number of elements in the table.
* Tis the total number of possible elements.
Element density : The element density of hash table is the ratio -
Load density of load factor : The load density or load factor of hash table is
n
“* &)
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 1-5 Hashing
EEA Hash Funct
There are various types of hash functions or hash methods which are used to place
the elements in hash table.
Cn
Folding
Division | [Muttipication] [Extraction and
method | |_method method universal
method
Fig. 1.2.4 Types of hash functions
Division Method
The hash function depends upon the remainder of division.
Typically the divisor is table length. For example :-
If the record 54, 72, 89, 37 is to be placed in the hash table and if the table size is
10 then
Hash Table
Hash Function 6
4=54%10 1
2 2
2=72%10 3
4 54
9 = 89% 10 5
6
7=37%10 7 a7
8
9 89
Multiplicative Hash Function
The multiplicative hash function works in following steps
1) Multiply the key 'k’ by a constant A where A is in the range 0 < A < 1. Then
extract the fractional part of kA.
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 1-6 Hashing
2) Multiply this fractional part by m and take the floor.
The above steps can be formulated as
tk) = [m {kA} |
Fractional part
Donald Knuth suggested to use A = 0.61803398987
Example :
Let key k = 107, assume m = 50.
A = 0.61803398987
hik) = | nm [07 = 0,61803308087] |
|
66.12
|
0.12 | Fractional part
h(k) = 50» 0.12
=6
bik) =6
That means 107 will be placed at index 6 in hash table.
Advantage : The choice of m is not critical
Extraction
In this method some digits are extracted from the key to form the address location in
hash table.
For example : Suppose first, third and fourth digit from left is selected for hash key.
@s@O@24
478 at 478 location in the hash
table of size 1000
the key can be stored.
TECHNICAL PUBLICA’ Tions® - An up thrust for knowledgeData Structures and Algorithms 1-7
This method works
Hashing
in following steps
1) Square the key
2) Extract middle part of the result. This will indicate the location of the key element
in the hash table.
Note that if the key element is a string then it has to be preprocessed to produce a
number.
Let key = 3111
ony
For the hash table of size of 1000
H@I11) = 783
Folding
There are two folding techniques
i) Fold shift ii) Fold boundary
i) Fold shift : In this method the key is divided into separate parts whose size
matches with the size of required address. Then left and right parts are shifted and
added with the middle part.
ii) Fold boundary : In this method the key is divided into separate parts. The leftmost
and rightmost parts are folded on fixed boundary and added with the middle part.
For example
345678123
Key
Digit
reversed
Index = 146 (@s42_ = Index = 542
(a) Fold shitt (b) Fold boundary
Fig. 1.2.2 Folding techniques
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 1-8 Hashing
Universal Hashing
Universal hashing is a hashing technique in which randomized algorithm is used to
select the hash function at random, from family of hash functions with the help of
certain mathematical properties.
The universal hashing guarantees lower number of collisions.
Let
U be the set of universe keys
H_be the finite collection of Hash functions mapping U into {0, 1,...., m-1}, Then H
is called universal if, for x, y ¢U (xy)
[8]
[{heH: h)=hy)} | = St
™
That means, probability of a collision for two different keys x and y given a hash
function randomly chosen from H is —.
m
Theorem : If h is chosen from a universal class of hash functions and is used to hash n
keys into a table of size m, where n
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 1-9 Hashing
Then hg (K) = ‘s a; Kj |modsize
(0
h(xyz) = (35 24 + 100 « 25 + 12 16) % 11
(xyz) = 0
That means key = xyz is stored at index 0 in Hash table.
wt Ori
1. What is a hashing function ? Explain any 4 types of hashing functions.
SU See ed
2. What is hash function ? Explain the following hash functions
’) Mid square ii) Modulo division iti) Folding method iv) Digit analysis
Sao See
EE] Properties of Good Hash Function
RR RA
Rules for choosing good hash function
1. The hash function should be simple to compute.
2. Number of collisions should be less while placing the record in the hash table,
Ideally no collision should occur. Such a function is called perfect hash function.
3. Hash function should produce such keys which will get distributed uniformly over
an array
4, The hash function should depend on every bit of the key. Thus the hash function
that simply extracts the portion of a key is not suitable
University Questions
1. What is hashing function ? What are different ways to design a hash function ? Explain any
one of the method in detail
2. What is hashing ? What are the characteristics of good hashing function ? Explain any two
types of hash functions, ERD TAS stack aes eae ees
3. What is the use of hash tables ? Enlist the characteristics of good hash fienction.
SSDs eee
Collision Resolution Strate:
SPPU : May-07, 08, 09, 14, 17, 18, 19, Dec.-06, 07, 09, 10, 11, 13, 17, 19, Marks 12
Definition : If collisions occur then it should be handled by applying some
techniques, such techniques are called collision handling techniques.
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 1-10 Hashing
Collision handling
techniques
‘Open hashing
(chaining)
Closed hashi
(open addressing)
Chaining
Linear probing] [Quadratic] [~ Double
probing | | hashing
Fig. 1.4.1 Collision resolution techniques
Open and Closed Hashing
* The open hashing is also called as separate chaining and closed hashing is called
open addressing.
* The difference between open hashing and closed hashing is that in open hashing
the collisions are stored outside the table and in closed hashing the collisions are
stored in the same table at some other slot.
Chaining Index Data Chain
4. Chaining without replacement o a [4
In collision handling method chaining is a u ue Z
concept which introduces an additional field 2 2
with data ie. chain. A separate chain table is a a eu
maintained for colliding data. When collision 4 4 a
occurs we store the second colliding data by 5 6 7
linear probing method. The address of this, 6 6 a
colliding data can be stored with the first 7 7 er
colliding element in the chain table, without 8 -1
replacement. 9 al
For example consider elements,
131, 3, 4, 21, 61, 6, 71, 8,9 Fig. 1.4.2 Chaining without replacement
From the example, you can see that the chain is maintained the number who
demands for location 1. First number 131 comes we will place at index 1, Next comes 21
but collision occurs so by linear probing we will place 21 at index 2, and chain is
maintained by writing 2 in chain table at index 1 similarly next comes 61 by linear
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms t-it Hashing
probing we can place 61 at index 5 and chain will be maintained at index 2. Thus any
element which gives hash key as 1 will be stored by linear probing at empty location
but a chain is maintained so that traversing the hash table will be efficient.
The drawback of this method is in finding, the next empty location. We are least
bothered about the fact that when the element which actually belonging to that empty
location cannot obtain its location. This means logic of hash function gets disturbed. Let
us now see a program which implements chaining without replacement.
[HHS En ES nin nnn nin niniiiiiiniiiaiaiaieiaiaiee
Program to create hash table and to handle the collision using chianing
without replacement.In this program hash function is (number %10)
#include
#include
#include
#define MAX 10
class WO_chain
{
private:
int a[MAX||2];
public:
WO_chain();
int create(int);
void chain(int,int),display();
hk
,
The constructor defined
/
WO_chain::WO_chain()
{
int i;
for(i=0i >num;
key=h.create(num);//retums hash key
h.chain(key,num);//collision handled by chaining without replacement
cout<<"\n Do U Wish To Continue?(y/n)",
ans=getche();
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 1-15 Hashing
2. Chi
ing with replacement
‘As previous method has a drawback of loosing the meaning of the hash function, to
overcome this drawback the method known as changing with replacement is introduced.
Let us discuss the example to understand the method. Suppose we have to store
following elements :
131, 21, 31, 4, 5
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 1-16 Hashing
eiaiaw
9
Now next element is 2. As hash function will indicate hash key as 2 but already at
index 2. We have stored element 21. But we also know that 21 is not of that position at
which currently it is placed.
Hence we will replace 21 by 2 and accordingly chain table will be updated. See the
table :
Index Data Chain
a =il ea
1 131 6
2 2 -1
3 31 =il
4 4 ea
5 5 ea
6 21 3
7 -1 -1
8 =il =il
9 =il Ea
‘The value -1 in the hash table and chain table indicate the empty location.
The advantage of this method is that the meaning of hash function is preserved. But
each time some logic is needed to test the element, whether it is at its proper position.
[DOOR DUD RRA
Program to create hash table and to handle the collision using chaining
with replacement.In this program hash function is (number %10)
diatesersnsvasanocensaussoounnnuceaneensauaseneenanaueseuannnseenseeynsns/
#include
#include
#include
#define MAX 10
class W_chain
t
private:
int a[MAX][2];
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms
= An up thrust for knowledgeData Structures and Algorithms
= An up thrust for knowledgeData Structures and Algorithms
= An up thrust for knowledgeData Structures and Algorithms
= An up thrust for knowledgeData Structures and Algorithms
= An up thrust for knowledgeData Structures and Algorithms
= An up thrust for knowledgeData Structures and Algorithms 1-23 Hashing
Open Addressing
Open addressing is a collision handling technique in which the entire hash table is
searched in systematic way for empty cell to insert new item if col .
Various techniques used in open addressing are
1. Linear probing 2. Quadratic probing 3. Double hashing
4. Linear probing
When collision occurs ie. when two records demand for the same location in the
hash table, then the collision can be solved by placing second record linearly down.
wherever the empty location is found.
For example
Fig. 1.4.3 Linear probing
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 1-24 Hashing
In the hash table given in Fig. 14.3 the hash function used is number % 10. If the
first number which is to be placed is 131 then 131 % 10 = 1 ie. remainder is 1 so hash
key = 1. That means we are supposed to place the record at index 1. Next number is 21
which gives hash key = 1 as 21 % 10 = 1. But already 131 is placed at index 1. That
means collision is occurred. We will now apply linear probing. In this method, we will
search the place for number 21 from location of 131. In this case we can place 21 at
index 2. Then 31 at index 3. Similarly 61 can be stored at 6 because number 4 and 5 are
stored before 61. Because of this technique, the searching becomes efficient, as we have
to search only limited list to obtain the desired number.
HI HRS SRS non nnn In nnnnnaaaiiaininaiainiaaiaiaiaiiit
Program to create hash table and to handle the collision using linear
probing.In this Program hash function is (number%10)
#include
#include
#include
#define MAX 10
class Hash
{
private:
int a[MAX];
public;
Hash();
int create(int);
void linear_prob(int,int),display();
hk
;
The constructor defined
y
Hash::Hash()
{
int i;
for(i=0;i>num;
key=h.create(num);//returns hash key
hilinear_prob(key,num);//collision handled by linear probing
cout<<"\n Do U Wish To Continue?(y/n)";
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms
Problem with linear probing
One major problem with linear probing is primary clustering. Primary clustering is a
process in which a block of data is formed in the hash table when collision is resolved.
+ An up thrust for knowledgeData Structures and Algorithms 1-28 Hashing
For example :
0 39
Cluster
19% 10= 9 1 29
18% 10= 8 2
39% 10=9 3
29% 10=9 4
8%10=8 5
6
7
8 18
9 19
This clustering problem can be solved by quadratic probing.
2. Quadratic probing
Quadratic probing operates by taking the original hash value and adding successive
values of an arbitrary quadratic polynomial to the starting value. This method uses
following formula -
Hy(key) = (Hash(key)+i2)%m
where m can be a table size or any prime number.
For example : If we have to insert following elements in the hash table with table
size 10:
37, 90, 55, 22, 11, 17, 49, 87.
We will fill the hash table step by step
0 | 90
37%10 = 7 1 in
90%10 2 |»
55%10 = 5 3
22%10 4
11%10 = 5 | 55
6
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 1-29
37.
Now if we want to place 17 a collision will occur as 17%10 = 7
and bucket 7 has already an element 37. Hence we will apply
quadratic probing to insert this record in the hash table.
Hi(key) = (Hash(key)+i?)%m
we will choose value i = 0, 1, 2, ... whichever is applicable,
Consider i 0 then
(17407) %10
(17+#17) %10
7
8,when i=1
The bucket 8 is empty hence we will place the element at index 8
Then comes 49 which will be placed at index 9.
49%10 = 9
Now to place 87 we will use quadratic probing,
(87 + 0)%10 = 7
(87 + 1)%10 = 8 ... but already occupied
(87 + 27)%10 = 1 ... already occupied
(87 + 37)%10 = 6... this slot is free». We place 87 at 6"" index.
It is observed that if we want to place all the necessary elements
in the hash table the size of divisor (m) should be twice as large as
total number of elements.
3. Double hashing
Double hashing is technique in which a second hash function is
applied to the key when a collision occurs. By applying the second
hash function we will get the number of positions from the point of
collision to insert.
There are two important rules to be followed for the second function :
« It must never evaluate to zero.
* Must make sure that all cells can be probed.
The formula to be used for double hashing is
Hy(key) = key mod tablesize
H,(key) = M - ( key mod M )
Hashing
Fig. 1.4.4
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 1-30
Hashing
where M is a prime number smaller than the size of the table.
Consider the following elements to be placed in the hash table of size 10
37, 90, 45, 22, 17, 49, 55
Initially insert the elements using the formula for Hy(key).
Insert 37, 90, 45, 22.
0| 90
H,(37) = 37%10 = 7 1
H,(90) = 90%10 = 0 al 22
H4(45) = 45%10 = 5 3
Hy(22) = 22% 10 = 2 4
Hy(49) = 49% 10 = 9 5|_45
6
7|_37
8
9|_49
Now if 17 is to be inserted then
H,(Q7) = 17%10=7 [0
Hy(key) = M ~ (key%M) ‘1
2] 22
Here M is a prime number smaller than the size of 7
the table. Prime number smaller than table size 10 is 7. 1
Hence M = 7 5) 45
H,(17) = 7-(17%7)=7-3=4 6
‘That means we have to insert the element 17 at 4 7a
places from 37. In short we have to take 4 jumps. a
Therefore the 17 will be placed at index 1 s+
Now to insert number 55. Fig. 1.4.6
H,(55) = 55%10 = 5 ... collision
HA(55) = 7 - (65%7) =7-6=1
That means we have to take one jump from index 5 to place 55. Finally the hash
table will be -
of 90
1, 47
22
2|
3|
45,
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 1-31 Hashing
49.
Comparison of quadratic probing and double hashing
The double hashing requires another hash function whose probing efficiency is same
as some another hash function required when handling random collision.
The double hashing is more complex to implement than quadratic probing. The
quadratic probing is fast technique than double hashing.
4, Rehashing
Rehashing is a technique in which the table is resized, i.e,, the size of table is doubled
by creating a new table. It is preferable if the total size of table is a prime number.
There are situations in which 0
Sr ohact a. Transferring 1
the rehashing is required the contents 2
* When table is completely
full.
* With quadratic probing
when the table is filled
half.
* When insertions fail due 9
to overflow. Old table
In such situations, we have
to transfer entries from old
table to the new table by
recomputing their _ positions
New table
Fig. 1.4.7 Rehashing
using suitable hash functions.
Consider we have to insert the elements 37, 90, 55, 22, 17, 49 and 87. The table size is
10 and will use hash function,
H(key) = key mod tablesize
37%10 = 7
90%10 = 0
8
55%10 = 5
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 1-32
Hashing
22%10 = 2
17% 10 = 7 Collision solved by
goxio 9 linear probing
55,
ea
37.
9
7
49
Now this table is almost full and if we try to insert more elements collisions will
occur and eventually further insertions will fail. Hence we will rehash by doubling the
table size. The old table size is 10 then we should double this size for new table, that
becomes 20. But 20 is not a prime number, we will prefer to make the table size as 23.
And new hash function will be
H(key) = key mod 23
37%23 = 14
90%23 = 21
55%23 = 9
22%23 = 22
17%23 = 17
49%23 = 3
87%23 = 18
Now the hash table is sufficiently large to accommodate new insertions.
10)
u
a
13
14
15)
16
v|
18
9)
20
a
22
49.
37.
7
87,
90.
2
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 1-33 Hashing
Advantages
1. This technique provides the programmer a flexibility to enlarge the table size if
required.
2. Only the space gets doubled with simple hash function which avoids occurrence of
collisions.
Give the input (4371, 1323, 6173, 4199, 4344, 9679, 1989} and hash function
A(X) = X(mod 10), show the results for the following :
i) Open addressing hash table using linear probing
ii) Open addressing hash table using quadratic probing
iti)Open addressing hash table with second hash function h2(X)
- (X mod 7).
Corer
Solution : i) Open addressing hash table using linear probing :
We assume mod function as mod 10.
4371 mod 10 = 1
1323 mod 10 = 3
6173 mod 10 = 3 collision occurs
Hence by linear probing we will place 6173 at next empty location. That is, at
location 4.
4199 mod 10 = 9
4344 mod 10 = 4 but location 4 is not empty.
Hence we will place 4344 at next empty location ie. 5,
9679 mod 10 = 9 collision occurs so place at next location at 0. The hash table is of
size 10. Hence we find the next empty location by rolling the table in forward direction.
1989 mod 10 = 9 collision occurs, so we find the next empty location at index 2.
The hash table will then be
Index Keys
0 9679
1 4371
2 1989
3 1323
4 6173
5 4344
6
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 1-34 Hashing
9 4199
ii) Open addressing hash table using quadratic probing
In quadratic probing we consider the original hash key and then add an arbitrary
polynomial. This sum is then considered for hash function. The hash function will be
H(Key) = (Key +i?) %m
where m can be a table size
If we assume m = 10, then the numbers can be inserted as follows -
Step 1: Step 2:
Thdex | Keys Thdex | Keys
oO 0
4371
4374
1323
0] @|~Jo]a]s]eo] no]
1823 mod 10
4371 mod 11
Step 3:
6173 mod 10 = 3
As collision occurs.
‘we will apply
quadratic probing.
H(key) = (H{key) +i) % m_
Consider |= 0
H (6173) = (+0) % 10
=$ collision
Hence consider
H 6173)
As index 4 is an empty
slot, we will place
6173 at index 4.
TECHNICAL PUBLICA’ Tions® - An up thrust for knowledgeData Structures and Algorithms 1-35 Hashing
Step 4:
index] Keys,
4371
4199 mod 10 = 9
1323
As this slot is empty
5173} we will place 4199 at index 9
Step 5:
4344 mod 10= 4
But index 4 shows an
‘occupied slot. Hence
collision occurs. Therefore
we will place 4344 using
quadratic probing,
Hkey) = (H(key) +7) % m
2,
H (key) = (4 40.) % 10 = 4 colision
H (key) = (4 #1") % 10
‘The index 5 is an empty
slot. Hence we will place
4344 at index 5.
Step 6 :
9679 mod 10 =9 collision occurs.
Hence we will place the element
using quadratic probing,
Hikey) = (Htkey) +) 5% m
Hikey) = 8, iwill be 0 or 1 or 2
m=10 2
H (key) = (940) %7_ v 1=0
9 collision
H (key) = (9 + 1°) % 10=0
"-. Place 9679 at index 0
TECHNICAL PUBLICA’ Tions® - An up thrust for knowledgeData Structures and Algorithms 1-36 Hashing
Step 7:
1989 mod 10 = 9 collision occurs.
Hence we will place the element
using quadratic probing,
2
= H{key) = (H{key) +i) %m.
Hikey) = 9, hence
H (key) = (9 +0°) % 10,
=9 collision
H (key) = (9 +1") % 10
=0 collision
(9+2)%10 v i=2
=3 collison
H (key) = (9 +3°) % 10 = i= 3
=8
Insert 1989 at index 8
H (key)
iii) Open addressing hash table with second hash function
Step 1:
Index Key
1 4371 4371 mod 10 = 1
3 1323 1323 mod 10 = 3
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms
1-37 Hashing
Step 2:
Key
4371
1323
6173
Step 3:
Key
4371
1323
6173
4199,
6173 mod 10 = 3 collision occurs.
Hence we will apply second hash.
function,
209 = 7 - (X mod 7)
12(6173) = 7 ~ (6173 mod 7)
=7-6
=1
‘That means we have to take 1 jump
from the index 3 (the place at which
collision occurs). Hence we will
place 6173 at index 4.
4199 mod 10 = 9. As this slot
is empty, we will place 4199
at index 9
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 1-38 Hashing
Step 4:
Index Key | 4344 mod 10 = 4 collision occurs.
Hence
0 h(a3ia) = 7 - (4344 mod 7)
7-4
1 4971 3
2 i.e. Take 3 jumps from index 4
ie, place 4444 at index 7.
3 1323 | Similarly element
4 6173 | 9679 will be placed
at index 5 using second
i 2679 | hash function.
6
7 ou | hace
There is no place for
8 1989,
9 4199
Explain linear probing with and without replacement using the following
data: 12, 01, 04, 03, 07, 08, 10, 02, 05, 14, 06, 28
Assume buckets from 0 to 9 and each bucket has one slot. Calculate average cost/aumber of
comparison for both. Se
Marks 10
Solution : We assume hash function = key mod 10. As buckets are from 0 to 9 and each
bucket has one slot, the keys can be inserted in the hash table using our hash function
as follows -
Index Key
0 10 12 % 10 =2
1 1 01% 10 =
2 2 04% 10 = 4
3 03 03 % 10 = 3
4 04 07 % 10 =7
5 02 08 % 10 = 8
6 05 10% 10
7 7 02 % 10 = 2 collision oceurs
8 08 |. By linear probing place 02 at index
05 % 10 = 5 collision occurs
9 14 ©. Place 05 at index 6
14 % 10 = 4 collision occurs
* Place 14 at index 9
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 1-39 Hashing
Now as hash table is full we can not insert 06 and 28 keys in the hash table. Now if
we want to search any key then following analysis is made
Key 12 OL 04 «03 07 (8 10 02 05 14 06 2B
‘Number of 1 1 1 1 1 1 1 4 2 6 10 10
comparisons
made for each
key
Thus total number of comparisons made
= 14+141414+14+14+14+44+24+6+10+10
= 39 buckets get examined.
That means average number of buckets that get examined per key
Totalnumber ofcomparisons _ 39
Number of keys 2
= 3.25
We must find loading density which denoted by
n Number ofkeys _ 12
a = 12
The average number of key comparisons
Totalnumber of comparisons
Total number ofbuckets
The average cost can be computed by calculating successful search (denoted by Sn) and
unsuccessful search (denoted by Un).
1 1
sn = ("aay
1 1
- (twa)
1 1
« {*-03)
Sn = -2
1 1
tn Wie 2
Hosts]
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 1-40 Hashing
Total cost =
n
Total cost
What do you understand by collision in hashing ? Represent the following
keys in memory using linear probing with or without replacement. Use modulo (10) as
your hashing function : (24, 13, 16, 15, 19, 20, 22, 14, 17, 26, 84, 96)
SPPU : De
a
Sr
Solution : Collision in hashing - Refer section 1.4.
i) Linear probing with replacement
We will consider the hash function as modulo (10) for the hash table size 12, that is
from 0 to 11. In linear probing with replacement, we first find the probable position of
the key clement using hash function. If the location which we obtain from hash function
is empty then place the corresponding key element at that location. If the location is not
empty and the key element which is present at that location belongs to that location
only then, move down in search of empty slot. Place the record at the empty slot.
If the location contains a record which does not belong to that location then replace
that record by the current key element. Place the replaced record at some empty slot,
which can be obtained by moving linearly down.
0 20
1
2 2
3
4 2
5 16
6 16
7 4 Place 14 here
because 14 % 10= 4.
8 The index 4 contain
the element 24
9 19 This is a proper record atits place.
Hence to place 14 we have to move down
10 in search of an empty slot.
"
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 1-41 Hashing
We get the empty location at index 7. Hence 14 is placed at index 7.
J<— 17 % 10 =7. This position was occupied
by 14, Bul 14 is not the proper record
al this place, Hence replace 14 by 17.
Then piace 14 al next empty slot
0 20
1
2 22,
3 18
4 24
5 15
6 16 |+— 26 % 10 = 6. But 16 is a proper record at this place.
7 7
8 14
9 19
10] 26 | Hence at next empty location 26 is placed,
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 1-42 Hashing
0 20
1
2 22
3 13
4 24 |e 84% 10 = 4. But 24 is occupying its own location
5 15
6 16
7 7
8 14
9 19
10] 26
" 84. [+— Then by moving linearly down we can place
84 at the empty location found
96 }+— As 96 % 10 = 6. But index 6 holds a record 16 which
is correct for that location. By moving down linearly
22 we get no empty slot. Hence we
roll back and get the empty slot at index 1
3 Hence 96 will be replaced at index 1.
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms
1-43 Hashing
ii) Linear probing without replacement
10
"
0
20
22
13
24
15
Collision occurs at
[7 inex 4. Hence
probing 14 at
| the next empty slot
‘Atindex 7, the 14 is already placed.
Hence at next empty slot 17 is placed,
PL 25 % 10 = 6. The collision occurs.
Hence 26 is placed at next ematy sit
}+— 96% 10= 6, But at location 6, the element 16 is
placed. Hence we go linearly down in search
of an emply slot. Bul since table gets full,
we may not get an empty slot. Therefore
‘oll back to search an empty slot
Atindex 1, we can then place 98.
}e— 84% 10= 4. But index 4 contains key element
24, Hence by linear probing, at empty slot 11
‘the element 84 is placed
GEER) Assume the size of hash table as 8. The hash function to be used to calculate
the hash value of the data X is X%8. Insert the following values in hash table : 10, 12, 20,
18, 15. Use linear probing without replacement for handling collision.
Solution : Table size
8
Hash function is x % 8
Dresser
We will handle collision of the elements using linear probing, without replacement.
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 1-44 Hashing
10% 8 = 2 °
12%8 = 4 '
2 10
20% 8 = 4- collision occurs 3 Fal
place at 10 C 5. 4 12
18 % 8 = 2- collision occurs 5 20)
6
. place at 10 € 3.
7 15
15%8 = 7
Fig. 1.4.8 Hash table
EEEEEREE) Construct hash table of size 10 using linear probing with replacement strategy
for collision resolution. The hash function is h(x) = x % 10. Calculate total numbers of
comparisons required for searching. Consider slot per bucket is
1, 25, 3, 21, 13, 1, 2, 7,12, 4,8 a
Solution :
25 % 10 = of
3% 10=3 1
21 % 10 2, |
13 % 10 = 3 Collision i
Hence by linear probing : eed
we will insert 13 sl)
at index 3. 7
a
9
1% 10 = 1 collission |
<. Insert 1 at index 2 1 es
2% 10 = 2, But index 2
2 is not empty it '
4 13
is occupied by 1. s bs
Replace 1 by 2, place 6 Ia
1 at next empty location al?
ie. 6 P|
7% 10 =7 ol)
Ta
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 1-45
Hashing
12 % 10 = 2 Again collision
occurs. Hence place 12 at
next empty location ie. 8
4% 10 = 4. But index 4
is occupied by 13. We
will replace it by 4 and
probe 13 at next empty
location i.e. 9.
8 % 10 = 8. But index
8 is already occuppied
by element 12. Hence
replace 12 by 8. Then
probe 8 at next
empty location
21
13
25
12
21
25,
12
13
12
21
25
13,
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms
1-46 Hashing
For the given set of values : 11, 33, 20, 88, 79, 98, 44, 68, 66, 22.
Create a hash table with size 10 and resolve collision using chaining with replacement and
without replacement. Use the modulus Hash function. (key % size).
Solution : Step 1: Without replacement
6
7
Se
Tas
Key Chain
20 a
at zy
ail 11% 10 =1
33 a 35% 10 = 3
=f 20% 10 = 0
a 88% 10 = 8
a 79% 10 = 9
a
Step 2: Now insert 98. The hash key for
occupied. So we search for next empty position. The position at index 2 is empty. Hence
insert 98 at index 2, adjust chain table of 88 key.
Then insert 44.
7”
98 is 98% 10 = 8. But position 8 is already
Step 3: Insert 68 at position index 5. Adjust chain table. Insert 66 at index 6. Now
insert 22 at index 7. The hash table will be as follows :
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 1-47 Hashing
2 8B R|R|B £ Bl
x
Chaining with replacement :
In this technique we place the actual clement to its belonging position. If the position
of element is occupied by an element which does not belong to that corresponding
location then we make replacement.
Step 1: Insert 11, 33, 20, 88, 79.
Key Chain
0 20 a
1 sr a
2
3 33, -
4
5
6
7
8 88
9 7”
Step 2: Now to insert 98, the collision occurs. 98 % 10 = 8. At index 8 we have stored
element 88 which deserves that place, hence we cannot replace 88 by 98. Hence by linear
probing. we can insert 98 at location 2. Then insert 44, and 68, 66. Adjust the chain table
accordingly.
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 1-48 Hashing
Key Chain
0 20 -
1 n =
2 98
3 3B
4 44
5 68
6 66 -1
7
8 88 2
9 2B -
Step 3: Now to insert 22 we get 22 % 10 = 2. At the index 2 element 98 is stored
which does not belong to location 2. Hence replace it by 22 and adjust chain table
appropriately,
Key Chain
0 20
1 n
2 2
3 B =i
4 44 -1
5 68 -1
6 66 -1
7 98 5
8 88 7
9 ce) -1
Construct hash table of size 10 using linear probing without replacement
trategy for collision resolution. The hash function is h(x) = x % 10. Consider slot
per bucket is 1.
31, 3, 4, 21, 61, 6, 71, 8, 9, 25
Solution: The hash table after inserting elements using linear probing without
replacement.
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 1-49 Hashing
o| 25
1} 31
2) 2
3} 3
4} 4
5} 61
6} 6
7) a”
s| 8
9} 9
For the given set of values 35, 36 25, 47, 2501, 129, 65, 29, 16, 14, 99.
Create a hash table with size 15 and resolve collision using open addressing techniques.
Za
9, Mar
Solution : Table size = 15
Assume hash function is element - value % 15. We use open addressing using linear
probing.
10
u
R
B
u
uu
16
a7
29
25% 15 = 10
47 % 15 =2
2501 % 1
65 % 15 = 5 — Collision occurs
:. Place 65 at index 7
by linear probing
29 % 15
16% 15 =1
14% 15 = 14 < Collision occurs
Store 14 at index 0
99 % 15 = 9 — Collision occurs
Store 99 at index 12
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 1-50
(inom eens
1. What is the probing in hash table ? What is linear probing ? How does it differs from
quairatic probing ? Explain with suitable example.
2. What is collision ? What are different collision resolution techniques ? Explain any two
methods in detail. SPPU : Dec.-10, 11, M
Hashing
14, Marks 8
Ha Extensible Hashing
* Extensible hashing is a technique which handles a large amount of data. The data
to be placed in the hash table is by extracting certain number of bits.
+ Extensible hashing grow and shrink similar to B-trees.
* In extensible hashing referring the size of directory the elements are to be placed
in buckets. The levels are indicated in parenthesis.
For example :
Directory
(1) (1) Levels
oo Data to
o10 be placed
in bucket
Fig. 1.5.1
* The bucket can hold the data of its global depth. If data in bucket is more than
global depth then, split the bucket and double the directory.
Insertion Operation :
Consider an example for understanding the insertion operation
Consider we have to insert 1, 4, 5,7, 8, 10. Assume each page can hold 2 data entries
(2 is the depth).
Step 1 et 4 []
1 = 001 o
4= 100
001
We will examine last bit of data and aaa
insert the data in bucket,
Fig. 1.5.2
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 1-81
Hashing
Insert 5. The bucket is full. Hence double
the directory.
1=001
4= 100
5=101
Based on last bit the data is inserted.
Step 2 : Insert 7.
=i
4
0
a) (1)
100
001
101
Fig. 1.5.3
Insert 7 : But as depth is full we can not insert 7 here. Then double the directory and
split the bucket. After insertion of 7. Now consider last two bits.
00 [ on [2 ] 1"
[o Je Ja
100 001 ci
101
Fig. 1.5.4
Step 3 : Insert 8 ie. 1000.
00 1 10 4
“ @
100 001 m1
1000 401
Fig. 1.5.5
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 1-52 Hashing
Step 4: Insert 10. ic. 1010
00 o1 10 11
a) @) 2) (2)
100 oo1 1010 "1
1000 401
Fig. 1.5.6
‘Thus the data is inserted using extensible hashing.
Deletion operation
If we want to delete 10 then, simply make the bucket of 10 empty.
00 | o1 10 ] W
[o Ja i
100 001 411
4000 401
Fig. 1.5.7
Delete 7.
) (ae
~~ Note that the level was increased
100 when we insert 7. Now on deletion
oo G27. the tvel shoul get decorates
Fig. 1.5.8
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 1-53 Hashing
Delete 8. Remove entry from directory 00.
00 01 10 "
(1) (1)
001
100 101
Fig. 1.5.9
EES Applications of Hashing
1. In compilers to keep track of declared variables.
2. For online spelling checking the hashing functions are used.
3. Hashing helps in Game playing programs to store the moves made.
4, For browser program while caching the web pages, hashing is used.
Skip List
© Skip list is a variant list for the linked list
* Skip lists are made up of a series of nodes connected one after the other. Each
node contains a key and value pair as well as one or more references, or pointers,
to nodes further along in the list
* The number of references each node contains is determined randomly. The
number of references a node contains is called its node level.
* There are two special nodes in the skip list one is head node which is the starting
node of the list and tail node is the last node of the list.
* The skip list is an efficient implementation of dictionary using sorted chain. This
is because in skip list each node consists of forward references of more than one
node at a time. Following Fig. 1.7.1 shows the skip list.
Po tote
1 2 3 6 NIL
head tail
node node
Fig. 41.7.1 Skip list
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 1-54 Hashing
Consider a sorted chain as given below -
head tail
Leelee eL el Ae el ed
Fig. 1.7.2
Now to search any node from above given sorted chain we have to search the sorted
chain from head node by visiting each node. But this searching time can be reduced if
we add one level in every alternate node. This extra level contains the forward pointer
of some node. That means in the sorted chain some nodes can hold pointers to more
than one node. For example,
: -ao-eo- aoe
ut]
A bol F-BL t-te el eel elf
Fig. 1.7.3
If we want to search node 50 from above chain there we will require comparatively
less time. This search again can be made efficient if we add few more pointers of
forward references. For example,
head tail
Fig. 1.7.4
Skip List
In above sorted chain every second having one level of chain is added. For instance
node 20 has two next chains one pointing to 30 and another pointing to node 40. This
makes searching more efficient. In skip list the hierarchy of chains is maintained. The
level 0 is a sorted chain of all pairs. The level i chain consists of a subset of pairs in
every level i-1 chains.
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 1-55 Hashing
Node structure of skip list
‘The head node contains maximum number of level chains of the skip list whereas
the tail node contains simply NULL value and no pointer. Each node in the skip list
consists of pair of key and value given by element and a next pointer which is basically
an array of pointers.
template
struct skipNode
{
typedef pair pair_type;
pair_type element;
skipNode **next;
skipNode(const pair_type &New_pair,int MAX):element(New_pair)
a
next=new skipNode*[MAXI;
}
The individual node will look like this -
[7
Element next
Fig. 1.7.5
Operations on Skip List
Various operations that can be performed on skip lists are
1, Searching a node from the skip list.
2. Insertion of a node from skip list.
3. Deletion of a node from skip list.
Searching of a Node
When we search for the 'key’ node then follow these 3 rules -
1) If key
current — key then the node is found.
2) If key < next > key, just go down one level.
3) If key > next + key, go right along the link.
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 1-56 Hashing
For example : Find 50 from the given skip list.
tail
header
We will start search from level 3.
1) Key = 50, next + key = 60, just go down one level,
2) Now we are at level 2. Here next + key = 40 go right along the link. But this link
indicates next > key = 60. Hence go down one more level.
3) Now we are at level 1. Here next > key = 60. Hence go down one more level
4) Now we are at level 0. The next —> key = key. Thus the node 50 is said to be
present in the skip list.
The algorithm for the same is as given below -
Algorithm search(K& Key val)
{
skipNode *Forward_Node = header;
for (int i = levels; i >= 0; i~)
{
while (Forward_Node->next[iJ->element.key < Key_val)
Forward_Node = Forward_Node->next[i];
lastli] = Forward_Node;
t
retum Forward_Node->next[0};
The searching of a node takes O (log n) time.
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 1-57 Hashing
Insertion of a Node
* While inserting a new node in the skip list, it is necessary to find its appropriate
location in the skip lists. Note that after inserting a new node in the skip list, the
sorted order need to be maintained.
* The level of the new node is determined randomly,
For example : Consider following skip list in which we want to insert 55.
[40] 50]
Before insertion of 55
{s0] 40 [50]
Ater insertion of 55
The algorithm for the same is as given below -
void insert(pair& New_pair)
{// Insert New_pair in the skip list
if (New_pair.key >= tailKey)
{
}
// if pair with the key value of New node
//i already present
skipNode* temp = search(New_Pair.key);
if (temp->element.key == New_Pair-key)
{// update temp->element.value
temp->element.value = New_pair.value;
return;
i
cout<<"Key is Too Large";
// Tf the Key_val is not already present
//then determine level for new node randomly
int New_Level = randomlevel();
// evel of retrieved for new node
// Adjust New_Level to be <= levels + 1
if (New_Level > levels)
i
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 1-58 Hashing
New_Level = + +levels;
last[New_Level] = header;
t
// get and insert new node just after node temp
/Jallocating memory for Newnode
skipNode *newNode = new skipNode(New_Pair, New Level + 1);
for (int i = 0; 1 <= New Level; i++)
{
// insert into level i chain
newNode->next[i] = last[i]->nextli];
last[i]->next|i] = newNode;
t
/{ number of pairs in the dictionary will be incremented by
7/1, As one node is inserted in the dictionary
len+
The deletion of a node works in two steps -
1) Search the node to be deleted from the skip list
2) On obtaining the desired node, remove the node from the list and adjust the
pointers.
header tail
NULL
7L-BL-BL +E elH
Before: The node 40 is to be deleted
header tail
cas
| -l-S-eo+
After: The node 40 is deleted
Fig. 1.7.6 Deletion operation
void delet(K& Key_val)
{
// Delete the pair, if any, whose key equals Key val
if (Key_val >= tailKey) // too large
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms
1-59
Hashing
retum;
// see if matching pair present
skipNode* temp = search(Key_val);
//temp node is to be deleted
if (temp->element.key != Key_val) // node is not present
retum;
// delete node from skip list
for (int i <= levels;it +)
{
ifflastli]->nextli] =
lastlil->next[i]
}
// update levels
while (levels > 0 && header->next{levels]
levels— 5
temp)
temp->next!i];
ail)
delete temp;//ftee memory of the node to be deleted
len-—//the pair is removed from the dictionary
f¥ Features of Skip Lists
1) It is a randomized data structure.
2) This is a kind of linked list which works with levels.
3) It is ordered linked list which is called as sorted chain.
4) The bottommost list of the skip list contains all the nodes.
5) The expected time complexity for insertion, deletion and search operation is
O(logn)
Comparison between Hashing and Skip Lists
Hashing,
This method is used to carry out dictionary
operations using randomized processes.
Skip List
Skip lists are used to implement dictionary
operations using randomized processes
It is based on hash function.
If the sorted data is given then hashing is not an.
effective method to implement dictionary.
It does not require hash function.
‘The sorted data improves the performance of
skip list.
The space requirement in hashing is for hash
table and a forward pointer is required per node.
Hashing is an efficient method than skip lists.
Skip lists are more versatile than hash table.
The forward pointers are required for every
level of skip
The skip lists are not that much efficient.
Worst case space requirement is larger for
skip list than hashing
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms
1-60
Hashing
1
Explain about skip list with an example. Gi
sive applications of
Cera
EEX Multiple choice Questions
Qt
a] key b
cc) index (dj
Q.2 One of the most commonly used metho
a_ addition b
| division (dj
a3
[a] more then one different keys[b]
[fe] different hash functions [|
Q4 Assuming that the hash function for a
time ne:
a) O(1) b|
c | Ollogn) {d]
as
probing sequences where m is the size
a| Linear probing {[b]
| Chaining {dj
Q.6 Consider a hash table of size seven,
closed hashing ? Note that "_’ denotes
[b]
di
in building hash function is,
Tn hashing a record is located using :
function
none of these
subtraction
multiplication
Collision occurs when the same hash value is obtained from,
the equal keys
resizing hash tables.
table works well and the size of the hash table is
reasonably large compared to the number of items in the table, the expected (average)
led to find an item in a hash table containing n items is,
Om)
O(nlogn)
Which of the following hashing technique has a potential to generate ©-(m2) different
if hash table ?
Double hashing
All of the above
with starting index zero and a hash function
(3x + 4) mod 7. Assuming the hash table is initially empty, which of the following
the contents of the table when the sequence 1, 3, 8, 10 is inserted into the table using
an empty location in the table.
18,10,___3
110,83
TECHNICAL PUBLICATIOI
S® - An up thrust for knowledgeData Structures and Algorithms
1-61 Hashing
a7
as
ag
Q.10
att
A hash table with 10 buckets with one
Symbols $1 to S7 are initially entered using a hashing function with linear probing.
Maximum number of comparisons needed in searching an item that is not present is
slot per bucket is depicted in following diagram.
0
s7
SI
sa
$2
g
wlalsjalalejelrle
o
Z|
wa
deletion is easier bi
€) worst case comple
d) none of these
One major problem of linear probing is.
table size DI
€) too many computations. — [4
vacant position
[a] chaining (B]
©) double hashing, d
a] second hash function is applied.
[b| chains are used.
table size is changed
The advantage of chained hash table over open addressing scheme is,
ty of search operation is less
In following collision resolution technique the key causing collision is placed at first
Rehashing is a technique in which,
space used is less
primary clustering
none of these
quadratic probing
linear probing
none of these
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 1-62 Hashing
Q.12
Q13
Q14
Qs
Q16
Qi7
Double hashing is a technique in which,
second hash function is applied.
[al
(b]
chains are used.
table size is changed
d) none of these
Requirement of additional data structure is the drawback of.
linear probing
quadratic probing
double hashing
chaining
elele
Adding the objects to the hash table for the list of elements which are already sorted
give following result
Placing of the elements in the table becomes time efficient
b| There is no need to resize the table
| Placing of the elements in the table becomes time efficient
d) No effect
A good hashing function must have.
minimize collisions b] easy and quick to compute
[| distribute the keys evenly over the hash table
all of the above.
In which of the hashing function the arithmetic or logical function is applied on
different field value to calculate hash address ?
a] Division method [b] Folding
[e] Open hashing [d] Chaining
Which of the following is a collision resolution method 2
[a] Open addressing [b] Division method
c| Folding [d) All of the above
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms
1-63 Hashing
Q.48 The advantage of chained hash table over open addressing is, .
space required is less [b| removal or deletion operation is simple
[e] worst case complexity of search operation is less
[d] none of these
Q.19 Following hashing technique is preferred when huge amount of data is available __.
[a] double hashing ‘b] rehashing
¢ | quadratic probing [d) extensible hashing
Q.20 In browser programs for caching the web pages __is used.
a] linked list B] arrays or buffers
hashing [d] none of these
Q.24 Which of the following is hashing technique makes use of bit prefix ?
(a) Chaining [b] Linear probing
fe] Quadratic probing [d] Extensible hashing
Q.22 What do you understand by collision resolution by open addressing ?
a] When collision happens we create a new memory location outside the
existing table and use a chain to link to the new memory location
When collision occurs we enlarge the hash table.
When collision occurs we look for an unoccupied memory location in the
existing hash table.
[d] We use an extra hash table to collect all the collicled data.
Q.23 The time complexity for insertion, deletion, and searching operation of skip list is
[a] Om) [b] om?)
¢) 02") [d] Ofogn)
Answer Keys for Multiple Choice Questions
on b 2 < a3 a
Q4 a Qs b Q6 b
a7 « as a Qs b
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 1-64 Hashing
Q.0 a Qu © Qu2 a
Q13 a Qu d Qus d
Q.16 b Quy a Qus b
19 a 20 c 21 d
Q.22 © 02 a | |
Explanations for Multiple Choice Questions :
Q.4 All the conditions of good hash functions are specified for the hash table. Hence
the positon value returned by the hash function will be used and the desired record
can be retrieved. Thus within one hash function computation the record can be
found. Hence the time complexity is O(1).
Q.6 The hash function is (3x +4) % 7. ol4
The insertion of 1,3, 8 and 10 is as follows 1 | 8
xsl 2 | 10
GBx1+ 4)%7 = 0 3
x=3 4
(3x3+4)%7 = 6 5
x=8 6 [3
(3x84+4)%7 = 0
Collision occurs at 0" index.
Place 8 at index 1
x= 10
(x 1044) %7 = 6
Collision occurs at 6" index.
Hence we roll back and search for next empty location. The location is found at
2°4 index.
Q.7 Consider searching of $8, we will search the locations 8 then 9, then 0, then 1
and finally 2.
Q.13 Insert the keys 5, 28, 19, 15, 30, 33, 12, 17, 10 into a hash table with collisions
resolved by chaining. The table has 9 slots. The hash function h(k) = k mod 9.
Q.14 Consider the hash table of size 11. The hash function h(i) = (2i + 5)%11. The
elements to be placed in this table are - 12, 44, 13, 88, 23, 94, 11, 39, 20, 16, 5. The
given hash function will return some index positions.
Q00
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeUNIT - Il
Trees
Syllabus
Basic terminology, General tree and it’s representation, representation using sequential and linked
organization, Binary tree - properties, converting tree 10 binary tree, binary tree traversals (recursive
‘and non-recursive) - inorder, preorder, post order, depth first and breadth first, Oprations on binary
tree. Huffman Tree (Concept and use), Binary Search Tree (BST). BST operations, Threaded binary
search tree - concepts, threading, insertion and deletion of nodes in inorder threaded binary search
tree, in order traversal of in-order threaded binary search tree.
Contents
21 Basic Terminology May-10, Dec.-10,
2.2 Properties of Binary Tree cess ssee, May-06,
2.3 Representation of Binary Tree <<, May-10, Dec11, 12, 13,
24 General Tree and its Representation
2.5 Converting Tree into Binary Tree Dec.-11, 13, May-12, 14, 19, -
2.6 Binary Tree Traversals oo May-11, 14, Dec-12, 17, °°
2.7 Depth and Level Wise Traversals. May-07, Dec.-10,
2.8 Operations on Binary Tree
29 Huffman's Tree
2.10 Binary Search Tree (BST)
2.11 BST Operations -....,May-07, 09, 11, 14,
Dec.-08, 09, 10,»
2.12 BST as ADT . .. May-06, 09, 10,
Dec.-06,---
2.13 Threaded Binary Tree Dec.-19,
2.14 Multiple Choice Questions
@-1)Data Structures and Algorithms 2-2 Trees
Co
Basic Terminology
Definition of a Tree :
A tree is a finite set of one or more nodes such that -
i) Root node : It is a specially designated node.
ii) There are remaining n nodes which can be partitioned into disjoint sets T,, Ts,
Ty Ty where Ty, Ty, Ty... Ty are called subtrees of the root.
The concept of tree can be represented by following figure -
Fig, 2.1.1 Tree
Let us get introduced with some of the definitions or terms which are normally used.
Fig. 2.1.2 Binary tree
From Fig. 2.1.2,
4, Root
Root is a unique node in the tree to which further subtrees are attached. For above
given tree, node 10 is a root node.
2, Parent node
The node having further sub-branches is called parent node. In Fig. 2.1.3, 20 is parent
node of 40, 50 and 60.
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 2-3 Trees
Geo © @
Fig. 2.1.3 Binary tree representing parent node
3. Child nodes
The child nodes in above given tree are marked as shown below,
@) ©» ® ©
Fig. 2.1.4 Binary tree repres
ting child node
4. Leaves
These are the terminal nodes of the tree.
For example
Fig. 24.5
ary tree with left nodes
5. Degree of the node
The total number of sub-trees attached to that node is called the degree of a node.
For example
Node with
@) ©) © © Node
with degree 0
Fig. 2.1.6 Binary tree
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 2-4 Trees
6. Degree of tree
‘The maximum degree in the tree is degree of tree.
This is the
node with
maximum degree
i.e. 3.Hence
degree of tree
Fig, 2.1.7 Binary tree
7. Level of the tree
The root node is always considered at level zero.
The adjacent nodes to root are supposed to be at level 1 and so on.
Go) = Note ateveto
Nodes at level 1
(20,30)
= — Nodes at level 2
(40,50,60,70)
|< — Nodes at level 3
(80,90)
Fig. 2.1.8 Binary tree in which levels are shown
8. Height of the tree
The maximum level is the height of the tree. In Fig. 2.1.8 the height of tree is 3.
Sometimes height of the tree is also called depth of tree.
9. Predecessor
While displaying the tree, if some particular node occurs previous to some other
node then that node is called predecessor of the other node.
For example : While displaying the tree in Fig. 2.1.8 if we read node 20 first and then
if we read node 40, then 20 is a predecessor of 40.
40. Successor
Successor is a node which occurs next to some node.
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 2-5 Trees
For example : While displaying tree in Fig. 2.1.8 if we read node 60 after reading
node 20 then 60 is called successor of 20.
44, Internal and external nodes
Leaf node means a node having no child node. As leaf nodes are not having further
links, we call leaf nodes external nodes and non leaf nodes are called internal nodes.
[a hn noses
ED exeratrces
Fig, 2.1.9 Representing internal and external nodes
42. Sibling
The nodes with common parent are called siblings or brothers.
For example
@) The nodes 40,50 and
So are sbings of
© eachother
Fig. 2.1.10
In this chapter we will deal with special type of trees called binary trees. Let us
understand it
Uni eS
1. Write the concept of ~ Skewed binary tee
2. What is binary tree ? How is it different than a basic tree ? Explain with figures.
Sos Cees
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 2-6 Trees
EZA Properties of Binary Tree
In the trees, we can have any number of child nodes to a parent node. But if we
impose some restriction on number of child nodes ie. if we allow at the most two
children nodes then that type of tree is called binary tree. Let us formally define binary
trees,
Definition of a binary tree : A binary tree is a finite set of nodes which is either empty
or consists of a root and t2vo disjoint binary trees called the left sub-tree and right sub-tree.
The binary tree can be as shown below.
Go) Foot node
Node with one
child
Node with no
}=— child or leat
node
Fig. 2.2.1 Binary tree
Fig, 2.2.2 Not a binary tree
Left and Right Skewed Trees
The tree in which each node is attached as a left child of parent node then it is left
skewed tree. The tree in which each node is attached as a right child of parent node
then it is called right skewed tree.
TECHNICAL PUBLICA’ Tions® - An up thrust for knowledgeData Structures and Algorithms Trees
Left skewed tree Right skewed tree
Fig, 2.2.3 Skewed trees
Properties of Binary Trees
Various properties of binary trees are -
i) For a binary tree maximum number of nodes at level L are 2".
Proof :
Basis of Induction : As we know, root node is a unique node present at 0'" level
That means, if L.= 0, Maximum number of nodes are 2!" = 2° =
The binary tree is a tree in which each node has maximum two nodes. Hence if level
m = 1 then maximum nodes are, 2 = 2' = 2.
© @
@ ©
L=0 then When L = 1, then
maximum nodes maximum nodes
are1 are two ie. n, and n,.
Inductive step :
=
As, Maximum_nodes (0) =
Maximum_nodes (1) = 2 * maximum_nodes (0)
Maximum_nodes (k) = 2% maximum_nodes (k+1)
= 2+ 2# (24 (2* (..2* maximum-nodes(0)))))
=k
This proves that maximum number of nodes at level L are 2!
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 2-8 Trees
ii) A full binary tree of height h has (2"*11) nodes.
Proof :
Basis of induction :
—— Level o
Height = 1
Levelt > Height =2
= Level 2
Fig. 2.2.4 Full binary tree
For the binary tree, total number of nodes at level m are 2”. In above figure, total
nodes at level 0 are 1, at level 1 are 2 and at level 1 are 4. Thus total number of nodes
in the given binary tree are,
total_nodes = 2°42"427+
If height h = 0, then
total_nodes = 2"!
If height h
total_nodes =
'~ 1 = 3 nodes when h = 1
Induction hypothesis :
Assume total_nodes < 2!"""-1 for a full binary tree when h = k.
Inductive step :
Now a tree T has two subtrees Ty, and Tg. These subtrees are also the full binary
trees with the heights h(T,) and h(T,).
Hence
T
T
T.- Tr
total_nodes (T,) + total_nodes (Tg) + 1
Here 1 is added for root nodes, and T denotes total number of nodes.
o T = (2RUL)+1_4)4 (QTR Iq) 41 *s induction hypothesis
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 2-9 Trees
= (Qh. +1 ghltR)+1y
= 2QmaxthL +L TRL) 4
2-21 max (h(T,) + 1, h(Tp)+1 = h(T)
Tp = ght
Thus total number of allowed nodes in a full binary tree of height h are 2"*1 -1 is
proved.
iii) Total number of external nodes in a binary tree are internal nodes + 1.
iee=itl
Proof: We assume
e = External nodes
i = Internal nodes.
s. We have to prove e =i +1
Basis of induction :
If there is only one node i.e. root node
@ «228
O+1
e=1
ie. Only one external node is present.
Thus, e = i+] is true.
Whenever we add the nodes to a binary tree we add two external nodes and then
previous external node becomes an internal node.
For example
Again, i= 1,e=2 ©
Hence, e=i+1
ie. 2 = 2 is true.
Inductive step : (2) (&)
‘As soon as we add two external nodes previous external node
becomes an internal node.
ie Crow = prev + 1
= (ingey #1) +1
Cnew = inew +1 “+ inew = ipree + 1+
Thus e = i+ 1is true.
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 2-10 Trees
1. Prove that maximaun possible nodes in a binary tree of height és 2h-1
Sea
2. Prove that for binary tree maximum number of nodes at level L are 2L
Sa ee
EEE] Representation of Binary Tree EL ESOS eee
There are two ways of representing the binary tree.
1. Sequential representation. _2. Linked representation.
Let us see these representations one by one.
4, Sequential representation of binary trees or array representation :
Each node is sequentially arranged from top to bottom and from left to right. Let us
understand this matter by numbering each node. The numbering will start from root
node and then remaining nodes will give ever increasing numbers in level wise
direction. The nodes on the same level will be numbered from left to right
The numbering will be as shown in following figure.
Fig. 2.3.1 Binary tree
Now, observe Fig. 2.3.1 carefully. You will get a point that a binary tree of depth n
having 2"-1 number of nodes. The tree is having the depth 4 and total number of nodes
are 15. Thus remember that in a binary tree of depth n there will be maximum 2"-1
nodes. And so if we know the maximum depth of the tree then we can represent binary
tree using arrays data structure. Because we can then predict the maximum size of an
array that can accommodate the tree.
Thus array size can be >=n. The root will be at index 0. Its left child will be at index
1, its right child will be at index 2 and so on. Another way of placing the elements in
the array is by applying the formula as shown below,
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 2-it Trees
When n=0 the root node will placed at 0" location
Parent(n) = floor(n—1)/2
Left(n) = n+l)
Right(n) = (2n42).
Where n>0
O Array Root =
Loft child of Ai.e
8 willbe at location,
Similarly right child of
Ais
Cwill be at 2x0 +2=2" location
th
ig at 8" location
\dox 0.
n=3
That means parent of lis at 3°
location and i.e. D
Fig. 2.3.2 Sequential representation of binary tree
Advantages of sequential representation
The only advantage with this type of representation is that the direct access to any
node can be possible and finding the parent, left or right children of any particular node
is fast because of the random access.
Disadvantages of sequential representation
1. The major disadvantage with this type of representation is wastage of memory.
For example : In the skewed tree half of the array is unutilized. You can easily
understand this point simply by seeing Fig. 2.3.3
20
ED
zn)
Fig. 2.3.3
2. In this type of representation the maximum depth of the tree has to be fixed
because we have already decided the array size. If we choose the array size quite
larger than the depth of the tree, then it will be wastage of the memory. And if we
TECHNICAL PUBLICA’ Tions® - An up thrust for knowledgeData Structures and Algorithms 2-12 Trees
choose array size lesser than the depth of the tree then we will be unable to
represent some part of the tree.
3. The insertions and deletion of any node in the tree will be costlier as other nodes
have to be adjusted at appropriate positions so that the meaning of binary tree can
be preserved
As these drawbacks are there with this sequential type of representation, we will
search for more flexible representation, So instead of array we will make use of linked
list to represent the tree.
2. Linked representation or node representation of binary trees :
In binary tree each node will have left child, right child and data field.
Left child Data Right child
The left child is nothing but the left link which points to some address of left
sub-tree whereas right child is also a right link which points to some address of right
sub-tree. And the data field gives the information about the node. Let us see the ‘C’
structure of the node in a binary tree.
struct node “left;
struct node *right;
}bin;
The tree with linked representation is as shown below.
Left Right
child Data child
Fig. 2.3.4 Linked representation of binary tree
Advantages of linked representation
1. This representation is superior to our array representation as there is no wastage of
memory. And so there is no need to have prior knowledge of depth of the tree.
Using dynamic memory concept one can create as many memory (nodes) as
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 2-13 Trees
required. By chance if some nodes are unutilized one can delete the nodes by
making the address free.
2. Insertions and deletions which are the most common operations can be done
without moving, the other nodes.
Disadvantages of linked representation
1. This representation does not provide direct access to a node and special algorithms
are required.
2. This representation needs additional space in each node for storing the left and
right sub-trees,
GEMMOEER) Represent the following binary tree using array, SSE
© ©)
© ©
Fig. 2.3.5
Solution : The root node will be at 0'" location. We will use following formula to find
the location of parent node, left child node and right child node.
Parent (n) = floor (n-1)/2
Left (n) = (2n+1)
Right (n) = (2n+2)
Array
ofa
® ifs
2, c¢
@® © 3
4
© © s|_>
eLe
Fig. 2.3.5 (a)
University Questions
1. Explain the concept of representation of a binary free using an array.
SESE
2. Explain binary tree representation with example. SUED ToS Tae
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 2-14 Trees
3. Explain the following (4)
i) What is array representation of given binary
tree? (8)
ii) What is linked representation of given binary
tree ? What are important observations of linked © @
representation ?
Sus rks 8
A binary tree
EZ] General Tree and its Representation
General trees are those trees @
in which there could be any
number of nodes that can be (@)
attached as a child nodes. The So (15) (18)
number of subtrees for each
node may be varying, For GY @@ @ O® (5)
example - in Fig. 24.1 is a
General tree. Fig. 2.4.1 General tree
EZ] converting Tree into Binary Tree
We can convert the given general
tree into equivalent binary tree as
follows -
i) The root of general tree becomes
the root of the binary tree.
ii) Find the first child node of the
node attach it as a left child to the
current node in binary tree.
iii) The right sibling can be attached
as a right child of that node.
Let us convert the given general tree
in Fig. 2.5.1 into equivalent binary tree.
Fig. 2.5.1 Binary tree
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 2-15 Trees
EEEOELE) Convert following generalized tree into a inary tree. AEST
Solution :
Fig. 2.5.2 (a)
BEUCRRES Convert the following tree into Binary tree.
Fig. 2.5.3
Solution : The rules are
i) The root of general tree becomes root of the binary tree.
ii) The first left child becomes left child of the tree.
iii) The right sibling is attached as right child.
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 2-16 Trees
The binary tree will be -
Fig. 2.5.3 (a)
Convert the given general tree to its equivalent binary tree
Fig. 2.5.4
Solution : Binary tree is
O
®
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 2-17 Trees
1. Explain how to convert general trees to binary tree with example.
2. What are the properties for binary trees that distinguish then from general trees ?
EG Binary Tree Traversals ESCs
Ane)
Concept : There are three traversal techniques used for binary tree. These are -
Inorder, Preorder and Postorder traversals. Let us discuss these traversals in detail
1. Inorder traversal: In this technique the
leftnode, parent and then right node visit is
followed. For example
1) Here visit the leftmost node ie. 7, then
its parent node 8 then 9 the right child
will be visited.
Il) The root node (actually a parent node of
left sub-branch) will be visited ic. visit
10. Fig. 2.6.1 Binary tr
Il) Visit 11 then 12 and finally visit 13. Thus the traversal of right subbranch is
performed.
Hence, Inorder sequence is 7, 8, 9, 10, 11, 12, 13
2. Preorder traversal : In this method visit
parent node the left and then right node.
1) Visit 10
Il) Visit 8, then 7 finally 9
MI) Visit 12 then 11 and finally 13.
Hence, Preorder traversal is
10, 8, 7, 9, 12, 11, 13.
Fig. 2.6.2 Binary tree
3. Postorder traversal : In this technique visit
left node then right node and finally parent
node.
1) Visit 7, then 9 and then 8.
11) Visit 11, then 13 and then 12.
Ill) Visit node 10.
The Postorder sequence is
7, 9, 8, 11, 13, 12, 10.
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 2-18 Trees
Algorithms and C Functions
4, Recursive inorder traversal
Algorithm :
1. If tree is not empty then
a. traverse the left subtree in inorder
b. visit the root node
c. traverse the right subtree in inorder
C Function
2, Recursive preorder traversal
Algorithm :
1. If tree is not empty then.
a. visit the root node
b. traverse the left subtree in preorder
c. traverse the right subtree in preorder
C Function
3. Recursive postorder traversal
Algorithm :
1. If tree is not empty then.
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 2-19 Trees
a, traverse the left subtree in postorder
b. traverse the right subtree in postorder
c. visit the root node
C function
void postorder(node *temp)
{
if(temp!=NULL)
{
postorder(temp->lett);
postorder(temp->right);
printf(’ %d'' temp->data);
}
r
EEREEELE) Consider the following tree given in the problem. Show a Postorder, Preorder
and In order Traversal of tree. SPPU : M DE]
@)
( 2)
Q2@e@® ©
Fig. 2.6.4
Solution : Preorder Sequence : 50, 17, 12, 9, 14, 23, 19, 72, 54, 67, 76.
Inorder Sequence : 9, 12, 14, 17, 23, 19, 50, 54, 67, 72, 76.
Postorder Sequence : 9, 14, 12, 19, 23, 17, 67, 54, 76, 72, 50.
From the given traversals construct the binary tree.
Pre-order : G, B, Q, A, C, K, F, P, D, E, R, H
In-order : Q, B, K, C, F, A, G, P, E, D, H, R
Solution : Step 1: The first element in
preorder sequence is root or parent node.
We will locate this element in inorder
sequence. Here the first pre-order element is
G. Now, in the inorder sequence the list left GBKGRA PED
to G forms left subbranch and the sequence
right to G forms the right sub branch. Fig. 26.5
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 2-20 Trees
We will repeat the above procedure for each sub-branch
Step 2
Preorder : [B].Q, A, C.K, F, Preorder : [P],D, E.R, H,
Inorder: @,[B].K, 6. FA Inorder: [PB]. 0, H, R,
EDR
Step 3:
Preorder: [A],C, KF
inorder: K, C, F,
@ ® ime
Step 4:
Preorder: [R].H
Inorder: HER]
Final binary tree
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 2-21 Trees
Non Recursive Traversals
We will consider following tree as an example.
Initially we assume
(10) Root / current root = 10, As
current ! = NULL the while
statement from the routine
CO 6 inorder will get executed.
Push 10 move onto left branch
(10) =-current = current — left
current —- (8) (1)
10
@) (:) Stack
Push 8 move onto left branch
(1) current = current — left
() ©) 8
GD
(9) Push 7 move onto left branch
current = current > left
OQ @® Hy
CG © Sack
current = NULL
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 2-22 Trees
Now as current becomes NULL, we will come out of the while statements and
following code fragment will get executed,
Code Meaning
if (! stempty (top) We will pop the topmost element
{ from the stack i.e. 7. Then we will
poe SPs Meccan print it, And then we will move
cout <<"" << current > data;
current — right; to right branch or right child of 7,
} calling it as new current.
Q ® 7
current = NULL.
10
Stack Output
OUD] current i
Now we will enter the for loop interactively, but this time while statement will not
get executed because current = NULL. Hence following code fragment will get executed.
Code Meaning,
if (1 stempty (top)) We will pop the topmost element
{ atops, & from the stack ie. 8. Then we will
pe See eee gata; Print it. And then we will move on to
current = current right; Tight branch, That mean new
current = Right child of 8 =
Right child of 8 = 9.
current = NULL
current Stack
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 2-23 Trees
Now we will enter the for loop iteratively and while loop will be executed because
current = 9. Hence 9 will be pushed onto the stack. And now current = current left.
As there is no left child for the node current; current = NULL. Hence following code
fragment will be executed.
Code Meaning
if (1 stempty (top) Now 9 which lies on the top of the
{ aoe ny stack will be popped off. It will be
Pop (&top,s, & current); j
cout <<"" << current — data; ed
current = current right; current will set to right child of 9
} which is NULL.
Q
Q@ @ |
current = NULL
FNOEE] current Stack Output
Now we will enter the for loop interactively but this time while statement will not be
executed. The if statement will be executed. According to this code fragment, 10 will be
popped off, it will be printed as an output and we will move on to the right branch of
10. Hence now, current = 11 As an output we will get 7 8 9 10.
Again we will enter the for loop interactively. The while statement will be executed
because current = 12. We will push 11 onto the stack. And move onto the left branch of
11. But there is no left child to 11. Hence value of current becomes NULL. Then if
statement will get executed. According to it, 11 will be popped off, it will be printed as
an output and we will move onto right branch of 11.
current = NULL
Output Stack
TECHNICAL PUBLICA’ Tions® - An up thrust for knowledgeData Structures and Algorithms 2-24 Trees
Again we will enter the for loop iteratively. But current = NULL ; hence while
statements will not be executed, stack is empty; hence if statements will not be executed.
Hence else will be executed, by which control returns to main function. Thus we will
get 7 8 9 10 11 as an output of nonrecursive inorder traversal.
void TREE_CLASS::inorder(node *root)
{
node *current,*s{10};
int top=-1;
if(root= =NULL)
{
cout<<"\n Tree is empty\n”;
retum;
+
current=root;
for(;;)
t
while(current!=NULL)
{
push(current,&top,s);
current=current->left;
tempty(top))
pop(&top,s,¤t);
cout<<" "<data;
current=current->right;
}
else
return;
}
Non-recursive Preorder Traversal
The logic for preorder traversal is similar to the inorder traversal. But the only
difference is that we will print the value of each visited current node before pushing it
onto the stack. Hence
:
8 i
oo @ @
Output ‘Stack
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 2-25 Trees
Visiting each node, calling it as current, printing it as an output, pushing current
node onto the stack and moving, on to the left child. These are the sequence of operation
which we must follow at every stage. Above is the scenario for stage I, II and III.
Now if we move on to left branch of 7 we will get current = NULL. We will pop 7
check if it has any right. As 7 has no right child. We will pop the next element ie. 8. As
8 has a right child ic. 9, we will print 9 as an output and push it on to the stack. The
stack will now be
10 8 7 9
9
10
‘Stack Output
Then 9 will be popped off, but 9 has no right child so we will pop 10. The 10 has
right child i.e, 11. Hence print 11 as an output and push it onto the stack.
10 8 7 9 1
1
Stack Output
Finally 11 will be popped off. The node 11 has no left or right child so nothing will
be pushed. As a result we get 10 8 7 9 11 as an output for nonrecursive preorder
traversal.
void TREE_CLASS::preorder(node *root)
i
node *current,*s[10];
int top=-1;
if(toot==NULL)
{
cout<<"\n The Tree is empty\n’
return;
+
current=root;
for(::)
cout<<" "<data;
push(current,&top,s);
current=current—>left;
}
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 2-26 Trees
if(!stempty(top))
{
pop(&top,s,¤t);
current=current->right;
}
else
retum;
+
}
Nonrecursive Postorder Traversal
In the nonrecursive postorder traversal we have to add extra field in the stack
structure. This field keeps a check on whether the node is visited once or not. If we visit
that node for the first time then we set that check to 1, if we again visits that node then
we reset that check. This extra logic we have to put because we traverse to the left node
first, then right node and then the parent node
void TREE_CLASS::postorder(node *root)
t
struct stack
t
node *element;//Here placing the node containing value
int check; —_//check 1 means visiting left subtree
/{check 0 means visiting right subtree
}st[10];
int top=—1;
node *current;
if(toot==NULL)
{
cout<<"\n The Tree is empty\n";
return;
+
current=r00t;
for(;;)
t
while(current!=NULL)
{
st[top|.element
st[top|.check=1;//visiting the left subbranch
current=current-> left;
}
while(st{top].check:
{
current=st]top].element;
top= =;
cout<<" "<data;
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 2-27 Trees
if(stempty(top))
return;
}
current=st|top].clement;//pushing the element onto the stack
current=current->right;
st[top].check=0;//visiting right subtree
1. Write pseudocode for printing the elements of a binary search tree in ascending order
non-recursively. Sues eee a
2. Write non-recursive algorithm for traversal of binary tree. ese
Depth and Level Wise Traversals EL ae ee
The tree can be traversed along with its depth as well as along with its breadth. Let
us discuss about it
Depth First Search (DFS)
Concept : In this traversal technique the tree is traversed according to its depth and
the visited vertex in this depthwise traversal are printed.
For example :
The DFS sequence
For displaying the tree in depth first search manner, we have to start from the root
node. From that node moving along the edge, and we have to move towards the leaf
node and display the data. The depth first traversal is same as preorder traversal.
To perform this traversal we need an additional data structure called stack.
Algorithm :
Step 1:
Step 2: Pop the node and display it as output.
‘sit the root node . Push it onto the stack.
Step 3: If its right child is not NULL, push it onto the stack
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 2-28
Trees
Step 4 : If its left child is not NULL push it onto the stack.
Step 5 : Repeat step 2 to Step 4 until stack is not empty.
Pseudo Code :
Algorithm DFS(root)
t
temp=root}
if(temp!=NULL)
{
display ‘temp-> data’;
DFS(temp->left);
DFS(temp->right);
}
}
Consider following tree and obtain its DES sequence.
Fig. 2.7.4
Solution :
Action Stack Output
Step 1: We will visit 10 and push it onto
the stack.
Pop the node display it as output. 10
10
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 2-29
Trees
Step 2:
We will visit right child of
popped node push it. Then we
will visit left child of 10, push it
onto the stack.
Pop the node, display it as
output.
12
10, 4
Step 3:
We will visit right child of node
4, push it onto the stack.
Now we will visit left child of
node 4, push it onto the stack.
Pop the node and display it as
output.
Bee
10, 4, 3, 8
Step 4:
Step 5:
Step 6:
As 3 has no left or right child,
nothing will be pushed.
Pop the top node from the stack
and display it as output.
We will visit right child of 8,
push it onto the stack.
Then we will visit left child of 6,
push it onto the stack.
Pop the top node from the stack
and display it as output.
Visit right child of 6 ie. 7, push
it onto the stack. Visit left child
of 6 ie. 5, push it onto the stack.
Pop the top node from the stack
and display it as output.
12
Roo
Reve
Box
10, 4, 3, 8
10, 4, 3, 8, 6
10, 4, 3, 8, 6, 5
TECHNICAL PUBLICATIONS® - An up thrust for knowledgeData Structures and Algorithms 2-30
Trees
Step 7: As 5 has no left or right child
nothing will be pushed onto the
Pop 9. Display it.
stack 9
Pop the top element of stack and 2 EONS
display it as output. -
Step 8: As 7 has no left or right child
nothing will be pushed onto the
stack, 12
10, 4, 3, 8, 6, 5, 7,9
Step 9: Again 9 is a leaf node nothing
will be pushed.
Pop 12, Print it.
Step 10: There is no right child of 12. But
12 has left child. So Push 11 onto
the stack.
Pop 11 and display it.
10, 4, 3, 8, 6, 5,7, 9 12
10, 4,3, 8, 6, 5, 7, 9, 12, 11
Now all the nodes are visited. The stack is empty. Hence we get DFS sequence as.
10, 4, 3, 8, 6, 5, 7, 9, 12, 11.
c++ Program
Je ee eeeaeere ieee
Program to create a binary tree and display it using Depth First Traversal.
Both the recursive and non recursive versions of DFS are considered in
this program
WU rine SARE eMn LAU Sn ERE ARUN SOAS ERAA ALES One ERAAECaN ENR RNAS ETON ON]
#include
#include
#define size 50
/? Declare a node for binary tree */
class tree
{
private:
typedef struct node
{
int data;
struct node *left;
struct node “right;
TECHNICAL PUBLICATIONS® - An up thrust for knowledge