DS Unit-IV
DS Unit-IV
Graph Terminology
i. Vertex: Individual data element of a graph is called as Vertex. Vertex is also known
as node. In the above example A, B, C, D, E are vertices.
ii. Edge: An edge is a connecting link between two vertices. Edge is also known
as Arc. In the above example ((A,B), (A,C), (A,D), (B,D), (C,D), (B,E), (D,E)).
iii. Path: A path is a sequence of alternate vertices and edges that starts at a vertex and
ends at other vertex such that each edge is incident to its predecessor and successor
vertex.
iv. Undirected Graph: A graph with only undirected edges is said to be undirected
graph.
v. Directed Graph: A graph with only directed edges is said to be directed graph.
vi. Mixed Graph: A graph with both undirected and directed edges is said to be mixed
graph
vii. Degree: Total number of edges connected to a vertex is said to be degree of that
vertex.
ix. Out-degree: Total number of outgoing edges connected to a vertex is said to be out-
degree of that vertex.
Graph traversal is a technique used for a searching vertex in a graph. The graph
traversal is also used to decide the order of vertices is visited in the search process. A graph
traversal finds the edges to be used in the search process without creating loops. That means
using graph traversal we visit all the vertices of the graph without getting into looping path.
There are two graph traversal techniques and they are as follows.
BFS traversal of a graph produces a spanning tree as final result. Spanning Tree is a
graph without loops. We use Queue data structure with maximum size of total number of
vertices in the graph to implement BFS traversal. We use the following steps to implement
BFS traversal.
Step2 - Select any vertex as starting point for traversal. Visit that vertex and insert it into
the Queue.
Step3 - Visit all the non-visited adjacent vertices of the vertex which is at front of the
Step4 - When there is no new vertex to be visited from the vertex which is at front of the
Step6 - When queue becomes empty, then produce final spanning tree by removing
Spanning Trees:
A spanning tree is a subset of Graph G, which has all the vertices covered with
minimum possible number of edges. Hence, a spanning tree does not have cycles and it
cannot be disconnected.
A complete undirected graph can have
maximum nn-2 number of spanning trees,
where n is the number of nodes. In the above
addressed example, 33−2 = 3 spanning trees are
possible.
A minimum cost spanning tree is a spanning tree of least cost. The cost of a spanning
tree of a weighted undirected graph is the sum of the costs (weights) of the edges in the
spanning tree. Three different algorithms can be used to obtain a minimum cost spanning tree
of a connected undirected graph. All three use an algorithm design strategy called the greedy
method. Those famous algorithms are:
i. Kruskal's Algorithm
Kruskal's Algorithm
Kruskal’s Algorithm builds the spanning tree by adding edges one by one into a
growing spanning tree. Kruskal's algorithm follows greedy approach and it finds an
edge which has least weight and adds it to the growing spanning tree. The following
steps are used to construct MST.
Step1: Remove all loops and Parallel Edges
Step2: Sort the graph edges with respect to their weights.
Step3: Start adding edges to the MST from the edge with the smallest weight until the
Step4: Only add edges which doesn't form a cycle, edges which connect only
disconnected components.
In case of parallel edges, keep the one which has the least cost associated and
remove all others.
Next cost in the table is 4, and we observe that adding it will create a circuit in the
graph. –
We ignore it. In the process we shall ignore/avoid all edges that create a circuit.
We observe that edges with cost 5 and 6 also create circuits. We ignore them and move
on.
Now we are left with only one node to be added. Between the two least cost edges
available 7 and 8, we shall add the edge with cost 7.
By adding edge S, A we have included all the nodes of the graph and we now have
minimum cost spanning tree.
In case of parallel edges, keep the one which has the least cost associated and remove
all others.
After this step, S-7-A-3-C tree is formed. Now we'll again treat it as a node and will
check all the edges again. However, we will choose only the least cost edge. In this case,
C-3-D is the new edge, which is less than other edges' cost 8, 6, 4, etc.
After adding node D to the spanning tree, we now have two edges going out of it having
the same cost, i.e. D-2-T and D-2-B. Thus, we can add either one. But the next step will
again yield edge 2 as the least cost. Hence, we are showing a spanning tree with both
edges included.
We may find that the output spanning tree of the same graph using two different
algorithms is same.
Applications of Minimum spanning tree:
i. Design of networks
ii. Cluster Analysis
iii. Handwriting recognition
iv. Image segmentation
v. Travelling salesman problem
vi. Multi-terminal minimum cut problem
vii. Minimum-cost weighted perfect matching
➢ Hashing is the process of mapping large amount of data item to smaller table
with the help of hashing function.
➢ Hashing is also known as Hashing Algorithm or Message Digest Function.
➢ It is a technique to convert a range of key values into a range of indexes of an
array.
➢ It is used to facilitate the next level searching method when compared with the
linear or binary search.
➢ Hashing is used with a database to enable items to be retrieved more quickly.
➢ It is used in the encryption and decryption of digital signatures.
➢ Hash table or hash map is a data structure used to store key-value pairs.
➢ It is a collection of items stored to make it easy to find them later.
➢ It uses a hash function to compute an index into an array of buckets or slots from
which the desired value can be found.
➢ It is an array of list where each list is known as bucket.
➢ It contains value based on the key.
➢ Hash table is synchronized and contains only unique elements.
➢ The above figure shows the hash table with the size of n = 10. Each position of
the hash table is called as Slot. In the above hash table, there are n slots in the
table, names = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}. Slot 0, slot 1, slot 2 and so on. Hash table
contains no items, so every slot is empty.
➢ As we know the mapping between an item and the slot where item belongs in
the hash table is called the hash function. The hash function takes any item in the
collection and returns an integer in the range of slot names between 0 to n-1.
➢ A hash function is simply a mathematical formula that maps the key to some
location in the hash table. The location will be calculated from the key value
itself. This one–to-one correspondence between a key value and an index in the
hash table is known as hashing or address calculation indexing. Thus, you can
Key k : 9452
Key square k2 : 89340304
Hash function h(k) : 3403
Types of Hashing:
Chaining
Linear Quadratic Double
Probing Probing Hashing
➢ This technique creates a linked list to the slot for which collision occurs.
Using the hash function ‘key mod 7’, insert the following sequence of keys in the
hash table- 50, 700, 76, 85, 92, 73 and 101 and Use separate chaining technique
for collision resolution.
Solution:
The given sequence of keys will be inserted in the hash table as-
Step1
Step2
Step3
Step5
Step6
Step8
➢ Bucket of the hash table to which key 101 maps = 101 mod 7 = 3.
➢ So, key 101 will be inserted in bucket-3 of the hash table as-
In open addressing, unlike separate chaining, all the keys are stored inside the
hash table. No key is stored outside the hash table.
➢ Linear Probing
➢ Quadratic Probing
➢ Double Hashing
Operations in Open Addressing-
➢ Search Operation
➢ Insert Operation
➢ Deletion Operation
Open Addressing Techniques-
1. Linear Probing
In linear probing,
Advantage Disadvantage
2. Quadratic Probing
In quadratic probing,
➢ Insert ki at first free location from (u+i2) % m where i=0 to (m-1), Where u is
Location (index or address).
➢ We keep probing until an empty bucket is found.
➢ Let us consider two hash functions h1(k) and h2(k) two functions.
➢ Insert ki at first free location from (u + v * i) % m where i=0 to (m-1), Where u= is
Location (index or address) and v= h2(k) %m.
➢ It requires more computation time as two hash functions need to be computed.
Using the hash function ‘key mod 7’, insert the following sequence of keys in the
hash table- 50, 700, 76, 85, 92, 73 and 101 to use linear probing technique for
collision resolution.
Step1
Step2
Step3
➢ Bucket of the hash table to which key 700 maps = 700 mod 7 = 0.
➢ So, key 700 will be inserted in bucket-0 of the hash table as-
Step5
➢ The next key to be inserted in the hash table = 85.
Step6
➢ The next key to be inserted in the hash table = 92.
Step7
➢ The next key to be inserted in the hash table = 73.
Step8
➢ The next key to be inserted in the hash table = 101.
➢ Bucket of the hash table to which key 101 maps = 101 mod 7 = 3.
➢ Since bucket-3 is already occupied, so collision occurs.
➢ To handle the collision, linear probing technique keeps
probing linearly until an empty bucket is found.
➢ The first empty bucket is bucket-5.
➢ So, key 101 will be inserted in bucket-5 of the hash table
as-
Assignment
1. Using the hash function ‘key mod 7’, insert the following sequence of keys in the
hash table- 50, 700, 76, 85, 92, 73 and 101 to use Quadratic probing technique
for collision resolution.
2. Using the hash function ‘key mod 7’, insert the following sequence of keys in the
hash table- 50, 700, 76, 85, 92, 73 and 101 to use Double hashing technique for
collision resolution.
3. Using the hash function h(k)=2k+3 and m=10, insert the following sequence of
keys in the hash table- 3, 2, 9, 6, 11, 13, 7 and 12 to use closed addressing
technique for collision resolution.
4. Using the hash function h(k)=2k+3 and m=10, insert the following sequence of
keys in the hash table- 3, 2, 9, 6, 11, 13, 7 and 12 to use Linear probing technique
for collision resolution.