Data Structures (1) - 73-94
Data Structures (1) - 73-94
PerformanceofOpenAddressing:
Like Chaining, the performance of hashing can be evaluated under the assumption that each key is
equally likely to be hashed to any slot of the table (simple uniform hashing)
Applications of hashing:
1. Database indexing: Hashing is used to index and retrieve data efficiently in databases
and other data storage systems.
2. Password storage: Hashing is used to store passwords securely by applying a hash
function to the password and storing the hashed result, rather than the plain text
password.
3. Data compression: Hashing is used in data compression algorithms, such as the
Huffman coding algorithm, to encode data efficiently.
4. Search algorithms: Hashing is used to implement search algorithms, such as hash
tables and bloom filters, for fast lookups and queries.
5. Cryptography: Hashing is used in cryptography to generate digital signatures,
message authentication codes (MACs), and key derivation functions.
6. Load balancing: Hashing is used in load-balancing algorithms, such as consistent
hashing, to distribute requests to servers in a network.
7. Blockchain: Hashing is used in blockchain technology, such as the proof-of-work
algorithm, to secure the integrity and consensus of the blockchain.
8. Image processing: Hashing is used in image processing applications, such as
perceptual hashing, to detect and prevent image duplicates and modifications.
9. File comparison: Hashing is used in file comparison algorithms, such as the MD5
and SHA-1 hash functions, to compare and verify the integrity of files.
10. Fraud detection: Hashing is used in fraud detection and cybersecurity applications,
such as intrusion detection and antivirus software, to detect and prevent malicious
activities.
Hashing provides constant time search, insert and delete operations on average. This is
why hashing is one of the most used data structure, example problems are, distinct
elements, counting frequencies of items, finding duplicates, etc.
There are many other applications of hashing, including modern-day cryptography hash
functions. Some of these applications are listed below:
Message Digest
Password Verification
Data Structures(Programming Languages)
Compiler Operation
Rabin-Karp Algorithm
Linking File name and path together
Game Boards
Graphics
UNIT 5
TREES AND
GRAPHS
INTRODUCTION
In linear data structure data is organized in sequential order and in non-linear data structure data is
organized in random order. A tree is a very popular non-linear data structure used in a wide range of
applications. Tree is a non-linear data structure which organizes data in hierarchical structure and this
is a recursive definition.
DEFINITION OF TREE:
Tree is collection of nodes (or) vertices and their edges (or) links. In tree data structure, every
individual element is called as Node. Node in a tree data structure stores the actual data of
that particular element and link to next element in hierarchical structure.
Note: 1. In a Tree, if we have N number of nodes then we can have a maximum of N-1 number of
links or edges.
2. Tree has no cycles.
TREE TERMINOLOGIES:
1. Root Node: In a Tree data structure, the first node is called as Root Node. Every tree must have a
root node. We can say that the root node is the origin of the tree data structure. In any tree, there
must be only one root node. We never have multiple root nodes in a tree.
2. Edge: In a Tree, the connecting link between any two nodes is called as EDGE. In a tree with
'N' number of nodes there will be a maximum of 'N-1' number of edges.
3. Parent Node: In a Tree, the node which is a predecessor of any node is called as PARENT
NODE. In simple words, the node which has a branch from it to any other node is called a parent
node. Parent node can also be defined as "The node which has child / children". Here, A is parent
of B&C. B is the parent of D,E&F and so on…
4. Child Node: In a Tree data structure, the node which is descendant of any node is called as
CHILD Node. In simple words, the node which has a link from its parent node is called as child
node. In a tree, any parent node can have any number of child nodes. In a tree, all the nodes
except root are child nodes.
5. Siblings: In a Tree data structure, nodes which belong to same Parent are called as SIBLINGS.
In simple words, the nodes with the same parent are called Sibling nodes.
6. Leaf Node: In a Tree data structure, the node which does not have a child is called as LEAF Node.
In simple words, a leaf is a node with no child. In a tree data structure, the leaf nodes are also called
as External Nodes. External node is also a node with no child. In a tree, leaf node is also called as
'Terminal' node.
7. Internal Nodes: In a Tree data structure, the node which has atleast one child is called as
INTERNAL Node. In simple words, an internal node is a node with atleast one child. In a Tree
data structure, nodes other than leaf nodes are called as Internal Nodes. The root node is also said to
be Internal Node if the tree has more than one node. Internal nodes are also called as 'Non-
Terminal' nodes.
8. Degree: In a Tree data structure, the total number of children of a node is called as DEGREE of
that Node. In simple words, the Degree of a node is total number of children it has. The highest
degree of a node among all the nodes in a tree is called as 'Degree of Tree'
Degree of Tree is: 3
9. Level: In a Tree data structure, the root node is said to be at Level 0 and the children of root
node are at Level 1 and the children of the nodes which are at Level 1 will be at Level 2 and so on...
In simple words, in a tree each step from top to bottom is called as a Level and the Level count
starts with '0' and incremented by one at each level (Step).
10. Height: In a Tree data structure, the total number of edges from leaf node to a particular node
in the longest path is called as HEIGHT of that Node. In a tree, height of the root node is said to be
height of the tree. In a tree, height of all leaf nodes is '0'.
11. Depth: In a Tree data structure, the total number of egdes from root node to a particular node is
called as DEPTH of that Node. In a tree, the total number of edges from root node to a leaf node in
the longest path is said to be Depth of the tree. In simple words, the highest depth of any leaf node
in a tree is said to be depth of that tree. In a tree, depth of the root node is '0'.
12. Path: In a Tree data structure, the sequence of Nodes and Edges from one node to another node
is called as PATH between that two Nodes. Length of a Path is total number of nodes in that path.
In below example the path A - B - E - J has length 4. 57
13. Sub Tree: In a Tree data structure, each child from a node forms a subtree recursively.
Every child node will form a subtree on its parent node.
TREE REPRESENTATIONS:
A tree data structure can be represented in two methods. Those methods are as follows...
1. List Representation
2. Left Child - Right Sibling Representation
1. List Representation
In this representation, we use two types of nodes one for representing the node with data called 'data
node' and another for representing only references called 'reference node'. We start with a 'data node'
from the root node in the tree. Then it is linked to an internal node through a 'reference node' which is
further linked to any other node directly. This process repeats for all the nodes in the tree.
The above example tree can be represented using List representation as follows...
To enhance the performance of binary tree, we use a special type of binary tree knownas
Binary Search Tree. Binary search tree mainly focuses on the search operation in a binarytree.
Binary search tree can be defined as follows...
Binary Search Tree is a binary tree in which every node contains only smaller
values inits left subtree and only larger values in its right subtree.
In a binary search tree, all the nodes in the left subtree of any node contains smaller values and
all the nodes in the right subtree of any node contains larger values as shown in the following
figure...
Example
The following tree is a Binary Search Tree. In this tree, left subtree of every node contains
nodes with smaller values and right subtree of every node contains larger values.
Every binary search tree is a binary tree but every binary tree need not to be
binarysearch tree.
1. Searching become very efficient in a binary search tree since, we get a hint at each
step, about which sub-tree contains the desired element.
2. The binary search tree is considered as efficient data structure in compare to arrays
and linked lists. In searching process, it removes half sub-tree at every step. Searching
for an element in a binary search tree takes o(log2n) time. In worst case, the time it
takes to search an element is 0(n).
3. It also speed up the insertion and deletion operations as compare to that in array and
linked list.
Example1:
Create the binary search tree using the following data elements.
43, 10, 79, 90, 12, 54, 11, 9, 50
1. Insert 43 into the tree as the root of the tree.
2. Read the next element, if it is lesser than the root node element, insert it as the root
of the left sub-tree.
The process of creating BST by using the given elements, is shown in the image
below.
Example2
10,12,5,4,20,8,7,15 and 13
Searching means finding or locating some specific element or node within a data structure.
However, searching for some specific node in binary search tree is pretty easy due to the
factthat, element in BST are stored in a particular order.
Step 2: END
Insert function is used to add a new element in a binary search tree at appropriate location.
Insert function is to be designed in such a way that, it must node violate the property of binary
search tree at each value.
Delete function is used to delete the specified node from a binary search tree. However, we
must delete a node from a binary search tree in such a way, that the property of binary
searchtree doesn't violate.
There are three situations of deleting a node from binary search tree.
In this case, replace the node with its child and delete the child node, which now contains the
value which is to be deleted. Simply replace it with the NULL and free the allocated space.
In the following image, the node 12 is to be deleted. It has only one child. The node will be
replaced with its child node and the replaced node 12 (which is now leaf node) will simply be
deleted.
In the following image, the node 50 is to be deleted which is the root node of the tree. The in-
order traversal of the tree given below.
replace 50 with its in-order successor 52. Now, 50 will be moved to the leaf of the tree, which will
simply be deleted.
Algorithm Delete (TREE, ITEM)
Step1: IF TREE=NULL
Write "item not found in the tree" ELSE IF ITEM < TREE -> DATA
Delete(TREE->LEFT,ITEM)
ELSE IF ITEM>TREE->DATA
Delete(TREE->RIGHT,ITEM)
ELSE IF TREE->LEFT AND TREE->RIGHT
SET TEMP = findLargestNode(TREE -> LEFT)
SET TREE -> DATA = TEMP -> DATA
Delete(TREE -> LEFT, TEMP -> DATA)
ELSE
SET TEMP = TREE
IF TREE -> LEFT = NULL AND TREE -> RIGHT = NULL
SET TREE = NULL
ELSE IF TREE -> LEFT != NULL
SET TREE = TREE -> LEFT
ELSE
SET TREE = TREE -> RIGHT
[END OF IF]
FREE TEMP
[END OF IF]
Step 2: END
GRAPH TERMINOLOGY
Graph :- Graphs are non-linear data structures comprising a finite set of nodes and edges. The
nodes are the elements and edges are ordered pairs of connections between the nodes. Generally,
a graph is represented as a pair of sets (V, E). V is the set of vertices or nodes. E is the set of
Edges. Simple Definition of Graph:- Graph G can be defined as G = ( V , E )
Where V = {A,B,C,D,E} and E = {(A,B),(A,C)(A,D),(B,D),(C,D),(B,E),(E,D)}.
Graph Terminology:-
1) Vertex :Individual data element of a graph is called as Vertex. Vertex is also known as node.
In above example graph, A, B, C, D & E are known as vertices.
2) Edge:An edge is a connecting link between two
vertices. Edges are three types.
1. Undirected Edge - An undirected egde is a bidirectional edge. If there is undirected
edge between vertices A and B then edge (A , B) is equal to edge (B , A).
2. Directed Edge - A directed egde is a unidirectional edge. If there is directed
edge between vertices A and B then edge (A , B) is not equal to edge (B , A).
3. Weighted Edge - A weighted egde is a edge with value (cost) on it.
3) Undirected Graph : A graph with only undirected edges is said to be undirected graph.
4) Directed Graph :A graph with only directed edges is said to be directed graph.
5) Mixed Graph :A graph with both undirected and directed edges is said to be mixed graph.
6) End vertices or Endpoints : The two vertices joined by edge are called end vertices (or
endpoints) of that edge.
7) Origin :If a edge is directed, its first endpoint is said to be the origin of it.
8) Destination : If a edge is directed, its first endpoint is said to be the origin of it and the other
endpoint is said to be the destination of that edge.
9) Adjacent :If there is an edge between vertices A and B then both A and B are said to be
adjacent. In other words, vertices A and B are said to be adjacent if there is an edge between
them.
10) Incident: Edge is said to be incident on a vertex if the vertex is one of the endpoints of that
edge.
11) Outgoing Edge : A directed edge is said to be outgoing edge on its origin vertex.
12) Incoming Edge : A directed edge is said to be incoming edge on its destination vertex.
13) Degree :Total number of edges connected to a vertex is said to be degree of that vertex.
14) Indegree : Total number of incoming edges connected to a vertex is said to be indegree of
that vertex.
15) Outdegree : Total number of outgoing edges connected to a vertex is said to be outdegree of
that vertex.
16) Parallel edges or Multiple edges : If there are two undirected edges with same end vertices
and two directed edges with same origin and destination, such edges are called parallel edges or
multiple edges.
17) Self-loop : Edge (undirected or directed) is a self-loop if its two endpoints coincide with
each other.
18) Simple Graph : A graph is said to be simple if there are no parallel and self-loop edges.
19) Path : A path is a sequence of alternate vertices and edges that starts at a vertex and ends at
other vertex such that each edge is incident to its predecessor and successor vertex.
GRAPH REPRESENTATION
Graph data structure is represented using following representations...
1. Adjacency Matrix
2. Incidence Matrix
3. Adjacency List
Adjacency Matrix :In this representation, the graph is represented using a matrix of size total
number of vertices by a total number of vertices. That means a graph with 4 vertices is
represented using a matrix of size 4X4. In this matrix, both rows and columns represent vertices.
This matrix is filled with either 1 or 0. Here, 1 represents that there is an edge from row vertex to
column vertex and 0 represents that there is no edge from row vertex to column vertex.
For example, consider the following
undirected graph representation...
Directed graph representation...
Incidence Matrix :
In this representation, the graph is represented using a matrix of size total number of vertices by
a total number of edges. That means graph with 4 vertices and 6 edges is represented using a
matrix of size 4X6. In this matrix, rows represent vertices and columns represents edges. This
matrix is filled with 0 or 1 or -1. Here, 0 represents that the row edge is not connected to column
vertex, 1 represents that the row edge is connected as the outgoing edge to column vertex and -1
represents that the row edge is connected as the incoming edge to column vertex.
For example, consider the following directed graph representation...
Adjacency List:
In this representation, every vertex of a graph contains list of its adjacent vertices.
For example, consider the following directed graph representation implemented
using linked list...
From A we have D as
unvisited adjacent node.
7
We mark it as visited and
enqueue it.
At this stage, we are left with no unmarked (unvisited) nodes. But as per the algorithm
we keep on dequeuing in order to get all unvisited nodes. When the queue gets
emptied, the program is over.
Rule 1 − Visit the adjacent unvisited vertex. Mark it as visited. Display it. Push
it in a stack.
Rule 2 − If no adjacent vertex is found, pop up a vertex from the stack. (It will
pop up all the vertices from the stack, which do not have adjacent vertices.)
Rule 3 − Repeat Rule 1 and Rule 2 until the stack is empty.
Step Traversal Description
We choose B, mark it as
visited and put onto the
stack. Here B does not
5
have any unvisited adjacent
node. So, we pop B from
the stack.
We check the stack top for
return to the previous node
and check if it has any
6
unvisited nodes. Here, we
find D to be on the top of
the stack.
As C does not have any unvisited adjacent node so we keep popping the stack until
we find a node that has an unvisited adjacent node. In this case, there's none and
we keep popping until the stack is empty.