0% found this document useful (0 votes)
3 views

ds-unit-4

Uploaded by

Santhosh Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

ds-unit-4

Uploaded by

Santhosh Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

UNIT – IV

TREES:
Tree represents the nodes connected by edges. We will discuss binary tree or binary
search tree specifically.
Binary Tree is a special datastructure used for data storage purposes. A binary tree has
a special condition that each node can have a maximum of two children. A binary tree
has the benefits of both an ordered array and a linked list as search is as quick as in a
sorted array and insertion or deletion operation are as fast as in linked list.

Important Terms

Following are the important terms with respect to tree.


 Path − Path refers to the sequence of nodes along the edges of a tree.
 Root − The node at the top of the tree is called root. There is only one root per
tree and one path from the root node to any node.
 Parent − Any node except the root node has one edge upward to a node called
parent.
 Child − The node below a given node connected by its edge downward is
called its child node.
 Leaf − The node which does not have any child node is called the leaf node.
 Subtree − Subtree represents the descendants of a node.
 Visiting − Visiting refers to checking the value of a node when control is on
the node.
 Traversing − Traversing means passing through nodes in a specific order.
 Levels − Level of a node represents the generation of a node. If the root node is
at level 0, then its next child node is at level 1, its grandchild is at level 2, and
so on.
 keys − Key represents a value of a node based on which a search operation is to
be carried out for a node.

Binary Search Tree Representation

Binary Search tree exhibits a special behavior. A node's left child must have a value
less than its parent's value and the node's right child must have a value greater than its
parent value.

We're going to implement tree using node object and connecting them through
references.

Tree Node
The code to write a tree node would be similar to what is given below. It has a data
part and references to its left and right child nodes.
struct node {
int data;
struct node *leftChild;
struct node *rightChild;
};
In a tree, all nodes share common construct.

BST Basic Operations

The basic operations that can be performed on a binary search tree data structure, are
the following −
 Insert − Inserts an element in a tree/create a tree.
 Search − Searches an element in a tree.
 Preorder Traversal − Traverses a tree in a pre-order manner.
 Inorder Traversal − Traverses a tree in an in-order manner.
 Postorder Traversal − Traverses a tree in a post-order manner.
We shall learn creating (inserting into) a tree structure and searching a data item in a
tree in this chapter. We shall learn about tree traversing methods in the coming
chapter.

Insert Operation

The very first insertion creates the tree. Afterwards, whenever an element is to be
inserted, first locate its proper location. Start searching from the root node, then if the
data is less than the key value, search for the empty location in the left subtree and
insert the data. Otherwise, search for the empty location in the right subtree and insert
the data.
Algorithm

If root is NULL
then create root node
return

If root exists then


compare the data with node.data

while until insertion position is located

If data is greater than node.data


goto right subtree
else
goto left subtree

endwhile

insert data

end If
Implementation
The implementation of insert function should look like this −
void insert(int data) {
struct node *tempNode = (struct node*) malloc(sizeof(struct node));
struct node *current;
struct node *parent;

tempNode->data = data;
tempNode->leftChild = NULL;
tempNode->rightChild = NULL;

//if tree is empty, create root node


if(root == NULL) {
root = tempNode;
} else {
current = root;
parent = NULL;

while(1) {
parent = current;

//go to left of the tree


if(data < parent->data) {
current = current->leftChild;

//insert to the left


if(current == NULL) {
parent->leftChild = tempNode;
return;
}
}

//go to right of the tree


else {
current = current->rightChild;

//insert to the right


if(current == NULL) {
parent->rightChild = tempNode;
return;
}
}
}
}
}

Search Operation
Whenever an element is to be searched, start searching from the root node, then if the
data is less than the key value, search for the element in the left subtree. Otherwise,
search for the element in the right subtree. Follow the same algorithm for each node.
Algorithm
If root.data is equal to search.data
return root
else
while data not found

If data is greater than node.data


goto right subtree
else
goto left subtree

If data found
return node
endwhile

return data not found

end if
The implementation of this algorithm should look like this.
struct node* search(int data) {
struct node *current = root;
printf("Visiting elements: ");

while(current->data != data) {
if(current != NULL)
printf("%d ",current->data);

//go to left tree

if(current->data > data) {


current = current->leftChild;
}
//else go to right tree
else {
current = current->rightChild;
}

//not found
if(current == NULL) {
return NULL;
}
}
return current;
}

Tree Traversal
Traversal is a process to visit all the nodes of a tree and may print their values too. Because, all
nodes are connected via edges (links) we always start from the root (head) node. That is, we
cannot randomly access a node in a tree. There are three ways which we use to traverse a tree −

 In-order Traversal
 Pre-order Traversal
 Post-order Traversal
Generally, we traverse a tree to search or locate a given item or key in the tree or to print all the
values it contains.

In-order Traversal

In this traversal method, the left subtree is visited first, then the root and later the right sub-tree.
We should always remember that every node may represent a subtree itself.
If a binary tree is traversed in-order, the output will produce sorted key values in an ascending
order.

We start from A, and following in-order traversal, we move to its left subtree B. B is also
traversed in-order. The process goes on until all the nodes are visited. The output of inorder
traversal of this tree will be −
D→B→E→A→F→C→G
Algorithm
Until all nodes are traversed −
Step 1 − Recursively traverse left subtree.
Step 2 − Visit root node.
Step 3 − Recursively traverse right subtree.

Pre-order Traversal
In this traversal method, the root node is visited first, then the left subtree and finally the right
subtree.

We start from A, and following pre-order traversal, we first visit A itself and then move to its left
subtree B. B is also traversed pre-order. The process goes on until all the nodes are visited. The
output of pre-order traversal of this tree will be −
A→B→D→E→C→F→G
Algorithm
Until all nodes are traversed −
Step 1 − Visit root node.
Step 2 − Recursively traverse left subtree.
Step 3 − Recursively traverse right subtree.

Post-order Traversal
In this traversal method, the root node is visited last, hence the name. First we traverse the left
subtree, then the right subtree and finally the root node.
We start from A, and following Post-order traversal, we first visit the left subtree B. B is also
traversed post-order. The process goes on until all the nodes are visited. The output of post-order
traversal of this tree will be −
D→E→B→F→G→C→A

Algorithm
Until all nodes are traversed −
Step 1 − Recursively traverse left subtree.
Step 2 − Recursively traverse right subtree.
Step 3 − Visit root node.

Binary Search Tree


A Binary Search Tree (BST) is a tree in which all the nodes follow the below-mentioned
properties −
 The value of the key of the left sub-tree is less than the value of its parent (root) node's
key.
 The value of the key of the right sub-tree is greater than or equal to the value of its parent
(root) node's key.
Thus, BST divides all its sub-trees into two segments; the left sub-tree and the right sub-tree and
can be defined as −
left_subtree (keys) < node (key) ≤ right_subtree (keys)

Representation

BST is a collection of nodes arranged in a way where they maintain BST properties. Each node
has a key and an associated value. While searching, the desired key is compared to the keys in
BST and if found, the associated value is retrieved.
Following is a pictorial representation of BST −
We observe that the root node key (27) has all less-valued keys on the left sub-tree and the
higher valued keys on the right sub-tree.

Basic Operations

Following are the basic operations of a tree −


 Search − Searches an element in a tree.
 Insert − Inserts an element in a tree.
 Pre-order Traversal − Traverses a tree in a pre-order manner.
 In-order Traversal − Traverses a tree in an in-order manner.
 Post-order Traversal − Traverses a tree in a post-order manner.

Node
Define a node having some data, references to its left and right child nodes.
struct node {
int data;
struct node *leftChild;
struct node *rightChild;
};

Search Operation

Whenever an element is to be searched, start searching from the root node. Then if the data is
less than the key value, search for the element in the left subtree. Otherwise, search for the
element in the right subtree. Follow the same algorithm for each node.

Algorithm
struct node* search(int data){
struct node *current = root;
printf("Visiting elements: ");

while(current->data != data){

if(current != NULL) {
printf("%d ",current->data);

//go to left tree


if(current->data > data){
current = current->leftChild;
} //else go to right tree
else {
current = current->rightChild;
}

//not found
if(current == NULL){
return NULL;
}
}
}

return current;
}

Insert Operation
Whenever an element is to be inserted, first locate its proper location. Start searching from the
root node, then if the data is less than the key value, search for the empty location in the left
subtree and insert the data. Otherwise, search for the empty location in the right subtree and
insert the data.

Algorithm
void insert(int data) {
struct node *tempNode = (struct node*) malloc(sizeof(struct node));
struct node *current;
struct node *parent;

tempNode->data = data;
tempNode->leftChild = NULL;
tempNode->rightChild = NULL;

//if tree is empty


if(root == NULL) {
root = tempNode;
} else {
current = root;
parent = NULL;

while(1) {
parent = current;
//go to left of the tree
if(data < parent->data) {
current = current->leftChild;
//insert to the left

if(current == NULL) {
parent->leftChild = tempNode;
return;
}
} //go to right of the tree
else {
current = current->rightChild;

//insert to the right


if(current == NULL) {
parent->rightChild = tempNode;
return;
}
}
}
}
}

Heap Data Structures


Heap is a special case of balanced binary tree data structure where the root-
node key is compared with its children and arranged accordingly. If α has
child node β then −
key(α) ≥ key(β)
As the value of parent is greater than that of child, this property
generates Max Heap. Based on this criteria, a heap can be of two types −
For Input → 35 33 42 10 14 19 27 44 26 31
Min-Heap − Where the value of the root node is less than or equal to either of
its children.
Max-Heap − Where the value of the root node is greater than or equal to
either of its children.

Both trees are constructed using the same input and order of arrival.

Max Heap Construction Algorithm


We shall use the same example to demonstrate how a Max Heap is created. The
procedure to create Min Heap is similar but we go for min values instead of max
values.
We are going to derive an algorithm for max heap by inserting one element at a time.
At any point of time, heap must maintain its property. While insertion, we also
assume that we are inserting a node in an already heapified tree.
Step 1 − Create a new node at the end of heap.
Step 2 − Assign new value to the node.
Step 3 − Compare the value of this child node with its parent.
Step 4 − If value of parent is less than child, then swap them.
Step 5 − Repeat step 3 & 4 until Heap property holds.
Note − In Min Heap construction algorithm, we expect the value of the parent node to
be less than that of the child node.
Let's understand Max Heap construction by an animated illustration. We consider the
same input sample that we used earlier.

Max Heap Deletion Algorithm


Let us derive an algorithm to delete from max heap. Deletion in Max (or Min)
Heap always happens at the root to remove the Maximum (or minimum)
value.
Step 1 − Remove root node.
Step 2 − Move the last element of last level to root.
Step 3 − Compare the value of this child node with its parent.
Step 4 − If value of parent is less than child, then swap them.
Step 5 − Repeat step 3 & 4 until Heap property holds.

Graph Data Structure


A graph is a pictorial representation of a set of objects where some pairs of objects are connected
by links. The interconnected objects are represented by points termed as vertices, and the links
that connect the vertices are called edges.
Formally, a graph is a pair of sets (V, E), where V is the set of vertices and E is the set of edges,
connecting the pairs of vertices. Take a look at the following graph −

In the above graph,


V = {a, b, c, d, e}
E = {ab, ac, bd, cd, de}

Graph Data Structure


Mathematical graphs can be represented in data structure. We can represent a graph using an
array of vertices and a two-dimensional array of edges. Before we proceed further, let's
familiarize ourselves with some important terms −
 Vertex − Each node of the graph is represented as a vertex. In the following example, the
labeled circle represents vertices. Thus, A to G are vertices. We can represent them using
an array as shown in the following image. Here A can be identified by index 0. B can be
identified using index 1 and so on.
 Edge − Edge represents a path between two vertices or a line between two vertices. In the
following example, the lines from A to B, B to C, and so on represents edges. We can
use a two-dimensional array to represent an array as shown in the following image. Here
AB can be represented as 1 at row 0, column 1, BC as 1 at row 1, column 2 and so on,
keeping other combinations as 0.
 Adjacency − Two node or vertices are adjacent if they are connected to each other
through an edge. In the following example, B is adjacent to A, C is adjacent to B, and so
on.
 Path − Path represents a sequence of edges between the two vertices. In the following
example, ABCD represents a path from A to D.
Basic Operations
Following are basic primary operations of a Graph −
 Add Vertex − Adds a vertex to the graph.
 Add Edge − Adds an edge between the two vertices of the graph.
 Display Vertex − Displays a vertex of the graph.

Depth First Traversal


Depth First Search (DFS) algorithm traverses a graph in a depthward motion and uses a stack to
remember to get the next vertex to start a search, when a dead end occurs in any iteration.
As in the example given above, DFS algorithm traverses from S to A to D to G to E to B first,
then to F and lastly to C. It employs the following rules.
 Rule 1 − Visit the adjacent unvisited vertex. Mark it as visited. Display it. Push it in a
stack.
 Rule 2 − If no adjacent vertex is found, pop up a vertex from the stack. (It will pop up all
the vertices from the stack, which do not have adjacent vertices.)
 Rule 3 − Repeat Rule 1 and Rule 2 until the stack is empty.

Step Traversal Description

Initialize the stack.

2
Mark S as visited and put it
onto the stack. Explore any
unvisited adjacent node
from S. We have three nodes
and we can pick any of them.
For this example, we shall take
the node in an alphabetical
order.

Mark A as visited and put it


onto the stack. Explore any
unvisited adjacent node from
A. Both S and D are adjacent
to A but we are concerned for
unvisited nodes only.
4

Visit D and mark it as visited


and put onto the stack. Here,
we have B and C nodes, which
are adjacent to D and both are
unvisited. However, we shall
again choose in an alphabetical
order.

We choose B, mark it as
visited and put onto the stack.
Here B does not have any
unvisited adjacent node. So, we
pop B from the stack.

We check the stack top for


return to the previous node and
check if it has any unvisited
nodes. Here, we find D to be
on the top of the stack.

Only unvisited adjacent node is


from D is C now. So we
visit C, mark it as visited and
put it onto the stack.
As C does not have any unvisited adjacent node so we keep popping the stack until we find a
node that has an unvisited adjacent node. In this case, there's none and we keep popping until the
stack is empty.

Breadth First Traversal

Breadth First Search (BFS) algorithm traverses a graph in a breadthward motion and uses a
queue to remember to get the next vertex to start a search, when a dead end occurs in any
iteration.

As in the example given above, BFS algorithm traverses from A to B to E to F first then to C and
G lastly to D. It employs the following rules.
 Rule 1 − Visit the adjacent unvisited vertex. Mark it as visited. Display it. Insert it in a
queue.
 Rule 2 − If no adjacent vertex is found, remove the first vertex from the queue.
 Rule 3 − Repeat Rule 1 and Rule 2 until the queue is empty.

Step Traversal Description


1

Initialize the queue.

We start from
visiting S (starting node), and
mark it as visited.

We then see an unvisited


adjacent node from S. In this
example, we have three nodes
but alphabetically we
choose A, mark it as visited
and enqueue it.

Next, the unvisited adjacent


node from S is B. We mark it
as visited and enqueue it.
5

Next, the unvisited adjacent


node from S is C. We mark it
as visited and enqueue it.

Now, S is left with no


unvisited adjacent nodes. So,
we dequeue and find A.

From A we have D as unvisited


adjacent node. We mark it as
visited and enqueue it.

At this stage, we are left with no unmarked (unvisited) nodes. But as per the algorithm we keep
on dequeuing in order to get all unvisited nodes. When the queue gets emptied, the program is
over.

Spanning Tree
A spanning tree is a subset of Graph G, which has all the vertices covered with minimum
possible number of edges. Hence, a spanning tree does not have cycles and it cannot be
disconnected..
By this definition, we can draw a conclusion that every connected and undirected Graph G has at
least one spanning tree. A disconnected graph does not have any spanning tree, as it cannot be
spanned to all its vertices.
We found three spanning trees off one complete graph. A complete undirected graph can have
maximum nn-2 number of spanning trees, where n is the number of nodes. In the above addressed
example, n is 3, hence 33−2 = 3 spanning trees are possible.

General Properties of Spanning Tree

We now understand that one graph can have more than one spanning tree. Following are a few
properties of the spanning tree connected to graph G −
 A connected graph G can have more than one spanning tree.
 All possible spanning trees of graph G, have the same number of edges and vertices.
 The spanning tree does not have any cycle (loops).
 Removing one edge from the spanning tree will make the graph disconnected, i.e. the
spanning tree is minimally connected.
 Adding one edge to the spanning tree will create a circuit or loop, i.e. the spanning tree
is maximally acyclic.

Mathematical Properties of Spanning Tree

 Spanning tree has n-1 edges, where n is the number of nodes (vertices).
 From a complete graph, by removing maximum e - n + 1 edges, we can construct a
spanning tree.
n-2
 A complete graph can have maximum n number of spanning trees.
Thus, we can conclude that spanning trees are a subset of connected Graph G and disconnected
graphs do not have spanning tree.

Application of Spanning Tree

Spanning tree is basically used to find a minimum path to connect all nodes in a graph. Common
application of spanning trees are −
 Civil Network Planning
 Computer Network Routing Protocol
 Cluster Analysis
Let us understand this through a small example. Consider, city network as a huge graph and now
plans to deploy telephone lines in such a way that in minimum lines we can connect to all city
nodes. This is where the spanning tree comes into picture.

Minimum Spanning Tree (MST)

In a weighted graph, a minimum spanning tree is a spanning tree that has minimum weight than
all other spanning trees of the same graph. In real-world situations, this weight can be measured
as distance, congestion, traffic load or any arbitrary value denoted to the edges.

Minimum Spanning-Tree Algorithm


We shall learn about two most important spanning tree algorithms here −
 Kruskal's Algorithm
 Prim's Algorithm

Kruskal's algorithm:
Kruskal's algorithm to find the minimum cost spanning tree uses the greedy approach. This
algorithm treats the graph as a forest and every node it has as an individual tree. A tree connects
to another only and only if, it has the least cost among all available options and does not violate
MST properties.
To understand Kruskal's algorithm let us consider the following example −

Step 1 - Remove all loops and Parallel Edges

Remove all loops and parallel edges from the given graph.
In case of parallel edges, keep the one which has the least cost associated and remove all others.

Step 2 - Arrange all edges in their increasing order of weight

The next step is to create a set of edges and weight, and arrange them in an ascending order of
weightage (cost).

Step 3 - Add the edge which has the least weightage

Now we start adding edges to the graph beginning from the one which has the least weight.
Throughout, we shall keep checking that the spanning properties remain intact. In case, by
adding one edge, the spanning tree property does not hold then we shall consider not to include
the edge in the graph.
The least cost is 2 and edges involved are B,D and D,T. We add them. Adding them does not
violate spanning tree properties, so we continue to our next edge selection.
Next cost is 3, and associated edges are A,C and C,D. We add them again −

Next cost in the table is 4, and we observe that adding it will create a circuit in the graph. −

We ignore it. In the process we shall ignore/avoid all edges that create a circuit.

We observe that edges with cost 5 and 6 also create circuits. We ignore them and move on.
Now we are left with only one node to be added. Between the two least cost edges available 7
and 8, we shall add the edge with cost 7.

By adding edge S,A we have included all the nodes of the graph and we now have minimum cost
spanning tree

Prim's algorithm
Prim's algorithm to find minimum cost spanning tree (as Kruskal's algorithm) uses the greedy
approach. Prim's algorithm shares a similarity with the shortest path first algorithms.
Prim's algorithm, in contrast with Kruskal's algorithm, treats the nodes as a single tree and keeps
on adding new nodes to the spanning tree from the given graph.
To contrast with Kruskal's algorithm and to understand Prim's algorithm better, we shall use the
same example −
Step 1 - Remove all loops and parallel edges

Remove all loops and parallel edges from the given graph. In case of parallel edges, keep the one
which has the least cost associated and remove all others.

Step 2 - Choose any arbitrary node as root node

In this case, we choose S node as the root node of Prim's spanning tree. This node is arbitrarily
chosen, so any node can be the root node. One may wonder why any video can be a root node.
So the answer is, in the spanning tree all the nodes of a graph are included and because it is
connected then there must be at least one edge, which will join it to the rest of the tree.

Step 3 - Check outgoing edges and select the one with less cost

After choosing the root node S, we see that S,A and S,C are two edges with weight 7 and 8,
respectively. We choose the edge S,A as it is lesser than the other.

Now, the tree S-7-A is treated as one node and we check for all edges going out from it. We
select the one which has the lowest cost and include it in the tree.

After this step, S-7-A-3-C tree is formed. Now we'll again treat it as a node and will check all the
edges again. However, we will choose only the least cost edge. In this case, C-3-D is the new
edge, which is less than other edges' cost 8, 6, 4, etc.

After adding node D to the spanning tree, we now have two edges going out of it having the
same cost, i.e. D-2-T and D-2-B. Thus, we can add either one. But the next step will again yield
edge 2 as the least cost. Hence, we are showing a spanning tree with both edges included.
We may find that the output spanning tree of the same graph using two different algorithms is
same.

You might also like