DS - Unit-3 (Amiraj) (VisionPapers - In)
DS - Unit-3 (Amiraj) (VisionPapers - In)
A node is an entity that contains a key or value and pointers to its child nodes.
The last nodes of each path are called leaf nodes or external nodes that do not
contain a link/pointer to child nodes.
Edge
Height of a Node
The height of a node is the number of edges from the node to the deepest leaf (ie. the longest path from the node to a leaf
node).
Depth of a Node
The depth of a node is the number of edges from the root to the node.
TREE TERMINOLOGIES
Height of a Tree
The height of a Tree is the height of the root node or the depth of the
deepest node.
Degree of a Node
Forest
A full Binary tree is a special type of binary tree in A perfect binary tree is a type of binary tree in which
which every parent node/internal node has either two every internal node has exactly two child nodes and all
or no children. the leaf nodes are at the same level.
TYPES OF BINARY TREE
Complete Binary Tree
A degenerate or pathological tree is the tree having a A skewed binary tree is a pathological/degenerate tree
single child either left or right. in which the tree is either dominated by the left nodes
or the right nodes. Thus, there are two types of skewed
struct node
int data;
};
BINARY TREE TRAVERSAL
Traversing a tree means visiting every node in the tree. You might, for
instance, want to add all the values in the tree or find the largest one. For all
these operations, you will need to visit each node of the tree.
Linear data structures like arrays, stacks, queues, and linked list have only
one way to read the data. But a hierarchical data structure like a tree can be
traversed in different ways.
Instead, we use traversal methods that take into account the basic structure of a tree i.e.
struct node {
int data;
The struct node pointed to by left and right might have other left and right children so we should think of
● Two subtrees
INORDER TRAVERSAL
inorder(root->left)
display(root->data)
inorder(root->right)
PREORDER TRAVERSAL
1. Visit root node
2. Visit all the nodes in the left subtree
3. Visit all the nodes in the right subtree
display(root->data)
preorder(root->left)
preorder(root->right)
POSTORDER TRAVERSAL
Postorder traversal
postorder(root->left)
postorder(root->right)
display(root->data)
CONSTRUCT A BINARY TREE FROM
TRAVERSAL
THREAD BINARY TREE
A binary tree can be represented using array representation or linked list representation. When a binary tree is
represented using linked list representation, the reference part of the node which doesn't have a child is filled
with a NULL pointer. In any binary tree linked list representation, there is a number of NULL pointers than
actual pointers. Generally, in any binary tree linked list representation, if there are 2N number of reference fields,
then N+1 number of reference fields are filled with NULL ( N+1 are NULL out of 2N ). This NULL pointer
does not play any role except indicating that there is no link (no child).
A. J. Perlis and C. Thornton have proposed new binary tree called "Threaded Binary Tree", which makes use of
NULL pointers to improve its traversal process. In a threaded binary tree, NULL pointers are replaced by
references of other nodes in the tree. These extra references are called as threads.
Threaded Binary Tree is also a binary tree in which all left child pointers that are NULL (in Linked list
representation) points to its in-order predecessor, and all right child pointers that are NULL (in Linked
list representation) points to its in-order successor.
To convert the above example binary tree into a threaded binary tree, first find the in-order
traversal of that tree...
In-order traversal of above binary tree...
H-D-I-B-E-A-F-J-C-G
When we represent the above binary tree using linked list representation, nodes H, I, E, F, J and G
left child pointers are NULL. This NULL is replaced by address of its in-order predecessor
respectively (I to D, E to B, F to A, J to F and G to C), but here the node H does not have its in-
order predecessor, so it points to the root node A. And nodes H, I, E, J and G right child pointers
are NULL. These NULL pointers are replaced by address of its in-order successor respectively (H
to D, I to B, E to A, and J to C), but here the node G does not have its in-order successor, so it
points to the root node A.
THREAD BINARY TREE
ADVANTAGES DISADVANTAGES
❖ Inorder traversal is faster than ❖ Threaded trees are unable to
unthreaded version as stack is not required.
❖ Effectively determines the predecessor
share common sub trees
and successor for inorder traversal, for ❖ If Negative addressing is not
unthreaded tree this task is more difficult. permitted in programming
❖ A stack is required to provide upward language, two additional
pointing information in binary tree which fields are required
threading provides without stack.
❖ It is possible to generate successor or
❖ Insertion into and deletion
predecessor of any node without having from threaded binary tree are
over head of stack with the help of more time consuming
threading. because both thread and
structural link must be
maintained
BINARY SEARCH TREE
Binary search tree is a data structure that quickly allows us to maintain a sorted list of numbers.
● It is called a binary tree because each tree node has maximum of two children.
● It is called a search tree because it can be used to search for the presence of a number in O(log(n))
time.
● The properties that separates a binary search tree from a regular binary tree is
Balance factor of a node in an AVL tree is the difference between the height of the left subtree and
that of the right subtree of that node.
Balance Factor = (Height of Left Subtree - Height of Right Subtree) or (Height of Right Subtree -
Height of Left Subtree)
The self balancing property of an avl tree is maintained by the balance factor. The value of balance
factor should always be -1, 0 or +1.
LEFT ROTATION ON AVL TREE
In left-rotation, the arrangement of the nodes on the right is transformed into the arrangements on the left
node.
Algorithm
1. If newKey < rootKey, call insertion algorithm on the left subtree of the current node until the
leaf node is reached.
2. Else if newKey > rootKey, call insertion algorithm on the right subtree of current node until the
leaf node is reached.
1. If balanceFactor > 1, it means the height of the left subtree is greater than that of the right
1. Locate nodeToBeDeleted (recursion is used to find nodeToBeDeleted in the code used below).
❖ If nodeToBeDeleted has one child, then substitute the contents of nodeToBeDeleted with that of the child.
Let's try to understand this through an example. On facebook, everything is a node. That includes
User, Photo, Album, Event, Group, Page, Comment, Story, Video, Link, Note...anything that has
data is a node.
Every relationship is an edge from one node to another. Whether you post a photo, join a group,
like a page, etc., a new edge is created for that relationship.
Example of graph data structure
All of facebook is then a collection of
these nodes and edges. This is because
vertices (u,v)
GRAPH TERMINOLOGY
❖ Adjacency: A vertex is said to be adjacent to another vertex if there is an edge connecting
them. Vertices 2 and 3 are not adjacent because there is no edge between them.
❖ Path: A sequence of edges that allows you to go from vertex A to vertex B is called a path. 0-
1, 1-2 and 0-2 are paths from vertex 0 to vertex 2.
❖ Directed Graph: A graph in which an edge (u,v) doesn't necessarily mean that there is an
edge (v, u) as well. The edges in such a graph are represented by arrows to show the direction
of the edge.
GRAPH REPRESENTATION
1. Adjacency Matrix
❖ An adjacency matrix is a 2D array of V x V vertices. Each row and column represent a vertex.
❖ If the value of any element a[i][j] is 1, it represents that there is an edge connecting vertex i and vertex j.
❖ Since it is an undirected graph, for edge (0,2), we also need to mark edge (2,0); making the adjacency
matrix symmetric about the diagonal.
❖ Edge lookup(checking if an edge exists between vertex A and vertex B) is extremely fast in adjacency
matrix representation but we have to reserve space for every possible link between all vertices(V x V), so
❖ The algorithm uses a greedy approach in the sense that we find the next best solution hoping that the end result is
the best solution for the whole problem.
MINIMUM SPANNING TREE
The cost of the spanning tree is the sum of the weights of all the edges in the tree. There can be
many spanning trees. Minimum spanning tree is the spanning tree where the cost is minimum
among all the spanning trees. There also can be many minimum spanning trees.
Minimum spanning tree has direct application in the design of networks. It is used in algorithms
approximating the travelling salesman problem, multi-terminal minimum cut problem and
minimum-cost weighted perfect matching. Other practical applications are:
1. Cluster Analysis
2. Handwriting recognition
3. Image segmentation
KRUSKAL’S ALGORITHM
Kruskal’s Algorithm builds the spanning tree by adding edges one by one into a growing spanning tree.
Kruskal's algorithm follows greedy approach as in each iteration it finds an edge which has least weight
and add it to the growing spanning tree.
Algorithm Steps:
This could be done using DFS which starts from the first vertex, then check if the second vertex is visited
or not. But DFS will make time complexity large as it has an order of
O(V+E)
where
Disjoint sets are sets whose intersection is the empty set so it means that they don't have any element in
common.
In Kruskal’s algorithm, at each iteration we will select the edge with the lowest weight. So,
we will start with the lowest weighted edge first i.e., the edges with weight 1. After that we
will select the second lowest weighted edge i.e., edge with weight 2. Notice these two edges
are totally disjoint. Now, the next edge will be the third lowest weighted edge i.e., edge with
weight 3, which connects the two disjoint pieces of the graph. Now, we are not allowed to
pick the edge with weight 4, that will create a cycle and we can’t have any cycles. So we will
select the fifth lowest weighted edge i.e., edge with weight 5. Now the other two edges will
create cycles so we will ignore them. In the end, we end up with a minimum spanning tree
with total cost 11 ( = 1 + 2 + 3 + 5).
PRIM’S ALGORITHM
Prim’s Algorithm also use Greedy approach to find the minimum spanning tree. In Prim’s Algorithm we grow
the spanning tree from a starting position. Unlike an edge in Kruskal's, we add vertex to the growing spanning
tree in Prim's.
Algorithm Steps:
● Maintain two disjoint sets of vertices. One containing vertices that are in the growing spanning tree and
other that are not in the growing spanning tree.
● Select the cheapest vertex that is connected to the growing spanning tree and is not in the growing
spanning tree and add it into the growing spanning tree. This can be done using Priority Queues. Insert the
vertices, that are connected to growing spanning tree, into the Priority Queue.
● Check for cycles. To do that, mark the nodes which have been already selected and insert only those nodes
in the Priority Queue that are not marked.
In Prim’s Algorithm, we will start with an arbitrary node (it doesn’t matter which one) and
mark it. In each iteration we will mark a new vertex that is adjacent to the one that we have
already marked. As a greedy algorithm, Prim’s algorithm will select the cheapest edge and
mark the vertex. So we will simply choose the edge with weight 1. In the next iteration we
have three options, edges with weight 2, 3 and 4. So, we will select the edge with weight 2
and mark the vertex. Now again we have three options, edges with weight 3, 4 and 5. But we
can’t choose edge with weight 3 as it is creating a cycle. So we will select the edge with
weight 4 and we end up with the minimum spanning tree of total cost 7 ( = 1 + 2 +4).