Trees One
Trees One
Tree
Dr Deepak Gupta
Assistant Professor, SMIEEE
CSED, MNNIT Allahabad, Prayagraj
Email: [email protected]
Non-linear Data Structures
A data structure is said to be non-linear if its elements form a hierarchical
relationship where data items appear at various levels.
Trees and Graphs are widely used non-linear data structures. Tree and
graph structures represent hierarchical relationships between individual
data elements.
Trees can be defined recursively as:
A tree is a finite set of nodes such that:
1. There is a distinguished node called root node
2. The remaining nodes are partitioned into n>=0 disjoint sets T1, T2 ,…,
Tn where each of these sets is a tree. The sets T1, T2,…,Tn are the
subtrees of the root.
Proof:
If we sum up the maximum number of nodes possible on each level then we
can get the maximum number of nodes possible in the binary tree. First level
is 0 and last level is h-1. By using property 1, the total number of nodes
possible in a binary tree of height h is given by:
h −1
n = 2i
i =0
n = 1 + 21 + 22 + ... + 2h −1
2( h −1) +1 − 1
n=
2 −1
n = 2h − 1
Prepared by: Dr Deepak Gupta, CSED, MNNIT Allahabad, India
Property 3: The minimum number of nodes possible in a binary tree of
height h is equal to h.
Property 7: A strictly binary tree with a non-leaf nodes has n+1 leaf nodes.
Property 8: A strictly binary tree with n leaf nodes always has 2n-1 nodes.
Remark: Number of Leaf nodes = Number of Internal nodes + 1
Internal nodes
External nodes
Property 10: If the height of a complete binary tree is h, h ≥1, then the
minimum number of nodes possible is 2h-1 and the maximum number of nodes
possible is 2h -1.
Proof: The number of nodes will be maximum when the last level also
contains maximum nodes i.e. all levels are full, and so total nodes will be 2h -1
from property 2.
The number of nodes will be minimal when the last level has only one node. In
this case the total nodes will be
Total nodes in a full binary tree of height (h-1) + one node
i.e. (2h-1 -1)+1 = 2h-1.
Prepared by: Dr Deepak Gupta, CSED, MNNIT Allahabad, India
5. Balanced Binary Tree
A balanced binary tree is a binary tree
in which the height of the left and the
right sub-trees of every node may
differ by at most 1.
Remark: AVL Tree and Red-Black Tree are
well-known data structures.
The process of visiting each node of the tree exactly once is called tree
traversal. The traversal of the binary tree involves three basic activities such
as:
1. Visiting the root
2. Traverse the left sub-tree
3. Traverse the right sub-tree
These three basic activities are done in a different order which follows:
1. Pre-order Traversal (NLR)
a. Visit the root (N)
b. Traverse the left subtree of root in
preorder(L)
c. Traverse the right subtree of root in
preorder(R)
D H B E I F C J G K
Pre-order : B D H E Pre-order : C F I G J K
In-order : D H B E In-order : I F C J G K
B C
D H E I F J G K
Pre-order : D H Pre-order : F I Pre-order : G J K
In-order : D H In-order : I F In-order : J G K
left subtree of B left subtree of C right subtree of C
B C
D E F G
H I J K
Prepared by: Dr Deepak Gupta, CSED, MNNIT Allahabad, India
Creation of Binary Tree from In-order and Post-order traversals
D H B E I F C J G K
Post-order : H D E B Post-order : I F J K G C
In-order : D H B E In-order : I F C J G K
B C
D H E I F J G K
Post-order : H D Post-order : I F Post-order : J K G
In-order : D H In-order : I F In-order : J G K
left subtree of B left subtree of C right subtree of C
B C
D E F G
H I J K
Prepared by: Dr Deepak Gupta, CSED, MNNIT Allahabad, India
Creation of Binary Tree from Pre-order and Post-order traversals
B = x1 B C C = x2
D H E I G F
Pre-order : D E H I Pre-order : F G
Post-order : D H I E Post-order : F G
Find the node succussing the root node in pre-order, say x1 (i.e. D) and the node
preceding the root node in post-order say x2 (i.e. E).
• If x1 != x2, then x1 is taken as left child i.e. D and x2 is taken as right child i.e. E of
the root node.
Find the position of x2, i.e. ‘E’ in pre-order and position of x1, i.e. ‘D’ in the post order.
• Consider two sets of pre-order and post-order traversal of left and right sub-tree of root.
• The first set consists of nodes that are present after x1 and before x2 in pre-order
traversal i.e. NULL and the nodes present before x1 in post-order i.e. NULL.
• The second set consists of nodes that are present after x2 in pre-order traversal i.e. HI
and the nodes present after x1 and before x2 in post-order traversal i.e. HI.
B C
G F
D = x1 D E E = x2 Pre-order : F G
Post-order : F G
H I
Pre-order : HI
Post-order : HI
arithmetic expression: 4 / ( 2 - ( - 8 * 3 ) )
First of all, we will do scanning of the given expression into left to the right
manner, then one by one check the identified character:
1. If a scanned character is an operand, we will apply the push operation and push it
into the stack.
2. If a scanned character is an operator, we will apply the pop operation into it to
remove the two values from the stack to make them its child, and after then we will
push back the current parent node into the stack.
In-order: 10 , 15 , 20 , 23 , 25 , 30 , 35 , 39 , 42
Level order: 30, 20, 39, 10, 25, 35, 42, 15, 23
Remark: Note that the in-order traversal of a binary search tree gives us all keys of
that tree in ascending order.
Suppose the data elements are - 45, 15, 79, 90, 10, 55, 12, 20, 50
Given a BST, the task is to delete a node in this BST, which can be broken
down into 3 cases:
Case 1. Delete a Leaf Node in BST
A binary tree with n nodes has 2n pointers out of which n+1 are always
NULL, so we can see that about half the space allocated for pointers is
wasted. We can utilize this wasted to contain some useful information.
A left NULL pointer can be used to store the address of inorder predecessor
of the node and a right NULL pointer can be used to store the address of
inorder successor of the node.
These pointers are called threads and a binary tree which implements these
pointers is called a threaded binary tree.
Types of Threaded Binary Tree
Depending on the type of threading, there are two types of threaded binary
tree:
Types of Threaded Binary Tree
struct node
{
struct node *left;
boolean lthread;
int info;
boolean rthread;
struct node *right;
Fully in-threaded binary tree };
If the left pointer of the node is a thread, then thread will point to inorder
predecessor. If the left pointer is not a thread i.e. node has a left child then
to find the inorder predecessor, we move to this left child and keep on
moving right till we find a node with no right child.
The technique for balancing a binary search tree was introduced by Russian
mathematicians G.M. Adelson, Velski and E. M. Lendis in 1962.
AVL tree is a self-balancing binary search tree in which the heights of the
two sub-trees of a node may differ by at most one. Because of this property,
AVL tree is also known as a height-balanced tree.
The key advantage of using an AVL tree is that it takes O(logn) time to
perform search, insertion and deletion operations in average case as well as
worst case (because the height of the tree is limited to O(logn)).
The structure of an AVL tree is same as that of a binary search tree but with
a little difference. In its structure, it stores an additional variable called the
BalanceFactor.
The balance factor of a node is calculated by subtracting the height of its
right sub-tree from the height of its left sub-tree.
Balance factor = Height (left sub-tree) – Height (right sub-tree)
Prepared by: Dr Deepak Gupta, CSED, MNNIT Allahabad, India
A binary search tree in which every node
has a balance factor of -1, 0 or 1 is said to be
height balanced. A node with any other
balance factor is considered to be
unbalanced and requires rebalancing.
bf = hl – hr = {-1, 0, 1}
If the balance factor of a node is 1, then it means that the left sub-tree of the
tree is one level higher than that of the right sub-tree. Such a tree is called
Left-heavy tree.
If the balance factor of a node is 0, then it means that the height of the left
sub-tree is equal to the height of its right sub-tree.
If the balance factor of a node is -1, then it means that the left sub-tree of
the tree is one level lower than that of the right sub-tree. Such a tree is
called Right-heavy tree.
Prepared by: Dr Deepak Gupta, CSED, MNNIT Allahabad, India
Prepared by: Dr Deepak Gupta, CSED, MNNIT Allahabad, India
Types of Rotations
1. LL Rotation: Inserted node is in the left subtree of left subtree of C
3. Insert A
On inserting E, BST becomes unbalanced as the Balance Factor of I is 2, since if we travel from E
to I we find that it is inserted in the left subtree of right subtree of I, we will perform LR Rotation
on node I. LR = RR + LL rotation
Deleting a node from an AVL tree is similar to that in a binary search tree.
Deletion may disturb the balance factor of an AVL tree and therefore the tree
needs to be rebalanced in order to maintain the AVLness.
For this purpose, we need to perform rotations. The two types of rotations
are L rotation and R rotation. Here, we will discuss R rotations. L rotations
are the mirror images of them.
If the node which is to be deleted is present in the left sub-tree of the critical
node, then L rotation needs to be applied else if, the node which is to be
deleted is present in the right sub-tree of the critical node, the R rotation will
be applied.
Let us consider that, A is the critical node and B is the root node of its left
sub-tree. If node X, present in the right sub-tree of A, is to be deleted, then
there can be three different situations:
If the node B has 0 balance factor, and the balance factor of node A
disturbed upon deleting the node X, then the tree will be rebalanced by
rotating tree using R0 rotation.
Example: Delete node 30 from the AVL tree shown in the following image.
2. R1 Rotation (Node B has balance factor 1)
Example: Delete node 55 from the AVL tree shown in the following image.
Prepared by: Dr Deepak Gupta, CSED, MNNIT Allahabad, India
A multiway search tree of order m is a search tree in which any node can
have at the most m children.
The properties of a non empty m way search tree of order m are:
1. Each node can hold maximum m-1 keys and can have maximum m
children.
2. A node with m children has m-1 key
values. Some of the children can be
NULL (empty subtrees).
3. The keys in a node are in
ascending order.
4. Keys in a non-leaf node will
divide the left and right subtrees
where the value of the left subtree
keys will be less and the value of
the right subtree keys will be more
than that particular key.
Prepared by: Dr Deepak Gupta, CSED, MNNIT Allahabad, India
B-Tree
For example, if we search for item 49 in the following B Tree. The process
will be something like the following :
• Compare item 49 with root node 78. since 49 < 78 hence, move to its left sub-tree.
• Since, 40<49<56, traverse right sub-tree of 40.
• 49>45, move to right. Compare 49.
• match found, return.
Remark: Searching in a B tree depends upon the height of the tree. The search algorithm takes
O(log n) time to search any element in a B tree.
1. Insert 10
10
2. Insert 40
10 40
3. Insert 30
10 30 40
4. Insert 35
10 30 35 40
5. Insert 20
30
10 20 35 40
Prepared by: Dr Deepak Gupta, CSED, MNNIT Allahabad, India
6. Insert 15, 50, 28
30
10 15 20 28 35 40 50
7. Insert 25
20 30
10 15 25 28 35 40 50
8. Insert 5, 60, 19
20 30
5 10 15 19 25 28 35 40 50 60
9. Insert 12
12 20 30
5 10 15 19 25 28 35 40 50 60
Prepared by: Dr Deepak Gupta, CSED, MNNIT Allahabad, India
10. Insert 38 12 20 30 40
5 10 15 19 25 28 35 38 50 60
5 10 15 19 25 27 28 35 38 45 50 60 90
12. Insert 48
30
12 20 40 50
5 10 15 19 25 27 28 35 38 45 48 60 90
Delete 7, 52 30
12 20 40 55
3 7 9 11 15 19 25 28 35 38 45 47 52 65 78
12 20 40 55
3 7 9 11 15 19 25 28 35 38 45 47 52 65 78
30
12 20 40 55
3 9 11 15 19 25 28 35 38 45 47 65 78
Given Tree →
30
12 20 40 55
3 7 9 11 15 19 22 25 28 35 38 45 47 65 78 80
Here key 15 is to be deleted from node [15, 19], since this node has only MIN keys, we
will try to borrow from its left sibling [3, 7, 9, 11] which has more than MIN keys. The
parent of these nodes is node [12, 20] and the separator key is 12. So the last key of left
sibling(11) is moved to the place of separator key and the separator key is moved to the
underflow node. The resulting tree after deletion of 15 will be:
3 7 9 11 15 19 22 25 28 35 38 45 47 65 78 80
After deleting 15 →
30
11 20 40 55
3 7 9 12 19 22 25 28 35 38 45 47 65 78 80
Given Tree →
30
9 20 40 55
3 7 11 12 22 25 28 35 38 45 47 65 78 80
The left sibling of [45, 47] is [35, 38] which has only MIN keys so we can’t borrow from it,
hence we will try to borrow from the right sibling [65, 78, 80]. The first key of the right
sibling (65) is moved to the parent node and the separator key from the parent node (55) is
moved to the underflow node. In the underflow node, 47 is shifted left to make room for
55. In the right sibling, 78 and 80 are moved to fill the gap created by removal of 65.
3 7 11 12 22 25 28 35 38 47 55 78 80
Given Tree → 55
15 24 36 45 64 73 89
5 10 18 22 28 31 39 43 47 53 58 62 67 71 76 86 92 95
We can see that the node [28, 31] has only MIN keys so we’ll try to borrow from left
sibling [18, 22], but it also has MIN keys so we’ll look at the right sibling [39, 43] which
also has only MIN keys. So after deletion of 28 we’ll combine the underflow node with its
left sibling. For combining these two nodes the separator key (24) from the parent node
will move down in the combined node.
5 10 18 22 24 31 39 43 47 53 58 62 67 71 76 86 92 95
5 10 18 22 24 31 39 43 47 53 58 62 67 71 76 86 92 95
Here the key is to be deleted from [58, 62] which is leftmost child of its parent, and hence
it has no left sibling. So here we’ll look at the right sibling for borrowing a key, but the
right sibling has only MIN keys, so we’ll delete 62 and combine the underflow node with
the right sibling.
After deleting 62 →
55
15 36 45 73 89
5 10 18 22 24 31 39 43 47 53 58 64 67 71 76 86 92 95
Given Tree → 45
15 36 55 73
5 10 18 31 39 43 47 53 58 64 67 71 76 86 89 95
45
36 55 73
5 10 15 18 39 43 47 53 58 64 67 71 76 86 89 95
36 55 73
5 10 15 18 39 43 47 53 58 64 67 71 76 86 89 95
Now the parent node [36] has become underflow so we will try to borrow a key from its
right sibling (since it is leftmost node and has no left sibling), but the right sibling has MIN
keys so we will combine the underflow node [36] with its right sibling [55, 73]. The
separator key (45) comes down in the combined node, and since it was the only key in the
root node, now the root node becomes empty and the combined node becomes the new root
of the tree and height of the tree decreases by one.
After deleting 31 → 36 45 55 73
5 10 15 18 39 43 47 53 58 64 67 71 76 86 89 95
5 10 15 19 25 27 28 35 38 45 48 53 57 69 78
The successor key of 12 is 15, so we’ll copy 15 at the place of 12 and now our task
reduces to deletion of 15 from the leaf node. This deletion is performed by borrowing a
key from the right sibling.
30
15 20 40 50 60
5 10 15 19 25 27 28 35 38 45 48 53 57 69 78
5 10 15 19 25 27 28 35 38 45 48 53 57 69 78
After deleting 12 →
30
15 25 40 50 60
5 10 19 20 27 28 35 38 45 48 53 57 69 78
5 10 19 20 27 28 35 38 45 48 53 57 69 78
The successor key of 30 is 35, so it is copied at the place of 30 and now 35 will be deleted
from the leaf node.
35
15 25 40 50 60
5 10 19 20 27 28 35 38 45 48 53 57 69 78
After deleting 30 →
35
15 25 50 60
5 10 19 20 27 28 38 40 45 48 53 57 69 78
In B Tree, Keys and records both can be stored in the internal as well as leaf
nodes. Whereas, in B+ tree, records (data) can only be stored on the leaf
nodes while internal nodes can only store the key values.
The leaf nodes of a B+ tree are linked together in the form of a singly
linked list to make the search queries more efficient.
B+ trees are used to store the large amount of data that can not be stored in
the main memory. Due to the fact that, the size of the main memory is
always limited, the internal nodes (keys to access records) of the B+ tree are
stored in the main memory whereas, leaf nodes are stored in the secondary
memory.
12 20 40 50
D12 D20 D40 D50
5 10 15 19 25 27 28 35 38 45 48 60 90
D5 D10 D15 D19 D25 D27 D28 D35 D38 D45 D48 D60 D90
Fig: The B tree data structure
30
12 20 40 50
5 10 12 15 19 20 25 27 28 30 35 38 40 45 48 50 60 90
D5 D10 D12 D15 D19 D20 D25 D27 D28 D30 D35 D38 D40 D45 D48 D50 D60 D90
Fig: The B+ tree data structure