Unit III Trees1
Unit III Trees1
TREES
A tree is a nonlinear hierarchical data structure that consists of nodes connected
by edges.
Other data structures such as arrays, linked list, stack, and queue are linear
data structures that store data sequentially. In order to perform any operation
in a linear data structure, the time complexity increases with the increase in the
data size. But, it is not acceptable in today's computational world.
Different tree data structures allow quicker and easier access to the data as it is
a non-linear data structure.
One of the nodes is designated as “Root node” and the remaining nodes are
called child nodes or the leaf nodes of the root node.
In general, each node can have as many children but only one parent node.
Nodes of a tree are either at the same level called sister nodes or they can
have a parent-child relationship. Nodes with the same parent are sibling
nodes.
Tree Terminologies
1
Node : A node is an entity that contains a key or value and pointers to its child
nodes.The last nodes of each path are called leaf nodes or external nodes that do
not contain a link/pointer to child nodes.The node having at least a child node is
called an internal node.
Edge : It is the link between any two nodes.
2
Parent node: Any node except the root node that has a child node and an edge
upward towards the parent.
Ancestor Node: It is any predecessor node on a path from the root to that node.
Note that the root does not have any ancestors. In the above diagram, A and B are
the ancestors of E.
Key: It represents the value of a node.
Level: Represents the generation of a node. Levels of a node represents the
number of connections between the node and the root. It represents generation of a
node.A root node is always at level 1. Child nodes of the root are at level 2,
grandchildren of the root are at level 3, and so on. In general, each node is at a
level higher than its parent.
Path: The path is a sequence of consecutive edges. In the above diagram, the path to
E is A=>B->E.
Types of Trees
1. Binary Tree
2. Binary Search Tree
3. AVL Tree
4. B-Tree
Why to use Tree Data Structure?
1. One reason to use trees might be because you want to store information that
naturally forms a hierarchy. For example, the file system on a computer:
File System
2. Trees (with some ordering e.g., BST) provide moderate access/search (quicker
than Linked List and slower than arrays).
3. Trees provide moderate insertion/deletion (quicker than Arrays and slower than
Unordered Linked Lists).
4. Like Linked Lists and unlike Arrays, Trees don’t have an upper limit on the
number of nodes as nodes are linked using pointers.
BINARY TREES:
3
A Binary tree is a heirarchichal data structure in which every node has 2 children,
also known as left child and right child, as each node has 2 children hence the name
"Binary".
Root node is the topmost node of the tree.
So a typical binary tree will have the following components:
A left subtree
A root node
A right subtree
A pictorial representation of a binary tree is shown below:
In a given binary tree, the maximum number of nodes at any level is 2 l-1 where
‘l’ is the level number.
Thus in case of the root node at level 1, the max number of nodes = 2 1-1 =
20 = 1
As every node in a binary tree has at most two nodes, the maximum nodes at
the next level will be, 2*2l-1.
Given a binary tree of depth or height of h, the maximum number of nodes in
a binary tree of height h = 2h – 1.
Hence in a binary tree of height 3 (shown above), the maximum number of
nodes = 23-1 = 7.
TREE NODE IMPLEMENTATION
class Node
4
{
public:
int data;
Node* left;
Node* right;
};
Basic Operations On Binary Tree:
1. Inserting an element.
2. Removing an element.
3. Searching for an element.
4. Deletion for an element.
5. Traversing an element.
Binary Tree Representation
A binary tree is allocated memory in two ways.
#1) Sequential Representation
This is the simplest technique to store a tree data structure. An array is used to store
the tree nodes. The number of nodes in a tree defines the size of the array. The root
node of the tree is stored at the first index in the array.
In general, if a node is stored at the ith location then it’s left and right child is stored
at 2i and 2i+1 location respectively.
Consider the following Binary Tree.
5
In the above representation, we see that the left and right child of each node is
stored at locations 2*(node_location) and 2*(node_location)+1 respectively.
For Example, the location of node 3 in the array is 3. So it’s left child will be placed
at 2*3 = 6. Its right child will be at the location 2*3 +1 = 7. As we can see in the
array, children of 3 which are 6 and 7 are placed at location 6 and 7 in the array.
The sequential representation of the tree is inefficient as the array which is used to
store the tree nodes takes up lots of space in memory. As the tree grows, this
representation becomes inefficient and difficult to manage.
This drawback is overcome by storing the tree nodes in a linked list. Note that if the
tree is empty, then the first location storing the root node will be set to 0.
#2) Linked-list Representation
In this type of representation, a linked list is used to store the tree nodes. Several
nodes are scattered in the memory in non-contiguous locations and the nodes are
connected using the parent-child relationship like a tree.
As shown in the above representation, each linked list node has three components:
Left pointer
Data part
Right pointer
The left pointer has a pointer to the left child of the node; the right pointer has a
pointer to the right child of the node whereas the data part contains the actual data
of the node. If there are no children for a given node (leaf node), then the left and
right pointers for that node are set to null as shown in the above figure.
Traversing a tree means visiting every node in the tree. You might, for
instance, want to add all the values in the tree or find the largest one. For all these
operations, you will need to visit each node of the tree.
6
Linear data structures like arrays, stacks, queues, and linked list have only one way
to read the data. But a hierarchical data structure like a tree can be traversed in
different ways.
Let's think about how we can read the elements of the tree in the image shown
above.
inorder(root->left)
7
display(root->data)
inorder(root->right)
We traverse the left subtree first. We also need to remember to visit the root
node and the right subtree when this tree is done.
Stack
8
Left subtree -> root -> right subtree
Final Stack
Since the node "5" doesn't have any subtrees, we print it directly. After that we
print its parent "12" and then the right child "6".
Putting everything on a stack was helpful because now that the left-subtree of
the root node has been traversed, we can print it and go to the right subtree.
After going through all the elements, we get the inorder traversal as
We don't have to create the stack ourselves because recursion maintains the correct
order for us.
Example of inorder traversal
we start recursive call from 30(root) then move to 20 (20 also have sub tree so
apply in order on it),15 and 5.
9
5 have no child .so print 5 then move to it's parent node which is 15 print and then
move to 15's right node which is 18.
18 have no child print 18 and move to 20 .print 20 then move it right node which is
25 .25 have no subtree so print 25.
print root node 30 .
now recursively traverse to right subtree of root node . so move to 40. 40 have
subtree so traverse to left subtree of 40.
left subtree of 40 have only one node which is 35. 35 had no further subtree so
print 35. move to 40 and print 40.
traverse to right subtree of 40. so move to 50 now have subtree so traverse to left
subtree of 50 .move to 45 , 45 have no further subtree so print 45.
move to 50 and print 50. now traverse to right subtree of 50 hence move to 60
and print 60.
our final output is {5 , 15 , 18 , 20 , 25 , 30 , 35 , 40 , 45 , 50 , 60}
Application of inorder traversal
In-order traversal is used to retrives data of binary search tree in sorted order.
PREORDER TRAVERSAL
display(root->data)
preorder(root->left)
preorder(root->right)
In this traversal method, the root node is visited first, then the left subtree, and
finally the right subtree.
10
Start with root node 30 .print 30 and recursively traverse the left subtree.
next node is 20. now 20 have subtree so print 20 and traverse to left subtree of 20 .
next node is 15 and 15 have subtree so print 15 and traverse to left subtree of 15.
5 is next node and 5 have no subtree so print 5 and traverse to right subtree of 15.
next node is 18 and 18 have no child so print 18 and traverse to right subtree of 20.
25 is right subtree of 20 .25 have no child so print 25 and start traverse to right
subtree of 30.
next node is 40. node 40 have subtree so print 40 and then traverse to left subtree
of 40.
next node is 35. 35 have no subtree so print 35 and then traverse to right subtree of
40.
next node is 50. 50 have subtree so print 50 and traverse to left subtree of 50.
next node is 45. 45 have no subtree so print 45 and then print 60(right subtree) of
50.
our final output is {30 , 20 , 15 , 5 , 18 , 25 , 40 , 35 , 50 , 45 , 60}
Application of preorder traversal
Preorder traversal is used to create a copy of the tree.
Preorder traversal is also used to get prefix expression of an expression tree.
11
2. Visit all the nodes in the right subtree
postorder(root->left)
postorder(root->right)
display(root->data)
The root node is visited last in this traversal method, hence the name. First, we
traverse the left subtree, then the right subtree, and finally the root node.
We start from 30, and following Post-order traversal, we first visit the left subtree
20. 20 is also traversed post-order.
15 is left subtree of 20 .15 is also traversed post order.
5 is left subtree of 15. 5 have no subtree so print 5 and traverse to right subtree of 15
.
18 is right subtree of 15. 18 have no subtree so print 18 and then print 15. post order
traversal for 15 is finished.
next move to right subtree of 20.
25 is right subtree of 20. 25 have no subtree so print 25 and then print 20. post
order traversal for 20 is finished.
next visit the right subtree of 30 which is 40 .40 is also traversed post-order(40 have
subtree).
12
35 is left subtree of 40. 35 have no more subtree so print 35 and traverse to right
subtree of 40.
50 is right subtree of 40. 50 should also traversed post order.
45 is left subtree of 50. 45 have no more subtree so print 45 and then print 60
which is right subtree of 50.
next print 50 . post order traversal for 50 is finished.
now print 40 ,and post order traversal for 40 is finished.
print 30. post order traversal for 30 is finished.
our final output is {5 , 18 , 15 , 25 , 20 , 35 , 45 , 60 , 50 , 40 , 30}
Application of postorder traversal
Postorder traversal is used to delete the tree.
Postorder traversal is also used to get the postfix expression of an expression tree.
C++ program to implement Binary Tree
#include <bits/stdc++.h>
struct Node {
int data;
Node(int value)
data = value;
13
left = NULL;//Left child is initialized to NULL
};
return;
Printtree(root -> left); //We will use inorder traversal to print the tree
int main()
14
Printtree(root); //function call to print the tree
return 0;
#include <iostream>
/* A binary tree node has data stored as value, pointer to left child
and right child */
struct Node
{
int value;
struct Node *left, *right;
};
//function for creating new tree node
Node* createNode(int value)
{
Node* t = new Node;
t->value = value;
t->left = t->right = NULL;
return t;
}
/* printing postorder traversal of binary tree */
void postorder(struct Node* root)
{
if (root == NULL)
return;
postorder(root->left); // first traverse left subtree
postorder(root->right); // then traverse right subtree
cout << root->value << " "; // now visit root node
}
/* inorder traversal of the binary tree*/
void inorder(struct Node* root)
{
if (root == NULL)
return;
15
inorder(root->left); /* first visit left child */
cout << root->value << " "; /* print root data */
inorder(root->right); /* at last recur over right subtree */
}
16
Mirror Tree
The idea is to traverse recursively and swap the right and left subtrees after
traversing the subtrees.
Follow the steps below to solve the problem:
Call Mirror for left-subtree i.e., Mirror(left-subtree)
Call Mirror for right-subtree i.e., Mirror(right-subtree)
Swap left and right subtrees.
temp = left-subtree
left-subtree = right-subtree
right-subtree = temp
C++ program to convert a binary tree to its mirror
#include <bits/stdc++.h>
/* A binary tree node has data, pointer to left child and a pointer to right child */
struct Node
int data;
};
/* Helper function that allocates a new node with the given data and NULL left and
right pointers. */
node->data = data;
17
node->left = NULL;
node->right = NULL;
return (node);
/* Change a tree so that the roles of the left and right pointers are swapped at every
node.*/
if (node == NULL)
return;
else {
mirror(node->right);
node->left = node->right;
node->right = temp;
if (node == NULL)
return;
inOrder(node->left);
int main()
root->left = newNode(2);
root->right = newNode(3);
root->left->left = newNode(4);
root->left->right = newNode(5);
cout << "Inorder traversal of the constructed" << " tree is" << endl;
inOrder(root);
inOrder(root);
return 0;
Deletion of a Tree
To delete a tree, we must traverse all the nodes of the tree and delete them one
by one. So, which traversal we should use – inorder traversal, preorder traversal,
or the postorder traversal? The answer is simple. We should use the postorder
traversal because before deleting the parent node, we should delete
itschildnodesfirst.
We can delete the tree with other traversals also with extra space complexity but
why should we go for the other traversals if we have the postorder one available
which does the work without storing anything in the same time complexity.
For the following tree, nodes are deleted in the order – 4, 5, 2, 3, 1.
19
C++ program to Delete a Tree
#include<bits/stdc++.h>
#include<iostream>
/* A binary tree node has data,pointer to left child and a pointer to right child */
class node
public:
int data;
node* left;
node* right;
/* Constructor that allocates a new node with the given data and NULL
left and right pointers. */
node(int data)
this->data = data;
this->left = NULL;
this->right = NULL;
};
/* This function traverses tree in post order to delete each and every node of the
tree */
20
void deleteTree(node* node)
deleteTree(node->left);
deleteTree(node->right);
delete node;
int main()
deleteTree(root);
root = NULL;
return 0;
21
Output:
Level1 nodes: 1
Level 2 nodes: 2,3
Level 3 nodes: 4,5
Level 4 nodes 6,7
Input:
Output:
Level1 nodes: 50
Level 2 nodes: 35,57
Level 3 nodes: 30,40,52,58
Level 4 nodes: 11
C++ PROGRAM FOR LEVEL ORDER BINARY TREE
#include <iostream>
class node
{
22
public: int data;
};
Node->data = data;
Node->left = NULL;
Node->right = NULL;
return (Node);
if (node == NULL)
return 0;
else {
return(lheight + 1);
else {
return(rheight + 1);
23
}
if (root == NULL)
return;
if (level == 1)
CurrentLevel(root->left, level-1);
CurrentLevel(root->right, level-1);
int h = height(root);
int i;
CurrentLevel(root, i);
int main() {
root->left = newNode(2);
24
root->right = newNode(3);
root->left->left = newNode(4);
root->left->right = newNode(5);
LevelOrder(root);
return 0;
Given a binary tree, efficiently create copy of it. The idea very simple – recursively
traverse the binary tree in a preorder fashion, and for each encountered node,
create a new node with the same data and insert a mapping from the original tree
node to the new node in a hash table. After creating the mapping, recursively
process its children.
#include <iostream>
// A Binary Tree Node
class Node
{
public:
int data;
Node* left, *right;
Node(int data)
{
this->data = data;
this->right = this->left = nullptr;
}
};
// Function to print the inorder traversal on a given binary tree
void inorder(Node* root)
{
if (root == nullptr)
25
{
return;
}
inorder(root->left); // recur for the left subtree
cout << root->data << " "; // print the current node's data
inorder(root->right); // recur for the right subtree
}
// Recursive function to clone a binary tree
Node* cloneBinaryTree(Node* root)
{
// base case
if (root == nullptr) {
return nullptr;
}
// create a new node with the same data as the root node
Node* root_copy = new Node(root->data);
root_copy->left = cloneBinaryTree(root->left); // clone the left &right subtree
root_copy->right = cloneBinaryTree(root->right);
return root_copy; // return cloned root node
}
int main()
{
Node* root = new Node(1);
root->left = new Node(2);
root->right = new Node(3);
root->left->left = new Node(4);
root->left->right = new Node(5);
root->right->left = new Node(6);
26
root->right->right = new Node(7);
Node* clone = cloneBinaryTree(root);
cout << "Inorder traversal of the cloned tree: ";
inorder(clone);
return 0;
}
In a binary tree, every node can have a maximum of two children but there is no
need to maintain the order of nodes basing on their values. In a binary tree, the
elements are arranged in the order they arrive at the tree from top to bottom and
left to right.
A binary tree has the following time complexities...
To enhance the performance of binary tree, we use a special type of binary tree
known as Binary Search Tree. Binary search tree mainly focuses on the search
operation in a binary tree. Binary search tree can be defined as follows...
Binary search tree is a data structure that quickly allows us to maintain a sorted list
of numbers.
It is called a binary tree because each tree node has a maximum of two children.
It is called a search tree because it can be used to search for the presence of a
number in O(log(n)) time.
The properties that separate a binary search tree from a regular binary tree is
1. All nodes of left subtree are less than the root node
2. All nodes of right subtree are more than the root node
27
3. Both subtrees of each node are also BSTs i.e. they have the above two properties
Binary Search Tree is a binary tree in which every node contains only smaller values
in its left subtree and only larger values in its right subtree.
In a binary search tree, all the nodes in the left subtree of any node contains smaller
values and all the nodes in the right subtree of any node contains larger values as
shown in the following figure...
Example
The following tree is a Binary Search Tree. In this tree, left subtree of every node
contains nodes with smaller values and right subtree of every node contains larger
values.
28
Every binary search tree is a binary tree but every binary tree need not to be binary
search tree.
1. Search
2. Insertion
3. Deletion
In a binary search tree, the search operation is performed with O(log n) time
complexity. The search operation is performed as follows...
If the value is below the root, we can say for sure that the value is not in the
right subtree; we need to only search in the left subtree and if the value is
above the root, we can say for sure that the value is not in the left subtree; we
need to only search in the right subtree.
29
Algorithm:
If root == NULL
return NULL;
If number == root->data
return root->data;
If number < root->data
return search(root->left)
If number > root->data
return search(root->right)
30
Insertion Operation in BST
In a binary search tree, the insertion operation is performed with O(log n) time
complexity. In binary search tree, new node is always inserted as a leaf node. The
insertion operation is performed as follows...
Step 1 - Create a newNode with given value and set its left and right to NULL.
Step 2 - Check whether tree is Empty.
Step 3 - If the tree is Empty, then set root to newNode.
Step 4 - If the tree is Not Empty, then check whether the value of newNode
is smaller or larger than the node (here it is root node).
Step 5 - If newNode is smaller than or equal to the node then move to
its left child. If newNode is larger than the node then move to its right child.
Step 6- Repeat the above steps until we reach to the leaf node (i.e., reaches to
NULL).
Step 7 - After reaching the leaf node, insert the newNode as left child if the
newNode is smaller or equal to that leaf node or else insert it as right child.
If node == NULL
return createNode(data)
if (data < node->data)
node->left = insert(node->left, data);
else if (data > node->data)
node->right = insert(node->right, data);
return node;
31
Deletion Operation in BST
In a binary search tree, the deletion operation is performed with O(log n) time
complexity. Deleting a node from Binary search tree includes following three cases...
Case 1: Deleting a leaf node: In the first case, the node to be deleted is the leaf
node. In such a case, simply delete the node from the tree
A. This is the first case of deletion in which you delete a node that has no
children. As you can see in the diagram that 19, 10 and 5 have no children. But
we will delete 19.
B. Delete the value 19 and remove the link from the node.
C. View the new structure of the BST without 19
33
A. This is the second case of deletion in which you delete a node that has 1 child,
as you can see in the diagram that 9 has one child.
B. Delete the node 9 and replace it with its child 10 and add a link from 7 to 10
C. View the new structure of the BST without 9
We use the following steps to delete a node with two children from BST...
34
A. Here you will be deleting the node 12 that has two children
B. The deletion of the node will occur based upon the in order predecessor rule,
which means that the largest element on the left subtree of 12 will replace it.
C. Delete the node 12 and replace it with 10 as it is the largest value on the left
subtree
D. View the new structure of the BST after deleting 12
35
C++ PROGRAM TO IMPLEMENT BINARY SEARCH TREE OPERATIONS IN C++
#include<iostream>
class BST {
struct node {
int data;
node* left;
node* right;
};
node* root;
if(t == NULL)
t = new node;
t->data = x;
return t;
node* findMin(node* t)
36
if(t == NULL)
return NULL;
return t;
else
return findMin(t->left);
node* findMax(node* t) {
if(t == NULL)
return NULL;
return t;
else
return findMax(t->right);
node* temp;
if(t == NULL)
return NULL;
37
temp = findMin(t->right);
t->data = temp->data;
else
temp = t;
if(t->left == NULL)
t = t->right;
t = t->left;
delete temp;
return t;
void inorder(node* t) {
if(t == NULL)
return;
inorder(t->left);
inorder(t->right);
if(t == NULL)
return NULL;
38
else if(x < t->data)
else
return t;
public:
BST()
root = NULL;
void insert(int x)
void remove(int x)
void display()
inorder(root);
39
void search(int x)
};
int main()
BST t;
t.insert(20);
t.insert(25);
t.insert(15);
t.insert(10);
t.insert(30);
t.display();
t.remove(20);
t.display();
t.remove(25);
t.display();
t.remove(30);
t.display();
return 0;
40
BALANCED BINARY SEARCH TREES:
The disadvantage of a binary search tree is that its height can be as large as N-1 .This
means that the time needed to perform insertion and deletion and many other
operations can be O(N) in the worst case . We want a tree with small height A
binary tree with N node has height at least (log N) .Thus, our goal is to keep the
height of a binary search tree O(log N) .Such trees are called balanced binary search
trees. Examples are AVL tree, and red-black tree.
AVL TREES
The first (and simplest) data structure to be discovered for which this could be
achieved is the AVL tree. It takes longer (on average) to insert and delete in an AVL
tree, since the tree must remain balanced, but it is faster (on average) to retrieve.
A binary tree is said to be balanced if, the difference between the heights of left and
right subtrees of every node in the tree is either -1, 0 or +1. In other words, a binary
tree is said to be balanced if the height of left and right children of every node differ
by either -1, 0 or +1.
An AVL tree must have the following properties:
• It is a binary search tree.
• For each node in the tree, the height of the left subtree and the height of the right
subtree differ by at most one (the balance property).
The height of each node is stored in the node to facilitate determining whether this
is the case. The height of an AVL tree is logarithmic in the number of nodes. This
allows insert/delete/retrieve to all be performed in O(log n) time.
Balance factor of a node is the difference between the heights of the left and right
subtrees of that node. The balance factor of a node is calculated either height of left
subtree - height of right subtree (OR) height of right subtree - height of left subtree.
In the following explanation, we calculate as follows...
Balance factor = heightOfLeftSubtree - heightOfRightSubtree
41
Example of AVL Tree
The above tree is a binary search tree and every node is satisfying balance factor
condition. So this tree is said to be an AVL tree.
Every AVL Tree is a binary search tree but every Binary Search Tree need not be AVL
tree.
The key to an AVL tree is keeping it balanced when an insert or delete operation is
performed.
In AVL tree, after performing operations like insertion and deletion we need to
check the balance factor of every node in the tree. If every node satisfies the balance
factor condition then we conclude the operation otherwise we must make it
balanced. Whenever the tree becomes imbalanced due to any operation we
use rotation operations to make the tree balanced.
Rotation is the process of moving nodes either to left or to right to make the tree
balanced.
There are four rotations and they are classified into two types.
42
Single Left Rotation (LL Rotation)
In LL Rotation, every node moves one position to left from the current position. To
understand LL Rotation, let us consider the following insertion operation in AVL
Tree...
43
Right Left Rotation (RL Rotation)
The RL Rotation is sequence of single right rotation followed by single left rotation.
In RL Rotation, at first every node moves one position to right and one position to
left from the current position. To understand RL Rotation, let us consider the
following insertion operation in AVL Tree...
1. Search
2. Insertion
3. Deletion
In an AVL tree, the search operation is performed with O(log n) time complexity.
The search operation in the AVL tree is similar to the search operation in a Binary
search tree. We use the following steps to search an element in AVL tree...
44
Step 6 - If search element is larger, then continue the search process in right
subtree.
Step 7 - Repeat the same until we find the exact element or until the search
element is compared with the leaf node.
Step 8 - If we reach to the node having the value equal to the search value,
then display "Element is found" and terminate the function.
Step 9 - If we reach to the leaf node and if it is also not matched with the
search element, then display "Element is not found" and terminate the
function.
In an AVL tree, the insertion operation is performed with O(log n) time complexity.
In AVL Tree, a new node is always inserted as a leaf node. The insertion operation is
performed as follows...
Step 1 - Insert the new element into the tree using Binary Search Tree insertion
logic.
Step 2 - After insertion, check the Balance Factor of every node.
Step 3 - If the Balance Factor of every node is 0 or 1 or -1 then go for next
operation.
Step 4 - If the Balance Factor of any node is other than 0 or 1 or -1 then that
tree is said to be imbalanced. In this case, perform suitable Rotation to make it
balanced and go for next operation.
The deletion operation in AVL Tree is similar to deletion operation in BST. But after
every deletion operation, we need to check with the Balance Factor condition. If the
tree is balanced after deletion go for next operation otherwise perform suitable
rotation to make the tree Balanced.
45
Example: Construct an AVL Tree by inserting numbers from 1 to 8.
46
B - TREES
In search trees like binary search tree, AVL Tree, Red-Black tree, etc., every node
contains only one value (key) and a maximum of two children. But there is a special
type of search tree called B-Tree in which a node contains more than one value
(key) and more than two children. B-Tree was developed in the year 1972 by Bayer
and McCreight with the name Height Balanced m-way Search Tree. Later it was
named as B-Tree.
B-Tree is a self-balanced search tree in which every node contains multiple keys and
has more than two children.
The need for B-tree arose with the rise in the need for lesser time in accessing the
physical storage media like a hard disk. The secondary storage devices are slower
with a larger capacity. There was a need for such types of data structures that
minimize the disk accesses.
Other data structures such as a binary search tree, avl tree, red-black tree, etc can
store only one key in one node. If you have to store a large number of keys, then
the height of such trees becomes very large and the access time increases.
However, B-tree can store many keys in a single node and can have multiple child
nodes. This decreases the height significantly allowing faster disk accesses.
Here, the number of keys in a node and number of children for a node depends on
the order of B-Tree. Every B-Tree has an order.
47
B-Tree of Order m has the following properties...
Operations on a B-tree
1.insertion
2.deletion
3.searching
1. Starting from the root node, compare k with the first key of the node.
If k = the first key of the node, return the node and the index.
2. If k.leaf = true, return NULL (i.e. not found).
3. If k < the first key of the root node, search the left child of this key recursively.
4. If there is more than one key in the current node and k > the first key, compare k
with the next key in the node.
If k < next key, search the left child of this key (ie. k lies in between the first and the
second keys).
Else, search the right child of the key.
5. Repeat steps 1 to 4 until the leaf is reached.
BtreeSearch(x, k)
i=1
48
while i ≤ n[x] and k ≥ keyi[x] // n[x] means number of keys in x node
do i = i + 1
if leaf [x]
else
return BtreeSearch(ci[x], k)
Searching Example
B-tree
2. k is not found in the root so, compare it with the root key.
49
k is not found on the root node
.
k lies in between 16 and 18
6. k is found.
50
k is found
Searching Complexity on B Tree
Insertion Operation
1. If the tree is empty, allocate a root node and insert the key.
6. Now, there are elements greater than its limit. So, split at the median.
51
7. Push the median key upwards and make the left keys as a left child and the right
keys as a right child.
Insertion Example
53
54
Deletion from a B-tree
Deleting an element on a B-tree consists of three main events: searching the node
where the key to be deleted exists, deleting the key and balancing the tree if
required.
While deleting a tree, a condition called underflow may occur. Underflow occurs
when a node contains less than the minimum number of keys it should hold.
The terms to be understood before studying deletion operation are:
1. Inorder Predecessor
The largest key on the left child of a node is called its inorder predecessor.
2. Inorder Successor
The smallest key on the right child of a node is called its inorder successor.
Deletion Operation
Before going through the steps below, one must know these facts about a B tree of
degree m.
1. A node can have a maximum of m children. (i.e. 3)
Case I
The key to be deleted lies in the leaf. There are two cases for it.
1. The deletion of the key does not violate the property of the minimum number of
keys a node should hold.
55
In the tree below, deleting 32 does not violate the above properties.
1. In the tree below, deleting 32 does not violate the above properties.
2. The deletion of the key violates the property of the minimum number of keys a
node should hold. In this case, we borrow a key from its immediate neighboring
sibling node in the order of left to right.
First, visit the immediate left sibling. If the left sibling node has more than a
minimum number of keys, then borrow a key from this node.
Else, check to borrow from the immediate right sibling node.
In the tree below, deleting 31 results in the above condition. Let us borrow a key
56
from the left sibling node.
57
Deleting 30 results in the above case.
58
1. The internal node, which is deleted, is replaced by an inorder predecessor if the left
child has more than the minimum number of keys.
2. The internal node, which is deleted, is replaced by an inorder successor if the right
child has more than the minimum number of keys.
59
3. If either child has exactly a minimum number of keys then, merge the left and the
right children.
60
1. In this case, the height of the tree shrinks. If the target key lies in an
internal node, and the deletion of the key leads to a fewer number of
keys in the node (i.e. less than the minimum required), then look for the
inorder predecessor and the inorder successor. If both the children
contain a minimum number of keys then, borrowing cannot take place.
This leads to Case II(3) i.e. merging the children.
2. Again, look for the sibling to borrow a key. But, if the sibling also has
only a minimum number of keys then, merge the node with the sibling
along with the parent. Arrange the children accordingly (increasing
order).
5.
6. Deleting an internal node (10)
Deletion Complexity
o Best case Time complexity: Θ(log n)
o Average case Space complexity: Θ(n)
o Worst case Space complexity: Θ(n)
61
C++ Program for Implementation of a B-tree
#include <iostream>
class TreeNode
int *keys;
int t;
TreeNode **C;
int n;
bool leaf;
public:
void traverse();
};
class BTree
TreeNode *root;
int t;
public:
BTree(int temp)
root = NULL;
62
t = temp;
void traverse()
if (root != NULL)
root->traverse();
TreeNode *search(int k)
};
t = t1;
leaf = leaf1;
n = 0;
void TreeNode::traverse()
int i;
63
{
if (leaf == false)
C[i]->traverse();
if (leaf == false)
C[i]->traverse();
TreeNode *TreeNode::search(int k)
int i = 0;
i++;
if (keys[i] == k)
return this;
if (leaf == true)
return NULL;
return C[i]->search(k);
void BTree::insert(int k)
if (root == NULL)
root->keys[0] = k;
64
root->n = 1;
else
if (root->n == 2 * t - 1)
s->C[0] = root;
s->splitChild(0, root);
int i = 0;
if (s->keys[0] < k)
i++;
s->C[i]->insertNonFull(k);
root = s;
else
root->insertNonFull(k);
void TreeNode::insertNonFull(int k) {
int i = n - 1;
if (leaf == true)
65
keys[i + 1] = keys[i];
i--;
keys[i + 1] = k;
n = n + 1;
else
i--;
if (C[i + 1]->n == 2 * t - 1)
if (keys[i + 1] < k)
i++;
C[i + 1]->insertNonFull(k);
z->n = t - 1;
66
if (y->leaf == false) {
y->n = t - 1;
C[j + 1] = C[j];
C[i + 1] = z;
keys[j + 1] = keys[j];
n = n + 1;
int main()
BTree t(3);
t.insert(8);
t.insert(9);
t.insert(10);
t.insert(11);
t.insert(15);
t.insert(16);
t.insert(17);
t.insert(18);
t.insert(20);
67
t.insert(23);
t.traverse();
int k = 10;
(t.search(k) != NULL) ? cout << endl<< k << " is found": cout << endl
<< k << " is not Found";
k = 2;
(t.search(k) != NULL) ? cout << endl<< k << " is found": cout << endl
<< k << " is not Found\n";
B Tree Applications
3. multilevel indexing
68