Dsa Unit 3
Dsa Unit 3
TREE ADT:
Tree data structure is a specialized data structure to store data in
hierarchical manner. It is used to organize and store data in the
computer to be used more effectively. It consists of a central node,
structural nodes, and sub-nodes, which are connected via edges. We
can also say that tree data structure has roots, branches, and leaves
connected.
if (root == nullptr)
return;
inorderTraversal(root->left);
cout << root->data << " ";
inorderTraversal(root->right);
}
int main() {
Node* root = new Node(1);
root->left = new Node(2);
root->right = new Node(3);
root->left->left = new Node(4);
root->left->right = new Node(5);
inorderTraversal(root);
return 0;
}
Output
42513
Preorder Traversal:
Preorder traversal visits the node in the order: Root -> Left -> Right
Algorithm for Preorder Traversal:
Preorder(tree)
• Visit the root.
• Traverse the left subtree, i.e., call Preorder(left->subtree)
• Traverse the right subtree, i.e., call Preorder(right->subtree)
Output
12453
Postorder Traversal:
Postorder traversal visits the node in the order: Left -> Right -> Root
Algorithm for Postorder Traversal:
Algorithm Postorder(tree)
• Traverse the left subtree, i.e., call Postorder(left->subtree)
• Traverse the right subtree, i.e., call Postorder(right->subtree)
• Visit the root
Output
45231
Output
123456
BINARY TREE:
Binary Tree is a non-linear data structure where each node has at
most two children. In this article, we will cover all the basics of Binary
Tree, Operations on Binary Tree, its implementation, advantages,
disadvantages which will help you solve all the problems based on
Binary Tree.
Syntax:
class Node {
public:
int data;
Node* left, * right;
Node(int key) {
data = key;
left = nullptr;
right = nullptr;
}
};
Example:
struct Node{
int data;
Node *left, *right;
Node(int d){
data = d;
left = NULL;
right = NULL;
}
};
int main(){
Node* firstNode = new Node(2);
Node* secondNode = new Node(3);
Node* thirdNode = new Node(4);
Node* fourthNode = new Node(5);
firstNode->left = secondNode;
firstNode->right = thirdNode;
secondNode->left = fourthNode;
return 0;
}
postOrderDFS(node->left);
postOrderDFS(node->right);
cout << node->data << " ";
}
void BFS(Node* root) {
if (root == nullptr) return;
queue<Node*> q;
q.push(root);
while (!q.empty()) {
Node* node = q.front();
q.pop();
cout << node->data << " ";
if (node->left != nullptr) q.push(node->left);
if (node->right != nullptr) q.push(node->right);
}
}
int main() {
Node* root = new Node(2);
root->left = new Node(3);
root->right = new Node(4);
root->left->left = new Node(5);
cout << "In-order DFS: ";
inOrderDFS(root);
cout << "\nPre-order DFS: ";
preOrderDFS(root);
cout << "\nPost-order DFS: ";
postOrderDFS(root);
cout << "\nLevel order: ";
BFS(root);
return 0;
}
Output
In-order DFS: 5 3 2 4
Pre-order DFS: 2 3 5 4
Post-order DFS: 5 3 4 2
Level order: 2 3 4 5
Output
6 is found in the binary tree
Output
Original tree (in-order): 5 3 6 2 4
Expression Tree
The expression tree is a binary tree in which each internal node
corresponds to the operator and each leaf node corresponds to the
operand so for example expression tree for 3 + ((5+9) *2) would be:
Return t.value
A = solve(t.left)
B = solve(t.right)
Examples:
Input: A B C*+ D/
Output: A + B * C / D
The first three symbols are operands, so create tree nodes and push
pointers to them onto a stack as shown below.
In the Next step, an operator ‘*’ will going read, so two pointers to trees
are popped, a new tree is formed and a pointer to it is pushed onto the
stack
In the Next step, an operator ‘+’ will read, so two pointers to trees are
popped, a new tree is formed and a pointer to it is pushed onto the
stack.
Similarly, as above cases first we push ‘D’ into the stack and then in the
last step first, will read ‘/’ and then as previous step topmost element
will pop out and then will be right subtree of root ‘/’ and other nodes will
be right subtree.
Example:
class node {
public:
char value;
node* left;
node* right;
node* next = NULL;
node(char c)
{
this->value = c;
left = NULL;
right = NULL;
}
node()
{
left = NULL;
right = NULL;
}
friend class Stack;
friend class expression_tree;
};
class Stack {
node* head = NULL;
public:
void push(node*);
node* pop();
friend class expression_tree;
};
class expression_tree {
public:
void inorder(node* x)
{
// cout<<"Tree in InOrder Traversal is: "<<endl;
if (x == NULL)
return;
else {
inorder(x->left);
cout << x->value << " ";
inorder(x->right);
}
}
};
void Stack::push(node* x)
{
if (head == NULL) {
head = x;
}
else {
x->next = head;
head = x;
}
}
node* Stack::pop()
{
node* p = head;
head = head->next;
return p;
}
int main()
{
string s = "ABC*+D/";
// If you wish take input from user:
//cout << "Insert Postorder Expression: " << endl;
//cin >> s;
Stack e;
expression_tree a;
node *x, *y, *z;
int l = s.length();
for (int i = 0; i < l; i++) {
if (s[i] == '+' || s[i] == '-' || s[i] == '*'
|| s[i] == '/' || s[i] == '^') {
z = new node(s[i]);
x = e.pop();
y = e.pop();
z->left = y;
z->right = x;
e.push(z);
}
else {
z = new node(s[i]);
e.push(z);
}
}
cout << " The Inorder Traversal of Expression Tree: ";
a.inorder(z);
return 0;
}
Output
The Inorder Traversal of Expression Tree: A + B * C / D
Output:100
Applications of trees:
1. Store hierarchical data, like folder structure, organization
structure, XML/HTML data.
2. Binary Search Tree is a tree that allows fast search, insert,
delete on a sorted data. It also allows finding closest item
3. Heap is a tree data structure which is implemented using
arrays and used to implement priority queues.
4. B-Tree and B+ Tree : They are used to implement indexing
in databases.
5. Syntax Tree: Scanning, parsing, generation of code and
evaluation of arithmetic expressions in Compiler design.
6. K-D Tree: A space partitioning tree used to organize points in
K dimensional space.
7. Trie : Used to implement dictionaries with prefix lookup.
8. Suffix Tree : For quick pattern searching in a fixed text.
9. Spanning Trees and shortest path trees are used in routers
and bridges respectively in computer networks
Applications of BST:
• Self-balancing binary search tree: Self-balancing data
structures such as AVL tree and Red-black tree are the most
useful variations of BSTs. In these variations, we maintain the
height as O(Log n) so that all operations are bounded by
O(Log n). TreeSet and TreeMap in Java (or set and map in
C++) are library implementations of self balancing BSTs.
• Sorted Stream of Data: If we wish to maintain a sorted
stream of data where we wish to have operations like insert,
search, delete and traversal in sorted order, BST is the most
suitable data structure for this case.
• Doubly Ended Priority Queues: With Self Balancing
BSTs, we can extract both maximum and minimum in O (Log
n) time, so when we need a data structure with both operations
supported efficiently, we use self balancing BSTs.
Advantages:
• Fast search: Searching for a specific value in a BST has an
average time complexity of O (log n), where n is the number of
nodes in the tree. This is much faster than searching for an
element in an array or linked list, which have a time complexity
of O(n) in the worst case.
• In-order traversal: BSTs can be traversed in-order, which
visits the left subtree, the root, and the right subtree. This can
be used to sort a dataset.
Disadvantages:
• Skewed trees: If a tree becomes skewed, the time
complexity of search, insertion, and deletion operations will be
O(n) instead of O (log n), which can make the tree inefficient.
• Additional time required: Self-balancing trees require
additional time to maintain balance during insertion and
deletion operations.
• Efficiency: For only search, insert and / or delete operations
only hashing is always preferred over BSTs. However, if we
need to maintain sorted data along with these operations, we
use BST.
The threads are also useful for fast accessing ancestors of a node.
Following diagram shows an example Single Threaded Binary Tree.
The dotted lines represent threads.
struct node {
int data;
struct node* left;
struct node* right;
bool rightThread;
};
or
class Node {
public:
int data;
Node* left;
Node* right;
bool rightThread;
Node(int val){
data = val;
left = NULL;
right = NULL;
rightThread = false;
}
};
Advantages of Threaded Binary Tree
• In this Tree it enables linear traversal of elements.
• It eliminates the use of stack as it performs linear traversal, so
save memory.
• Enables to find parent node without explicit use of parent
pointer
• Threaded tree gives forward and backward traversal of nodes
by in-order fashion
• Nodes contain pointers to in-order predecessor and successor
Disadvantages of Threaded Binary Tree
• Every node in threaded binary tree needs extra information
(extra memory) to indicate whether its left or right node
indicated its child nodes or its inorder predecessor or
successor. So, the node consumes extra memory to
implement.
• Insertion and deletion are way more complex and time
consuming than the normal one since both threads and
ordinary links need to be maintained.
• Implementing threads for every possible node is complicated.
• Increased complexity: Implementing a threaded binary tree
requires more complex algorithms and data structures than a
regular binary tree. This can make the code harder to read and
debug.
• Extra memory usage: In some cases, the additional pointers
used to thread the tree can use up more memory than a
regular binary tree. This is especially true if the tree is not fully
balanced, as threading a skewed tree can result in a large
number of additional pointers.
Applications of threaded binary tree
• Expression evaluation: Threaded binary trees can be
used to evaluate arithmetic expressions in a way that avoids
recursion or a stack. The tree can be constructed from the
input expression, and then traversed in-order or pre-order to
perform the evaluation.
• Database indexing: In a database, threaded binary trees
can be used to index data based on a specific field (e.g. last
name). The tree can be constructed with the indexed values as
keys, and then traversed in-order to retrieve the data in sorted
order.
• Symbol table management: In a compiler or interpreter,
threaded binary trees can be used to store and manage
symbol tables for variables and functions. The tree can be
constructed with the symbols as keys, and then traversed in-
order or pre-order to perform various operations on the symbol
table.
• Disk-based data structures: Threaded binary trees can
be used in disk-based data structures (e.g. B-trees) to improve
performance. By threading the tree, it can be traversed in a
way that minimizes disk seeks and improves locality of
reference.
• Navigation of hierarchical data: In certain applications,
threaded binary trees can be used to navigate hierarchical
data structures, such as file systems or web site directories.
The tree can be constructed from the hierarchical data, and
then traversed in-order or pre-order to efficiently access the
data in a specific order.
AVL Tree Data Structure
An AVL tree defined as a self-balancing Binary Search Tree (BST)
where the difference between heights of left and right subtrees for any
node cannot be more than one.
The difference between the heights of the left subtree and the right
subtree for any node is known as the balance factor of the node.
The AVL tree is named after its inventors, Georgy Adelson-Velsky and
Evgenii Landis, who published it in their 1962 paper “An algorithm for
the organization of information”.
Example of AVL Trees:
The above tree is AVL because the differences between the heights of
left and right subtrees for every node are less than or equal to 1.
Operations on an AVL Tree:
• Insertion
• Deletion
• Searching
Rotating the subtrees in an AVL Tree:
An AVL tree may rotate in one of the following four ways to keep itself
balanced:
Left Rotation:
When a node is added into the right subtree of the right subtree, if the
tree gets out of balance, we do a single left rotation.
Right Rotation:
If a node is added to the left subtree of the left subtree, the AVL tree
may get out of balance, we do a single right rotation.
Left-Right Rotation:
A left-right rotation is a combination in which first left rotation takes
place after that right rotation executes.
Right-Left Rotation:
A right-left rotation is a combination in which first right rotation takes
place after that left rotation executes.
Advantages of AVL Tree:
1. AVL trees can self-balance themselves and therefore provides
time complexity as O (Log n) for search, insert and delete.
2. It is a BST only (with balancing), so items can be traversed in
sorted order.
3. Since the balancing rules are strict compared to Red Black
Tree, AVL trees in general have relatively less height and
hence the search is faster.
4. AVL tree is relatively less complex to understand and
implement compared to Red Black Trees.
B-Tree
Meet the B-Tree, the multi-talented data structure that can handle
massive amounts of data with ease. When it comes to storing and
searching large amounts of data, traditional binary search trees can
become impractical due to their poor performance and high memory
usage. B-Trees, also known as B-Tree or Balanced Tree, are a type of
self-balancing tree that was specifically designed to overcome these
limitations.
1. Search O(log n)
2. Insert O(log n)
3. Delete O(log n)
Properties of B-Tree:
• All leaves are at the same level.
• B-Tree is defined by the term minimum degree ‘t ‘. The value
of ‘t ‘depends upon disk block size.
• Every node except the root must contain at least t-1 keys. The
root may contain a minimum of 1 key.
• All nodes (including root) may contain at most (2*t – 1) keys.
• Number of children of a node is equal to the number of keys in
it plus 1.
• All keys of a node are sorted in increasing order. The child
between two keys k1 and k2 contains all keys in the range
from k1 and k2.
• B-Tree grows and shrinks from the root which is unlike Binary
Search Tree. Binary Search Trees grow downward and also
shrink from downward.
• Like other balanced Binary Search Trees, the time complexity
to search, insert, and delete is O (log n).
• Insertion of a Node in B-Tree happens only at Leaf Node.
Following is an example of a B-Tree of minimum order 5
Traversal in B-Tree:
Traversal is also similar to Inorder traversal of Binary Tree. We start
from the leftmost child, recursively print the leftmost child, then repeat
the same process for the remaining children and keys. In the end,
recursively print the rightmost child.
Search Operation in B-Tree:
Search is similar to the search in Binary Search Tree. Let the key to be
searched is k.
• Start from the root and recursively traverse down.
• For every visited non-leaf node,
o If the node has the key, we simply return the
node.
o Otherwise, we recur down to the appropriate
child (The child which is just before the first
greater key) of the node.
• If we reach a leaf node and don’t find k in the leaf node, then
return NULL.
Searching a B-Tree is similar to searching a binary tree. The algorithm
is similar and goes with recursion. At each level, the search is
optimized as if the key value is not present in the range of the parent,
then the key is present in another branch. As these values limit the
search they are also known as limiting values or separation values. If
we reach a leaf node and don’t find the desired key then it will display
NULL.
Algorithm for Searching an Element in a B-Tree: -
struct Node {
int n;
int key[MAX_KEYS];
Node* child[MAX_CHILDREN];
bool leaf;
};
Node* BtreeSearch(Node* x, int k) {
int i = 0;
while (i < x->n && k > x->key[i]) {
i++;
}
if (i < x->n && k == x->key[i]) {
return x;
}
if (x->leaf) {
return nullptr;
}
return BtreeSearch(x->child[i], k);
}
Examples:
Input: Search 120 in the given B-Tree.
Solution:
In this example, we can see that our search was reduced by just
limiting the chances where the key containing the value could be
present. Similarly, if within the above example we’ve to look for 180,
then the control will stop at step 2 because the program will find that
the key 180 is present within the current node. And similarly, if it’s to
seek out 90 then as 90 < 100 so it’ll go to the left subtree automatically,
and therefore the control flow will go similarly as shown within the
above example.
Program:
class BTreeNode {
int* keys;
int t;
BTreeNode** C;
int n;
bool leaf;
public:
BTreeNode(int _t, bool _leaf);
void traverse();
BTreeNode*
search(int k);
friend class BTree;
};
class BTree {
BTreeNode* root;
int t;
public:
BTree(int _t)
{
root = NULL;
t = _t;
}
void traverse()
{
if (root != NULL)
root->traverse();
}
BTreeNode* search(int k)
{
return (root == NULL) ? NULL : root->search(k);
}
};
BTreeNode::BTreeNode(int _t, bool _leaf)
{
t = _t;
leaf = _leaf;
keys = new int[2 * t - 1];
C = new BTreeNode*[2 * t];
n = 0;
}
void BTreeNode::traverse()
{
int i;
for (i = 0; i < n; i++) {
if (leaf == false)
C[i]->traverse();
cout << " " << keys[i];
}
if (leaf == false)
C[i]->traverse();
}
BTreeNode* BTreeNode::search(int k)
{
int i = 0;
while (i < n && k > keys[i])
i++;
if (keys[i] == k)
return this;
if (leaf == true)
return NULL;
return C[i]->search(k);
}
B+ Tree
B + Tree is a variation of the B-tree data structure. In a B + tree, data
pointers are stored only at the leaf nodes of the tree. In a B+ tree
structure of a leaf node differs from the structure of internal nodes. The
leaf nodes have an entry for every value of the search field, along with a
data pointer to the record (or to the block that contains this record). The
leaf nodes of the B+ tree is linked together to provide ordered access to
the search field to the records. Internal nodes of a B+ tree are used to
guide the search. Some search field values from the leaf nodes are
repeated in the internal nodes of the B+ tree.
Features of B+ Trees
• Balanced: B+ Trees are self-balancing, which means that as
data is added or removed from the tree, it automatically adjusts
itself to maintain a balanced structure. This ensures that the
search time remains relatively constant, regardless of the size
of the tree.
• Multi-level: B+ Trees are multi-level data structures, with a
root node at the top and one or more levels of internal nodes
below it. The leaf nodes at the bottom level contain the actual
data.
• Ordered: B+ Trees maintain the order of the keys in the tree,
which makes it easy to perform range queries and other
operations that require sorted data.
• Fan-out: B+ Trees have a high fan-out, which means that
each node can have many child nodes. This reduces the
height of the tree and increases the efficiency of searching and
indexing operations.
• Cache-friendly: B+ Trees are designed to be cache-friendly,
which means that they can take advantage of the caching
mechanisms in modern computer architectures to improve
performance.
• Disk-oriented: B+ Trees are often used for disk-based
storage systems because they are efficient at storing and
retrieving data from disk.
Implementation of B+ Tree
In order, to implement dynamic multilevel indexing, B-tree and B+ tree is
generally employed. The drawback of the B-tree used for indexing,
however, is that it stores the data pointer (a pointer to the disk file block
containing the key value), corresponding to a particular key value, along
with that key value in the node of a B-tree. This technique greatly reduces
the number of entries that can be packed into a node of a B-tree, thereby
contributing to the increase in the number of levels in the B-tree, hence
increasing the search time of a record. B+ tree eliminates the above
drawback by storing data pointers only at the leaf nodes of the tree. Thus,
the structure of the leaf nodes of a B+ tree is quite different from the
structure of the internal nodes of the B tree. It may be noted here that,
since data pointers are present only at the leaf nodes, the leaf nodes
must necessarily store all the key values along with their corresponding
data pointers to the disk file block, in order to access them.
Moreover, the leaf nodes are linked to providing ordered access to the
records. The leaf nodes, therefore form the first level of the index, with
the internal nodes forming the other levels of a multilevel index. Some of
the key values of the leaf nodes also appear in the internal nodes, to
simply act as a medium to control the searching of a record. From the
above discussion, it is apparent that a B+ tree, unlike a B-tree, has two
orders, ‘a’ and ‘b’, one for the internal nodes and the other for the external
(or leaf) nodes.
Structure of B+ Trees
Insertion in B+ Trees
Insertion in B+ Trees is done via the following steps.
• Every element in the tree has to be inserted into a leaf node.
Therefore, it is necessary to go to a proper leaf node.
• Insert the key into the leaf node in increasing order if there is
no overflow.
Deletion in B+Trees
Deletion in B+ Trees is just not deletion but it is a combined process of
Searching, Deletion, and Balancing. In the last step of the Deletion
Process, it is mandatory to balance the B+ Trees, otherwise, it fails in the
property of B+ Trees.
Advantages of B+Trees
• A B+ tree with ‘l’ levels can store more entries in its internal
nodes compared to a B-tree having the same ‘l’ levels. This
accentuates the significant improvement made to the search
time for any given key. Having lesser levels and the presence
of Pnext pointers imply that the B+ trees is very quick and
efficient in accessing records from disks.
• Data stored in a B+ tree can be accessed both sequentially
and directly.
• It takes an equal number of disk accesses to fetch records.
• B+trees have redundant search keys, and storing search keys
repeatedly is not possible.
Disadvantages of B+ Trees
• The major drawback of B-tree is the difficulty of traversing the
keys sequentially. The B+ tree retains the rapid random-access
property of the B-tree while also allowing rapid sequential
access.
Application of B+ Trees
• Multilevel Indexing
• Faster operations on the tree (insertion, deletion, search)
• Database indexing
Min-Heap:
In this heap, the value of the root node must be the smallest among all
its child nodes and the same thing must be done for its left and right
sub-tree also.
Properties of Heap:
Heap has the following Properties:
• Complete Binary Tree: A heap tree is a complete binary
tree, meaning all levels of the tree are fully filled except
possibly the last level, which is filled from left to right. This
property ensures that the tree is efficiently represented using
an array.
• Heap Property: This property ensures that the minimum (or
maximum) element is always at the root of the tree according
to the heap type.
• Parent-Child Relationship: The relationship between a
parent node at index ‘i’ and its children is given by the
formulas: left child at index 2i+1 and right child at
index 2i+2 for 0-based indexing of node numbers.
• Efficient Insertion and Removal: Insertion and removal
operations in heap trees are efficient. New elements are
inserted at the next available position in the bottom-rightmost
level, and the heap property is restored by comparing the
element with its parent and swapping if necessary. Removal of
the root element involves replacing it with the last element and
heapifying down.
• Efficient Access to Extremal Elements: The minimum
or maximum element is always at the root of the heap, allowing
constant-time access.
Heapify:
It is the process to rearrange the elements to maintain the property of
heap data structure. It is done when a certain node creates an
imbalance in the heap due to some operations on that node. It
takes O(log N) to balance the tree.
• For max-heap, it balances in such a way that the maximum
element is the root of that binary tree and
• For min-heap, it balances in such a way that the minimum
element is the root of that binary tree.
Insertion:
Now if we delete 15 into the heap it will be replaced by leaf node of the
tree for temporary.
3
/ \
5 7
/
2
removeMin or removeMax:
This operation returns and deletes the maximum element and minimum
element from the max-heap and min-heap respectively. In short, it
deletes the root element of the heap binary tree.