Unit 4 - Trees
Unit 4 - Trees
UNIT-4: TREES
Basic terminology, binary trees, binary tree traversal, Threaded
Trees, Binary Search Tree, AVL tree, B-tree.
Similar to tree in general world
Finite set of elements
Non-linear data structure
Used to represent hierarchical structures
Eg. Syntax tree, Binary Search Tree, Animal Kingdom
Application: Organization Charts, File system
Tree can also be defined in itself as a node and list of child
trees.
In Computer Science, Trees grow down!
My Computer
C E
D
B D
C
E F
G H
I J K
Root: top node in tree or node with no parent (A is root)
Leaf: node with no child (E, I, J etc. are leaf node)
Interior node: All non-leaf node(B, F, A etc.)
B D
C
E F
G H
I J K
Ancestors: parent, parent of parent and so on. (F, B, A are
ancestors of K)
Descendants: child, child of child and so on. (E, F, I, J, K are
descendants of B)
Subtree: tree consisting of a node and
its descendants
A
B D
C
E F
G H
I J K
Degree of a node: number of its child (degree(A)=3,
degree(B)=2)
Degree of tree: maximum of degrees of all nodes
Level: 1 + number of connection between node and root
(level(F)=3)
B D
C
E F
G H
I J K
Height of node: number of edges between node and farthest
leaf (Height(B)=2)
Height of tree: height of root node (height of tree is 3)
Depth of node: number of edges between root and node (depth
of K is 3)
B D
C
E F
G H
I J K
Every node contains:
Key/data
Children nodes
Parent node(optional)
Data
A
Data
B C D
Left Right
Child Sibling
E F G H I
J K L
Each node has at most two children
The children of a node are ordered pair (a left and a right
child)
Each node contains:
Key
Left A
Right
Parent(optional)
B C
D E F G
H I
Each node has at most K children
Binary tree is a special case with K=2
Eg. 3-ary tree
B J C
D E F G K
H I
Tree traversal (also known as tree search) is the
process of visiting each node in a tree data structure
exactly once in a systematic manner.
There are two types of traversal:
breadth-first search (BFS)
depth-first search (DFS).
Output: A A
Queue(Q): B, C
Output: A B B C
Queue(Q): C, D, E
Output: A B C D E F G
Queue(Q): D, E, F, G
Output: A B C D H I
Queue(Q): E, F, G
Output: A B C D E
Queue(Q): F, G, H, I
Output: A B C D E F A
Queue(Q): G, H, I
Output: A B C D E F G B C
Queue(Q): H, I
Output: A B C D E F G H D E F G
Queue(Q): I
Output: A B C D E F G H I H I
Queue(Q):
Travel All the nodes of one sub-tree of binary search before
travelling other sub-tree
DFS is recursively implemented on a tree to visit nodes
Note: Many of the tree algorithms are recursively
implemented for the reason that tree itself is implemented
recursively
DFS on binary tree can be implemented in 3 ways
PreOrder: Root-Left-Right
InOrder: Left-Root-Right
PostOrder: Left-Right-Root
Visit the root node first
Visit left sub-tree in PreOrder
Visit right sub-tree in PreOrder
PreOrderTraversal(Tree) {
if (isEmpty(Tree)) return;
else {
print (tree->data);
PreOrderTraversal(tree->left);
PreOrderTraversal(tree->right);
}
}
Visit the left sub-tree in InOrder
Visit root node
Visit right sub-tree in InOrder
InOrderTraversal(Tree) {
if (isEmpty(Tree)) return;
else {
InOrderTraversal(tree->left);
print (tree->data);
InOrderTraversal(tree->right);
}
}
Visit the left sub-tree in PostOrder
Visit right sub-tree in PostOrder
Visit root node
PostOrderTraversal(Tree) {
if (isEmpty(Tree)) return;
else {
PostOrderTraversal(tree->left);
PostOrderTraversal(tree->right);
print (tree->data);
}
}
Pre-order: In-order: Post-order:
F, B, A, D, C, E, G, I, H A, B, C, D, E, F, G, H, I A, C, E, D, B, H, I, G, F
https://en.wikipedia.org/wiki/Tree_traversal 21
What will be the order of BFS, PreOrder, InOrder and PostOrder
traversals on below tree?
20
17 53
11 19 23 64
6 16 18 44 61 66
BFS: 20 17 53 11 19 23 64 6 16 18 44 61 66
PreOrder: 20 17 11 6 16 19 18 53 23 44 64 61 66
InOrder: 6 11 16 17 18 19 20 23 44 53 61 64 66
PostOrder: 6 16 11 18 19 17 44 23 61 66 64 53 20
20
17 53
11 19 23 64
6 16 18 44 61 66
Either a node has 2 children or no child in a binary tree
17 17 17
11 19 11 19 11 19
6 16 Full Binary 6 16 18
Tree
Full Binary Not Full Binary Tree
Tree
All internal nodes has two children and all leaves are at same
level/depth
17 17 17
11 19 11 19 11 19
6 16 Full; perfect 18
17 17 17
11 19 11 19 11 19
6 16 Full; 18
Perfect;
Complete
Full; Not full;
Not perfect; Not perfect;
Complete Not Complete
An arithmetic expression can be represented as binary tree
where internal nodes are operators and leaves are operands
Eg. ((6+16) * 19) + (23 + (61-66))
* +
+ 19 23 -
6 16 61 66
Perfect tree is a special case of complete tree with last level
completely filled.
In some literature, Perfect binary tree is referred as Complete
binary tree. In that case, Complete binary tree is referred
Almost Complete binary tree.
Any binary tree can be thought of as a tree obtained by
pruning some nodes of a perfect binary tree
B C
D E F G
J K H I L M N O
Any binary tree can be thought of as a tree obtained by
pruning some nodes of a perfect binary tree
B C
D E F G
J K H I L M N O
Any binary tree can be thought of as a tree obtained by
pruning some nodes of a perfect binary tree
B C
D F G
J L M O
If a binary tree has n nodes and height h, then
Level i has at most 2i-1 nodes
n <= 2h+1 – 1
Hence, h >= log2(n+1)/2 i.e. minimum height of a tree with n nodes
is O(log2n)
Maximum height of a tree with n nodes is n-1 which is obtained
when every non-leaf node has exact one child
Binary tree can also be represented using arrays
Store root node at index 0
Store left child of a parent node is at 2*i+1 where i is the index
of parent node in array
Store right child of a parent node is at 2*i+2 where i is the
index of parent node in array
Parent of a node at index i can be found at (i-1)/2 except for
root node
A B C D E F G H I J
0 1 2 3 4 5 6 7 8 9 10 11 12
B C
D E F G
H I J
If a node doesn’t have a left or/and right child, indices for left
or/and right child are empty
If index of a child is greater than size of array, child of that node
does not exist
BFS or DFS
Worst case time complexity = O(n) where n is the number of
nodes in binary tree
Can we improve searching?
Root is greater than left child and less than right child
Equal keys can be kept in left or right sub-tree
Same strategy should be followed
for whole tree for repeated keys
insertKey(tree, key) {
if tree = NULL
create new node (key, NULL, NULL);
else tree->data < key
tree->right = insertKey(tree->right, key);
else
tree->left = insertKey(tree->left, key);
}
Insert 11
typedef struct Node { //structure of a node
int data;
struct Node left;
struct Node right;
} * Nptr;
Nptr newNode(int data, Nptr left, Nptr right) { //create a new node
Nptr nNode = (Nptr) malloc(sizeof(struct Node));
nNode->data = data;
nNode->left = left;
nNode->right =right;
return nNode;
}
(CONT.)
Nptr insertKey(Nptr tree, int data) {
if (tree == NULL) {
return newNode(data, NULL, NULL);
}
else if (tree->data < data) {
tree->right = insertKey(tree->right, data);
}
else {
tree->left = insertKey(tree->left, data);
}
return tree;
}
searchBST(tree, key) {
if tree = NULL {
print(“Not found”);
return;
}
else if tree->data = key
print(“Key found”);
else if tree->data >key
searchBST(tree->left, key);
else
searchBST(tree->right, key);
}
Nptr searchBST(Nptr tree, int key) { //Search key in tree
if (tree == NULL) {
printf("\nkey not found"); //print not found if key not present
return NULL;
}
else if (tree->data == key) {
printf("\n Key found"); //print found if key not present
return tree;
}
else if (tree->data < key)
return searchBST(tree->right, key); //Search right sub-tree
else
return searchBST(tree->left, key); //Search left sub-tree
}
Search in BST depends on the height of the tree
In worst case,
Height of a binary tree with n nodes = O(n)
Thus, Time complexity of search = O(n)
In general,
Height of a binary tree with n nodes = O(log2 n)
Thus, time complexity of search = O(log2 n)
Simply delete the node
Make parent of node to be deleted to point its child node
Set node to be deleted free (optional)
Find predecessor (i.e. rightmost node in the left sub-tree) or
successor (i.e. leftmost node in the right sub-tree) and replace
data
Delete predecessor node or successor node as chosen above
Create a Binary Search Tree for the following data
1. 50, 60, 25, 40, 30, 70, 35, 10, 55, 65, 5
2. 6, 8, 9, 3, 1, 4, 2, 0, 5, 7
3. 30 , 40, 24, 58, 48, 26, 11, 13
Does the order of inserting
elements into tree matters?
Yes, certain orders might
produce very unbalanced
trees!
Unbalanced trees are not
desirable because search
time increases!
Solution are balanced BST
like AVL tree.
Most of the applications (eg. Search, insert, delete, minimum,
maximum etc.) are dependent on height of BST
Order of insertion of element in BST could generate
unbalanced BST
Therefore, in worst case time complexity can be O(n)
1 8 -1 8 -1 12 1 12
0 6 NO NO 0 9 NO 14 0 0 11 NO
Steps to insert in AVL:
1. Insert in BST (i.e. find proper location and create node)
2. Starting from new node inserted, check upwards for the first
node which is imbalance
3. Re-balance tree by performing rotation(s) based on the type
of node arrangement causing imbalance
1. Left-Left case
2. Right-Right case
3. Left-Right case
4. Right-Left case
Height of node: number of edges between node and farthest
leaf
We can calculate height of any node recursively
height(x) = 1 + max(height(left), height(right))
Height of leaf nodes are 0.
int height(tree) {
if (tree == NULL) {
return -1; //For NULL tree, eases computation
}
else {
return (1 + max(heightBST(tree->left),
heightBST(tree->right)));
}
}
“ht” of leaf nodes are 0. Thus every new node has ht=0
//Algorithm
Nptr insertAVL(Nptr tree, int data) {
1. insert similar to BST
2. This insertion might have caused change in height of
ancestors of new node, thus update “height” of ancestors
3. Check balance factor of every ancestor as we move up in the
AVL Tree
4. if balance factor is greater than 1, test for left-left or left-right
case and fix accordingly
5. if balance factor is less than -1, test for right-right or right-
left case and fix accordingly
}
int getHt(Nptr tree) {
if (tree == NULL)
return -1;
else
return (tree->ht);
}
tree->right = Y;
X->left = tree;
tree->left = Y;
X->right = tree;
10 10 10
8 12 8 12 8 12
6 6 6
4 imbalance 7
node
Therefore, either a leaf or parent of leaf node is deleted
Let w is the node deleted.
Traverse bottom up in AVL tree up to root
42 51 62
1 2 4 7 8 13 15 18 25
27 29 45 46 48 53 55 60 64 70 90
1
keys: 1 12 8 2 25 5 14 28 17 7 52 16 48 68 3 26 29 53 55 45
1 12
keys: 1 12 8 2 25 5 14 28 17 7 52 16 48 68 3 26 29 53 55 45
1 8 12
keys: 1 12 8 2 25 5 14 28 17 7 52 16 48 68 3 26 29 53 55 45
1 2 8 12
keys: 1 12 8 2 25 5 14 28 17 7 52 16 48 68 3 26 29 53 55 45
- Node can not have 5 keys, hence split the node and create new root
with middle key 8
8
1 2 12 25
1 2 5 12 25
keys: 1 12 8 2 25 5 14 28 17 7 52 16 48 68 3 26 29 53 55 45
1 2 5 12 14 25
keys: 1 12 8 2 25 5 14 28 17 7 52 16 48 68 3 26 29 53 55 45
1 2 5 12 14 25 28
keys: 1 12 8 2 25 5 14 28 17 7 52 16 48 68 3 26 29 53 55 45
- Inserting 17 will result in node split. The middle key 17 is promoted to
parent
8 17
1 2 5 12 14 25 28
keys: 1 12 8 2 25 5 14 28 17 7 52 16 48 68 3 26 29 53 55 45
8 17
1 2 5 7 12 14 25 28
keys: 1 12 8 2 25 5 14 28 17 7 52 16 48 68 3 26 29 53 55 45
8 17
1 2 5 7 12 14 25 28 52
keys: 1 12 8 2 25 5 14 28 17 7 52 16 48 68 3 26 29 53 55 45
8 17
1 2 5 7 12 14 16 25 28 52
keys: 1 12 8 2 25 5 14 28 17 7 52 16 48 68 3 26 29 53 55 45
8 17
1 2 5 7 12 14 16 25 28 48 52
keys: 1 12 8 2 25 5 14 28 17 7 52 16 48 68 3 26 29 53 55 45
- Node will be split and middle key 48 will be promoted to parent
8 17 48
1 2 5 7 12 14 16 25 28 52 68
keys: 1 12 8 2 25 5 14 28 17 7 52 16 48 68 3 26 29 53 55 45
- Node will be split and middle key will be promoted to parent
3 8 17 48
1 2 5 7 12 14 16 25 28 52 68
keys: 1 12 8 2 25 5 14 28 17 7 52 16 48 68 3 26 29 53 55 45
3 8 17 48
1 2 5 7 12 14 16 25 26 28 52 68
keys: 1 12 8 2 25 5 14 28 17 7 52 16 48 68 3 26 29 53 55 45
3 8 17 48
1 2 5 7 12 14 16 25 26 28 29 52 68
keys: 1 12 8 2 25 5 14 28 17 7 52 16 48 68 3 26 29 53 55 45
3 8 17 48
1 2 5 7 12 14 16 25 26 28 29 52 53 68
keys: 1 12 8 2 25 5 14 28 17 7 52 16 48 68 3 26 29 53 55 45
3 8 17 48
1 2 5 7 12 14 16 25 26 28 29 52 53 55 68
keys: 1 12 8 2 25 5 14 28 17 7 52 16 48 68 3 26 29 53 55 45
- As 45 can not be accommodated in leaf node, we will split the node
and promote middle key 28 to parent
- Further parent node also needs to be split
- As the parent is also root node, will increase the height of B-tree
17
3 8 28 48
1 2 5 7 12 14 16 25 26 29 45 52 53 55 68
1. Attempt to insert the new key into a leaf
2. If this could result in too many keys in leaf node,
split the leaf node and promote the key upward to
parent node
3. Repeat step 2 all the way to the top
4. If necessary, the root is split in two and the middle
key is promoted to a new root, making the tree one
level higher
Finding the correct location:
In worst case, we might require to compare input “key” with all key
present in one node i.e. m-1 comparison. But m is a constant, thus
time required is O(1)
Total number of levels to be traversed is O(h) or O(log n)
13
5 22
3 18 25 28
8 10
14 15 20 21
1 2 4 6 7 9 11 12 23 24 26 29 30
Delete key in given order: 21, 25, 20, 23, 18
When deletion is done from leaf node, if node satisfies
minimum key requirement, do nothing
13
5 22
3 18 25 28
8 10
14 15 20
1 2 4 6 7 9 11 12 23 24 26 29 30
(CONT’D)
Delete key in given order: 21, 25, 20, 23, 18
When deletion is done from internal node, swap it with
predecessor and delete predecessor
13
5 22
3 18 24 28
8 10
14 15 20
1 2 4 6 7 9 11 12 23 26 29 30
(CONT’D)
Delete key in given order: 21, 25, 20, 23, 18
If deleting a key violates minimum key requirement, borrow
key from adjacent sibling(s)
13
5 22
3 15 24 28
8 10
14 18
1 2 4 6 7 9 11 12 23 26 29 30
(CONT’D)
Delete key in given order: 21, 25, 20, 23, 18
If adjacent sibling(s) can not lend key, merge siblings and
bring down the key in parent node separating them
13
5 22
3 15 28
8 10
14 18
1 2 4 6 7 9 11 12 24 26 29 30
(CONT’D)
Delete key in given order: 21, 25, 20, 23, 18
Moving down key from parent node might result in parent node
violating minimum key requirement. Perform deletion algo at
parent.
13
5 22
3 28
8 10
14 15
1 2 4 6 7 9 11 12 24 26 29 30
(CONT’D)
Delete key in given order: 21, 25, 20, 23, 18
Moving down key from parent node might result in parent node
violating minimum key requirement. Perform deletion algo at
parent.
13
3 22 28
8 10
14 15
1 2 4 6 7 9 11 12 24 26 29 30
(CONT’D)
Delete key in given order: 21, 25, 20, 23, 18
Moving down key from parent node might result in parent node
violating minimum key requirement. Perform deletion algo at
parent.
5 13
3 8 10 22 28
1 2 4 6 7 9 11 12 14 15
24 26 29 30
1. If deletion is from leaf node, delete key and check if
minimum key requirement is satisfied or not. If not then,
1. Try to borrow key from left sibling if left sibling has extra key
2. Try to borrow key from right sibling if right sibling has extra key
3. If not possible, merge left(right if left is null) sibling and key in
parent node separating them
2. If deletion is to be done from internal node, replace with
predecessor and delete predecessor key if possible. Else
replace with successor and delete successor key. Then
repeat step 1 for deletion from leaf
3. Keep repeating steps for ancestors of node where deletion
happened.
Deleting the key from B-tree : O(log n)
Borrowing key from adjacent sibling : O(1)
Merging of siblings and bringing it down : O(1)
In worst case, borrowing and merging might be done for all the
nodes up to root. Therefore, might take O(log n)
Thus, time complexity of deletion : O(log n)
Given 5-way B-tree created by these data (last exercise):
3, 7, 9, 23, 45, 1, 5, 14, 25, 24, 13, 11, 8, 19, 4, 31, 35, 56
5 4 6 9 1
insert(1)
Minimum and delete minimum can be performed in O(1) time.
However, insertion will take O(n) time
1 4 5 6 9
insert(2)
A complete binary tree where every node is greater than or
equal to its respective parent.
Because it is complete binary tree, all leaves at last level are
filled from left most side.
The minimum element is at root
Binary heap is also called min-heap(minimum element at root) .
Binary heap with maximum element at root is called max-heap.
Efficiently represented using arrays.
Definition of a Max Heap
Definition:
A max (min) tree is a tree in which the key value in each
node is no smaller (larger) than the key values in its
children (if any)
A max heap is a complete binary tree
that is also a max tree
h = floor(log2 n)
11 17 13 18 21 19 17 43 23 26 29 31
0 1 2 3 4 5 6 7 8 9 10 11
11
17 13
18 21 19 17
43 23 26 29 31
11
17 13
18 21 19 17
43 23 26 29 31
Insert 12
11
17 13
18 21 19 17
43 23 26 29 31 12
Insert 12
Compare 12 with its parent 19. As 19>12, swap values
11
17 13
18 21 19 17
43 23 26 29 31 12
Insert 12
Compare 12 with its parent 19. As 19>12, swap values
11
17 13
18 21 12 17
43 23 26 29 31 19
Insert 12
Compare 12 with its parent 19. As 19>12, swap values
Compare 12 with its parent 13. As 13>12, swap values
11
17 13
18 21 12 17
43 23 26 29 31 19
Insert 12
Compare 12 with its parent 19. As 19>12, swap values
Compare 12 with its parent 13. As 13>12, swap values
11
17 12
18 21 13 17
43 23 26 29 31 19
Insert 12
Compare 12 with its parent 19. As 19>12, swap values
Compare 12 with its parent 13. As 13>12, swap values
As 11<12, stop.
11
17 12
18 21 13 17
43 23 26 29 31 19
insert 12
insert 8
11
17 12
18 21 13 17
43 23 26 29 31 19
insert 12
insert 8
11
17 12
18 21 13 17
43 23 26 29 31 19 8
insert 12
insert 8
11
17 12
18 21 13 8
43 23 26 29 31 19 17
insert 12
insert 8
11
17 8
18 21 13 12
43 23 26 29 31 19 17
insert 12
insert 8
17 11
18 21 13 12
43 23 26 29 31 19 17
1. Insert node at last level at leftmost available position.
2. Subsequently compare newly inserted node with its parent. If
parent is greater than new node, swap two values.
3. Repeat step 2 for the parent upto root.
Adding a node at right place: O(1)
Swapping two values: O(1)
Number of swap operation: O(log2 n) (height of heap)
Total time complexity = O(log2 n)
As minimum element sits at root, finding minimum will take
constant O(1) time.
To delete minimum, we must reorganize tree after deleting the
root.
To reorganize, we “Heapify” the heap.
Precondition to Heapify:
Let i be the index in array A
Left and right sub-tree of tree rooted at A[i] are heaps
A[i] might be violating the heap property i.e. A[i] might be greater
than its children
Method Heapify make binary tree rooted at index i a heap by
moving A[i] down in the heap
Heapify(A, i) { //i is index where heapification to be done
while(true) {
if (A[i] > left(i) or A[i] > right(i)) {
then swap A[i] with min(left(i), right(i))
if swapped with left
then i= index of left
else if swapped with right
then i= index of right
}
else
break
}
}
Index i=0 is not heap. Left and right sub-tree of i are heap
17
17 11
18 21 13 12
43 23 26 29 31 19
Index i=0 is not heap. Left and right sub-tree of i are heap
Swap min(17, 11) with A[0]
17
17 11
18 21 13 12
43 23 26 29 31 19
Index i=0 is not heap. Left and right sub-tree of i are heap
Swap A[0] with min(17, 11)
11
17 17
18 21 13 12
43 23 26 29 31 19
Index i=0 is not heap. Left and right sub-tree of i are heap
Swap A[0] with min(17, 11)
Index i=2 is not heap. Swap A[2] with min(13, 12)
11
17 17
18 21 13 12
43 23 26 29 31 19
Index i=0 is not heap. Left and right sub-tree of i are heap
Swap A[0] with min(17, 11)
Index i=2 is not heap. Swap A[2] with min(13, 12)
Index i=6 is heap. Break
11
17 12
18 21 13 17
43 23 26 29 31 19
Index i=0 is not heap. Left and right sub-tree of i are heap
Swap A[0] with min(17, 11)
Index i=2 is not heap. Swap A[2] with min(13, 12)
11
17 12
18 21 13 17
43 23 26 29 31 19
1. Delete the minimum node from root
2. Replace root with rightmost node at last level
3. Heapify root i.e. Heapify(a, 0)
Deleting the root and replacing with rightmost node at last
level: O(1)
Performing heapification once: O(1)
In worst case, heapify will run equal to height of heap times.
Thus, O(log2 n) times
Thus, total time complexity O(log2 n)
If we are given an array of n elements, we can build heap in
O(n).
Build-Heap(A,n)
for i=floor(n/2)-1 to 0
run Heapify(A,i)
Heapify (A,6)
23
43 26
10 21 13 31
16 12 8 29 11 19 17
Heapify(A,6)
Heapify(A,5)
23
43 26
10 21 13 17
16 12 8 29 11 19 31
Heapify(A,6)
Heapify(A,5)
Heapify(A,4)
23
43 26
10 21 11 17
16 12 8 29 13 19 31
Heapify(A,6)
Heapify(A,5)
Heapify(A,4)
Heapify(A,3)
23
43 26
10 8 11 17
16 12 21 29 13 19 31
Heapify(A,6)
Heapify(A,5)
Heapify(A,4)
Heapify(A,3)
Heapify(A,2) 23
43 26
10 8 11 17
16 12 21 29 13 19 31
Heapify(A,6)
Heapify(A,5)
Heapify(A,4)
Heapify(A,3)
Heapify(A,2) 23
Heapify(A,1)
43 11
10 8 13 17
16 12 21 29 26 19 31
Heapify(A,6)
Heapify(A,5)
Heapify(A,4)
Heapify(A,3)
Heapify(A,2) 23
Heapify(A,1)
Heapify(A,0) 8 11
10 21 13 17
16 12 43 29 26 19 31
Heapify(A,6)
Heapify(A,5)
Heapify(A,4)
Heapify(A,3)
Heapify(A,2) 8
Heapify(A,1)
Heapify(A,0) 10 11
12 21 13 17
16 23 43 29 26 19 31
For (n+1)/2 nodes, heapify() requires at most 1 swap
For (n+1)/4 nodes, heapify() requires at most 2 swap
For (n+1)/8 nodes, heapify() requires at most 3 swap
For (n+1)/2i nodes, heapify() requires at most i swap