Data Structure Lecture 7 Tree
Data Structure Lecture 7 Tree
Tree
This structure is mainly used to represent data containing a hierarchical relationship between elements e.g. record, tree
A node that has a child is called the child's parent node (or ancestor node, or superior). A node has at most one parent. The topmost node in a tree is called the root node.
A node that has no children is called a leaf, and that node is of course at the bottommost level of the tree.
Cont..
The height of a node is the length of the longest path to a leaf from that node. The height of the root is the height of the tree.
Cont..
An empty tree has a height of zero. A single child tree is a tree of height 1.
According to the definition of trees, a node can have any number of children.
Binary tree
A binary tree is composed of zero or more nodes Each node contains:
A value (some sort of data item) A reference or pointer to a left child (may be null), and A reference or pointer to a right child (may be null)
A binary tree may be empty (contain no nodes) If not empty, a binary tree has a root node
Every node in the binary tree is reachable from the root node by a unique path
A node with neither a left child nor a right child is called a leaf
In some binary trees, only the leaves contain a value
d g h
e i l
a b c
d
g h
e
i j
f
k
a is at depth zero
e is at depth 2
Cont..
A complete binary tree is one where all the levels are full with exception to the last level and it is filled from left to right.
A full binary tree is one where if a node has a child, then it has two children.
Balance
a b d e f c g d c f b e a
A binary tree is balanced if every level above the lowest is full (contains 2n nodes) In most applications, a reasonably balanced binary tree is desirable
Inorder Traversal: 1. Traverse left subtree 2. Visit the root 3. Traverse right subtree Postorder Traversal: 1. Traverse left subtree 2. Traverse right subtree 3. Visit the root
10
12
Example
[a+(b-c)]*[(d-e)/(f+g-h)]
Inorder traversal
left, Root, right. infix expression
a+b*c+d*e+f*g
The basic idea behind this data structure is to have such a storing repository that provides the efficient way of data sorting, searching and retriving.
Cont..
In the following tree all nodes in the left subtree of 10 have keys < 10 while all nodes in the right subtree > 10. Because both the left and right subtrees of a BST are again search trees; the above definition is recursively applied to all internal nodes:
Searching BST
If we are searching for 15, then we are done. If we are searching for a key < 15, then we should search in the left subtree. If we are searching for a key > 15, then we should search in the right subtree.
Searching in a BST always starts at the root. We compare a data stored at the root with the key we are searching for.
If the node does not contain the key we proceed either to the left or right child depending upon comparison. If the result of comparison is negative we go to the left child, otherwise - to the right child. The recursive structure of a BST yields a recursive algorithm.
Searching in a BST
Exercise.
Given a sequence of numbers: 11, 6, 8, 19, 4, 10, 5, 17, 43, 49, 31 Draw a binary search tree by inserting the above numbers from left to right.
Insertion
The insertion procedure is quite similar to searching. We start at the root and recursively go down the tree searching for a location in a BST to insert a new node. If the element to be inserted is already in the tree, we are done (we do not insert duplicates). The new node will always replace a NULL reference.
BST Insertion
Insert 7:
Deletion is somewhat more tricky than insertion. There are several cases to consider. A node to be deleted (let us call it as Delete) is not in a tree; is a leaf; has only one child; has two children. If Delete is not in the tree, there is nothing to delete. If Delete node has only one child the procedure of deletion is identical to deleting a node from a linked
Deletion
cont..
Delete 9:
Cont..
If delete node has 2 child then find inorder successor of node and replace it.
B Tree
A node of a tree may contain many records or key and pointers to the children. A B-Tree is also known as the balanced sort tree. To reduce disk accesses several conditions must be true: The height of the tree must be kept to a minimum
Cont.
DEFINITION: A B-tree of order m is an m-way search tree in Which, the root has at most m children, but may have as few as 2 if it is not a leaf, or none if the tree consists of the root alone.
Cont..
There must be no empty sub trees above the leaves of the tree. All leaves are on the same level. All nodes except the leaves must have at least some minimum no. of children. B tree of order m has the following properties: Each node has maximum of m children or minimum of m/2 children or any no. from 2 to the maximum.
Cont..
Each node has one fewer keys than with a maximum of m-1 keys. Keys are arranged in a defined order within the node. When a new key is to be inserted into a full node, they split into 2 nodes and key with the median value is inserted in the parent node In case parent node is the root, a new root is created
Operations
B-Tree of order 4
Each node has at most 4 pointers and 3 keys, and at least 2 pointers and 1 key.
Insert 5, 3, 21
*5* a
*3*5*
* 3 * 5 * 21 *
Insert 9
*9*
b *3*5* * 21 * a c
Insert 1, 13
*9*
b *1*3*5* * 13 * 21 * a c
Insert 2
*3*9*
b *1*2* d *5* * 13 * 21 * a c
Insert 7, 10
*3*9*
b *1*2* d *5*7* a c * 10 * 13 * 21 *
Insert 12
* 3 * 9 * 13 *
b *1*2* d *5*7* a c * 10 * 12 * e * 21 *
Insert 4
a
* 3 * 9 * 13 *
b *1*2* d *4*5*7* c * 10 * 12 * e * 21 *
Insert 8
*9* f *3*7* a g * 13 *
b *1*2*
d *4*5*
h *8*
c * 10 * 12 *
e * 21 *
Node d must split into 2 nodes. This causes node a to split into 2 nodes and the tree grows a level.
Delete 2
*9* f *3*7* * 13 *
a
g
b *1*
d *4*5*
h *8*
c * 10 * 12 *
e * 21 *
Delete 21
*9* f *3*7* 12 *
a
g
b *1*
d *4*5*
h *8*
c * 10 *
e * 13 *
Deleting 21 causes node e to underflow, so elements are redistributed between nodes c, g, and e
Delete 10
*3*7*9* a
h *4*5* *8*
e * 12 * 13 *
*1*
Deleting 10 causes node c to underflow. This causes the parent, node g to recombine with nodes f and a. This causes the tree to shrink one level.
Delete 3
*4*7*9* a
h *5* *8*
e * 12 * 13 *
*1*
Because 3 is a pointer to nodes below it, deleting 3 requires keys to be redistributed between nodes a and d.
Delete 4
*7*9* a
b *1*5*
h *8*
e * 12 * 13 *
Deleting 4 requires a redistribution of the keys in the subtrees of 4; however, nodes b and d do not have enough keys to redistribute without causing an underflow. Thus, nodes b and d must be combined.