0% found this document useful (0 votes)
28 views41 pages

Tree Part2

The document discusses M-way search trees, particularly focusing on B-trees and B+ trees, which are specialized data structures designed for efficient data retrieval and manipulation in external memory. It outlines the properties, benefits, and applications of these trees, along with the concept of threaded binary trees that enhance traversal efficiency. Additionally, it covers the complexities and advantages of using threaded binary trees in various applications.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views41 pages

Tree Part2

The document discusses M-way search trees, particularly focusing on B-trees and B+ trees, which are specialized data structures designed for efficient data retrieval and manipulation in external memory. It outlines the properties, benefits, and applications of these trees, along with the concept of threaded binary trees that enhance traversal efficiency. Additionally, it covers the complexities and advantages of using threaded binary trees in various applications.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 41

Tree-part2

Why we need M-way search tree


To make faster retrieval and manipulation of data stored in external
memory such as hard disk there is a need of special data structure.
In this, file access are minimize because of its restricted height. So it is
essential that height of a tree should be kept as ,low as possible.
Therefore in M-way search tree height becoming smaller and less
traversing is required.
M-way search tree (Multi-way )
Properties of M-way search tree of order m:

M-way search tree are generalize version of BST. Tree having m-1 keys
and m children are known as m-way search tree.
A binary tree is 2-way Tree , i.e. it has (m-1)= (2-1) =1 key and 2
children
A binary tree is also known as m-way tree of order 2.
 however in m-way search tree of height ‘h’ calls for O(h) number of
access for insert/delete/ retrieval operations.
h

3-way search tree


M-way tree
Benefits:
 Fast information retrieval
 Fast update

Problems:
• Tree is not balanced
• Leaf nodes are on different levels
• Bad space usage tree can become skew.
B tree
• B Tree is a specialized m-way tree that can be widely used for disk access. A B-Tree of order
m can have at most m-1 keys and m children. One of the main reason of using B tree is its
capability to store large number of keys in a single node and large key values by keeping the
height of the tree relatively small.
• A B tree of order m contains all the properties of an M way tree. In addition, it contains the
following properties.
• Every node in a B-Tree contains at most m children.
• Every node in a B-Tree except the root node and the leaf node contain at least m/2 children.
• The root nodes must have at least 2 nodes.
• All leaf nodes must be at the same level.
• It is not necessary that, all the nodes contain the same number of children but, each node
must have m/2 number of nodes.
B tree of order 4 is shown in the
following image
B-Tree (Balance m way search tree)
Example
Construct B-tree of order 5
10, 20, 50, 60, 40, 80, 100, 70, 130, 90, 30, 120, 140, 25, 35, 160, 180
B-Tree (Balance m way search tree)
Example
Construct B-tree of order 5
10, 20, 50, 60, 40, 80, 100, 70, 130, 90, 30, 120, 140, 25, 35, 160, 180

Remember
m-1 keys

m children
m order tree
Examples
• Construct B tree of order 4
1. Z, U, A, I, W, L, P, X, C, J D , M, T, B , Q , E, H, S, K (m=4)
2. M, Q, A, N, P, W, X, T, G, E, J (m=3)
3. 20, 15, 10, 5, 8, 30, 1, 40 (m=3)
4. 2, 5, 10, 1, 6, 9, 4, 3, 12, 18, 20, 25 (m=4)
5. 78, 21, 14, 11, 97, 85, 74, 63, 45, 42, 57, 20, 16, 19, 32, 30, 31 (m=5)
B Tree -Application
• B tree is used to index the data and provides fast access to the actual
data stored on the disks since, the access to value stored in a large
database that is stored on a disk is a very time consuming process.
• Searching an un-indexed and unsorted database containing n key
values needs O(n) running time in worst case. However, if we use B
Tree to index this database, it will be searched in O(log n) time in
worst case.
Extended Binary tree
• Extended binary tree consists of replacing every null subtree of the
original tree with special nodes.
• Empty circle represents internal node and filled circle represents
external node.
• The nodes from the original tree are internal nodes and the special
nodes are external nodes.
• Every internal node in the extended binary tree has exactly two
children and every external node is a leaf. It displays the result which
is a complete binary tree.
Extended binary tree
Huffman coding
• Huffman coding is a lossless data compression algorithm.
• Huffman coding provides codes to characters such that the length of the code depends
on the relative frequency or weight of the corresponding character. Huffman codes are of
variable-length, and without any prefix (that means no code is a prefix of any other). Any
prefix-free binary code can be displayed or visualized as a binary tree with the encoded
characters stored at the leaves.

• Huffman tree or Huffman coding tree defines as a full binary tree in which each leaf of
the tree corresponds to a letter in the given alphabet.

• The Huffman tree is treated as the binary tree associated with minimum external path
weight that means, the one associated with the minimum sum of weighted path lengths
for the given set of leaves. So the goal is to construct a tree with the minimum external
path weight.
An example is given below-
• Letter frequency table
Letter z k m c u d l e
Frequency 2 7 24 32 37 42 42 120
Huffman code

Letter Freq Code Bits


e 120 0 1
d 42 101 3
l 42 110 3
u 37 100 3
c 32 1110 4
m 24 11111 5
k 7 111101 6
z 2 111100 6
Algorithm for creating the Huffman
Tree-
• Step 1- Create a leaf node for each character and build a min heap using all the nodes (The
frequency value is used to compare two nodes in min heap)

• Step 2- Repeat Steps 3 to 5 while heap has more than one node

• Step 3- Extract two nodes, say x and y, with minimum frequency from the heap

• Step 4- Create a new internal node z with x as its left child and y as its right child. Also
frequency(z)= frequency(x)+frequency(y)

• Step 5- Add z to min heap

• Step 6- Last node in the heap is the root of Huffman tree


B+ Tree
• B+ Tree is an m-way search Tree in which internal nodes behave like the indexes and leaf
nodes as data nodes.
• Properties of B +Tree For any order-m B +Tree:
1. Each node has at most m children.
2. Each node has at most m-1keys.
3. A node with k children has k-1 keys. (1<=k<=m)
4. All leaf nodes are at the same level
5. Every node (except root node) has a restriction of containing at least (m-1)/2 keys.
6. The root node can have the number of keys less than (m-1)/2 but at least one key.
7. Root has at least had two children (if it is not leaf).
8. Keys in the nodes are arranged in non-descending order (k1<=K2<=K3<= … <=Km)
9. Leaf nodes are connected with each other.
Difference between B-Tree and
B+Tree
• In B-Tree data is stored in leaf node as well as in the internal node. In
B+ tree data is stored in leaf node only.
• Searching is slower in B-Tree, while searching is faster in B+ tree.
• Deletion is complex in B-Tree as compared to deletion performed in
B+ tree, which is quite simple.
• In B+ tree all leaf nodes are connected together like a linked list
Insertion in B+Tree
• Let us take an example where we insert the following keys in B+Tree
of
order 3:- 5, 10, 12, 14, 13, 1, 2, 3, 4

• Minimum children will be 2


• Minimum keys will be 1
• Maximum children will be 3
• Max keys will be 2
• Step 1 Insert 5 Since key 5 is the first value to be inserted, it can easily be
placed into a node.
• Step 2 Insert 10

Step 3 Insert 13

Step 4 Insert 12
Header nodes
• The header node does not contain any data part and its left link field
points to the root node and its right link field points to itself. If this
header node is included in the two-way threaded Binary tree then this
node becomes the inorder predecessor of the first node and inorder
successor of the last node.
What do you mean by Threaded
Binary Tree?
• In the linked representation of binary trees, more than one half of the
link fields contain NULL values which results in wastage of storage
space. If a binary tree consists of n nodes then n+1 link fields contain
NULL values. So in order to effectively manage the space, a method
was devised by Perlis and Thornton in which the NULL links are
replaced with special links known as threads. Such binary trees with
threads are known as threaded binary trees. Each node in a threaded
binary tree either contains a link to its child node or thread to other
nodes in the tree.
Threaded Binary tree
• Threaded binary tree is a simple binary tree but they have a speciality
that null pointers of leaf node of the binary tree is set to inorder
predecessor or inorder successor.
• The main idea behind setting such a structure is to make the inorder
and preorder traversal of the tree faster without using any additional
data structure(e.g auxilary stack) or memory to do the traversal.
Types of Threaded Binary Tree
There are two types of threaded binary tree:
• Single Threaded Binary Tree
• Double Threaded Binary Tree
Single Threaded Binary Tree: Here only the right NULL pointer are
made to point to inorder successor.
Double Threaded Binary Tree: Here both the right as well as the left
NULL pointers are made to point inorder successor and inorder
predecessor respectively. (here the left threads are helpful in reverse
inorder traveral of the tree )
Double threaded tree
Two way ( Double threading) with
Header Node
Structure of node in threaded binary tree
• A node in threaded binary tree has two additional attributes:
• rightThread
• leftThread
• Both new attributes are of type bolean.
Significance of bool variable
(leftThread and rightThread) in
structure
• if we have some address stored in some node to diffrentiate whether that
address is of parent node or of child node we use leftThread and rightThread
bool variables.
• leftThread and rightThread bool variables stores whether left and right
pointers point to child node or some ancestor node , if the bool variable is set
to true that means pointer is pointing to child node and if it is set to 1 that
means that pointer is pointing to parent node.
• for example:
• let's us say for some node right pointer is pointing to some node and
righThread is set to true, this means that it is pointing to it's children, but if in
the same case if rightThread is set to false this means that it is pointing to it's
parent node (and not child ).
What happens with righmost
and leftmost null nodes ?
• when we create a threaded binary tree the left most and rightmost
pointers do not have inorder predecessor or inorder successor so they
are made to point to a dummy node as you can see in the image and
leftThread of leftmost node and rightThread of rightmost node is set
to false.
Applications of Threaded Binary
Tree
• The idea of threaded binary trees is to make inorder traversal of the
binary tree faster and do it without using any extra space, so
sometimes in small systems where hardware is very limited we use
threaded binary tree for better efficiency of the software in a limited
hardware space.
Advantages of Threaded Binary
Tree
• In this Tree it enables linear traversal of elements.
• It eliminates the use of stack as it perform linear traversal, so save memory.
• Enables to find parent node without explicit use of parent pointer
• Threaded tree give forward and backward traversal of nodes by in-order fashion
• Nodes contain pointers to in-order predecessor and successor
• For a given node, we can easily find inorder predecessor and successor. So, searching is much
more easier.
• In threaded binary tree there is no NULL pointer present. Hence memory wastage in occupying
NULL links is avoided.
• The threads are pointing to successor and predecessor nodes. This makes us to obtain
predecessor and successor node of any node quickly.
• There is no need of stack while traversing the tree, because using thread links we can reach to
previously visited nodes.
Disadvantages of Threaded
Binary Tree
• Every node in threaded binary tree need extra information(extra memory) to indicate whether its left
or right node indicated its child nodes or its inorder predecessor or successor. So, the node consumes
extra memory to implement.
• Insertion and deletion are way more complex and time consuming than the normal one since both
threads and ordinary links need to be maintained.
• Implementing threads for every possible node is complicated.
• Increased complexity: Implementing a threaded binary tree requires more complex algorithms and
data structures than a regular binary tree. This can make the code harder to read and debug.
• Extra memory usage: In some cases, the additional pointers used to thread the tree can use up more
memory than a regular binary tree. This is especially true if the tree is not fully balanced, as threading
a skewed tree can result in a large number of additional pointers.
• Limited flexibility: Threaded binary trees are specialized data structures that are optimized for specific
types of traversal. While they can be more efficient than regular binary trees for these types of
operations, they may not be as useful in other scenarios. For example, they cannot be easily modified
(e.g. inserting or deleting nodes) without breaking the threading.
Applications of threaded binary
tree –
• Expression evaluation: Threaded binary trees can be used to evaluate
arithmetic expressions in a way that avoids recursion or a stack. The tree can
be constructed from the input expression, and then traversed in-order or
pre-order to perform the evaluation.
• Database indexing: In a database, threaded binary trees can be used to index
data based on a specific field (e.g. last name). The tree can be constructed
with the indexed values as keys, and then traversed in-order to retrieve the
data in sorted order.
• Disk-based data structures: Threaded binary trees can be used in disk-based
data structures (e.g. B-trees) to improve performance. By threading the tree,
it can be traversed in a way that minimizes disk seeks and improves locality of
reference.
Time and space complexity for
operations
• Time complexity for
• for insertion : log(n)
• for deletion : log(n)
• for seaching : log(n)

• the time required for finding inorder predecessor or successor for a


given node is O(1) provided we are on that node.

You might also like