0% found this document useful (0 votes)
46 views40 pages

24-Multi-Level Indexing, Dynamic Multilevel Indexing, B-Tree-11-09-2024

bdf

Uploaded by

Hemesh R
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views40 pages

24-Multi-Level Indexing, Dynamic Multilevel Indexing, B-Tree-11-09-2024

bdf

Uploaded by

Hemesh R
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 40

B and B+ Trees

What is an M-ary Search


Tree?
 Maximum branching factor
of M
 Complete tree has depth =
logMN
 Each internal node in a
complete tree has M - 1
keys
 Binary search tree is a B
Tree where M is 2
B-Trees
 B-Tree is known as a self-balancing tree as its nodes are
sorted in the inorder traversal.
 In B-tree, a node can have more than two children.
 B-tree has a height of logM N (Where ‘M’ is the order of
tree and N is the number of nodes). And the height is
adjusted automatically at each update.
 In the B-tree data is sorted in a specific order, with the
lowest value on the left and the highest value on the right.
 To insert the data or key in B-tree is more complicated than
a binary tree.
 There are some conditions that must be hold by the B-Tree:
All the leaf nodes of the B-tree must be at the same
level.
Above the leaf nodes of the B-tree, there should be no
empty sub-trees.
B- tree’s height should lie as low as possible.
B Tree
 B-Trees are specialized M-ary search trees
 Each node has many keys
 Subtree between two keys x and y contains values v such that x  v < y
 Binary search within a node to find correct subtree
 Each node takes one full {page, block, line}
3 7 1221
of memory (disk)

x<3 3<x<7 7<x<12 12<x<21 21<x


B-Tree Properties
• Properties
– maximum branching factor of M
– the root has between 2 and M children or at most M-1
keys
– All other nodes have between M/2 and M records
– Keys+data

• Result
– tree is O(log M) deep
– all operations run in O(log M) time
– operations pull in about M items at a time
Searching
 Searching in B Trees is similar to that in Binary search tree.
For example, if we search for an item 49 in the following B
Tree. The process will something like following :
 Compare item 49 with root node 78. since 49 < 78 hence,
move to its left sub-tree.
 Since, 40<49<56, traverse right sub-tree of 40.
 49>45, move to right. Compare 49.
 match found, return.
 Searching in a B tree depends upon the height of the tree.
 The search algorithm takes O(log n) time to search any
element in a B tree.
Inserting
 Insertions are done at the leaf node level. The following
algorithm needs to be followed in order to insert an item into B
Tree.
 Traverse the B Tree in order to find the appropriate leaf node at
which the node can be inserted.
 If the leaf node contain less than m-1 keys then insert the
element in the increasing order.
 Else, if the leaf node contains m-1 keys, then follow the
following steps.
Insert the new element in the increasing order of elements.
Split the node into the two nodes at the median.
Push the median element upto its parent node.
If the parent node also contain m-1 number of keys, then split
it too by following the same steps.
Insert the node 8 into the B Tree of order 5
shown in the following image.
 The node, now contain 5 keys which is greater than (5 -1 = 4 )
keys. Therefore split the node from the median i.e. 8 and push it
up to its parent node shown as follows.
Deletion
 Deletion is also performed at the leaf nodes. The node which is
to be deleted can either be a leaf node or an internal node.
Following algorithm needs to be followed in order to delete a
node from a B tree.
 Locate the leaf node.
 If there are more than m/2 keys in the leaf node then delete the
desired key from the node.
 If the leaf node doesn't contain m/2 keys then complete the keys
by taking the element from right or left sibling.
If the left sibling contains more than m/2 elements then push
its largest element up to its parent and move the intervening
element down to the node where the key is deleted.
If the right sibling contains more than m/2 elements then push
its smallest element up to the parent and move intervening
element down to the node where the key is deleted.
 If neither of the sibling contain more than m/2 elements then
create a new leaf node by joining two leaf nodes and the
intervening element of the parent node.
 If parent is left with less than m/2 nodes then, apply the above
process on the parent too.
 If the node which is to be deleted is an internal node, then
replace the node with its in-order successor or predecessor.
Since, successor or predecessor will always be on the leaf node
hence, the process will be similar as the node is being deleted
from the leaf node.
Example
Delete the node 53 from the B Tree of
order 5 shown in the following figure.

53 is present in the right child of element


49. Delete it.
 Now, 57 is the only element which is left in the node, the
minimum number of elements that must be present in a B tree of
order 5, is 2. it is less than that, the elements in its left and right
sub-tree are also not sufficient therefore, merge it with the left
sibling and intervening element of parent i.e. 49.
 The final B tree is shown as follows.
Definition of a B+Tree
 The B+ tree is a balanced binary search tree. It follows a
multi-level index format.
 In the B+ tree, leaf nodes denote actual data pointers. B+
tree ensures that all leaf nodes remain at the same height.
 In the B+ tree, the leaf nodes are linked using a link list.
Therefore, a B+ tree can support random access as well as
sequential access.
Structure of B+ Tree
 In the B+ tree, every leaf node is at equal distance from the
root node. The B+ tree is of the order n where n is fixed for
every B+ tree.
 It contains an internal node and leaf node.
Internal node
An internal node of the B+ tree can contain at least n/2
record pointers except the root node.
At most, an internal node of the tree contains n pointers.
Leaf node
The leaf node of the B+ tree can contain at least n/2
record pointers and n/2 key values.
At most, a leaf node contains n record pointer and n key
values.
Every leaf node of the B+ tree contains one block pointer
P to point to next leaf node.
Example
B+ Tree with M = 4
Often, leaf nodes linked
together

1040

3 152030 50

1 2 101112 202526 4042


3 5 6 9 1517 30323336 506070
Advantages of B+ tree usage for
databases
 keeps keys in sorted order for sequential traversing
 uses a hierarchical index to minimize the number of disk
reads
 uses partially full blocks to speed insertions and deletions
 keeps the index balanced with a recursive algorithm
 In addition, a B+ tree minimizes waste by making sure the
interior nodes are at least half full. A B+ tree can handle an
arbitrary number of insertions and deletions.
Searching
 Just compare the key value with the data in the tree,
then return the result.
 For example: find the value 45, and 15 in below tree.
Searching
 Result:
 1. For the value of 45, not found.
 2. For the value of 15, return the position where the
pointer located.
Insertion
 inserting a value into a B+ tree may
unbalance the tree, so rearrange the tree if
needed.
 Example #1: insert 28 into the below tree.

25 28 30
Fits inside the
leaf
Insertion
 Result:
Insertion
 Example #2: insert 70 into below tree
Insertion
 Process: split the leaf and propagate middle
key up the tree

50 55 60 65 70
Does not fit
inside the
leaf
50 55 60 65 70
Insertion
 Result: chose the middle key 60, and place
it in the index page between 50 and 75.
Insertion
The insert algorithm for B+ Tree
Leaf Index Node Action
Node Full Full
NO NO Place the record in sorted position in the appropriate leaf page

YES NO 1. Split the leaf node


2. Place Middle Key in the index node in sorted order.
3. Left leaf node contains records with keys below the middle key.
4. Right leaf node contains records with keys equal to or greater than
the middle key.
YES YES 1. Split the leaf node.
2. Records with keys < middle key go to the left leaf node.
3. Records with keys >= middle key go to the right leaf node.
Split the index node.
4. Keys < middle key go to the left index node.
5. Keys > middle key go to the right index node.
6. The middle key goes to the next (higher level) index node.

IF the next level index node is full, continue splitting the index nodes.
Insertion
 Exercise: add a key value 95 to the below
tree.

75 80 85 90 95
Leaf node
full, split the
75 80 85 90 95 25 50 60 75 85
leaf.
Insertion
 Result: again put the middle key 60 to the
index page and rearrange the tree.
Deletion
 Same as insertion, the tree has to be rebuild if
the deletion result violate the rule of B+ tree.
 Example #1: delete 70 from the tree

OK. Node
>=50% full 60 65
Deletion
 Result:
Deletion
Example #2: delete 25 from below tree, but 25
appears in the index page.
But…

28 30
This is
OK.
Deletion
 Result: replace 28 in the index page.

Add 28
Deletion
 Example #3: delete 60 from the below tree

65

Less than
50 55 65 50% full
Deletion
 Result: delete 60 from the index page and
combine the rest of index pages.
Deletion
 Delete algorithm for B+ trees

Data Page Below Fill Index Page Below Fill Action


Factor Factor
NO NO Delete the record from the leaf page. Arrange
keys in ascending order to fill void. If the key of
the deleted record appears in the index page,
use the next key to replace it.
YES NO Combine the leaf page and its sibling. Change
the index page to reflect the change.

YES YES 1. Combine the leaf page and its sibling.


2. Adjust the index page to reflect the
change.
3. Combine the index page with its sibling.

Continue combining index pages until you


reach a page with the correct fill factor or
you reach the root page.
Conclusion
 For a B+ Tree:
 It is “easy” to maintain its balance
 Insert/Deletion complexity O(logM/2)
 The searching time is shorter than most
of other types of trees because branching
factor is high
B+Trees and DBMS
– Used to index primary keys
– Can access records in O(logM/2) traversals
(height of the tree)
– Interior nodes contain Keys only
– Set node sizes so that the M-1 keys and M
pointers fits inside a single block on disk
– E.g., block size 4096B, keys 10B, pointers 8 bytes
– (8+ (10+8)*M-1) = 4096
– M = 228; 2.7 billion nodes in 4 levels
– One block read per node visited

You might also like