r20 Unit 4 Ads Notes
r20 Unit 4 Ads Notes
Syllabus:
M-Way Search Trees, Definition and Properties- Searching an M-Way Search Tree, B-Trees,
Definition and Properties- Number of Elements in a B-tree- Insertion into B-Tree- Deletion
from a B-Tree- B+-Tree Definition- Searching a B+-Tree- Insertion into B+-tree- Deletion
from a B+-Tree
Definition: An m-way tree is a search tree in which each node can have from 0 to m subtrees,
where m is defined as the B-tree order. Given a nonempty multiway tree.
Properties:
3. The key values in the first subtree are all less than the key value in the first
entry; the key values in the other subtrees are all greater than or equal to
4. The keys of the data entries are ordered key1 ≤ key2 ≤ … ≤ keyk.
1.The first thing to note is that it has the same structure as the binary search tree: subtrees to
the left of an entry contain data with keys that are less than the key of the entry, and subtrees
1|Page
to the right of an entry contain data with keys that are greater than or equal to the entry’s key.
This ordering is easiest to see in the first and last subtrees.
2. In second subtree we can observe that its keys are greater than or equal to k1 and less than
k2. It serves as the right subtree for K1 and at the same time the left subtree for k2. In other
words, whether a subtree is a left or a right subtree depends on which node entry you are
viewing.
3. Also note that there is one more subtree than there are entries in the node; there is a
separate subtree at the beginning of each node. This first subtree identifies all of the subtrees
that contain keys less than the first entry in the node.
4. Because each node has a variable number of entries, we need some way to keep track of
how many entries are currently in the node. This is done with an entry count.
A structure is to be used for the entries. Because the number of entries varies up to a specified
maximum, the best structure in which to store them is an array. Each entry needs to hold the
key of the data, the data itself (or a pointer to the data if stored elsewhere), and a pointer to its
right subtree. Below figure shows the node structure:
2|Page
The node structure contains the first pointer to the subtree with entries less than the key of the
first entry, a count of the number of entries currently in the node, and the array of entries. The
array must have room for m − 1 entries.
An M-way search tree is a more constrained m-way tree, and these constrain mainly apply to
the key fields and the values in them. The constraints on an M-way tree that makes it an M-
way search tree are:
Each node in the tree can associate with m children and m-1 key fields.
The keys in any node of the tree are arranged in a sorted order(ascending).
The keys in the first K children are less than the Kth key of this node.
The keys in the last (m-K) children are higher than the Kth key.
M-way search trees have the same advantage over the M-way trees, which is making the
search and update operations much more efficient. Though, they can become unbalanced
3|Page
which in turn leaves us to the same issue of searching for a key in a skewed tree which is not
much of an advantage.
If we want to search for a value say X in an M-way search tree and currently we are at a node
that contains key values from Y1, Y2, Y3,.....,Yk. Then in total 4 cases are possible to deal
with this scenario, these are:
Case 1 : If X < Y1, then we need to recursively traverse the left subtree of Y1.
Case 2: If X > Yk, then we need to recursively traverse the right subtree of Yk.
Case 3: If X = Yi, for some i, then we are done, and can return.
Case 4: Last and only remaining case is that when for some i we have Yi < X < Y(i+1), then
in this case we need to recursively traverse the subtree that is present in between Yi and
Y(i+1).
For example, consider the 3-way search tree that is shown above, say, we want to search for a
node having key(X) equal to 60. Then, considering the above cases, for the root node, the
second condition applies, and (60 > 40) and hence we move on level down to the right
subtree of 40. Now, the last condition is valid only, hence we traverse the subtree which is in
between the 55 and 70. And finally, while traversing down, we have our value that we were
looking for.
B Trees:
Definition: B-tree is a perfectly balanced m-way tree in which each node, with the possible
exception of the root, is at least half full.
Properties:
A B tree is an extension of an M-way search tree. Besides having all the properties of an M-
way search tree, it has some properties of its own, these mainly are:
1. All the leaf nodes in a B tree are at the same level.
2. All internal nodes must have M/2 children.
3. If the root node is a non-leaf node, then it must have at least two children.
4. All nodes except the root node, must have at least ceil(M/2)-1 keys and at most M-1 keys.
4|Page
Number of Elements in a B-tree:
Table below defines the minimum and maximum numbers of subtrees(elements) in a nonroot
node for B-trees of different orders.
5|Page
Searching in a B Tree:
Searching for a key in a B Tree is exactly like searching in an M-way search tree, which we
have seen just above. Consider the pictorial representation shown below of a B tree, say we
want to search for a key 49 in the below shown B tree. We do it as following:
Step 1.Compare item 49 with root node 75. Since 49 < 75 hence, move to its left sub-tree.
6|Page
Inserting in a B Tree:
Inserting in a B tree is done at the leaf node level. We follow the given steps to make sure
that after the insertion the B tree is valid, these are:
First, we traverse the B tree to find the appropriate node where the to be inserted key
will fit.
If that node contains less than M-1 keys, then we insert the key in an increasing order.
If that node contains exactly M-1 keys, then we have to insert the new element in
increasing order, split the nodes into two nodes through the median, push the median
element up to its parent node, and finally if the parent node also contains M-1 keys,
then we need to repeat these steps.
Now, consider that we want to insert a key 9 into the above shown B tree, the tree after
inserting the key 9 will look something like this:
7|Page
Since, a violation occurred, we need to push the median node to the parent node, and then
split the node in two parts, and hence the final look of B tree is:
1) Initialize x as root.
a) Find the child of x that is going to be traversed next. Let the child be y.
c) If y is full, split it and change x to point to one of the two parts of y. If k is smaller than
mid key in y, then set x as the first part of y. Else second part of y. When we split y, we move
a key from y to its parent x.
3) The loop in step 2 stops when x is leaf. x must have space for 1 extra key as we have been
splitting all nodes in advance. So simply insert k to x.
Deletion in a B Tree:
8|Page
Deletion of Key from a leaf node:
If we want to delete a key that is present in a leaf node of a B tree, then we have two cases
possible, these are:
If the node that contains the key that we want to delete, in turn contains more than
the minimum number of keys required for the valid B tree, then we can simply
delete that key.
Say, we want to delete the key 64 and the node in which 64 is present, has more than
minimum number of nodes required by the B tree, which is 2. So, we can simply delete this
node.
9|Page
If the node that contains the key that we want to delete, in turn contains the
minimum number of keys required for the valid B tree, then three cases are
possible:
o In order to delete this key from the B Tree, we can borrow a key from the
immediate left node(left sibling). The process is that we move the highest
value key from the left sibling to the parent, and then the highest value parent
key to the node from which we just deleted our key.
o In another case, we might have to borrow a key from the immediate right
node(right sibling). The process is that we move the lowest value key from the
right sibling to the parent node, and then the highest value parent key to the
node from which we just deleted our key.
o Last case would be that neither the left sibling or the right sibling are in a state
to give the current node any value, so in this step we will do a merge with
either one of them, and the merge will also include a key from the parent, and
then we can delete that key from the node.
10 | P a g e
After we delete 23, we ask the left sibling, and then move 16 to the parent node and then push
20 downwards, and the resultant B tree is:
11 | P a g e
Case 2 pictorial representation:
After we delete 72, we ask the right sibling, and then move the 77 to the parent node and then
push the 75 downwards, and the resultant B tree is:
12 | P a g e
After deleting 65 from the leaf node, we will have the final B tree as:
If we want to delete a key that is present in an internal node, then we can either take
the value which is in order predecessor of this key or if taking that inorder
predecessor violates the B tree property we can take the inorder successor of the key.
In the inorder predecessor approach, we extract the highest value in the left children
node of the node where our key is present.
In the inorder successor approach, we extract the lowest value in the right children
node of the node where our key is present.
13 | P a g e
Pictorial Representation of the above cases:
14 | P a g e
After deletion of 95, our tree will look like this:
15 | P a g e
The time complexity for search, insert and delete operations in a B tree is O (log n).
a) If the child y that precedes k in node x has at least t keys, then find the
predecessor k0 of k in the sub-tree rooted at y. Recursively delete k0, and replace k by
k0 in x. (We can find k0 and delete it in a single downward pass.)
b) If y has fewer than t keys, then, symmetrically, examine the child z that follows
k in node x. If z has at least t keys, then find the successor k0 of k in the subtree
rooted at z. Recursively delete k0, and replace k by k0 in x. (We can find k0 and
delete it in a single downward pass.)
c) Otherwise, if both y and z have only t-1 keys, merge k and all of z into y, so that x
loses both k and the pointer to z, and y now contains 2t-1 keys. Then free z and
recursively delete k from y.
3. If the key k is not present in internal node x, determine the root x.c(i) of the
appropriate subtree that must contain k, if k is in the tree at all. If x.c(i) has only t-1
keys, execute step 3a or 3b as necessary to guarantee that we descend to a node
containing at least t keys. Then finish by recursing on the appropriate child of x.
a) If x.c(i) has only t-1 keys but has an immediate sibling with at least t keys, give
x.c(i) an extra key by moving a key from x down into x.c(i), moving a key from x.c(i)
’s immediate left or right sibling up into x, and moving the appropriate child pointer
from the sibling into x.c(i).
b) If x.c(i) and both of x.c(i)’s immediate siblings have t-1 keys, merge x.c(i) with
one sibling, which involves moving a key from x down into the new merged node to
become the median key for that node.
16 | P a g e
1. keeps keys in sorted order for sequential traversing
5. In addition, a B-tree minimizes waste by making sure the interior nodes are at least
half full. A B-tree can handle an arbitrary number of insertions and deletions.
B+ Tree:
A B+ tree is an extension of a B tree which makes the search, insert and delete operations
more efficient. We know that B trees allow both the data pointers and the key values in
internal nodes as well as leaf nodes; this certainly becomes a drawback for B trees as the
ability to insert the nodes at a particular level is decreased thus increase the node levels in it,
which is certainly of no good. B+ trees reduce this drawback by simply storing the data
pointers at the leaf node level and only storing the key values in the internal nodes. It should
also be noted that the nodes at the leaf level are linked with each other, hence making the
traversal of the data pointers easy and more efficient.
B+ trees come in handy when we want to store a large amount of data in the main memory.
Since we know that the size of the main memory is not that large, so make use of the B+
trees, whose internal nodes that store the key(to access the records) are stored in the main
memory whereas, the leaf nodes that contain the data pointers are actually stored in the
secondary memory.
17 | P a g e
Why B+ trees?
B+ trees store the records which later can be fetched in an equal number of disk
accesses.
Having less number of levels makes the accessing of records very easy.
As the leaf nodes are connected with each other like a linked list, we can easily
search elements in sequential manners.
Inserting in B+ tree
If the bucket is not full( does not violate the B+ tree property ), then add that node
into this bucket.
Otherwise split the nodes into two nodes and push the middle node( median node to
be precise ) to the parent node and then insert the new node.
Repeat the above steps if the parent node is there and the current node keeps getting
full.
Consider the pictorial representations shown below to understand the Insertion operation in
the B+ tree:
18 | P a g e
Let us try to insert 57 inside the above-shown B+ tree, the resultant B+ tree will look like
this:
We know that the bucket (node) where we inserted the key with value 57 is now violating the
property of the B+ tree; hence we need to split this node as mentioned in the steps above.
After splitting we will push the median node to the parent node, and the resulting B+ tree will
look like this:
Searching in B+ tree:
Searching in a B+ tree is similar to searching in a BST. If the current value is less than the
searching key, then traverse the left subtree, and if greater than first traverse this current
bucket(node) and then check where the ideal location is.
19 | P a g e
Now we know that 59 < 69, hence we traverse the left subtree.
Now we have found the internal pointer that will point us to our required search value.
20 | P a g e
Finally, we traverse this bucket in a linear fashion to get to our required search value.
1. If the node has an empty space, insert the key/reference pair into the node.
2. If the node is already full, split it into two nodes, distributing the keys evenly between the
two nodes. If the node is a leaf, take a copy of the minimum value in the second of these two
nodes and repeat this insertion algorithm to insert it into the parent node. If the node is a non-
leaf, exclude the middle value during the split and repeat this insertion algorithm to insert this
excluded value into the parent node.
Deletion in B+ tree:
If it is present only as a leaf node position, then we can simply delete it, for which we first
do the search operation and then delete it.
21 | P a g e
After the deletion of 79, we are left with the following B+ tree.
22 | P a g e
After we locate the node we want to delete we must also delete the internal pointer that points
to this node, and then we finally need to move the next node pointer to move to the parent
node.
1.Remove the required key and associated reference from the node.
2.If the node still has enough keys and references to satisfy the invariants, stop.
3.If the node has too few keys to satisfy the invariants, but its next oldest or next youngest
sibling at the same level has more than necessary, distribute the keys between this node and
the neighbor. Repair the keys in the level above to represent that these nodes now have a
different “split point” between them; this involves simply changing a key in the levels above,
without deletion or insertion.
4.If the node has too few keys to satisfy the invariant, and the next oldest or next youngest
sibling is at the minimum for the invariant, then merge the node with its sibling; if the node is
a non-leaf, we will need to incorporate the “split key” from the parent into our merging. In
either case, we will need to repeat the removal algorithm on the parent node to remove the
23 | P a g e
“split key” that previously separated these merged nodes — unless the parent is the root and
we are removing the final key from the root, in which case the merged node becomes the new
root (and the tree has become one level shorter than before).
S.N
O B tree B+ tree
All internal and leaf nodes have data Only leaf nodes have data
1. pointers pointers
Since all keys are not available at leaf, All keys are at leaf nodes, hence
2. search often takes more time. search is faster and accurate..
Insertion takes more time and it is not Insertion is easier and the results
4. predictable sometimes. are always the same.
Leaf nodes are not stored as structural Leaf nodes are stored as
6. linked list. structural
Advantages of B+ Tree:
24 | P a g e
5. Faster search queries as the data is stored only on the leaf nodes.
Time Complexity:
The average case time complexity of insertion, deletion and search operation in a B+ Tree is
O(log n)
25 | P a g e