0% found this document useful (0 votes)
8 views15 pages

Chapter4 Part6 Btree Delete

The document discusses the process of deletion in B-Trees, highlighting the challenges and methods for deleting keys from both leaf and non-leaf nodes. It outlines two main cases for deletion: from a leaf node and from a non-leaf node, detailing the necessary steps to maintain B-tree properties such as node fullness. Additionally, it introduces B+-Trees, which connect leaf nodes in a linked list and discusses their practical applications in various file systems and database management systems.

Uploaded by

mudwah877
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views15 pages

Chapter4 Part6 Btree Delete

The document discusses the process of deletion in B-Trees, highlighting the challenges and methods for deleting keys from both leaf and non-leaf nodes. It outlines two main cases for deletion: from a leaf node and from a non-leaf node, detailing the necessary steps to maintain B-tree properties such as node fullness. Additionally, it introduces B+-Trees, which connect leaf nodes in a linked list and discusses their practical applications in various file systems and database management systems.

Uploaded by

mudwah877
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

COS 212

B-Trees: Deletion
B-Trees: Delete
▪ The hardest of the processes to apply
▪ A reversal of insertion: Rather than splitting nodes, we will
merge nodes as necessary so that B-tree properties still hold
▪ Ensure that all non-root nodes must be at least half full
▪ Two main cases to be considered
1. Deletion of a key from a leaf node
2. Deletion of a key from a non-leaf node
B-Trees: Deletion from a leaf
▪ We’ll start with the simpler case: deleting from a leaf
▪ In all cases we begin by searching for the value we want to
delete, and removing that value
▪ It may then be necessary to move values between nodes, and
possibly merge nodes
▪ There are two sub-cases to consider when deleting from a leaf
A. The leaf is at least half full after deleting the value
B. The leaf is less than half full after deleting the value
B-Trees: Deletion from a leaf
A. If the leaf is at least half full after deleting the desired value
▪ All remaining keys in the leaf with larger values shift left one space
▪ This fills the gap made by the deletion

Delete 6 16

3 8 22 25

1 2 5 7
6 7 13 14 15 18 20 23 24 27 37

2 keys, so leaf is still at


least half full
B-Trees: Deletion from a leaf
▪ The two sub-cases to consider when deleting from a leaf
A. The leaf is at least half full after deleting the value
▪ We’ve just looked at this one
B. The leaf is less than half full after deleting the value (“underflow”)
▪ There are two possible situations to take care of here
1. The leaf has a left or right sibling with a number of keys exceeding the
minimum required (i.e. at least one sibling is more than half full)
2. Both siblings of the leaf have a number of keys that do not exceed the
minimum required (i.e. both siblings are at most half full)
B-Trees: Deletion from a leaf
B. The leaf is less than half full after deleting the value
1) The node has a left or right sibling with a number of keys
exceeding the minimum required
▪ Choose one of the siblings satisfying this requirement (if only one sibling
satisfies the requirement, that sibling must be chosen)
▪ Combine all keys from the leaf containing the deleted value, the leaf’s
chosen sibling, and the leaf’s parent key into a combined list
▪ Replace the leaf’s parent key with the middle key in the combined list
▪ Redistribute all remaining keys in the combined list between the leaf
and its chosen sibling

Delete 7 16

3 13
8 22 25

1 2 5 7
8 13 15
14 14 15 18 20 23 24 27 37

1 key, so leaf is less Only the right sibling


than half full is more than half full 5, 8, 13, 14, 15 𝑖 = 𝑛/2 = 3
B-Trees: Deletion from a leaf
B. The leaf is less than half full after deleting the value
2) All siblings of the leaf have a number of keys that do not
exceed the minimum required
▪ Choose either one of the siblings (if the leaf is the leftmost or rightmost
child of its parent, only one sibling can be chosen)
▪ Take all keys from the leaf containing the deleted value, the leaf’s
chosen sibling, and the leaf’s parent key, and redistribute them into
either the leaf or its sibling (whichever is the leftmost node)
▪ Discard the leaf or its sibling that is empty after the redistribution, and
shift the keys in the parent to fill the gap left by the leaf’s parent key
▪ This may cause the parent to underflow.
If so, treat the parent as a leaf and repeat the deletion algorithm on it

Delete 8 16

3 13 Underflow! 22 25

1 2 5 13
8 14 15 14 15 18 20 23 24 27 37

1 key, so leaf is less Neither sibling is more We’ll choose the right 5, 13, 14, 15
than half full than half full sibling for this example
B-Trees: Deletion from a leaf

3, 16, 22, 25
Treat underflowing parent as a
leaf & repeat deletion algorithm The sibling is not more than half full
16
1 key, so node is less than half full We must merge into a single node

3 16 22 25 Underflow! 22 25

1 2 5 13 14 15 18 20 23 24 27 37

We must now deal with the underflow

We must be sure to also move the Note that the root and sibling nodes are
pointers with their corresponding keys now empty, so they must be discarded
B-Trees: Deletion from a leaf
▪ An exercise for you
▪ Start with the original tree from the previous example (below)
▪ Once again delete 8
▪ But this time merge the leaf node that the 8 is deleted from with its
left sibling, instead of its right sibling

Delete 8 16

3 13 22 25

1 2 5 8 14 15 18 20 23 24 27 37
B-Trees: Deletion from a non-leaf
▪ To avoid problems with B-tree balancing, non-leaf deletion is
reduced to leaf deletion
▪ Replace deleted key with the value of one of the following
▪ Deleted key’s immediate predecessor: Largest key smaller than the deleted key
Deleted key’s immediate successor: Smallest key larger than the deleted key
▪ How should we find the immediate predecessor and immediate successor?
▪ Immediate predecessor and successor will always be in a leaf node
▪ The immediate predecessor (or successor) is then deleted
▪ Because these are always in leaf nodes, this is a normal leaf deletion that follows
the procedure we’ve already discussed

Delete 16
3 15
16 22 25

1 2 5 13 14 15 18 20 23 24 27 37

We’ll choose the immediate Now we have to delete the 3 keys after deletion, so
predecessor for this example predecessor (leaf deletion) leaf is at least half full
B-Trees: Deletion from a non-leaf
▪ The deletion can behave very differently depending on
whether the immediate predecessor or successor is chosen
▪ To illustrate this
▪ We’ll redo the previous example
▪ But choose the immediate successor, instead of the immediate predecessor

𝑖 = 𝑛/2 = 3

Delete 16 5, 13, 14, 15, 18, 20


3 18
16
14 22 25

1 2 5 13 14 15 15
18
20 18
20 20 23 24 27 37

We’ll choose the immediate Now we have to delete the


successor for this example successor (leaf deletion)

1 key after deletion, so leaf Only the left sibling is more


is less than half full than half full
B+-Trees
▪ What if each leaf stores the address of the next leaf?
▪ In other words, the leaf nodes are connected into a linked list

50

10 15 20 70 80

6 8 11 12 16 18 21 25 27 54 56 71 76 81 89

▪ We can now print all the leaf data in order!


▪ But this only partly solves the problem
▪ What about the keys in the parents?
B+-Trees
▪ What if the parent keys are also stored in the leaves?

50

10 15 20 70 80

6 8 10 11 12 15 16 18 20 21 25 27 50 54 56 70 71 76 80 81 89

▪ We can now print all the data in order


▪ Is there another problem?
▪ Redundancy!
▪ A lot of memory can potentially be wasted, especially for big trees
B+-Trees
▪ What if the leaf nodes had a different size then non-leaf nodes?
▪ Size L instead of m-1
▪ Min size: L/2 instead of m/2 -1
50

10 15 20 70 80

6 10 15 20 50 70 80
8 11 16 21 54 71 81
9 12 18 25 56 76 89
27 90
35

▪ We can now size leaf and non-leaf nodes independently e.g.


▪ Match non-leaf node size to keys per memory block
▪ Match leaf node size to records per memory block
B+-Tree Usage
▪ Are B+-trees actually used in practice?
▪ The ReiserFS, NSS, XFS, JFS, ReFS, and BFS filesystems
▪ Use B+-trees for metadata indexing
▪ APFS, BFS and NTFS
▪ Use B+ trees for directory indexing
▪ Relational database management systems like IBM DB2,
Informix, Microsoft SQL Server, Oracle 8, Sybase ASE, and SQLite
▪ Use B+ trees for table indices

You might also like