CSE201-DATA STRUCTURES
B-Trees
1
MOTIVATION FOR B-TREES
Two technologies for providing memory capacity in a
computer system
Primary (main) memory (silicon chips) : Random Access
Memory (Ram) or Read Only Memory (Rom)
Secondary storage (magnetic disks) : hard disk drives
(HDDs), solid-state drives (SSDs), USB flash drives
or other devices.
2
MOTIVATION FOR B-TREES
Primary memory (main memory)
5 orders of magnitude (i.e., about 105
times) faster,
2 orders of magnitude (about 100 times)
more expensive, and
by at least 2 orders of magnitude less in
size
than secondary storage due to
mechanical operations involved in
magnetic disks.
3
MOTIVATION FOR B-TREES
During one disk read or disk write (4-8.5msec for 7200
RPM disks), MM can be accessed about 105 times (100
nanosec per access).
Paging
Paging is a function of memory management where a
computer will store and retrieve data from a device's
secondary storage to the primary storage.
Paging is used to decrease the disk access time.
4
MOTIVATION FOR B-TREES
B-trees data structure
is for storing these
equal sized pages in
MM.
B-Trees, with their
equal-sized leaves
(as big as a page),
are suitable data
structures for storing
and performing
regular operations on
paged data.
5
B-TREES
A B-tree is a rooted tree with the following
properties:
Every node x has the following fields:
n[x], the number of keys currently stored in x.
the n[x] keys themselves, in non-decreasing
order, so that
key1[x] ≤ key2[x] ≤ ... ≤ keyn[x][x] ,
leaf[x], a boolean value, true if x is a leaf.
6
B-TREES
Each internal node has n[x]+1 pointers, c1[x],...,
cn[x]+1[x], to its children. Leaf nodes have no
children, hence no pointers!
The keys separate the ranges of keys stored in
each subtree: if ki is any key stored in the subtree
with root ci[x], then
k1 ≤ key1[x] ≤ k2 ≤ key2[x] ≤ ... ≤ keyn[x][x] ≤ kn[x]+1 .
7
B-TREES
All leaves have the same depth, h, equal to the
tree’s height.
There are lower and upper bounds on the number
of keys a node may contain. These bounds can be
expressed in terms of a fixed integer t ≥ 2 called
the minimum degree of the B-Tree.
8
B-TREES
Lower limits
Allnodes but the root has at least t-1 keys.
Every internal node but the root has at least t children.
A non-empty tree’s root must have at least one key.
9
B-TREES
Upper limits
Every node can contain at most 2t-1 keys.
Every internal node can have at most 2t children.
A node is defined to be full if it has exactly 2t-1 keys.
For a B-tree of minimum degree t ≥ 2 and n
nodes
n 1
h log t
2
10
BASIC OPERATIONS ON B-TREES
B-tree search
B-tree insert
B-tree removal
11
SEARCH IN B-TREES
Similar to search in BSTs with the exception
that instead of a binary, a multi-way (n[x]+1-
way) decision is made.
12
INSERTION IN B-TREES
Insertion into a B-tree is more complicated than that into
a BST, since the creation of a new node to place the new
key may violate the B-tree property of the tree.
Instead, the key is put into a leaf node x if it is not full.
If full, a split is applied, which splits a full node (with 2t-
1 keys) at its median key, keyt[x], into two nodes with t-1
keys each.
keyt[x] moves up into the parent of x and identifies the
split point of the two new trees.
13
INSERTION IN B-TREES
A single-pass insertion starts at the root
traversing down to the leaf into which the key is
to be inserted.
On the path down, all full nodes are split
including a full leaf that also guarantees a parent
with an available position for the median key of a
full node to be placed.
14
B-TREES
Check the animation in this link
https://people.ksp.sk/~kuko/gnarley-
trees/Btree.html
15
INSERTION IN B-TREES
The order: the maximum number of children that
an internal node may have.
The order of the following tree is 3.
A full node has 2t-1 keys.
16
INSERTION IN B-TREES
A full node key size is the order of the tree
A full node has 2t-1 keys, 2t-1 = 3 t=2.
Remember
All nodes but the root has at least t-1 keys 1 for this
case.
Every internal node but the root has at least t children
2 for this case.
17
INSERTION IN B-TREES
Insert 55
18
INSERTION IN B-TREES
19
INSERTION IN B-TREES
Insert 90
20
INSERTION IN B-TREES
Insert 90
21
INSERTION IN B-TREES
22
INSERTION IN B-TREES
Insert 999
23
INSERTION IN B-TREES
Insert 999
24
INSERTION IN B-TREES
25
INSERTION IN B-TREES
Insert 400 and 200
26
INSERTION IN B-TREES
Insert 200
27
INSERTION IN B-TREES
28
REMOVING A KEY FROM A B-TREE
Removal in B-trees is different than insertion
only in that a key may be removed from any node,
not just from a leaf.
As the insertion algorithm splits any full node
down the path to the leaf to which the key is to be
inserted, a recursive removal algorithm may be
written to ensure that for any call to removal on a
node x, the number of keys in x is at least the
minimum degree t.
29
VARIOUS CASES OF REMOVING A KEY FROM A B-TREE
1. If the key k is in node x and x is a leaf, remove
the key k from x.
2. If the key k is in node x and x is an internal
node, then
a. If the child y that precedes k in node x has at least t
keys, then find the predecessor k’ of k in the subtree
rooted at y. Recursively delete k’, and replace k by
k’ in x. Finding k’ and deleting it can be performed
in a single downward pass.
30
VARIOUS CASES OF REMOVAL A KEY FROM A B-TREE
b. Symmetrically, if the child z that follows k in
node x has at least t keys, then find the successor
k’ of k in the subtree rooted at z. Recursively
delete k’, and replace k by k’ in x. Finding k’ and
deleting it can be performed in a single
downward pass.
c. Otherwise, if both y and z have only t-1 keys,
merge k and all of z into y so that x loses both k
and the pointer to z and y now contains 2t-1 keys.
Free z and recursively delete k from y.
31
VARIOUS CASES OF REMOVAL A KEY FROM A B-TREE
3. If k is not present in internal node x,
determine root ci[x] of the subtree that must
contain k, if k exists in the tree. If ci[x] has
only t-1 keys, execute step 3a or 3b as
necessary to guarantee that we descend to a
node containing at least t keys. Then finish
by recursing on the appropriate child of x.
32
VARIOUS CASES OF REMOVAL A KEY FROM A B-TREE
a. If ci[x] has only t-1 keys but has an immediate
sibling with at least t keys, give ci[x] an extra key
by moving a key from x down into ci[x], moving a
key from ci[x]’s immediate left or right sibling up
into x, and moving the appropriate child pointer
from the sibling into ci[x].
b. If ci[x] and both of ci[x]’s immediate siblings have
t-1 keys, merge ci[x] with one sibling, which
involves moving a key from x down into the new
merged node to become the median key for that
node.
33
REMOVAL EXAMPLE
Remove 999
34
REMOVAL EXAMPLE
Remove 999
35
REMOVAL EXAMPLE
Remove 999
36
REMOVAL EXAMPLE
Remove 50 and 90
37
REMOVAL EXAMPLE
Remove 50 and 90
38
REMOVAL EXAMPLE
Remove 47
39
REMOVAL EXAMPLE
Remove 47
40
REMOVAL EXAMPLE
Remove 47
41
REMOVAL EXAMPLE
Remove 159
42
REMOVAL EXAMPLE
Remove 159
43
REMOVAL EXAMPLE
Remove 159
44
REMOVAL EXAMPLE
Remove 80
45
REMOVAL EXAMPLE
Remove 80
46
REMOVAL EXAMPLE
Remove 55
47
REMOVAL EXAMPLE
Remove 55
48
REMOVAL EXAMPLE
Remove 400
49
REMOVAL EXAMPLE
Remove 400
50
REMOVAL EXAMPLE
Remove 400
51
INSERTION IN B-TREES
The order of the following tree is 5.
A full node has 2t-1 keys, 2t-1 = 5 t=3.
52
INSERTION IN B-TREES
A full node has 2t-1 keys, 2t-1 = 5 t=3.
All nodes but the root has at least t-1 2 keys.
Every internal node but the root has at least t 3
children.
A node is defined to be full if it has exactly 2t-1 5
keys.
53
INSERTION IN B-TREES
Insert 300
54
INSERTION IN B-TREES
Insert 300
55
INSERTION IN B-TREES
Delete 865
56
INSERTION IN B-TREES
Delete 865
57
INSERTION IN B-TREES
Delete 505
58
INSERTION IN B-TREES
Delete 505
59
RECOMMENDATION
B-tree visualization
https://people.ksp.sk/~kuko/gnarley-
trees/Btree.html
60