B-Trees: Based On Materials by D. Frey and T. Anastasio
B-Trees: Based On Materials by D. Frey and T. Anastasio
1
Large Trees
2
An Alternative to BSTs
3
20
12 40
8 17 33 45
1
9 12 18 27 33
2 40 45
10 15 19 29 37
5 41
20
7
Figure 1 - A BST with data stored in the leaves
4
Observations
pointers
q all search paths have same length: ⎡lg n⎤
5
M-Way Trees
6
An M-Way Tree of Order 3
Figure 2 (next page) shows the same data as figure 1,
stored in an M-way tree of order 3. In this example
M = 3 and h = 2, so the tree can support 9 leaves,
although it contains only 8.
One way to look at the reduced path length with
increasing M is that the number of nodes to be visited
in searching for a leaf is smaller for large M.
We’ll see that when data is stored on the disk, each
node visited requires a disk access, so reducing the
nodes visited is essential.
7
12 20
5 9 15 18 26
27
1 21
5 10 15 18 30
2 12 24
7 11 16 19 34
4
42
Figure 2 -- An M-Way tree of order 3
8
Searching in an M-Way Tree
9
Searching an M-Way Tree
Search (MWayNode v, DataType element, boolean foundIt)
if (v == NULL) return failure;
if (v is a leaf)
search the list of values looking for element
if found, return success otherwise return failure
else (if v is an interior node)
search the keys to find which subtree element is in
recursively search the subtree
10
Search Algorithm: Traversing the M-way Tree
Everything in this
subtree is smaller than
this key 18 32
10 13 22 28 39
1 13 18 23 32
10 28 39
2 14 24 35
11 30 44
9 16 25 38
In any interior node, find the first key > search item, and traverse the link to the left of that key. Search for any
item >= the last key in the subtree pointed to by the rightmost link. Continue until search reaches a leaf.
11
22 36 48
6 12 18 26 32 42 54
2 6 14 18 22 26 32 38 42 48 54
4 8 16 19 24 28 34 40 44 50 56
10 20 30 46 52
12
Is it worth it?
13
An example
n Consider storing 107 items in a balanced
BST and in an M-way tree of order 10.
7
n The height of the BST will be lg(10 ) ~ 24.
15
A Generic M-Way Tree Node
public class MwayNode<Ktype, Dtype>
{
// code for public interface here
// constructors, accessors, mutators
16
B-Tree Definition
17
A B-Tree example
18
22 36 48
6 12 18 26 32 42 54
2 6 14 18 22 26 32 38 42 48 54
4 8 16 19 24 28 34 40 44 50 56
10 20 30 46 52
19
Designing a B-Tree
20
Student Record Example
21
Calculating L
22
Calculating M
23
Performance of our B-Tree
With M = 342 the height of our tree for N students
will be ⎡ log342 ⎡ N/L ⎤ ⎤ .
For example, with N = 100,000 (about 10 times the
size of UMBC student population) the height of
the tree with M = 342 would be no more than 2,
because ⎡ log342(25000)⎤ = 2.
So any student record can be found in 3 disk
accesses. If the root of the B-Tree is stored in
memory, then only 2 disk accesses are needed .
24
Insertion of X in a B-Tree
n Search to find the leaf into which X should be
inserted
n If the leaf has room (fewer than L elements), insert X
and write the leaf back to the disk.
n If the is leaf full, split it into two leaves, each with half
of elements. Insert X into the appropriate new leaf
and write new leaves back to the disk.
q Update the keys in the parent
q If the parent node is already full, split it in the same manner
q Splits may propagate all the way to the root, in which case,
the root is split (this is how the tree grows in height)
25
Insert 33 into this B-Tree
22 36 48
6 12 18 26 32 42 54
2 6 12 18 22 26 32 36 42 48 54
4 8 14 19 24 28 34 38 44 50 56
10 16 20 30 40 46 52
26
Inserting 33
27
After inserting 33
22 36 48
6 12 18 26 32 42 54
2 6 12 18 22 26 32 36 42 48 54
4 8 14 19 24 28 33 38 44 50 56
10 16 20 30 34 40 46 52
28
Now insert 35
29
After inserting 35
22 36 48
6 12 18 26 32 34 42 54
2 6 12 18 22 26 32 34 36 42 48 54
4 8 14 19 24 28 33 35 38 44 50 56
10 16 20 30 40 46 52
n This item belongs in the 4th leaf of the 1st subtree
(the leaf containing 18, 19, 20).
n Since the leaf is full, we split it and update the keys
in the parent.
n However, the parent is also full, so it must be split
and its parent (the root) updated.
n But this would give the root 5 subtrees which is not
allowed, so the root must also be split.
n This is the only way the tree grows in height
31
After inserting 21
36
18 22 48
6 12 20 26 32 34 42 54
2 6 12 18 20 22 26 32 34 36 42 48 54
4 8 14 19 21 24 28 33 35 38 44 50 56
10 16 30 40 46 52
33