Ch18 - B-Trees
Ch18 - B-Trees
1
B-tree
Defined by one parameter: t
Balanced n-ary tree
Each node contains between t-1 and 2t-1 keys/data
values (i.e. multiple data values per tree node)
keys/data are stored in sorted order
one exception: root can have < t-1 keys
Each internal node contains between t and 2t
children
the keys of a parent delimit the values of the children keys
For example, if keyi = 15 and keyi+1 = 25 then child i + 1
must have keys between 15 and 25
all leaves have the same depth
2
Example B-tree: t = 2
GNT
C K Q X
A DE F H LM P RS W YZ
3
Example B-tree: t = 2
GNT
C K Q X
A DE F H LM P RS W YZ
4
Example B-tree: t = 2
GNT
C K Q X
A DE F H LM P RS W YZ
5
Example B-tree: t = 2
GNT
C K Q X
A DE F H LM P RS W YZ
6
Example B-tree: t = 2
GNT
C K Q X
A DE F H LM P RS W YZ
C K Q X
A DE F H LM P RS W YZ
C K Q X
A DE F H LM P RS W YZ
C K Q X
A DE F H LM P RS W YZ
13
Height of a B-tree
B-trees have a similar feeling to BSTs
In general? 15
Minimum number of keys/values
min. keys min. number
root
per node of nodes
n 1 (t 1)i 1 2t i 1
h
16
Minimum number of
keys/values
n 1 (t 1)i 1 2t i 1
h
t h 1
1 2(t 1)
t 1
2t h 1
so,
t h (n 1) / 2
(n 1)
h log t
2 17
Searching B-Trees
Find value k in B-Tree
GNT
C K Q X
A DE F H LM P RS W YZ
18
Searching B-Trees
Find value k in B-Tree node x
number of keys
key[i]
child[i]
19
Searching B-Trees
20
Searching B-Trees
21
Searching B-Trees
22
Searching B-Trees
23
Searching B-Trees
24
Search example: R
GNT
C K Q X
A DE F H LM P RS W YZ
25
Search example: R
GNT
C K Q X
A DE F H LM P RS W YZ
26
Search example: R
GNT
C K Q X
A DE F H LM P RS W YZ
27
Search example: R
GNT
C K Q X
A DE F H LM P RS W YZ
28
Search example: R
GNT
C K Q X
A DE F H LM P RS W YZ
this is not a
leaf node
29
Search example: R
GNT
C K Q X
A DE F H LM P RS W YZ
30
Search example: R
GNT
C K Q X
A DE F H LM P RS W YZ
31
Search example: R
GNT
C K Q X
A DE F H LM P RS W YZ
32
Search example: R
GNT
C K Q X
A DE F H LM P RS W YZ
33
Search example: R
GNT
C K Q X
A DE F H LM P RS W YZ
34
Search example: R
GNT
C K Q X
A DE F H LM P RS W YZ
35
Search running time
How many calls to BTreeSearch?
O(height of the tree)
O(logtn)
Disk accesses?
One for each call – O(logtn)
Computational time?
O(t) keys per node
linear search
O(t logtn)
36
B-Tree insert
Starting at root, follow the search path down the tree
If the node is full (contains 2t - 1 keys)
split the keys into two nodes around the median value
Observations
Insertions always happens in the leaves
37
Insertion: t = 2
GCNAHEKQMFWLTZDPRXYS
38
Insertion: t = 2
GCNAHEKQMFWLTZDPRXYS
39
Insertion: t = 2
GCNAHEKQMFWLTZDPRXYS
CG
40
Insertion: t = 2
GCNAHEKQMFWLTZDPRXYS
CGN
41
Insertion: t = 2
GCNAHEKQMFWLTZDPRXYS
42
Insertion: t = 2
GCNAHEKQMFWLTZDPRXYS
C N
43
Insertion: t = 2
GCNAHEKQMFWLTZDPRXYS
AC N
44
Insertion: t = 2
GCNAHEKQMFWLTZDPRXYS
AC N
?
45
Insertion: t = 2
GCNAHEKQMFWLTZDPRXYS
AC HN
46
Insertion: t = 2
GCNAHEKQMFWLTZDPRXYS
AC HN
?
47
Insertion: t = 2
GCNAHEKQMFWLTZDPRXYS
ACE HN
48
Insertion: t = 2
GCNAHEKQMFWLTZDPRXYS
ACE HN
?
49
Insertion: t = 2
GCNAHEKQMFWLTZDPRXYS
ACE HKN
50
Insertion: t = 2
GCNAHEKQMFWLTZDPRXYS
ACE HKN
?
51
Insertion: t = 2
GCNAHEKQMFWLTZDPRXYS
52
Insertion: t = 2
GCNAHEKQMFWLTZDPRXYS
GK
53
Insertion: t = 2
GCNAHEKQMFWLTZDPRXYS
GK
ACE H NQ
54
Insertion: t = 2
GCNAHEKQMFWLTZDPRXYS
GK
ACE H MNQ
55
Insertion: t = 2
GCNAHEKQMFWLTZDPRXYS
GK
ACE H MNQ
56
Insertion: t = 2
GCNAHEKQMFWLTZDPRXYS
CGK
A E H MNQ
57
Insertion: t = 2
GCNAHEKQMFWLTZDPRXYS
CGK
A EF H MNQ
58
Insertion: t = 2
GCNAHEKQMFWLTZDPRXYS
CGK
A EF H MNQ
59
Insertion: t = 2
GCNAHEKQMFWLTZDPRXYS
A EF H MNQ
?
60
Insertion: t = 2
GCNAHEKQMFWLTZDPRXYS
C K
A EF H MNQ
61
Insertion: t = 2
GCNAHEKQMFWLTZDPRXYS
C K
62
Insertion: t = 2
GCNAHEKQMFWLTZDPRXYS
C KN
63
Insertion: t = 2
GCNAHEKQMFWLTZDPRXYS
C KN
A EF H M QW
64
Insertion: t = 2
GCNAHEKQMFW…
C KN
A EF H M QW
65
Insertion: t = 3
66
Correctness of insert
Starting at root, follow search path down the tree
If the node is full (contains 2t - 1 keys), split the keys
around the median value into two nodes and add the
median value to the parent node
If the node is a leaf, insert it into the correct spot
67
Correctness of insert
Starting at root, follow search path down the tree
If the node is full (contains 2t - 1 keys), split the keys
around the median value into two nodes and add the
median value to the parent node
If the node is a leaf, insert it into the correct spot
69
When a node is split
How many disk accesses?
3 disk write operations
2 for the new nodes created by the split (one is reused, but
must be updated)
1 for the parent node to add median value
Runtime to split a node?
O(t) – iterating through the elements a few times since
they’re already in sorted order
71
Review of Deletions
73
Deletion : t = 3
74
Running time of Deletion
O(logtn) disk accesses
75