0% found this document useful (0 votes)
11 views60 pages

Hafta.

B-trees are a data structure designed for efficient storage and retrieval of data in paged memory systems, balancing the need for fast access and reduced disk operations. They maintain properties such as equal-sized leaves, a defined minimum degree, and specific rules for insertion and removal of keys. B-trees support basic operations like search, insert, and delete, ensuring that all leaves are at the same depth and nodes adhere to key limits.

Uploaded by

brkykybl
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views60 pages

Hafta.

B-trees are a data structure designed for efficient storage and retrieval of data in paged memory systems, balancing the need for fast access and reduced disk operations. They maintain properties such as equal-sized leaves, a defined minimum degree, and specific rules for insertion and removal of keys. B-trees support basic operations like search, insert, and delete, ensuring that all leaves are at the same depth and nodes adhere to key limits.

Uploaded by

brkykybl
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 60

CSE201-DATA STRUCTURES

B-Trees

1
MOTIVATION FOR B-TREES

 Two technologies for providing memory capacity in a


computer system
 Primary (main) memory (silicon chips) : Random Access
Memory (Ram) or Read Only Memory (Rom)
 Secondary storage (magnetic disks) : hard disk drives
(HDDs), solid-state drives (SSDs), USB flash drives
or other devices.

2
MOTIVATION FOR B-TREES

 Primary memory (main memory)


 5 orders of magnitude (i.e., about 105
times) faster,
 2 orders of magnitude (about 100 times)
more expensive, and
 by at least 2 orders of magnitude less in
size
than secondary storage due to
mechanical operations involved in
magnetic disks.

3
MOTIVATION FOR B-TREES

 During one disk read or disk write (4-8.5msec for 7200


RPM disks), MM can be accessed about 105 times (100
nanosec per access).
Paging
 Paging is a function of memory management where a
computer will store and retrieve data from a device's
secondary storage to the primary storage.
 Paging is used to decrease the disk access time.

4
MOTIVATION FOR B-TREES

 B-trees data structure


is for storing these
equal sized pages in
MM.
 B-Trees, with their
equal-sized leaves
(as big as a page),
are suitable data
structures for storing
and performing
regular operations on
paged data.

5
B-TREES
 A B-tree is a rooted tree with the following
properties:
 Every node x has the following fields:
 n[x], the number of keys currently stored in x.
 the n[x] keys themselves, in non-decreasing
order, so that
key1[x] ≤ key2[x] ≤ ... ≤ keyn[x][x] ,
 leaf[x], a boolean value, true if x is a leaf.

6
B-TREES

 Each internal node has n[x]+1 pointers, c1[x],...,


cn[x]+1[x], to its children. Leaf nodes have no
children, hence no pointers!
 The keys separate the ranges of keys stored in
each subtree: if ki is any key stored in the subtree
with root ci[x], then
k1 ≤ key1[x] ≤ k2 ≤ key2[x] ≤ ... ≤ keyn[x][x] ≤ kn[x]+1 .

7
B-TREES

 All leaves have the same depth, h, equal to the


tree’s height.
 There are lower and upper bounds on the number
of keys a node may contain. These bounds can be
expressed in terms of a fixed integer t ≥ 2 called
the minimum degree of the B-Tree.

8
B-TREES

 Lower limits
 Allnodes but the root has at least t-1 keys.
 Every internal node but the root has at least t children.

 A non-empty tree’s root must have at least one key.

9
B-TREES

 Upper limits
 Every node can contain at most 2t-1 keys.
 Every internal node can have at most 2t children.

 A node is defined to be full if it has exactly 2t-1 keys.

 For a B-tree of minimum degree t ≥ 2 and n


nodes
n 1
h  log t
2

10
BASIC OPERATIONS ON B-TREES

 B-tree search
 B-tree insert

 B-tree removal

11
SEARCH IN B-TREES

 Similar to search in BSTs with the exception


that instead of a binary, a multi-way (n[x]+1-
way) decision is made.

12
INSERTION IN B-TREES

 Insertion into a B-tree is more complicated than that into


a BST, since the creation of a new node to place the new
key may violate the B-tree property of the tree.
 Instead, the key is put into a leaf node x if it is not full.
 If full, a split is applied, which splits a full node (with 2t-
1 keys) at its median key, keyt[x], into two nodes with t-1
keys each.
 keyt[x] moves up into the parent of x and identifies the
split point of the two new trees.

13
INSERTION IN B-TREES

 A single-pass insertion starts at the root


traversing down to the leaf into which the key is
to be inserted.
 On the path down, all full nodes are split
including a full leaf that also guarantees a parent
with an available position for the median key of a
full node to be placed.

14
B-TREES

 Check the animation in this link


https://people.ksp.sk/~kuko/gnarley-
trees/Btree.html

15
INSERTION IN B-TREES

 The order: the maximum number of children that


an internal node may have.
 The order of the following tree is 3.

 A full node has 2t-1 keys.

16
INSERTION IN B-TREES
 A full node key size is the order of the tree
 A full node has 2t-1 keys, 2t-1 = 3 t=2.

 Remember
 All nodes but the root has at least t-1 keys 1 for this
case.
 Every internal node but the root has at least t children
 2 for this case.

17
INSERTION IN B-TREES

 Insert 55

18
INSERTION IN B-TREES

19
INSERTION IN B-TREES

 Insert 90

20
INSERTION IN B-TREES

 Insert 90

21
INSERTION IN B-TREES

22
INSERTION IN B-TREES

 Insert 999

23
INSERTION IN B-TREES

 Insert 999

24
INSERTION IN B-TREES

25
INSERTION IN B-TREES

 Insert 400 and 200

26
INSERTION IN B-TREES

 Insert 200

27
INSERTION IN B-TREES

28
REMOVING A KEY FROM A B-TREE

 Removal in B-trees is different than insertion


only in that a key may be removed from any node,
not just from a leaf.
 As the insertion algorithm splits any full node
down the path to the leaf to which the key is to be
inserted, a recursive removal algorithm may be
written to ensure that for any call to removal on a
node x, the number of keys in x is at least the
minimum degree t.

29
VARIOUS CASES OF REMOVING A KEY FROM A B-TREE

1. If the key k is in node x and x is a leaf, remove


the key k from x.
2. If the key k is in node x and x is an internal
node, then
a. If the child y that precedes k in node x has at least t
keys, then find the predecessor k’ of k in the subtree
rooted at y. Recursively delete k’, and replace k by
k’ in x. Finding k’ and deleting it can be performed
in a single downward pass.

30
VARIOUS CASES OF REMOVAL A KEY FROM A B-TREE

b. Symmetrically, if the child z that follows k in


node x has at least t keys, then find the successor
k’ of k in the subtree rooted at z. Recursively
delete k’, and replace k by k’ in x. Finding k’ and
deleting it can be performed in a single
downward pass.
c. Otherwise, if both y and z have only t-1 keys,
merge k and all of z into y so that x loses both k
and the pointer to z and y now contains 2t-1 keys.
Free z and recursively delete k from y.

31
VARIOUS CASES OF REMOVAL A KEY FROM A B-TREE

3. If k is not present in internal node x,


determine root ci[x] of the subtree that must
contain k, if k exists in the tree. If ci[x] has
only t-1 keys, execute step 3a or 3b as
necessary to guarantee that we descend to a
node containing at least t keys. Then finish
by recursing on the appropriate child of x.

32
VARIOUS CASES OF REMOVAL A KEY FROM A B-TREE

a. If ci[x] has only t-1 keys but has an immediate


sibling with at least t keys, give ci[x] an extra key
by moving a key from x down into ci[x], moving a
key from ci[x]’s immediate left or right sibling up
into x, and moving the appropriate child pointer
from the sibling into ci[x].
b. If ci[x] and both of ci[x]’s immediate siblings have
t-1 keys, merge ci[x] with one sibling, which
involves moving a key from x down into the new
merged node to become the median key for that
node.

33
REMOVAL EXAMPLE

 Remove 999

34
REMOVAL EXAMPLE

 Remove 999

35
REMOVAL EXAMPLE

 Remove 999

36
REMOVAL EXAMPLE

 Remove 50 and 90

37
REMOVAL EXAMPLE

 Remove 50 and 90

38
REMOVAL EXAMPLE

 Remove 47

39
REMOVAL EXAMPLE

 Remove 47

40
REMOVAL EXAMPLE

 Remove 47

41
REMOVAL EXAMPLE

 Remove 159

42
REMOVAL EXAMPLE

 Remove 159

43
REMOVAL EXAMPLE

 Remove 159

44
REMOVAL EXAMPLE

 Remove 80

45
REMOVAL EXAMPLE

 Remove 80

46
REMOVAL EXAMPLE

 Remove 55

47
REMOVAL EXAMPLE

 Remove 55

48
REMOVAL EXAMPLE

 Remove 400

49
REMOVAL EXAMPLE

 Remove 400

50
REMOVAL EXAMPLE

 Remove 400

51
INSERTION IN B-TREES

 The order of the following tree is 5.


 A full node has 2t-1 keys, 2t-1 = 5 t=3.

52
INSERTION IN B-TREES
 A full node has 2t-1 keys, 2t-1 = 5 t=3.
 All nodes but the root has at least t-1 2 keys.
 Every internal node but the root has at least t  3
children.
 A node is defined to be full if it has exactly 2t-1 5
keys.

53
INSERTION IN B-TREES
 Insert 300

54
INSERTION IN B-TREES
 Insert 300

55
INSERTION IN B-TREES
 Delete 865

56
INSERTION IN B-TREES
 Delete 865

57
INSERTION IN B-TREES
 Delete 505

58
INSERTION IN B-TREES
 Delete 505

59
RECOMMENDATION

 B-tree visualization
https://people.ksp.sk/~kuko/gnarley-
trees/Btree.html

60

You might also like