0% found this document useful (0 votes)
38 views148 pages

Data Structure Unit 4

Uploaded by

kkyoto24
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views148 pages

Data Structure Unit 4

Uploaded by

kkyoto24
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 148

Department of Electrical and Electronics

Engineering
U20EST356
DATA STRUCTURES

Handling by
Mr.J.Muruganandham
Assistant Professor
Department of EEE
UNIT IV TREES

Trees: Basic Tree Terminologies. Different types of Trees:


Binary Tree – Threaded Binary Tree – Binary Search Tree –
Binary Tree Traversals – AVL Tree. Introduction to B-Tree
and B+ Tree.
Nature View of a Tree
leaves

branche
s root
Computer Scientist’s View

root

leaves

branches
nodes
What is a Tree
• A tree is a finite nonempty
set of elements.
• It is an abstract model of a Computers”R”U
hierarchical structure. s
• consists of nodes with a
parent-child relation.
• Applications: Sale Manufacturin R&
– Organization charts s g D
– File systems
– Programming environments
U Internationa Laptop Desktop
S l s s

Europ Asi Canad


e a a
Tree Terminology
• Root: node without parent (A) Subtree: tree consisting of
• Siblings: nodes share the same parent a node and its descendants
• Internal node: node with at least one child
(A, B, C, F)
• External node (leaf ): node without children
(E, I, J, K, G, H, D) A
• Ancestors of a node: parent, grandparent,
grand-grandparent, etc.
• Descendant of a node: child, grandchild,
grand-grandchild, etc. B C D
• Depth of a node: number of ancestors
• Height of a tree: maximum depth of any
node (3)
• Degree of a node: the number of its
children E F G H
• Degree of a tree: the maximum number of
its node.

I J K
subtree
Forest
Tree Properties
Property Value
A Number of nodes
Height
Root Node
B C Leaves
Interior nodes
Ancestors of H
D E F Descendants of B
Siblings of E
Right subtree of A
Degree of this tree
G

H I
Binary and non binary tree

Non-Binary tree
Binary tree
• A full binary tree (sometimes proper binary
tree or 2-tree) is a tree in which every node
other than the leaves has two children.

A complete binary tree is a binary tree in which


every level, except possibly the last, is
completely filled, and all nodes are as far left
as possible. .
Binary tree representation
• Array representation
• Linked list representation
Array representation:

The element are represented using array


For any element in position i
Left child is in position 2i
Right child is in position 2i+1
Parents is in position i/2
Linked representation of binary tree

• Elements are represented using pointer


– Pointer to left sub tree
– Data field
– Pointer to right sub tree
Binary Tree Traversal Methods

• In a traversal of a binary tree, each element of


the binary tree is visited exactly once.
• During the visit of an element, all action (make
a clone, display, evaluate the operator, etc.)
with respect to this element is taken.
Binary Tree Traversal Methods
• Preorder
• Inorder
• Postorder
• Level order
Preorder Traversal
void preOrder(treePointer ptr)
{
if (ptr != NULL)
{
visit(t);
preOrder(ptr->leftChild);
preOrder(ptr->rightChild);
}
}
Preorder Example (Visit = print)
a

b c

a b c
Preorder Example (Visit = print)
a

b c
f
d e
g h i j

a b d g h e i c f j
Preorder Of Expression Tree
/

* +

e f
+ -
a b c d

/ * + a b - c d + e f

Gives prefix form of expression!


Inorder Traversal

void inOrder(treePointer ptr)


{
if (ptr != NULL)
{
inOrder(ptr->leftChild);
visit(ptr);
inOrder(ptr->rightChild);
}
}
Inorder Example (Visit = print)
a

b c

b a c
Inorder Example (Visit = print)
a

b c
f
d e
g h i j

g d h b e i a f j c
Inorder By Projection (Squishing)
a

b c
f
d e
g h i j

g d h b e i a f j c
Inorder Of Expression Tree
/

* +

e f
+ -

a b c d

a + b * c - d / e + f

Gives infix form of expression (sans parentheses)!


Postorder Traversal

void postOrder(treePointer ptr)


{
if (ptr != NULL)
{
postOrder(ptr->leftChild);
postOrder(ptr->rightChild);
visit(t);
}
}
Postorder Example (Visit = print)
a

b c

b c a
Postorder Example (Visit = print)
a

b c
f
d e
g h i j

g h d i e b j f c a
Postorder Of Expression Tree
/

* +

e f
+ -

a b c d

a b + c d - * e f + /

Gives postfix form of expression!


Traversal Applications
a

b c
f
d e
g h i j

• Make a clone.
• Determine height.
•Determine number of nodes.
Level Order
Let ptr be a pointer to the tree root.
while (ptr != NULL)
{
visit node pointed at by ptr and put its children
on a FIFO queue;
if FIFO queue is empty, set ptr = NULL;
otherwise, delete a node from the FIFO queue
and call it ptr;
}
Level-Order Example (Visit = print)
a

b c
f
d e
g h i j

a b c d e f g h i j
Conversion of General Tree in to Binary Tree
Threaded Binary Trees
Given a binary tree with n nodes,
 the total number of links in the tree is 2n.

Each node (except the root) has exactly one incoming arc
 only n - 1 links point to nodes
 remaining n + 1 links are null.

One can use these null links to simplify some traversal processes.

A threaded binary search tree is a BST with unused links employed to


point to other tree nodes.
 Traversals (as well as other operations, such as backtracking)
made more efficient.

A BST can be threaded with respect to inorder, preorder or postorder


successors.
Threaded Binary Trees
Given the following BST, thread it to facilitate inorder traversal:
H

E K

B F J L

A D G I M

The first node visited in an inorder traversal is the leftmost leaf, node A.
Since A has no right child, the next node visited in this traversal is its
parent, node B.

Use the right pointer of node A as a thread to parent B to make backtracking


easy.
Threaded Binary Trees
H

E K

B F J L

A D G I M

The thread from A to B is shown as the arrow in above diagram.


Threaded Binary Trees
The next node visited is C, and since its right pointer is null, it also can be
used as a thread to its parent D:

E K

B F J L

A D G I M

C
Threaded Binary Trees
Node D has a null right pointer which can be replaced with a pointer to D’s
inorder successor, node E:

E K

B F J L

A D G I M

C
Threaded Binary Trees
The next node visited with a null right pointer is G. Replace the null pointer
with the address of G’s inorder successor: H

E K

B F J L

A D G I M

C
Threaded Binary Trees
Finally, we replace:
first, the null right pointer of I with a pointer to its parent
and then, likewise, the null right pointer of J with a pointer to its parent

E K

B F J L

A D G I M

C
Threaded Tree Example
6
3 8
1 5 7 11
9 13
Threaded Tree Traversal
• We start at the leftmost node in the tree, print
it, and follow its right thread
• If we follow a thread to the right, we output
the node and continue to its right
• If we follow a link to the right, we go to the
leftmost node, print it, and continue
Threaded Tree Traversal
Output
6 1

3 8
1 5 7 11
9 13
Start at leftmost node, print it
Threaded Tree Traversal
Output
6 1
3

3 8
1 5 7 11
9 13
Follow thread to right, print node
Threaded Tree Traversal
Output
6 1
3

3 8 5

1 5 7 11
9 13
Follow link to right, go to leftmost
node and print
Threaded Tree Traversal
Output
6 1
3

3 8 5
6

1 5 7 11
9 13
Follow thread to right, print node
Threaded Tree Traversal
Output
6 1
3

3 8 5
6
7
1 5 7 11
9 13
Follow link to right, go to
leftmost node and print
Threaded Tree Traversal
Output
6 1
3

3 8 5
6
7
1 5 7 11 8

9 13
Follow thread to right, print node
Threaded Tree Traversal
Output
6 1
3

3 8 5
6
7
1 5 7 11 8
9

9 13
Follow link to right, go to
leftmost node and print
Threaded Tree Traversal
Output
6 1
3

3 8 5
6
7
1 5 7 11 8
9
11
9 13
Follow thread to right, print node
Threaded Tree Traversal
Output
6 1
3

3 8 5
6
7
1 5 7 11 8
9
11
9 13 13

Follow link to right, go to


leftmost node and print
Threaded Tree Modification
• We’re still wasting pointers, since half of our
leafs’ pointers are still null
• We can add threads to the previous node in an
inorder traversal as well, which we can use to
traverse the tree backwards or even to do
postorder traversals
Threaded Tree Modification
6
3 8
1 5 7 11
9 13
Binary Search Tree - Best Time
• All BST operations are O(d), where d is tree
depth
• minimum d is d log2N for
 a binary tree with
N nodes
› What is the best case tree?
› What is the worst case tree?
• So, best case running time of BST operations
is O(log N)
Binary Search Tree - Worst Time
• Worst case running time is O(N)
› What happens when you Insert elements in
ascending order?
• Insert: 2, 4, 6, 8, 10, 12 into an empty BST
› Problem: Lack of “balance”:
• compare depths of left and right subtree
› Unbalanced degenerate tree
Balanced and unbalanced BST
1 4

2
2 5
3
1 3
4
4 Is this “balanced”?
5

2 6
6

1 3 5 7 7
Approaches to balancing trees
• Don't balance
› May end up with some nodes very deep
• Strict balance
› The tree must always be balanced perfectly
• Pretty good balance
› Only allow a little out of balance
• Adjust on access
› Self-adjusting
Balancing Binary Search Trees
• Many algorithms exist for keeping binary
search trees balanced
› Adelson-Velskii and Landis (AVL) trees (height-
balanced trees)
› Splay trees and other self-adjusting trees
› B-trees and other multiway search trees
Perfect Balance
• Want a complete tree after every operation
› tree is full except possibly in the lower right
• This is expensive
› For example, insert 2 in the tree on the left and
then rebuild as a complete tree
6 5
Insert 2 &
4 9 2 8
complete tree

1 5 8 1 4 6 9
AVL - Good but not Perfect
Balance
• AVL trees are height-balanced binary search
trees
• Balance factor of a node
› height(left subtree) - height(right subtree)
• An AVL tree has balance factor calculated at
every node
› For every node, heights of left and right subtree
can differ by no more than 1
› Store current heights in each node
Height of an AVL Tree
• N(h) = minimum number of nodes in an AVL
tree of height h.
• Basis
› N(0) = 1, N(1) = 2
• Induction h

› N(h) = N(h-1) + N(h-2) + 1


• Solution (recall Fibonacci analysis)
› N(h) > h (  1.62) h-1
h-2
Height of an AVL Tree
• N(h) > h (  1.62)
• Suppose we have n nodes in an AVL tree of
height h.
› n > N(h) (because N(h) was the minimum)
› n > h hence log n > h (relatively well balanced
tree!!)
› h < 1.44 log2n (i.e., Find takes O(logn))
Node Heights
Tree A (AVL) Tree B (AVL)
height=2 BF=1-0=1 2
6 6
1 0 1 1
4 9 4 9
0 0 0 0 0
1 5 1 5 8

height of node = h
balance factor = hleft-hright
empty height = -1
Node Heights after Insert 7
Tree A (AVL) Tree B (not AVL)
balance factor
2 3 1-(-1) = 2
6 6
1 1 1 2
4 9 4 9
0 0 0 0 0 1 -1
1 5 7 1 5 8

0
7
height of node = h
balance factor = hleft-hright
empty height = -1
Insert and Rotation in AVL Trees
• Insert operation may cause balance factor to
become 2 or –2 for some node
› only nodes on the path from insertion point to
root node have possibly changed in height
› So after the Insert, go back up to the root node by
node, updating heights
› If a new balance factor (the difference hleft-hright) is
2 or –2, adjust tree by rotation around the node
Single Rotation in an AVL Tree
2 2
6 6
1 2 1 1
4 9 4 8
0 0 1 0 0 0 0
1 5 8 1 5 7 9

0
7
Insertions in AVL Trees
Let the node that needs rebalancing be .

There are 4 cases:


Outside Cases (require single rotation) :
1. Insertion into left subtree of left child of .
2. Insertion into right subtree of right child of .
Inside Cases (require double rotation) :
3. Insertion into right subtree of left child of .
4. Insertion into left subtree of right child of .

The rebalancing is performed through four


separate rotation algorithms.
AVL Insertion: Outside Case
Consider a valid
AVL subtree j

k h

h
h
Z
X Y
AVL Insertion: Outside Case
j Inserting into X
destroys the AVL
property at node j

k h

h+1 h Z
Y
X
AVL Insertion: Outside Case
j Do a “right rotation”

k h

h+1 h Z
Y
X
Single right rotation
j Do a “right rotation”

k h

h+1 h Z
Y
X
Outside Case Completed
“Right rotation” done!

k (“Left rotation” is mirror


symmetric)

h+1
j
h h

X Y Z
AVL property has been restored!
AVL Insertion: Inside Case
Consider a valid
AVL subtree j

k h

h h Z
X Y
AVL Insertion: Inside Case
Inserting into Y
destroys the
AVL property
j Does “right rotation”
restore balance?

at node j

k h

h h+1 Z
X
Y
AVL Insertion: Inside Case

k
“Right rotation”
does not restore
balance… now k is

j
out of balance
h

X h+1
h

Z
Y
AVL Insertion: Inside Case
Consider the structure
of subtree Y… j
k h

h h+1 Z
X
Y
AVL Insertion: Inside Case
Y = node i and
subtrees V and W j
k h

h
i h+1 Z
X h or h-1

V W
AVL Insertion: Inside Case
j We will do a left-right
“double rotation” . . .

k
i Z
X
V W
Double rotation : first rotation
j left rotation complete

i
k Z
W
X V
Double rotation : second
rotation
j Now do a right rotation

i
k Z
W
X V
Double rotation : second
rotation
right rotation complete

Balance has been

i restored

k j
h h
h or h-1

X V W Z
Implementation

balance (1,0,-1)
key
left right

No need to keep the height; just the difference in height,


i.e. the balance factor; this has to be modified on the path of
insertion even if you don’t perform rotations
Once you have performed a rotation (single or double) you won’t
need to go back up the tree
Single Rotation
RotateFromRight(n : reference node pointer) {
p : node pointer;
p := n.right; n
n.right := p.left;
p.left := n;
n := p
}

You also need to


X
modify the heights
or balance factors Insert
of n and p Y Z
Double Rotation
• Implement Double Rotation in two lines.

DoubleRotateFromRight(n : reference node pointer) {


???? n
}

V W
Insertion in AVL Trees
• Insert at the leaf (as for all BST)
› only nodes on the path from insertion point to
root node have possibly changed in height
› So after the Insert, go back up to the root node by
node, updating heights
› If a new balance factor (the difference hleft-hright) is
2 or –2, adjust tree by rotation around the node
Insert in BST
Insert(T : reference tree pointer, x : element) : integer {
if T = null then
T := new tree; T.data := x; return 1;//the links to

//children are null


case
T.data = x : return 0; //Duplicate do nothing
T.data > x : return Insert(T.left, x);
T.data < x : return Insert(T.right, x);
endcase
}
Insert in AVL trees
Insert(T : reference tree pointer, x : element) : {
if T = null then
{T := new tree; T.data := x; height := 0; return;}
case
T.data = x : return ; //Duplicate do nothing
T.data > x : Insert(T.left, x);
if ((height(T.left)- height(T.right)) = 2){
if (T.left.data > x ) then //outside case
T = RotatefromLeft (T);
else //inside case
T = DoubleRotatefromLeft (T);}
T.data < x : Insert(T.right, x);
code similar to the left case
Endcase
T.height := max(height(T.left),height(T.right)) +1;
return;
}
Example of Insertions in an AVL
Tree
2
20
Insert 5, 40
0 1
10 30
0 0
25 35
Example of Insertions in an AVL
Tree
2
3
20 20
1 1 1 2
10 30 10 30
0 0 0 1
0 0
5 25 35 5 25 35
0
40
Now Insert 45
Single rotation (outside case)
3
3
20 20
1 2 1 2
10 30 10 30
0 0 2
0 0
5 25 35 5 25 40 1
0 0
35 45
1 40
Imbalance

0 45
Now Insert 34
Double rotation (inside case)
3
3
20 20
1 3 1 2
10 30 10 35
0 0 2
0 1
5 Imbalance 25 40 5 30 40 1
0
1 35 0 25 34 45
45 0

Insertion of 34 0
34
AVL Tree Deletion
• Similar but more complex than insertion
› Rotations and double rotations needed to
rebalance
› Imbalance may propagate upward so that
many rotations may be needed.
Pros and Cons of AVL Trees
Arguments for AVL trees:

1. Search is O(log N) since AVL trees are always balanced.


2. Insertion and deletions are also O(logn)
3. The height balancing adds no more than a constant factor to the
speed of insertion.

Arguments against using AVL trees:


4. Difficult to program & debug; more space for balance factor.
5. Asymptotically faster but rebalancing costs time.
6. Most large searches are done in database systems on disk and use
other structures (e.g. B-trees).
7. May be OK to have O(N) for a single operation if total run time for
many consecutive operations is fast (e.g. Splay trees).
Double Rotation Solution

DoubleRotateFromRight(n : reference node pointer) {


RotateFromLeft(n.right);
n
RotateFromRight(n);
}

V W
B-Trees
Considerations for disk-based storage
systems.
Indexed Sequential Access Method
(ISAM)
m-way search trees
B-trees
Data Layout on Disk
• Track: one ring
• Sector: one pie-shaped piece.
• Block: intersection of a track and a sector.
Disk Block Access Time
Seek time = maximum of
Time for the disk head to move to the correct track.
Time for the beginning of the correct sector to spin round
to the head. (Some authors use “latency” as the term for this component,
or they use latency to refer to all of what we are calling seek time.)

Transfer time =
Time to read or write the data.
(Approximately the time for the sector to spin by the head).

For a 7200 RPM hard disk with 8 millisec seek time,


average access time for a block is about 12 millisec.
(see Albert Drayes and John Treder: http://www.tanstaafl-software.com/seektime.html)
Considerations for Disk Based
Dictionary Structures
Use a disk-based method when the dictionary is too big to
fit in RAM at once.

Minimize the expected or worst-case number of disk


accesses for the essential operations (put, get, remove).

Keep space requirements reasonable -- O(n).

Methods based on binary trees, such as AVL search trees,


are not optimal for disk-based representations. The
number of disk accesses can be greatly reduced by using
m-way search trees.
Indexed Sequential Access Method
(ISAM)

Store m records in each disk block.

Use an index that consists of an array with


one element for each disk block, holding a
copy of the largest key that occurs in that
block.
ISAM (Continued)

1.7 5.1 21.2 26.8 ...


ISAM (Continued)
To perform a get(k) operation:

Look in the index using, say, either a


sequential search or a binary search, to
determine which disk block should hold the
desired record.

Then perform one disk access to read that


block, and extract the desired record, if it
exists.
ISAM Limitations
Problems with ISAM:

What if the index itself is too large to fit entirely in


RAM at the same time?

Insertion and deletion could be very expensive if


all records after the inserted or deleted one have
to shift up or down, crossing block boundaries.
A Solution: B-Trees
Idea 1: Use m-way search trees.
(ISAM uses a root and one level under the root.)
m-way search trees can be as high as we need.

Idea 2: Don’t require that each node always be full.


Empty space will permit insertion without rebalancing.
Allowing empty space after a deletion can also avoid
rebalancing.

Idea 3: Rebalancing will sometimes be necessary: figure


out how to do it in time proportional to the height of the
tree.
B-Tree Example with m = 5

12

2 3 8 13 27

The root has been 2 and m children.


Each non-root internal node has between m/2 and m children.
All external nodes are at the same level. (External nodes are actually
represented by null pointers in implementations.)
Insert 10

12

2 3 8 10 13 27

We find the location for 10 by following a path from the root using the stored
key values to guide the search.
The search falls out the tree at the 4th child of the 1st child of the root.
The 1st child of the root has room for the new element, so we store it there.
Insert 11

12

2 3 8 10 11 13 27

We fall out of the tree at the child to the right of key 10.
But there is no more room in the left child of the root to hold 11.
Therefore, we must split this node...
Insert 11 (Continued)

8 12

2 3 10 11 13 27

The m + 1 children are divided evenly between the old and new nodes.
The parent gets one new child. (If the parent become overfull, then it, too, will
have to be split).
Remove 8

8 12

2 3 10 11 13 27

Removing 8 might force us to move another key up from one of the children. It
could either be the 3 from the 1st child or the 10 from the second child.
However, neither child has more than the minimum number of children (3), so
the two nodes will have to be merged. Nothing moves up.
Remove 8 (Continued)

12

2 3 10 11 13 27

The root contains one fewer key, and has one fewer child.
Remove 13

12

2 3 10 11 13 27

Removing 13 would cause the node containing it to become underfull.


To fix this, we try to reassign one key from a sibling that has spares.
Remove 13 (Cont)

11

2 3 10 12 27

The 13 is replaced by the parent’s key 12.


The parent’s key 12 is replaced by the spare key 11 from the left sibling.
The sibling has one fewer element.
Remove 11

11

2 3 10 12 27

11 is in a non-leaf, so replace it by the value immediately preceding: 10.


10 is at leaf, and this node has spares, so just delete it there.
Remove 11 (Cont)

10

2 3 12 27
Remove 2

10

2 3 12 27

Although 2 is at leaf level, removing it leads to an underfull node.


The node has no left sibling. It does have a right sibling, but that node is at its
minimum occupancy already.
Therefore, the node must be merged with its right sibling.
Remove 2 (Cont)

3 10 12 27

The result is illegal, because the root does not have at least 2 children.
Therefore, we must remove the root, making its child the new root.
Remove 2 (Cont)

3 10 12 27

The new B-tree has only one node, the root.


Insert 49

3 10 12 27

Let’s put an element into this B-tree.


Insert 49 (Cont)

3 10 12 27 49

Adding this key make the node overfull, so it must be split into two.
But this node was the root.
So we must construct a new root, and make these its children.
Insert 49 (Cont)

12

3 10 27 49

The middle key (12) is moved up into the root.


The result is a B-tree with one more level.
B-Tree performance

Let h = height of the B-tree.


get(k): at most h disk accesses. O(h)
put(k): at most 3h + 1 disk accesses. O(h)
remove(k): at most 3h disk accesses. O(h)

h < log d (n + 1)/2 + 1 where d = m/2 (Sahni, p.641).


An important point is that the constant factors are relatively low.
m should be chosen so as to match the maximum node size to the
block size on the disk.
Example: m = 128, d = 64, n  643 = 262144 , h = 4.
2-3 Trees
A B-tree of order m is a kind of m-way search
tree.
A B-Tree of order 3 is called a 2-3 Tree.
In a 2-3 tree, each internal node has either 2
or 3 children.
In practical applications, however, B-Trees of
large order (e.g., m = 128) are more common
than low-order B-Trees such as 2-3 trees.
B -Trees
+

• Same structure as B-trees.


• Dictionary pairs are in leaves only. Leaves form a
doubly-linked list.
• Remaining nodes have following structure:

j a0 k1 a1 k2 a2 … kj aj

• j = number of keys in node.


• ai is a pointer to a subtree.
• ki <= smallest key in subtree ai and > largest
in ai-1.
Example B+-tree

5
16 30

1 3 5 6 9 16 17 30 40

 index node

 leaf/data node
B+-tree—Search

5
16 30

1 3 5 6 9 16 17 30 40

key = 5

6 <= key <= 20


B+-tree—Insert

5
16 30

1 5 6 9 16 17 30 40

Insert 10
Insert
9

5
16 30

1 3 5 6 9 16 17 30 40

• Insert a pair with key = 2.


• New pair goes into a 3-node.
Insert Into A 3-node
• Insert new pair so that the keys are in
ascending order.
123

• Split into two nodes.


1 23

• Insert smallest key in new node and pointer


to this new node into parent.
2

1 23
Insert
9

5 2
16 30

2 3
1 5 6 9 16 17 30 40

• Insert an index entry 2 plus a pointer into parent.


Insert
9

2 5 16 30

1 2 3 5 6 9 16 17 30 40

• Now, insert a pair with key = 18.


Insert
9

17

2 5 16 30 17 18

1 2 3 5 6 9 16 30 40

• Now, insert a pair with key = 18.


• Insert an index entry17 plus a pointer into parent.
Insert
9 17

2 5 16 30

1 2 3 5 6 9 16 17 18 30 40

• Now, insert a pair with key = 18.


• Insert an index entry17 plus a pointer into parent.
Insert
9 17

2 5 16 30

1 2 3 5 6 9 16 17 18 30 40

• Now, insert a pair with key = 7.


Delete
9

2 5 16 30

1 2 3 5 6 9 16 17 30 40

• Delete pair with key = 16.


• Note: delete pair is always in a leaf.
Delete
9

2 5 16 30

1 2 3 5 6 9 17 30 40

• Delete pair with key = 16.


• Note: delete pair is always in a leaf.
Delete
9

2 5 16 30

1 2 3 5 6 9 17 30 40

• Delete pair with key = 1.

• Get >= 1 from adjacent sibling and update parent key.


Delete
9

3 5 16 30

2 3 5 6 9 17 30 40

• Delete pair with key = 1.

• Get >= 1 from sibling and update parent key.


Delete
9

3 5 16 30

2 3 5 6 9 17 30 40

• Delete pair with key = 2.

• Merge with sibling, delete in-between key in parent.


Delete
9

5 16 30

3 5 6 9 17 30 40

• Delete pair with key = 3.

•Get >= 1 from sibling and update parent key.


Delete
9

6 16 30

5 6 9 17 30 40

• Delete pair with key = 9.

• Merge with sibling, delete in-between key in parent.


Delete
9

6 30

17 30 40
5 6
Delete
9

6 16 30

5 6 9 17 30 40

• Delete pair with key = 6.

• Merge with sibling, delete in-between key in parent.


Delete
9

16 30

5 9 17 30 40

• Index node becomes deficient.


•Get >= 1 from sibling, move last one to parent, get
parent key.
Delete
16

9 30

5 9 17 30 40

• Delete 9.
• Merge with sibling, delete in-between key in parent.
Delete
16

30

5 17 30 40

•Index node becomes deficient.


• Merge with sibling and in-between key in parent.
Delete

16 30

5 17 30 40

•Index node becomes deficient.


• It’s the root; discard.
B*-Trees

• Root has between 2 and 2 * floor((2m – 2)/3) + 1


children.
• Remaining nodes have between ceil((2m – 1)/3)
and m children.
• All external/failure nodes are on the same level.
• m=3
• m=4
• Assume m > 3 in following.
Insert
• When insert node is overfull, check adjacent
sibling.
• If adjacent sibling is not full, move a dictionary
pair from overfull node, via parent, to nonfull
adjacent sibling.
• If adjacent sibling is full, split overfull node,
adjacent full node, and in-between pair from
parent to get three nodes with floor((2m –
2)/3), floor((2m – 1)/3), floor(2m/3) pairs plus
two additional pairs for insertion into parent.
Delete
• When combining, must combine 3 adjacent
nodes and 2 in-between pairs from parent.
– Total # pairs involved = 2 * floor((2m-2)/3) +
[floor((2m-2)/3) – 1] + 2.
– Equals 3 * floor((2m-2)/3) + 1.
• Combining yields 2 nodes and a pair that is
to be inserted into the parent.
– m mod 3 = 1 => nodes have m – 1 pairs each.
– m mod 3 = 0 => one node has m – 1 pairs and
the other has m – 2.
– m mod 3 = 2 => nodes have m – 2 pairs each.

You might also like