0% found this document useful (0 votes)
97 views19 pages

B+ and Heaps

B+ trees are multi-way search trees that store data in leaf nodes to speed up traversal. They provide faster access than binary search trees for large datasets by requiring fewer disk accesses. Heaps can efficiently implement priority queues by keeping the heap property - each node's key is larger than its children's keys for max heaps. Heaps are often stored in arrays to allow fast access, and maintain the heap structure during insertion and deletion using trickle-down and trickle-up operations in O(logN) time.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
97 views19 pages

B+ and Heaps

B+ trees are multi-way search trees that store data in leaf nodes to speed up traversal. They provide faster access than binary search trees for large datasets by requiring fewer disk accesses. Heaps can efficiently implement priority queues by keeping the heap property - each node's key is larger than its children's keys for max heaps. Heaps are often stored in arrays to allow fast access, and maintain the heap structure during insertion and deletion using trickle-down and trickle-up operations in O(logN) time.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Data structure

B+ trees and heaps


Why B+ trees
• Suppose you have 100,000 items in a BST
• Levels: ~log2100,000=17
• Meaning: disk may need to be accessed ~17 times
• Note: portion of data may be in memory

• Data Access Times


• RAM: ~50-150ns
• Hard Disk Drive(HDD): ~ 9-15ms
• HDD can be 100,000 times slower!

• Conclusion: BSTs not good enough for BIG data


Multi-Way Search Trees (B+ Tree)
• A multi-way search tree of order M:
• Each node has at most ‘m’ children and ‘m-1’ keys
• Arrangement of keys analogous to BSTs
• In the figure: a tree of order 4 (m=4)
• 3 keys per node (m-1)
• 4 children per node(m)
• Take the key “60”
• Items to it’s left are smaller
• Items to it’s right are bigger
Performance of Multi-way Trees

• Let’s take a tree of degree M, with N items stored in it

• Level of Balanced Tree = log𝑀𝑁

• Suppose you have 1,000,000 items.


• Balanced Binary Trees: ~log2 106 ≈20

• Balanced 10th order tree: ~log10 106 ≈6

• So we need some balance requirement like BSTs


B+ Tree Properties
1. All data items are stored at leaves
• This speeds up traversal operation

2. Non-leaf nodes store up to M-1 keys to guide the searching


3. Root has between 2 and M children (unless it’s a leaf)
4. All non-leaf nodes (except the root) have between ceil(M/2) and M
children
• In other words: a node must be at least half full.

5. All leaves are at the same depth.


• The last two requirements enforce balance
B+ Tree Illustrated

Search, Insertion, Deletion also bear some similarity to binary trees


Heap
• Items are ordered by key value

• Max Heap: Higher key value represents higher priority

• Min Heap: Lower key value represents higher priority

• Binary Heaps can efficiently implement priority queues

• They are complete trees

• All nodes are full, except the leaves

• We fill the last row from left to right


Heap Condition

• Max Heap : every node’s key is larger than the key’s of it’s children

• Min Heap: every node’s key is smaller than the key’s of it’s children
Heaps as Priority Queues

• Two procedures have to be implemented to enqueue and dequeue


elements on a priority queue.

• Enqueue
Heaps as Priority Queues (Enqueue)
Heaps as Priority Queues (Denqueue)
Organizing Arrays as Heaps
Organizing Arrays as Heaps

 For a node stored at index i,


 It’s parent is at: (𝑖−1)/2
 It’s left child is at: 2𝑖+1
 It’s right child is at: 2𝑖+2

 Enables fast access of elements


Heap Tree –Array dequeue

• After dequeue, the tree must satisfy the heap condition and remain
complete

• Steps for dequeue(for max heap):


• Remove the root
• Move the last node to the root
• Trickle the last node down until it’s below a larger node and above a smaller
one.
Heap Tree –Array dequeue
Dequeue()
if array is empty
return null;
root = heapArray[0];
heapArray[0] = heapArray[currentSize-1];
trickleDown(0);
curentSize = currentSize-1;
Trickle-Down Pseudocode
trickleDown(index)
top = heapArray[index]
while(heapArray[index].hasChildren) {
largerChild = pickLargerChild
if(top.key > largerChild.key) break
heapArray[index] = largerChild
index = largerChildIndex

}
heapArray[index] = top
Heap Tree –Array enqueue
insert(key, value)
if array is full
return false

//create node with the new key and value


newNode = new Node(key, value)
//insert node at the end of the array
heapArray[currentSize] = newNode
size = size + 1
trickleUp(size)
Trickle-Up Pseudocode
trickleUp(index)
parent = (index-1)/2
bottom = heapArray[index]
while(index > 0 and heapArray[index].key > heapArray[parent].key)
{
heapArray[index] = heapArray[parent]
index = parent
parent = (index - 1)/2

}
heapArray[index] = bottom
Efficiency
• trickeleUp and trickleDown are the most time consuming
• Remember: level of a complete tree is given by:
• 𝐿𝑒𝑣𝑒𝑙(𝐿)=log2𝑁
• Trickle up/down, in the worst case, cycle through their loops L-1 times
• Insertion
• Best Case: 𝑂(1)
• Worst Case: 𝑂(log2𝑁)
• Deletion
• Worst Case: 𝑂(log2𝑁)

You might also like