0% found this document useful (0 votes)
46 views22 pages

Application of B & B+ Tree in Storage Allocation

The document discusses B+ trees, which are commonly used to store large amounts of data on disk. B+ trees keep the tree shallow by allowing internal nodes to have many children. All data is stored only in the leaf nodes, which are linked together for sequential access. This reduces the number of disk accesses needed for operations like searches, inserts and deletes compared to regular B-trees. B+ trees are well-suited for storage in databases and file systems due to their efficiency in external memory.

Uploaded by

Subho Chowdhury
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views22 pages

Application of B & B+ Tree in Storage Allocation

The document discusses B+ trees, which are commonly used to store large amounts of data on disk. B+ trees keep the tree shallow by allowing internal nodes to have many children. All data is stored only in the leaf nodes, which are linked together for sequential access. This reduces the number of disk accesses needed for operations like searches, inserts and deletes compared to regular B-trees. B+ trees are well-suited for storage in databases and file systems due to their efficiency in external memory.

Uploaded by

Subho Chowdhury
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 22

Application of B & B+

Tree in Storage
Allocation
Introduction
 As we have seen already, database consists of tables,
views, index, procedures, functions etc.
 The tables and views are logical form of viewing the data.
But the actual data are stored in the physical memory .
 Database is a very huge storage mechanism and it will have
lots of data and hence it will be in physical storage devices
– like magnetic disk .
 In the physical memory devices, these data cannot be
stored as it is. They are converted to binary format.
 Each memory devices will have many data blocks, each of
which will be capable of storing certain amount of data.
 The data and these blocks will be mapped to store the data
in the memory.
Overview of a Secondary
Storage-Magnetic Disk
Structure of Magnetic Disk
The primary medium for the long term online storage of data is
the magnetic disk . Physically, disks are relatively simple.
 Each disk platter has a flat , circular shape.
 Its two surfaces are covered with magnetic material.
 Information is recorded on the surfaces.
 When in use, a drive motor spins it at a constant high speed
(90, 120, 250 revolutions per second).
 There is a read-write head positioned just above the surfaces
of the platter.
 The disk surface is logically divided into tracks.
 Tracks are subdivided into sectors.
 A sector is the smallest unit of information that can be read
from or written to the disk.
Access Data in Magnetic Disk
 Traditional HDD has rotating drives
which stores data in tracks.
 When the data needs to be read or
written, the actuator with an arm,
needs to go to the particular sector
on the track to read or write a data.
This is measured as seek time.
 After that, the drive needs to rotate
to reach to a particular sector
(rotational latency).
 When we are dealing with huge
amount of data, it might become a
bottleneck since disk has to
continuously move to a specific
sector.
 Average seek time vary from 4ms for
high end servers and 9ms for
common server.
Motivation

 We assume that everything in a search tree is kept within


the main memory (including the balanced trees like AVL,
red-black trees, splay trees, etc.).
 What if the data items contained in a search tree do not
fit into the main memory?
 Just think about searching in the UIDAI database (for
AADHAAR details).
 Let us assume there is only 8 Bytes of data (say the
AADHAAR ID) per citizen and we have to create a search
tree.
 The population of India: 1,358,856,931 (LIVE!!!).
 The search tree will require more than 20 GB memory
(including pointers)!!!
Search Tree on disk

 A majority of the tree operations (search, insert,


delete, etc.) will require O(log2 n) disk accesses
where n is the number of data items in the
search tree.
 The main challenge is to reduce the number of
disk accesses.
 An m-ary search tree allows m-way branching.
 As branching increases, the depth decreases.
 A complete binary tree has a height of ┌ log2 n
┐.
 But a complete m-ary tree has a height of ┌ logm
n ┐.
Cycles to access different
types of storage
Storage Type Access Type Number of Cycles

CPU registers Random 1

L2 cache Random 2

L2 cache Random 30

Main Memory Random 2.5 X 10^2

Hard Disk Random 3 X 10^7

Steam Line 5 X 10^3


Characteristics of B Tree

 B-Tree is a low-depth self-balancing tree.


 The height of a B-Tree is kept low by
putting maximum possible keys in a B-
Tree node.
 Generally, the node size of a B-Tree is
kept equal to the disk block size.
What is B Tree

 Definition:-
A B-Tree of order m is an m-ary tree with the
following properties:
 The data items are stored at leaves.
 The non-leaf nodes store up to m − 1 keys to guide
the searching; The key i represents the smallest key
in subtree i + 1.
 The root is either a leaf or has between 2 and m
children.
 All non-leaf nodes (except the root) have between
┌m/2┐ and m children.
 All leaves are at the same depth and have between
┌ k/2 ┐ and k data items, for some k.
Searching
Insert 56 into tree
Delete
Delete
B+ Tree

 B+-trees are an important variant of B-trees.


 The performance of a B-tree depends heavily on
the height of the tree.
 The deeper a tree, the more page lookups (on
secondary storage) we need to reach a leaf.
 So what can we do to “flatten” B-trees?
B+ Tree

 If we can increase the branching (number of


pointers) in inner nodes, then the tree will
become “flatter”.
 Instead of storing data in inner nodes, we only
store search keys (take up less space ⇒ more room
for pointers).
 We also link all the leaf nodes, allowing a fast
sequential search.
Schema of B+ Tree
B+ Tree

 Definition:
A B+-Tree of order m is an m-ary tree with the
following properties:
 The data items are stored at leaves.
 The non-leaf nodes store up to m − 1 keys to guide
the searching; The key i represents the smallest
key in subtree i + 1.
 The root is either a leaf or has between 2 and m
children.
 All leaves are at the same depth and have up to k
data items, for some k.
Example
Advantages
 Since all records are stored only in the leaf node and are
sorted sequential linked list, searching is becomes very
easy.
 Using B+, we can retrieve range retrieval or partial
retrieval. Traversing through the tree structure makes
this easier and quicker.
 As the number of record increases/decreases, B+ tree
structure grows/shrinks. There is no restriction on B+
tree size, like we have in ISAM.
 Since it is a balance tree structure, any insert/ delete/
update does not affect the performance.
 Since we have all the data stored in the leaf nodes and
more branching of internal nodes makes height of the
tree shorter. This reduces disk I/O. Hence it works well in
secondary storage devices.
Conclusion

 B+ tree are extensively used in both


database and file systems because of the
efficiency they provide to store and retrieve
data from external memory. Thus B+ trees
are a cost effective way to store data in
bulk.

You might also like