LM6 - B+ Tree Index Files - B Tree Index Files
LM6 - B+ Tree Index Files - B Tree Index Files
CS3391/OOP/IICSE/IIISEM/KG-KiTE
Course Outcome
CS3391/OOP/IICSE/IIISEM/KG-KiTE
Syllabus
UNIT IV-IMPLEMENTATION TECHNIQUES
CS3391/OOP/IICSE/IIISEM/KG-KiTE
B+ Tree Index Files – B Tree Index Files
B+ Tree Index Files
B+ Tree:
• The B+ tree is a balanced binary search tree. It follows a multi-level
index format.
• In the B+ tree, leaf nodes denote actual data pointers. B+ tree ensures
that all leaf nodes remain at the same height.
• In the B+ tree, the leaf nodes are linked using a link list. Therefore, a B+
tree can support random access as well as sequential access.
B+ Tree Index Files
Structure of B+ Tree:
• In the B+ tree, every leaf node is at equal distance from the root
node. The B+ tree is of the order n where n is fixed for every B+
tree.
• It contains an internal node and leaf node.
B+ Tree Index Files
Structure of B+ Tree:
i) Internal node
• An internal node of the B+ tree can contain at least n/2 record pointers
except the root node.
• At most, an internal node of the tree contains n pointers.
ii) Leaf node
• The leaf node of the B+ tree can contain at least n/2 record pointers and n/2
key values.
• At most, a leaf node contains n record pointer and n key values.
• Every leaf node of the B+ tree contains one block pointer P to point to next
leaf node.
B+ Tree Index Files
Searching a record in B+ Tree:
• Suppose we have to search 55 in the below B+ tree structure. First, we
will fetch for the intermediary node which will direct to the leaf node
that can contain a record for 55.
• So, in the intermediary node, we will find a branch between 50 and 75
nodes. Then at the end, we will be redirected to the third leaf node.
Here DBMS will perform a sequential search to find 55.
B+ Tree Index Files
Inserting a record in B+ Tree:
• Suppose 60 needs to be inserted in the below structure. It will go to
the 3rd leaf node after 55.
• It is a balanced tree, and a leaf node of this tree is already full, so we
cannot insert 60 there.
• In this case, we have to split the leaf node, so that it can be inserted
into tree without affecting the fill factor, balance and order.
B+ Tree Index Files
Deleting a record from B+ Tree:
• Suppose 60 needs to be deleted from the below structure.
• In this case, 60 can be removed from the intermediate node as well as
from the 4th leaf node too.
• If it is removed from the intermediate node, then the tree will not
satisfy the rule of the B+ tree. So it needs to be modified to make it a
balanced tree.
B Tree Index Files
• B-tree in DBMS is an m-way tree that balances itself.
• Due to their balanced structure, such trees are frequently used to
manage and organize enormous databases and facilitate searches.
• In a B-tree, each node can have a maximum of n child nodes.
• In DBMS, B-tree is an example of multilevel indexing.
• Leaf nodes and internal nodes will both have record references.
• B-Tree is called a Balanced stored tree as all the leaf nodes are at the
same levels.
• Thus B-trees improve the databases' performance.
B Tree Index Files
• Example B Tree:
B Tree Index Files
Properties of B-tree:
• A non-leaf node's number of keys is one less than the number of its
children.
• The number of keys in the root ranges from one to (m-1) maximum.
Therefore, the root has a minimum of two and a maximum of m
children.
• The keys range from min([m/2]-1) to max(m-1) for all nodes (non-leaf
nodes) besides the root. Thus, they can have
between m and [m/2] children.
• The level of each leaf node is the same.
B Tree Index Files
Need of B-tree:
• For having optimized searching we cannot increase a tree's height.
Therefore, we want the tree to be as short as possible in height.
• Use of B-tree in DBMS, which has more branches and hence shorter
height, is the solution to this problem. Access time decreases as
branching and depth grow.
• Hence, use of B-tree is needed for storing data as searching and
accessing time is decreased.
• The cost of accessing the disc is high when searching tables Therefore,
minimizing disc access is our goal.
• So to decrease time and cost, we use B-tree for storing data as it
makes the Index Fast.
B Tree Index Files
How Database B-Tree Indexing Works?:
• When B-tree is used for database indexing, it becomes a little more
complex because it has both a key and a value.
• The value serves as a reference to the particular data record. A payload
is the collective term for the key and value.
• For index data to a particular key and value, the database first
constructs a unique random index or a primary key for each of the
supplied records.
• The keys and record byte streams are then all stored on a B+ tree. The
random index that is generated is used for indexing of the data.
• So this indexing helps to decrease the searching time of data.
B Tree Index Files
How Database B-Tree Indexing Works?:
• In a B-tree, all the data is stored on the leaf nodes, now for accessing a
particular data index, the database can make use of binary search on
the leaf nodes as the data is stored in the sorted order.
• If indexing is not used, the database reads each and every record to
locate the requested record and it increases time and cost for
searching the records, so B-tree indexing is very efficient.
B Tree Index Files
How Searching Happens in an Indexed Database?
• The database does a search in the B-tree for a given key and returns
the index in O(log(n)) time.
• The record is then obtained by running a second B+ tree search
in O(log(n)) time using the discovered index.
• So overall approx time taken for searching a record in a B-tree in
DBMS-indexed databases is O(log(n)).
B Tree Index Files
Example of B-Tree:
• Suppose there are some numbers that need to be stored in a database,
so if we store them in a B-tree in DBMS, they will be stored in a sorted
order so that the searching time can be logarithmic.
• For example:
B Tree Index Files
Parameters B+ Tree B Tree
Separate leaf nodes for data storage and internal nodes for
Structure Nodes store both keys and data values
indexing
Key Duplication Typically allows key duplication in leaf nodes Usually does not allow key duplication
Better disk access due to sequential reads in a linked list More disk I/O due to non-sequential reads in
Disk Access
structure internal nodes
Database systems, file systems, where range queries are In-memory data structures, databases, general-
Applications
common purpose use
Better performance for range queries and bulk data Balanced performance for search, insert, and delete
Performance
retrieval operations