0% found this document useful (0 votes)
32 views53 pages

Dsa Notes Topic Upto Tree

Dsa basic

Uploaded by

dhruvdwivedi761
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views53 pages

Dsa Notes Topic Upto Tree

Dsa basic

Uploaded by

dhruvdwivedi761
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 53

TECHNOCRATS INSTITUTE OF TECHNOLOGY,BHOPAL

Department of computer science and engineering


SUBJECT : - Data Structure & Algorithms
By: Er.Chandan kumar

➢ What is Data Structure ?


• Data structures are the fundamental building blocks of computer programming. They
define how data is organized, stored, and manipulated within a program.
Understanding data structures is very important for developing efficient and effective
algorithms.
• A data structure is not only used for organizing the data. It is also used for processing,
retrieving, and storing data. There are different basic and advanced types of data
structures that are used in almost every program or software system that has been
developed.
➢ Classification of Data Structure:

According to the data types, the data structure is divided into two categories,
namely:
▪ Primitive Data Structures
▪ Non-Primitive Data Structures
➢ Primitive Data Structures:
This type of data structure stores the value of a particular data type. For example, an
integer data structure can store the value of only an integer. A primitive data
structure cannot be NULL.
❖ The following are the primitive data structures:
▪ Integer: It represents a whole number data type with no decimal places.
For example, int num = 30
▪ Float: It is a data type with decimal precision.
For example, float num = 2346.9875
▪ Character: It represents a single character.
For example, char name = “M”
▪ Boolean: It returns true or false values for a condition specified.
For example, boolean value = “true”
➢ Non-Primitive Data Structures:
These data structures are capable of storing values of more than one data type. The
size of the non-primitive data structure depends on the type of data it will store. For
example, a list (a non-primitive data structure) can store the values of various data
types. It can contain a NULL value.
➢ The following are some of the non-primitive data structures:
▪ List: A list is a linear type of data structure. This data structure holds an ordered list of
elements.
For example,
list_of_networks [“Airtel”, ”Jio”, “Idea”]
▪ Arrays: An array is also a linear data structure that stores a collection of data having
the same data type. It has a fixed size, which means it stores data in a contagious
memory location of the same data type (int, float, boolean, text, and others).
For example,
int[] numbers = {1, 2, 3, 4, 5}
▪ Queue: A Queue is a data structure that arranges items in a specific order and
follows the First-In-First-Out (FIFO) method for accessing elements. It is commonly
employed in building priority queuing systems and handling threads in multithreading
scenarios.
For example,
String[] names = {"Alice", "Charlie", "David", "Emily"};
▪ Stack: Based on the principle ‘last in, first out’ or LIFO, a stack is an abstract data type
that is composed of homogenous pieces. It is used for push and pop operations that
are applied on top of the stack. While the pop operation is responsible for removing
an element from the top spot, the push operation adds an element to the stack.
For example,
stack = [1,2,3,4,5]
▪ Tree: This is a non-linear and hierarchical data structure. A tree is a hierarchical
structure with nodes connected by edges, with the root being the top node and child
nodes below it.
For example,
tree = Tree(1, Tree(2, 2.1, 2.2), Tree(3, 3.1))

➢ Linear Data Structure: Data structure in which data elements are arranged
sequentially or linearly, where each element is attached to its previous and next
adjacent elements, is called a linear data structure.
Example: Array, Stack, Queue, Linked List, etc.
➢ Static Data Structure: Static data structure has a fixed memory size. It is easier to
access the elements in a static data structure.
Example: array.
➢ Dynamic Data Structure: In dynamic data structure, the size is not fixed. It can be
randomly updated during the runtime which may be considered efficient concerning
the memory (space) complexity of the code.
Example: Queue, Stack, etc.
➢ Non-Linear Data Structure: Data structures where data elements are not placed
sequentially or linearly are called non-linear data structures. In a non-linear data
structure, we can’t traverse all the elements in a single run only.
Examples: Trees and Graphs.
➢ Need Of Data structure :
1. Data structure modification is easy.
2. It requires less time.
3. Save storage memory space.
4. Data representation is easy.
5. Easy access to the large database.
➢ Advantages & Disadvantages of Data Structure:
➢ Advantages of Data Structure:
Data structures offer a wide range of benefits for creating and maintaining code in
programming languages. Some of these advantages are:

1) Efficient Storage
Data structures provide efficient storage by organizing the data effectively, which facilitates
quick retrieval and maintenance of the data in the system. The memory allocation takes
place according to the data types used in the data structure.

2) Easy Data Processing


Various data structures are used for specific purposes like organizing, processing, retrieving,
and storing data. These structures enable users to access and work with data efficiently.
Data structures simplify data processing and enable faster sorting and searching for specific
data within a large data set. The data structures convert raw data into a machine-readable
format and develop algorithms for data processing.

3) Develop Algorithms
Algorithms for data structures help organize and access information in a structured manner.
These algorithms take into account the format of the data as well as any actions that can be
performed on it. Their goal is to find the most efficient way to store and manipulate data
within the structure while also allowing for easy navigation. By utilizing these algorithms,
complex issues can be resolved with efficiency.

4) Reusability of Data
One of the fundamental advantages of data structure is that it offers data reusability. It
enables the creation of data in specific formats, which can then be stored in libraries,
allowing different clients to utilize and access the data as needed. Therefore, data can be
reused in multiple ways and purposes. This makes it easier to create efficient and dynamic
algorithms that can be used for different applications.

5) Provide Built-in Functions


Different programming languages offer diverse data structures equipped with a variety of
built-in functions. These functions make the most efficient use of specific databases and
enhance data manipulation capabilities. For example, data structures provide built-in
functions, such as search, sort, filter, and merge, which enable us to manipulate data more
effectively.
6) Supports Data Abstraction
The abstract data type in data structures helps support data abstraction. Data abstraction is
the process of hiding internal functions and displaying relevant and basic information. An
abstract data type supports the use of complex data structures with complex functions.
They can customize any data structure according to how it will be used and enable reusing
code by calling its functions without writing repetitive code. Examples of abstract data
structures include lists, queues, stacks, etc.

7) Saves Programmer’s Time


Data structures streamline the process of organizing and accessing data, which helps save
time. Developers can access data quickly and efficiently without having to manually search
through large amounts of data by selecting the appropriate data structure for their
program.

➢ Disadvantages of Data Structure:


1) Difficult to Handle for Beginners:
Working on simple and complex data structures requires good programming skills and
experience. A new developer may find it difficult to handle complex data structures.
2) Slower Data Structure Access:
Although different data types are available to allocate memory within the system, some of
the more complex data structures’ memory access might become slow and sluggish at
times.
3) Initial Quality Testing Takes Time:
Building algorithms is a necessary step in the process of designing a data structure from
scratch. Initially, it takes a lot of quality testing time, especially if the complexity of the data
structure is high.
4) High Maintenance:
Handling large data sets, especially big data, requires the use of complex data structures
and algorithms along with physical infrastructure. This will require a high cost of
maintenance for the smooth functioning of the programs.
5) Requires Comprehensive Planning:
Implementing and managing data structures without prior preparation and planning is
challenging. You need sophisticated calculations and tremendous efforts to outline the use
of data structures in your program.
➢ WHAT IS DATA TYPE:
A data type is a characteristic of data that defines how a computer system interprets its
value. Data types are used to ensure that data is collected in the correct format and that the
value of each property is as expected.

➢ Some common data types include:


C++ supports the following data types:
1. Primary or Built-in or Fundamental data type
2. Derived data types
3. User-defined data types
➢ Data Types in C++ are Mainly Divided into 3 Types:
1. Primitive Data Types: These data types are built-in or predefined data types and can be
used directly by the user to declare variables. example: int, char, float, bool, etc. Primitive
data types available in C++ are:
• Integer
• Character
• Boolean
• Floating Point
• Double Floating Point
• Valueless or Void
• Wide Character
2. Derived Data Types: Derived data types that are derived from the primitive or built-in
datatypes are referred to as Derived Data Types. These can be of four types namely:
• Function
• Array
• Pointer
• Reference
3. Abstract or User-Defined Data Types: Abstract or User-Defined data types are defined
by the user itself. Like, defining a class in C++ or a structure. C++ provides the following
user-defined datatypes:
• Class
• Structure
• Union
• Enumeration
• Typedef defined Datatype
➢ Primitive Data Types:
• Integer: The keyword used for integer data types is int. Integers typically require 4
bytes of memory space and range from -2147483648 to 2147483647.
• Character: Character data type is used for storing characters. The keyword used for
the character data type is char. Characters typically require 1 byte of memory space
and range from -128 to 127 or 0 to 255.
• Boolean: Boolean data type is used for storing Boolean or logical values. A Boolean
variable can store either true or false. The keyword used for the Boolean data type
is bool.
• Floating Point: Floating Point data type is used for storing single-precision floating-
point values or decimal values. The keyword used for the floating-point data type
is float. Float variables typically require 4 bytes of memory space.
• Double Floating Point: Double Floating Point data type is used for storing double-
precision floating-point values or decimal values. The keyword used for the double
floating-point data type is double. Double variables typically require 8 bytes of
memory space.
• void: Void means without any value. void data type represents a valueless entity. A
void data type is used for those function which does not return a value.
• Wide Character: Wide character data type is also a character data type but this data
type has a size greater than the normal 8-bit data type. Represented by wchar_t. It is
generally 2 or 4 bytes long.
• sizeof() operator: sizeof() operator is used to find the number of bytes occupied by a
variable/data type in computer memory.
Datatype Modifiers
As the name suggests, datatype modifiers are used with built-in data types to modify the
length of data that a particular data type can hold.

➢ Data type modifiers available in C++ are:


• Signed
• Unsigned
• Short
• Long
The below table summarizes the modified size and range of built-in datatypes when
combined with the type modifiers:
Data Type Size (in bytes) Range

short int 2 -32,768 to 32,767

unsigned short int 2 0 to 65,535

unsigned int 4 0 to 4,294,967,295

int 4 -2,147,483,648 to 2,147,483,647

long int 4 -2,147,483,648 to 2,147,483,647

unsigned long int 4 0 to 4,294,967,295


Data Type Size (in bytes) Range

long long int 8 -(2^63) to (2^63)-1

unsigned long long int 8 0 to 18,446,744,073,709,551,615

signed char 1 -128 to 127

unsigned char 1 0 to 255

float 4 -3.4×10^38 to 3.4×10^38

double 8 -1.7×10^308 to1.7×10^308

long double 12 -1.1×10^4932 to1.1×10^4932

wchar_t 2 or 4 1 wide character

Name Expresses

CHAR_MIN The minimum value for an object of type char

CHAR_MAX Maximum value for an object of type char

SCHAR_MIN The minimum value for an object of type Signed char

SCHAR_MAX Maximum value for an object of type Signed char

UCHAR_MAX Maximum value for an object of type Unsigned char

CHAR_BIT Number of bits in a char object


Name Expresses

MB_LEN_MAX Maximum number of bytes in a multi-byte character

SHRT_MIN The minimum value for an object of type short int

SHRT_MAX Maximum value for an object of type short int

USHRT_MAX Maximum value for an object of type Unsigned short int

INT_MIN The minimum value for an object of type int

INT_MAX Maximum value for an object of type int

UINT_MAX Maximum value for an object of type Unsigned int

LONG_MIN The minimum value for an object of type long int

LONG_MAX Maximum value for an object of type long int

ULONG_MAX Maximum value for an object of type Unsigned long int

LLONG_MIN The minimum value for an object of type long long int

LLONG_MAX Maximum value for an object of type long long int

ULLONG_MAX Maximum value for an object of type Unsigned long long int

➢ Differences between data type and data structure:

Data Type Data Structure

The data type is the form of a variable to Data structure is a collection of different
which a value can be assigned. It defines kinds of data. That entire data can be
Data Type Data Structure

that the particular variable will assign the represented using an object and can be
values of the given data type only. used throughout the program.

It can hold value but not data. Therefore, It can hold multiple types of data within a
it is dataless. single object.

The implementation of a data type is Data structure implementation is known


known as abstract implementation. as concrete implementation.

There is no time complexity in the case of In data structure objects, time complexity
data types. plays an important role.

While in the case of data structures, the


In the case of data types, the value of data data and its value acquire the space in the
is not stored because it only represents computer’s main memory. Also, a data
the type of data that can be stored. structure can hold different kinds and
types of data within one single object.

Data type examples are int, float, double, Data structure examples are stack, queue,
etc. tree, etc.

• Linear data structure: Data structure in which data elements are arranged
sequentially or linearly, where each element is attached to its previous and next
adjacent elements, is called a linear data structure.
Examples of linear data structures are array, stack, queue, linked list, etc.
o Static data structure: Static data structure has a fixed memory size. It is
easier to access the elements in a static data structure.
An example of this data structure is an array.
o Dynamic data structure: In the dynamic data structure, the size is not fixed.
It can be randomly updated during the runtime which may be considered
efficient concerning the memory (space) complexity of the code.
Examples of this data structure are queue, stack, etc.
• Non-linear data structure: Data structures where data elements are not placed
sequentially or linearly are called non-linear data structures. In a non-linear data
structure, we can’t traverse all the elements in a single run only.
Examples of non-linear data structures are trees and graphs .
Arrays:
• An array is a linear data structure and it is a collection of items stored at contiguous
memory locations. The idea is to store multiple items of the same type together in
one place. It allows the processing of a large amount of data in a relatively short
period. The first element of the array is indexed by a subscript of 0. There are
different operations possible in an array, like Searching, Sorting, Inserting,
Traversing, Reversing, and Deleting.

➢ Characteristics of an Array:
An array has various characteristics which are as follows:
• Homogeneous Elements: All elements within an array must be of the same data type.
• Contiguous Memory Allocation: In most programming languages, elements in an
array are stored in contiguous (adjacent) memory locations.
• Zero-Based Indexing: In many programming languages, arrays use zero-based
indexing, which means that the first element is accessed with an index of 0, the
second with an index of 1, and so on.
• Random Access: Arrays provide constant-time (O(1)) access to elements. This means
that regardless of the size of the array, it takes the same amount of time to access
any element based on its index.
• Arrays use an index-based data structure which helps to identify each of the elements
in an array easily using the index.
• If a user wants to store multiple values of the same data type, then the array can be
utilized efficiently.
• An array can also handle complex data structures by storing data in a two-dimensional
array.
• An array is also used to implement other data structures like Stacks, Queues, Heaps,
Hash tables, etc.
• The search process in an array can be done very easily.

➢ Operations performed on array:


• Initialization: An array can be initialized with values at the time of declaration or later
using an assignment statement.

• Accessing elements: Elements in an array can be accessed by their index, which starts
from 0 and goes up to the size of the array minus one.
• Searching for elements: Arrays can be searched for a specific element using linear
search or binary search algorithms.
• Sorting elements: Elements in an array can be sorted in ascending or descending
order using algorithms like bubble sort, insertion sort, or quick sort.
• Inserting elements: Elements can be inserted into an array at a specific location, but
this operation can be time-consuming because it requires shifting existing elements in
the array.
• Deleting elements: Elements can be deleted from an array by shifting the elements
that come after it to fill the gap.
• Updating elements: Elements in an array can be updated or modified by assigning a
new value to a specific index.
• Traversing elements: The elements in an array can be traversed in order, visiting each
element once.
➢ Applications of Array:
Different applications of an array are as follows:
• An array is used in solving matrix problems.
• Database records are also implemented by an array.
• It helps in implementing a sorting algorithm.
• It is also used to implement other data structures like Stacks, Queues, Heaps, Hash
tables, etc.
• An array can be used for CPU scheduling.
• Can be applied as a lookup table in computers.
• Arrays can be used in speech processing where every speech signal is an array.
• The screen of the computer is also displayed by an array. Here we use a
multidimensional array.
• The array is used in many management systems like a library, students, parliament,
etc.
• The array is used in the online ticket booking system. Contacts on a cell phone are
displayed by this array.
• In games like online chess, where the player can store his past moves as well as
current moves. It indicates a hint of position.
• To save images in a specific dimension in the android Like 360*1200
➢ Real-Life Applications of Array:
• An array is frequently used to store data for mathematical computations.
• It is used in image processing.
• It is also used in record management.
• Book pages are also real-life examples of an array.
• It is used in ordering boxes as well.
➢ Types of arrays:
• One-Dimensional Array: This is the simplest form of an array, which consists of a
single row of elements, all of the same data type. Elements in a 1D array are accessed
using a single index.

• Two-Dimensional Array: A two-dimensional array, often referred to as a matrix or 2D


array, is an array of arrays. It consists of rows and columns, forming a grid-like
structure. Elements in a 2D array are accessed using two indices, one for the row and
one for the column.

• Multi-Dimensional Array: Arrays can have more than two dimensions, leading to
multi-dimensional arrays. These are used when data needs to be organized in a multi-
dimensional grid.

2. Linked List
A Linked List is a linear data structure which looks like a chain of nodes, where each node
contains a data field and a reference(link) to the next node in the list. Unlike Arrays,
Linked List elements are not stored at a contiguous location.
Common Features of Linked List:
• Node: Each element in a linked list is represented by a node, which contains two
components:
o Data: The actual data or value associated with the element.
o Next Pointer(or Link): A reference or pointer to the next node in the linked
list.
• Head: The first node in a linked list is called the “head.” It serves as the starting point
for traversing the list.
• Tail: The last node in a linked list is called the “tail.”
Types of Linked Lists:
• Singly-linked list
• Doubly linked list
• Circular linked list
• Doubly circular linked list

➢ Singly Linked List:


• In this type of linked list, every node stores the address or reference of the next node
in the list and the last node has the next address or reference as NULL. For example:
1->2->3->4->NULL

➢ Doubly Linked Lists:


• In a doubly linked list, each node has two pointers: one pointing to the next node and
one pointing to the previous node. This bidirectional structure allows for efficient
traversal in both directions.

➢ Circular Linked Lists:


• A circular linked list is a type of linked list in which the first and the last nodes are
also connected to each other to form a circle, there is no NULL at the end.
Types of Linked List operations:
• Accessing Elements: Accessing a specific element in a linked list takes O(n) time since
nodes are stored in non conitgous locations so random access if not possible.
• Searching: Searching of a node in linked list takes O(n) time as whole list needs to
travesed in worst case.
• Insertion: Insertion takes O(1) time if we are at the position where we have to insert
an element.
• Deletion: Deletion takes O(1) time if we know the position of the element to be
deleted.
➢ Characteristics of a Linked list:
A linked list has various characteristics which are as follows:
• A linked list uses extra memory to store links.
• During the initialization of the linked list, there is no need to know the size of the
elements.
• Linked lists are used to implement stacks, queues, graphs, etc.
• The first node of the linked list is called the Head.
• The next pointer of the last node always points to NULL.
• In a linked list, insertion and deletion are possible easily.
• Each node of the linked list consists of a pointer/link which is the address of the next
node.
• Linked lists can shrink or grow at any point in time easily.
➢ Applications of the Linked list:
Different applications of linked lists are as follows:
• Linked lists are used to implement stacks, queues, graphs, etc.
• Linked lists are used to perform arithmetic operations on long integers.
• It is used for the representation of sparse matrices.
• It is used in the linked allocation of files.
• It helps in memory management.
• It is used in the representation of Polynomial Manipulation where each polynomial
term represents a node in the linked list.
• Linked lists are used to display image containers. Users can visit past, current, and
next images.
• They are used to store the history of the visited page.
• They are used to perform undo operations.
• Linked are used in software development where they indicate the correct syntax of a
tag.
• Linked lists are used to display social media feeds.

3. Stack Data Structure:

A stack is a linear data structure that follows the Last-In-First-Out (LIFO) principle,
meaning that the last element added to the stack is the first one to be removed.
➢ Types of Stacks:
• Fixed Size Stack: As the name suggests, a fixed size stack has a fixed size and cannot
grow or shrink dynamically. If the stack is full and an attempt is made to add an
element to it, an overflow error occurs. If the stack is empty and an attempt is made
to remove an element from it, an underflow error occurs.
• Dynamic Size Stack: A dynamic size stack can grow or shrink dynamically. When the
stack is full, it automatically increases its size to accommodate the new element, and
when the stack is empty, it decreases its size. This type of stack is implemented using
a linked list, as it allows for easy resizing of the stack.
➢ Stack Operations:
• push(): When this operation is performed, an element is inserted into the stack.
• pop(): When this operation is performed, an element is removed from the top of the
stack and is returned.
• top(): This operation will return the last inserted element that is at the top without
removing it.
• size(): This operation will return the size of the stack i.e. the total number of elements
present in the stack.
• isEmpty(): This operation indicates whether the stack is empty or not.
Characteristics of a Stack:
Stack has various different characteristics which are as follows:
• Stack is used in many different algorithms like Tower of Hanoi, tree traversal,
recursion, etc.
• Stack is implemented through an array or linked list.
• It follows the Last In First Out operation i.e., an element that is inserted first will pop
in last and vice versa.
• The insertion and deletion are performed at one end i.e. from the top of the stack.
• In stack, if the allocated space for the stack is full, and still anyone attempts to add
more elements, it will lead to stack overflow.
Applications of Stack:
Different applications of Stack are as follows:
• The stack data structure is used in the evaluation and conversion of arithmetic
expressions.
• It is used for parenthesis checking.
• While reversing a string, the stack is used as well.
• Stack is used in memory management.
• It is also used for processing function calls.
• The stack is used to convert expressions from infix to postfix.
• The stack is used to perform undo as well as redo operations in word processors.
• The stack is used in virtual machines like JVM.
• The stack is used in the media players. Useful to play the next and previous song.
• The stack is used in recursion operations.

4. Queue Data Structure:


A queue is a linear data structure that follows the First-In-First-Out (FIFO) principle. In a
queue, the first element added is the first one to be removed.

➢ Types of Queue:

• Input Restricted Queue: This is a simple queue. In this type of queue, the input can
be taken from only one end but deletion can be done from any of the ends.

• Output Restricted Queue: This is also a simple queue. In this type of queue, the input
can be taken from both ends but deletion can be done from only one end.
• Circular Queue: This is a special type of queue where the last position is connected
back to the first position. Here also the operations are performed in FIFO order. To
know more refer this.
• Double-Ended Queue (Dequeue): In a double-ended queue the insertion and
deletion operations, both can be performed from both ends. To know more refer this.
• Priority Queue: A priority queue is a special queue where the elements are accessed
based on the priority assigned to them. To know more refer this.
➢ Operation performed on queue:
• Enqueue(): Adds (or stores) an element to the end of the queue..
• Dequeue(): Removal of elements from the queue.
• Peek() or front(): Acquires the data element available at the front node of the queue
without deleting it.
• rear(): This operation returns the element at the rear end without removing it.
• isFull(): Validates if the queue is full.
• isNull(): Checks if the queue is empty.
➢ Characteristics of a Queue:

The queue has various different characteristics which are as follows:

• The queue is a FIFO (First In First Out) structure.


• To remove the last element of the Queue, all the elements inserted before the new
element in the queue must be removed.
• A queue is an ordered list of elements of similar data types.
➢ Applications of Queue:
Different applications of Queue are as follows:
• Queue is used for handling website traffic.
• It helps to maintain the playlist in media players.
• Queue is used in operating systems for handling interrupts.
• It helps in serving requests on a single shared resource, like a printer, CPU task
scheduling, etc.
• It is used in the asynchronous transfer of data e.g. pipes, file IO, and sockets.
• Queues are used for job scheduling in the operating system.
• In social media to upload multiple photos or videos queue is used.
• To send an e-mail queue data structure is used.
• To handle website traffic at a time queues are used.
• In Windows operating system, to switch multiple applications.

➢ Advantages of Linear Data Structures:


• Efficient data access: Elements can be easily accessed by their position in the
sequence.
• Dynamic sizing: Linear data structures can dynamically adjust their size as elements
are added or removed.
• Ease of implementation: Linear data structures can be easily implemented using
arrays or linked lists.
• Versatility: Linear data structures can be used in various applications, such as
searching, sorting, and manipulation of data.
• Simple algorithms: Many algorithms used in linear data structures are simple and
straightforward.
➢ Disadvantages of Linear Data Structures:
• Limited data access: Accessing elements not stored at the end or the beginning of the
sequence can be time-consuming.
• Memory overhead: Maintaining the links between elements in linked lists and
pointers in stacks and queues can consume additional memory.
• Complex algorithms: Some algorithms used in linear data structures, such as
searching and sorting, can be complex and time-consuming.
• Inefficient use of memory: Linear data structures can result in inefficient use of
memory if there are gaps in the memory allocation.
• Unsuitable for certain operations: Linear data structures may not be suitable for
operations that require constant random access to elements, such as searching for an
element in a large dataset.
➢ TREE:

• A tree is a non-linear and hierarchical data structure where the elements are
arranged in a tree-like structure. In a tree, the topmost node is called the root
node. Each node contains some data, and data can be of any type. It consists of a
central node, structural nodes, and sub-nodes which are connected via edges.
Different tree data structures allow quicker and easier access to the data as it is a
non-linear data structure. A tree has various terminologies like Node, Root, Edge,
Height of a tree, Degree of a tree, etc.
➢ There are different types of Tree-like
1) Binary Tree,
2) Binary Search Tree,
3) AVL Tree,
4) B-Tree, etc.
➢ Characteristics of a Tree:
The tree has various different characteristics which are as follows:
• A tree is also known as a Recursive data structure.
• In a tree, the Height of the root can be defined as the longest path from the root
node to the leaf node.
• In a tree, one can also calculate the depth from the top to any node. The root node
has a depth of 0.
➢ Applications of Tree:
Different applications of Tree are as follows:
• Heap is a tree data structure that is implemented using arrays and used to implement
priority queues.
• B-Tree and B+ Tree are used to implement indexing in databases.
• Syntax Tree helps in scanning, parsing, generation of code, and evaluation of
arithmetic expressions in Compiler design.
• K-D Tree is a space partitioning tree used to organize points in K-dimensional space.
• Spanning trees are used in routers in computer networks.
➢ Operation performed on tree:
A tree is a non-linear data structure that consists of nodes connected by edges. Here
are some common operations performed on trees:
• Insertion: New nodes can be added to the tree to create a new branch or to increase
the height of the tree.
• Deletion: Nodes can be removed from the tree by updating the references of the
parent node to remove the reference to the current node.
• Search: Elements can be searched for in a tree by starting from the root node and
traversing the tree based on the value of the current node until the desired node is
found.
• Traversal: The elements in a tree can be traversed in several different ways, including
in-order, pre-order, and post-order traversal.
• Height: The height of the tree can be determined by counting the number of edges
from the root node to the furthest leaf node.
• Depth: The depth of a node can be determined by counting the number of edges
from the root node to the current node.
• Balancing: The tree can be balanced to ensure that the height of the tree is
minimized and the distribution of nodes is as even as possible.

➢ Binary Tree:
• Binary tree is a tree data structure(non-linear) in which each node can
have at most two children which are referred to as the left child and
the right child.
• The topmost node in a binary tree is called the root, and the bottom-most
nodes are called leaves. A binary tree can be visualized as a hierarchical
structure with the root at the top and the leaves at the bottom.

➢ Representation of Binary Tree


Each node in a Binary Tree has three parts:
• Data
• Pointer to the left child
• Pointer to the right child
➢ Terminologies in Binary Tree
• Nodes: The fundamental part of a binary tree, where each node
contains data and link to two child nodes.
• Root: The topmost node in a tree is known as the root node. It has no parent and
serves as the starting point for all nodes in the tree.
• Parent Node: A node that has one or more child nodes. In a binary tree, each node
can have at most two children.
• Child Node: A node that is a descendant of another node (its parent).
• Leaf Node: A node that does not have any children or both children are null.
• Internal Node: A node that has at least one child. This includes all nodes except
the root and the leaf nodes.
• Depth of a Node: The number of edges from a specific node to the root node. The
depth of the root node is zero.
• Height of a Binary Tree: The number of nodes from the deepest leaf node to the root
node.
The diagram below shows all these terms in a binary tree.

➢ Advantages of Binary Tree:


• Efficient Search: Binary Search Trees (a variation of Binary Tree) are efficient when
searching for a specific element, as each node has at most two child nodes when
compared to linked list and arrays
• Memory Efficient: Binary trees require lesser memory as compared to other tree data
structures, therefore memory-efficient.
• Binary trees are relatively easy to implement and understand as each node has at
most two children, left child and right child.
➢ Disadvantages of Binary Tree:
• Limited structure: Binary trees are limited to two child nodes per node, which can
limit their usefulness in certain applications. For example, if a tree requires more than
two child nodes per node, a different tree structure may be more suitable.
• Unbalanced trees: Unbalanced binary trees, where one subtree is significantly larger
than the other, can lead to inefficient search operations. This can occur if the tree is
not properly balanced or if data is inserted in a non-random order.
• Space inefficiency: Binary trees can be space inefficient when compared to other data
structures like arrays and linked list. This is because each node requires two child
references or pointers, which can be a significant amount of memory overhead for
large trees.
• Slow performance in worst-case scenarios: In the worst-case scenario, a binary tree
can become degenerate or skewed, meaning that each node has only one child. In this
case, search operations in Binary Search Tree (a variation of Binary Tree) can degrade
to O(n) time complexity, where n is the number of nodes in the tree.
➢ Applications of Binary Tree:
• Binary Tree can be used to represent hierarchical data.
• Huffman Coding trees are used in data compression algorithms.
• Priority Queue is another application of binary tree that is used for searching
maximum or minimum in O(1) time complexity.
• Useful for indexing segmented at the database is useful in storing cache in the system,
• Binary trees can be used to implement decision trees, a type of machine learning
algorithm used for classification and regression analysis.
➢ Operations On Binary Tree:
Following is a list of common operations that can be performed on a binary tree:
1. Traversal in Binary Tree:
• Traversal in Binary Tree involves visiting all the nodes of the binary tree. Tree
Traversal algorithms can be classified broadly into two categories, DFS and BFS:
• Depth-First Search (DFS) algorithms: DFS explores as far down a branch as possible
before backtracking. It is implemented using recursion. The main traversal methods
in DFS for binary trees are:
• Preorder Traversal (current-left-right): Visits the node first, then left subtree,
then right subtree.
• Inorder Traversal (left-current-right): Visits left subtree, then the node, then
the right subtree.
• Postorder Traversal (left-right-current): Visits left subtree, then right subtree, then
the node.
Breadth-First Search (BFS) algorithms: BFS explores all nodes at the present depth before
moving on to nodes at the next depth level. It is typically implemented using a queue. BFS
in a binary tree is commonly referred to as Level Order Traversal.

2. Insertion in Binary Tree:


• Inserting elements means add a new node into the binary tree. As we know that
there is no such ordering of elements in the binary tree, So we do not have to
worry about the ordering of node in the binary tree. We would first creates a root
node in case of empty tree. Then subsequent insertions involve iteratively
searching for an empty place at each level of the tree. When an
empty left or right child is found then new node is inserted there. By convention,
insertion always starts with the left child node

3. Searching in Binary Tree:


• Searching for a value in a binary tree means looking through the tree to find a
node that has that value. Since binary trees do not have a specific order like binary
search trees, we typically use any traversal method to search. The most common
methods are depth-first search (DFS) and breadth-first search (BFS).
• In DFS, we start from the root and explore the depth nodes first. In BFS, we
explore all the nodes at the present depth level before moving on to the nodes at
the next level. We continue this process until we either find the node with the
desired value or reach the end of the tree. If the tree is empty or the value isn’t
found after exploring all possibilities, we conclude that the value does not exist in
the tree.
4. Deletion in Binary Tree:
• Deleting a node from a binary tree means removing a specific node while keeping
the tree’s structure. First, we need to find the node that want to delete by traversing
through the tree using any traversal method. Then replace the node’s value with the
value of the last node in the tree (found by traversing to the rightmost leaf), and
then delete that last node. This way, the tree structure won’t be effected. And
remember to check for special cases, like trying to delete from an empty tree, to
avoid any issues.
Note: There is no specific rule of deletion but we always make sure that during deletion
the binary tree proper should be preserved.

➢ Binary Search Tree:


Binary Search Tree (BST) is a special type of binary tree. Which follows all properties of
binary tree and its left child contains values less than the parent node and the right child
contains values greater than the parent node. This hierarchical structure allows for
efficient Searching, Insertion, and Deletion operations on the data stored in the tree.

➢ Properties of Binary Search Tree:


• The left subtree of a node contains only nodes with keys lesser than the node’s key.
• The right subtree of a node contains only nodes with keys greater than the node’s
key.
• The left and right subtree each must also be a binary search tree.
• There must be no duplicate nodes(BST may have duplicate values with different
handling approaches).
➢ Basic Operations on Binary Search Tree:
1. Searching a node in BST:
The steps of searching a node in Binary Search tree are listed as follows –
1. First, compare the element to be searched with the root element of the tree.
• If root is matched with the target element, then return the node’s location.
• If it is not matched, then check whether the item is less than the root element,
if it is smaller than the root element, then move to the left subtree.
• If it is larger than the root element, then move to the right subtree.
2. Repeat the above procedure recursively until the match is found.
3. If the element is not found or not present in the tree, then return NULL.
Example: Below is given a BST and we have to search for element 6.

❖ Time Complexity : O(h) where h is height of BST.


2. Insert a node into a BST:
A new key is always inserted at the leaf. Start searching a key from the root till a leaf
node. Once a leaf node is found, the new node is added as a child of the leaf node.
3. Delete a Node of BST:
It is used to delete a node with specific key from the BST and return the new BST.
➢ Different scenarios for deleting the node:
Node to be deleted is the leaf node :
Its simple you can just null it out.

Node to be deleted has one child :


You can just replace the node with the child node.
Node to be deleted has two children :
• Here we have to delete the node is such a way, that the resulting tree follows the
properties of a BST. The trick is to find the inorder successor of the node. Copy
contents of the inorder successor to the node, and delete the inorder successor.

➢ Take Care of following things while deleting a node of a BST:


1. Need to figure out what will be the replacement of the node to be deleted.
2. Want minimal disruption to the existing tree structure
3. Can take the replacement node from the deleted nodes left or right subtree.
4. If taking if from the left subtree, we have to take the largest value in the left subtree.
5. If taking if from the right subtree, we have to take the smallest value in the right
subtree.
4. Traversal (Inorder traversal of BST) :
In case of binary search trees (BST), Inorder traversal gives nodes in non-decreasing
order. We visit the left child first, then the root, and then the right child.
Applications of BST:
• Self-balancing binary search tree: Self-balancing data structures such as AVL tree and
Red-black tree are the most useful variations of BSTs. In these variations, we maintain
the height as O(Log n) so that all operations are bounded by O(Log n). TreeSet and
TreeMap in Java (or set and map in C++) are library implementations of self balancing
BSTs.
• Sorted Stream of Data : If we wish to maintain a sorted stream of data where we
wish to have operations like insert, search, delete and traversal in sorted order, BST is
the most suitable data structure for this case.
• Doubly Ended Priority Queues: With Self Balancing BSTs, we can extract both
maximum and minimum in O(Log n) time, so when we need a data structure with
both operations supported efficiently, we use self balancing BSTs.
➢ Advantages:
• Fast search: Searching for a specific value in a BST has an average time complexity of
O(log n), where n is the number of nodes in the tree. This is much faster than
searching for an element in an array or linked list, which have a time complexity of
O(n) in the worst case.
• In-order traversal: BSTs can be traversed in-order, which visits the left subtree, the
root, and the right subtree. This can be used to sort a dataset.
➢ Disadvantages:
• Skewed trees: If a tree becomes skewed, the time complexity of search, insertion,
and deletion operations will be O(n) instead of O(log n), which can make the tree
inefficient.
• Additional time required: Self-balancing trees require additional time to maintain
balance during insertion and deletion operations.
• Efficiency: For only search, insert and / or delete operations only hashing is always
preferred over BSts. However if we need to maintain sorted data along with these
operations, we use BST.
Complexity Analysis of Binary Tree Operations:
Here’s the complexity analysis for specific binary tree operations:
Operation Time Complexity Auxiliary Space

In-Order Traversal O(n) O(n)

Pre-Order Traversal O(n) O(n)

Post-Order Traversal O(n) O(n)

Insertion (Unbalanced) O(n) O(n)


Operation Time Complexity Auxiliary Space

Searching (Unbalanced) O(n) O(n)

Deletion (Unbalanced) O(n) O(n)

➢ AVL Tree Data Structure:


• An AVL tree defined as a self-balancing Binary Search Tree (BST) where the
difference between heights of left and right subtrees for any node cannot be more
than one.
Example of AVL Trees:

❖ The above tree is AVL because the differences between the heights of left and right
subtrees for every node are less than or equal to 1.
➢ Operations on an AVL Tree:
• Insertion
• Deletion
• Searching [It is similar to performing a search in BST]
➢ Rotating the subtrees in an AVL Tree:
An AVL tree may rotate in one of the following four ways to keep itself balanced:
➢ Left Rotation:

When a node is added into the right subtree of the right subtree, if the tree gets out of
balance, we do a single left rotation.
➢ Right Rotation:
If a node is added to the left subtree of the left subtree, the AVL tree may get out of
balance, we do a single right rotation.

➢ Left-Right Rotation:

A left-right rotation is a combination in which first left rotation takes place after that right
rotation executes.

➢ Right-Left Rotation:

A right-left rotation is a combination in which first right rotation takes place after that left
rotation executes.
➢ Advantages of AVL Tree:
1. AVL trees can self-balance themselves and therefore provides time complexity as
O(Log n) for search, insert and delete.
2. It is a BST only (with balancing), so items can be traversed in sorted order.
3. Since the balancing rules are strict compared to Red Black Tree, AVL trees in general
have relatively less height and hence the search is faster.
4. AVL tree is relatively less complex to understand and implement compared to Red
Black Trees.
➢ Disadvantages of AVL Tree:
1. It is difficult to implement compared to normal BST and easier compared to Red Black
2. Less used compared to Red-Black trees.
3. Due to its rather strict balance, AVL trees provide complicated insertion and removal
operations as more rotations are performed.
➢ Applications of AVL Tree:
1. AVL Tree is used as a first example self balancing BST in teaching DSA as it is easier to
understand and implement compared to Red Black
2. Applications, where insertions and deletions are less common but frequent data
lookups along with other operations of BST like sorted traversal, floor, ceil, min and
max.
3. Red Black tree is more commonly implemented in language libraries like map in
C++, set in C++, TreeMap in Java and TreeSet in Java.
4. AVL Trees can be used in a real time environment where predictable and consistent
performance is required.
➢ Red-Black Tree:
• A Red-Black Tree is a self-balancing binary search tree where each node has an
additional attribute: a color, which can be either red or black. The primary objective
of these trees is to maintain balance during insertions and deletions, ensuring
efficient data retrieval and manipulation.
• Binary search trees are a fundamental data structure, but their performance can
suffer if the tree becomes unbalanced. Red Black Trees are a type of balanced binary
search tree that use a set of rules to maintain balance, ensuring logarithmic time
complexity for operations like insertion, deletion, and searching, regardless of the
initial shape of the tree. Red Black Trees are self-balancing, using a simple color-
coding scheme to adjust the tree after each modification.
➢ Properties of Red-Black Trees:

A Red-Black Tree have the following properties:


1. Node Color: Each node is either red or black.
2. Root Property: The root of the tree is always black.
3. Red Property: Red nodes cannot have red children (no two consecutive red nodes on
any path).
4. Black Property: Every path from a node to its descendant null nodes (leaves) has the
same number of black nodes.
5. Leaf Property: All leaves (NIL nodes) are black.
➢ Time complexity:

Sr. No. Algorithm Time Complexity

1. Search O(log n)

2. Insert O(log n)

3. Delete O(log n)

➢ Interesting points about Red-Black Tree:


• The black height of the red-black tree is the number of black nodes on a path from
the root node to a leaf node. Leaf nodes are also counted as black nodes. So, a red-
black tree of height h has black height >= h/2.
• Height of a red-black tree with n nodes is h<= 2 log2(n + 1).
• All leaves (NIL) are black.
• The black depth of a node is defined as the number of black nodes from the root to
that node i.e the number of black ancestors.
➢ Basic Operations on Red-Black Tree:
The basic operations on a Red-Black Tree include:
1. Insertion
2. Search
3. Deletion
4. Rotation
1. Insertion:
Inserting a new node in a Red-Black Tree involves a two-step process: performing a
standard binary search tree (BST) insertion, followed by fixing any violations of Red-Black
properties.
➢ Insertion Steps:
1. BST Insert: Insert the new node like in a standard BST.
2. Fix Violations:
• If the parent of the new node is black, no properties are violated.
• If the parent is red, the tree might violate the Red Property, requiring fixes.
Fixing Violations During Insertion
After inserting the new node as a red node, we might encounter several cases depending on
the colors of the node’s parent and uncle (the sibling of the parent):
• Case 1: Uncle is Red: Recolor the parent and uncle to black, and the grandparent
to red. Then move up the tree to check for further violations.
• Case 2: Uncle is Black:
o Sub-case 2.1: Node is a right child: Perform a left rotation on the parent.
o Sub-case 2.2: Node is a left child: Perform a right rotation on the grandparent
and recolor appropriately.
2. Searching
Searching for a node in a Red-Black Tree is similar to searching in a standard Binary Search
Tree (BST). The search operation follows a straightforward path from the root to a leaf,
comparing the target value with the current node’s value and moving left or right
accordingly.
➢ Search Steps:
1. Start at the Root: Begin the search at the root node.
2. Traverse the Tree:
• If the target value is equal to the current node’s value, the node is found.
• If the target value is less than the current node’s value, move to the left child.
• If the target value is greater than the current node’s value, move to the right
child.
3. Repeat: Continue this process until the target value is found or a NIL node is reached
(indicating the value is not present in the tree).
3. Deletion:
Deleting a node from a Red-Black Tree also involves a two-step process: performing the BST
deletion, followed by fixing any violations that arise.
➢ Deletion Steps
1. BST Deletion: Remove the node using standard BST rules.
2. Fix Double Black:
• If a black node is deleted, a “double black” condition might arise, which
requires specific fixes.
Fixing Violations During Deletion
When a black node is deleted, we handle the double black issue based on the sibling’s color
and the colors of its children:
• Case 1: Sibling is Red: Rotate the parent and recolor the sibling and parent.
• Case 2: Sibling is Black:
o Sub-case 2.1: Sibling’s children are black: Recolor the sibling and propagate the
double black upwards.
o Sub-case 2.2: At least one of the sibling’s children is red:
o If the sibling’s far child is red: Perform a rotation on the parent and
sibling, and recolor appropriately.
o If the siblings near child is red: Rotate the sibling and its child, then
handle as above.
➢ When to Perform Rotations?
Rotations in Red-Black Trees are typically performed during insertions and deletions to
maintain the properties of the tree. Below are the scenarios for rotations:
1. Fixing Violations after Insertion:
When a new node is inserted, it is always colored red. This can create violations of Red-
Black Tree properties, specifically:
• The root must be black.
• Red nodes cannot have red children.
Case Analysis for Fixing Insertions:
• Case 1: Recoloring and Propagating Upwards:
o If the parent and uncle of the new node are both red, recolor the parent and
uncle to black, and the grandparent to red. Then, recursively apply the fix-up to
the grandparent.
• Case 2: Rotation and Recoloring:
o If the new node’s uncle is black and the new node is the right child of a left
child (or vice versa), perform a rotation to move the new node up and align it.
o If the new node’s uncle is black and the new node is the left child of a left child
(or right of a right), perform a rotation and recolor the parent and grandparent
to fix the violation.
2. Fixing Violations after Deletion:
After deletion, the tree might need fixing to restore properties:
• When a black node is removed, or a red node is replaced by a black node, a double-
black situation can arise.
Case Analysis for Fixing Deletions:
• Case 1: Sibling is Red
o Recolor the sibling and the parent, and perform a rotation.
• Case 2: Sibling is Black with Black Children
o Recolor the sibling to red and move the problem up to the parent.
• Case 3: Sibling is Black with at least one Red Child
o Rotate and recolor to fix the double-black issue.

➢ Advantages of Red-Black Trees:


• Balanced: Red-Black Trees are self-balancing, meaning they automatically maintain a
balance between the heights of the left and right subtrees. This ensures that search,
insertion, and deletion operations take O(log n) time in the worst case.
• Efficient search, insertion, and deletion: Due to their balanced structure, Red-Black
Trees offer efficient operations. Search, insertion, and deletion all take O(log n) time
in the worst case.
• Simple to implement: The rules for maintaining the Red-Black Tree properties are
relatively simple and straightforward to implement.
• Widely used: Red-Black Trees are a popular choice for implementing various data
structures, such as maps, sets, and priority queues.

➢ Disadvantages of Red-Black Trees:


• More complex than other balanced trees: Compared to simpler balanced trees like
AVL trees, Red-Black Trees have more complex insertion and deletion rules.
• Constant overhead: Maintaining the Red-Black Tree properties adds a small overhead
to every insertion and deletion operation.
• Not optimal for all use cases: While efficient for most operations, Red-Black Trees
might not be the best choice for applications where frequent insertions and deletions
are required, as the constant overhead can become significant.

➢ Applications of Red-Black Trees:


• Implementing maps and sets: Red-Black Trees are often used to implement maps
and sets, where efficient search, insertion, and deletion are crucial.
• Priority queues: Red-Black Trees can be used to implement priority queues, where
elements are ordered based on their priority.
• File systems: Red-Black Trees are used in some file systems to manage file and
directory structures.
• In-memory databases: Red-Black Trees are sometimes used in in-memory databases
to store and retrieve data efficiently.
• Graphics and game development: Red-Black Trees can be used in graphics and
game development for tasks like collision detection and pathfinding.
B-Tree
• The limitations of traditional binary search trees can be frustrating. Meet the B-Tree,
the multi-talented data structure that can handle massive amounts of data with ease.
• When it comes to storing and searching large amounts of data, traditional binary
search trees can become impractical due to their poor performance and high memory
usage.
• B-Trees, also known as B-Tree or Balanced Tree, are a type of self-balancing tree that
was specifically designed to overcome these limitations.
• B-Trees maintains balance by ensuring that each node has a minimum number of
keys, so the tree is always balanced. This balance guarantees that the time complexity
for operations such as insertion, deletion, and searching is always O(log n), regardless
of the initial shape of the tree.
➢ Time Complexity of B-Tree:

Sr. No. Algorithm Time Complexity

1. Search O(log n)

2. Insert O(log n)

3. Delete O(log n)

Note: “n” is the total number of elements in the B-tree

➢ Properties of B-Tree:
• All leaves are at the same level.
• B-Tree is defined by the term minimum degree ‘t‘. The value of ‘t‘ depends upon disk
block size.
• Every node except the root must contain at least t-1 keys. The root may contain a
minimum of 1 key.
• All nodes (including root) may contain at most (2*t – 1) keys.
• Number of children of a node is equal to the number of keys in it plus 1.
• All keys of a node are sorted in increasing order. The child between two
keys k1 and k2 contains all keys in the range from k1 and k2.
• B-Tree grows and shrinks from the root which is unlike Binary Search Tree. Binary
Search Trees grow downward and also shrink from downward.
• Like other balanced Binary Search Trees, the time complexity to search, insert, and
delete is O(log n).
• Insertion of a Node in B-Tree happens only at Leaf Node.

Following is an example of a B-Tree of minimum order 5


Note: that in practical B-Trees, the value of the minimum order is much more than 5.
• We can see in the above diagram that all the leaf nodes are at the same level and all
non-leafs have no empty sub-tree and have keys one less than the number of their
children.
➢ Traversal in B-Tree:
Traversal is also similar to Inorder traversal of Binary Tree. We start from the leftmost
child, recursively print the leftmost child, then repeat the same process for the
remaining children and keys. In the end, recursively print the rightmost child.
➢ Search Operation in B-Tree:
Search is similar to the search in Binary Search Tree. Let the key to be searched is k.
• Start from the root and recursively traverse down.
• For every visited non-leaf node,
o If the node has the key, we simply return the node.
o Otherwise, we recur down to the appropriate child (The child which is just
before the first greater key) of the node.
• If we reach a leaf node and don’t find k in the leaf node, then return NULL.
Searching a B-Tree is similar to searching a binary tree. The algorithm is similar and goes
with recursion. At each level, the search is optimized as if the key value is not present in
the range of the parent then the key is present in another branch. As these values limit
the search they are also known as limiting values or separation values. If we reach a leaf
node and don’t find the desired key then it will display NULL.
➢ Traversal In B-Tree:
Traversal is also similar to Inorder traversal of Binary Tree. We start from the leftmost
child, recursively print the leftmost child, then repeat the same process for the
remaining children and keys. In the end, recursively print the rightmost child.

➢ Search Operation in B-Tree:

Search is similar to the search in Binary Search Tree. Let the key to be searched is k.
• Start from the root and recursively traverse down.
• For every visited non-leaf node,
o If the node has the key, we simply return the node.
o Otherwise, we recur down to the appropriate child (The child which is just
before the first greater key) of the node.
• If we reach a leaf node and don’t find k in the leaf node, then return NULL.
Examples:
Input: Search 120 in the given B-Tree.

Solution:
➢ Applications of B-Trees:
• It is used in large databases to access data stored on the disk
• Searching for data in a data set can be achieved in significantly less time using the B-
Tree
• With the indexing feature, multilevel indexing can be achieved.
• Most of the servers also use the B-tree approach.
• B-Trees are used in CAD systems to organize and search geometric data.
• B-Trees are also used in other areas such as natural language processing, computer
networks, and cryptography.

➢ Advantages of B-Trees:
• B-Trees have a guaranteed time complexity of O(log n) for basic operations like
insertion, deletion, and searching, which makes them suitable for large data sets and
real-time applications.
• B-Trees are self-balancing.
• High-concurrency and high-throughput.
• Efficient storage utilization.

➢ Disadvantages of B-Trees:
• B-Trees are based on disk-based data structures and can have a high disk usage.
• Not the best for all cases.
• Slow in comparison to other data structures.
➢ B+ Tree:
• B + Tree is a variation of the B-tree data structure. In a B + tree, data pointers are
stored only at the leaf nodes of the tree. In a B+ tree structure of a leaf node
differs from the structure of internal nodes. The leaf nodes have an entry for every
value of the search field, along with a data pointer to the record (or to the block
that contains this record).
• The leaf nodes of the B+ tree are linked together to provide ordered access to the
search field to the records. Internal nodes of a B+ tree are used to guide the
search. Some search field values from the leaf nodes are repeated in the internal
nodes of the B+ tree.

➢ Features of B+ Trees:

• Balanced: B+ Trees are self-balancing, which means that as data is added or removed
from the tree, it automatically adjusts itself to maintain a balanced structure. This
ensures that the search time remains relatively constant, regardless of the size of the
tree.
• Multi-level: B+ Trees are multi-level data structures, with a root node at the top and
one or more levels of internal nodes below it. The leaf nodes at the bottom level
contain the actual data.
• Ordered: B+ Trees maintain the order of the keys in the tree, which makes it easy to
perform range queries and other operations that require sorted data.
• Fan-out: B+ Trees have a high fan-out, which means that each node can have many
child nodes. This reduces the height of the tree and increases the efficiency of
searching and indexing operations.
• Cache-friendly: B+ Trees are designed to be cache-friendly, which means that they
can take advantage of the caching mechanisms in modern computer architectures to
improve performance.
• Disk-oriented: B+ Trees are often used for disk-based storage systems because they
are efficient at storing and retrieving data from disk.
➢ Why Use B+ Tree?
• B+ Trees are the best choice for storage systems with sluggish data access because
they minimize I/O operations while facilitating efficient disc access.
• B+ Trees are a good choice for database systems and applications needing quick data
retrieval because of their balanced structure, which guarantees predictable
performance for a variety of activities and facilitates effective range-based queries.
➢ Difference Between B+ Tree and B Tree:
Some differences between B+ Tree and B Tree are stated below.

B Tree
Parameters B+ Tree

Separate leaf nodes for Nodes store both keys


Structure data storage and internal and data values
nodes for indexing

Leaf nodes form a linked Leaf nodes do not form a


Leaf Nodes list for efficient range- linked list
based queries

Order Higher order (more keys) Lower order (fewer keys)

Key Typically allows key Usually does not allow key


Duplication duplication in leaf nodes duplication

Better disk access due to More disk I/O due to non-


Disk Access sequential reads in a sequential reads in
linked list structure internal nodes

Database systems, file In-memory data


Applications systems, where range structures, databases,
queries are common general-purpose use

Better performance for Balanced performance for


Performance range queries and bulk search, insert, and delete
data retrieval operations

Requires less memory as


Memory Requires more memory keys and values are stored
Usage for internal nodes in the same node
Structure of B+ Trees:

➢ B+ Trees contain two types of nodes:


• Internal Nodes: Internal Nodes are the nodes that are present in at least n/2 record
pointers, but not in the root node,
• Leaf Nodes: Leaf Nodes are the nodes that have n pointers.

➢ The Structure of the Internal Nodes of a B+ Tree of Order ‘a’ is as Follows:


• Each internal node is of the form: <P1, K1, P2, K2, ….., Pc-1, Kc-1, Pc> where c <= a
and each Pi is a tree pointer (i.e points to another node of the tree) and, each Ki is a
key-value (see diagram-I for reference).
• Every internal node has : K1 < K2 < …. < Kc-1
• For each search field value ‘X’ in the sub-tree pointed at by Pi, the following condition
holds: Ki-1 < X <= Ki, for 1 < I < c and, Ki-1 < X, for i = c (See diagram I for reference)
• Each internal node has at most ‘aa tree pointers.
• The root node has, at least two tree pointers, while the other internal nodes have at
least \ceil(a/2) tree pointers each.
• If an internal node has ‘c’ pointers, c <= a, then it has ‘c – 1’ key values.

The Structure of the Leaf Nodes of a B+ Tree of Order ‘b’ is as Follows:


• Each leaf node is of the form: <<K1, D1>, <K2, D2>, ….., <Kc-1, Dc-1>, Pnext> where c
<= b and each Di is a data pointer (i.e points to actual record in the disk whose key
value is Ki or to a disk file block containing that record) and, each Ki is a key
value and, Pnext points to next leaf node in the B+ tree (see diagram II for
reference).
• Every leaf node has : K1 < K2 < …. < Kc-1, c <= b
• Each leaf node has at least \ceil(b/2) values.
• All leaf nodes are at the same level.

Diagram-II Using the Pnext pointer it is viable to traverse all the leaf nodes, just like a linked
list, there by achieving ordered access to the records stored in the disk.

➢ Advantages of B+Trees:
• A B+ tree with ‘l’ levels can store more entries in its internal nodes compared to a B-
tree having the same ‘l’ levels. This accentuates the significant improvement made to
the search time for any given key. Having lesser levels and the presence of
Pnext pointers imply that the B+ trees is very quick and efficient in accessing records
from disks.
• Data stored in a B+ tree can be accessed both sequentially and directly.
• It takes an equal number of disk accesses to fetch records.
• B+trees have redundant search keys, and storing search keys repeatedly is not
possible.

➢ Disadvantages of B+ Trees:
• The major drawback of B-tree is the difficulty of traversing the keys sequentially. The
B+ tree retains the rapid random access property of the B-tree while also allowing
rapid sequential access.
➢ Preorder Traversal of Binary Tree:

Preorder traversal is defined as a type of tree traversal that follows the Root-Left-Right
policy where:

• The root node of the subtree is visited first.


• Then the left subtree is traversed.
• At last, the right subtree is traversed.

➢ Complexity Analysis:

Time Complexity: O(N) where N is the total number of nodes. Because it traverses all the
nodes at least once.
Auxiliary Space:
• O(1) if no recursion stack space is considered.
• Otherwise, O(h) where h is the height of the tree
• In the worst case, h can be the same as N (when the tree is a skewed tree)
• In the best case, h can be the same as logN (when the tree is a complete tree)

➢ How does Preorder Traversal of Binary Tree work?

Consider the following tree:

If we perform a preorder traversal in this binary tree, then the traversal will be as follows:
Step 1: At first the root will be visited, i.e. node 1.

Node 1 is visited
Step 2: After this, traverse in the left subtree. Now the root of the left subtree is visited i.e.,
node 2 is visited.

Node 2 is visited

Step 3: Again the left subtree of node 2 is traversed and the root of that subtree i.e., node 4
is visited.

Node 4 is visited

Step 4: There is no subtree of 4 and the left subtree of node 2 is visited. So now the right
subtree of node 2 will be traversed and the root of that subtree i.e., node 5 will be visited.

Node 5 is visited
Step 5: The left subtree of node 1 is visited. So now the right subtree of node 1 will be
traversed and the root node i.e., node 3 is visited.

Node 3 is visited

Step 6: Node 3 has no left subtree. So the right subtree will be traversed and the root of the
subtree i.e., node 6 will be visited. After that there is no node that is not yet traversed. So
the traversal ends.

The complete tree is visited

So the order of traversal of nodes is 1 -> 2 -> 4 -> 5 -> 3 -> 6.


➢ Inorder Traversal of Binary Tree:

Inorder traversal is defined as a type of tree traversal technique which follows the Left-
Root-Right pattern, such that:

• The left subtree is traversed first


• Then the root node for that subtree is traversed
• Finally, the right subtree is traversed

Time Complexity: O(N) where N is the total number of nodes. Because it traverses all the
nodes at least once.
Auxiliary Space: O(h) where h is the height of the tree. This space is required for recursion
calls.

• In the worst case, h can be the same as N (when the tree is a skewed tree)
• In the best case, h can be the same as log N (when the tree is a complete tree

How does Inorder Traversal of Binary Tree work?

Let us understand the algorithm with the below example tree

If we perform an inorder traversal in this binary tree, then the traversal will be as follows:

Step 1: The traversal will go from 1 to its left subtree i.e., 2, then from 2 to its left subtree
root, i.e., 4. Now 4 has no left subtree, so it will be visited. It also does not have any right
subtree. So no more traversal from 4

Node 4 is visited
Step 2: As the left subtree of 2 is visited completely, now it read data of node 2 before
moving to its right subtree.

Node 2 is visited

Step 3: Now the right subtree of 2 will be traversed i.e., move to node 5. For node 5 there is
no left subtree, so it gets visited and after that, the traversal comes back because there is no
right subtree of node 5.

Node 5 is visited

Step 4: As the left subtree of node 1 is, the root itself, i.e., node 1 will be visited.

Node 1 is visited
Step 5: Left subtree of node 1 and the node itself is visited. So now the right subtree of 1 will
be traversed i.e., move to node 3. As node 3 has no left subtree so it gets visited.

Node 3 is visited

Step 6: The left subtree of node 3 and the node itself is visited. So traverse to the right
subtree and visit node 6. Now the traversal ends as all the nodes are traversed.

The complete tree is traversed

So the order of traversal of nodes is 4 -> 2 -> 5 -> 1 -> 3 -> 6.

➢ Postorder Traversal of Binary Tree:

Postorder traversal is defined as a type of tree traversal which follows the Left-Right-Root
policy such that for each node:

• The left subtree is traversed first


• Then the right subtree is traversed
• Finally, the root node of the subtree is traversed
➢ Complexity Analysis:

Time Complexity: O(N) where N is the total number of nodes. Because it traverses all the
nodes at least once.
Auxiliary Space: O(1) if no recursion stack space is considered. Otherwise, O(h) where h is
the height of the tree

• In the worst case, h can be the same as N (when the tree is a skewed tree)

• In the best case, h can be the same as logN (when the tree is a complete tree)

How does Postorder Traversal of Binary Tree Work?

Consider the following tree:

If we perform a postorder traversal in this binary tree, then the traversal will be as follows:

Step 1: The traversal will go from 1 to its left subtree i.e., 2, then from 2 to its left subtree
root, i.e., 4. Now 4 has no subtree, so it will be visited.

Node 4 is visited
Step 2: As the left subtree of 2 is visited completely, now it will traverse the right subtree of
2 i.e., it will move to 5. As there is no subtree of 5, it will be visited.

Node 5 is visited

Step 3: Now both the left and right subtrees of node 2 are visited. So now visit node 2 itself.

Node 2 is visited

Step 4: As the left subtree of node 1 is traversed, it will now move to the right subtree root,
i.e., 3. Node 3 does not have any left subtree, so it will traverse the right subtree i.e., 6. Node
6 has no subtree and so it is visited.

Node 6 is visited
Step 5: All the subtrees of node 3 are traversed. So now node 3 is visited.

Node 3 is visited

Step 6: As all the subtrees of node 1 are traversed, now it is time for node 1 to be visited and
the traversal ends after that as the whole tree is traversed.

The complete tree is visited

So the order of traversal of nodes is 4 -> 5 -> 2 -> 6 -> 3 -> 1.

You might also like