0% found this document useful (0 votes)
14 views35 pages

Ads Unit 1

The document provides an introduction to algorithm analysis, covering concepts such as algorithms, their specifications, performance analysis, space complexity, and time complexity. It explains how to evaluate algorithms based on their efficiency and resource requirements, using asymptotic notations to express complexity. Additionally, it includes examples of algorithms and their complexities, emphasizing the importance of selecting the best algorithm for a given problem.

Uploaded by

lakshmi prasanna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views35 pages

Ads Unit 1

The document provides an introduction to algorithm analysis, covering concepts such as algorithms, their specifications, performance analysis, space complexity, and time complexity. It explains how to evaluate algorithms based on their efficiency and resource requirements, using asymptotic notations to express complexity. Additionally, it includes examples of algorithms and their complexities, emphasizing the importance of selecting the best algorithm for a given problem.

Uploaded by

lakshmi prasanna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 35

UNIT -1

Introduction to Algorithm Analysis, Space and Time Complexity analysis,


Asymptotic Notations. AVL Trees – Creation, Insertion, Deletion operations and
Applications B-Trees – Creation, Insertion, Deletion operations and Applications.
ALGORITHM :
What is an algorithm?
An algorithm is a step by step procedure to solve a problem. In normal language, the algorithm is defined as
a sequence of statements which are used to perform a task. In computer science, an algorithm can be
defined as follows...

An algorithm is a sequence of unambiguous instructions used for solving a problem, which can be
implemented (as a program) on a computer.

Algorithms are used to convert our problem solution into step by step statements. These statements can
be converted into computer programming instructions which form a program. This program is executed by
a computer to produce a solution. Here, the program takes required data as input, processes data according
to the program instructions and finally produces a result as shown in the following picture.

Specifications of Algorithms
Every algorithm must satisfy the following specifications...

1. Input - Every algorithm must take zero or more number of input values from external.
2. Output - Every algorithm must produce an output as result.
3. Definiteness - Every statement/instruction in an algorithm must be clear and unambiguous (only
one interpretation).
4. Finiteness - For all different cases, the algorithm must produce result within a finite number of
steps.
5. Effectiveness - Every instruction must be basic enough to be carried out and it also must be
feasible.

Example for an Algorithm

Let us consider the following problem for finding the largest value in a given list of values.
Problem Statement: Find the largest number in the given list of numbers?
Input: A list of positive integer numbers. (List must contain at least one number).
Output: The largest number in the given list of positive integer numbers.

Consider the given list of numbers as 'L' (input), and the largest number as 'max' (Output).

ADSA UNIT-1 R23 DEPT OF CSE(AI&ML) KITS AKSHAR INSTITUTE OF TECHNOLOGY


Algorithm

1. Step 1: Define a variable 'max' and initialize with '0'.


2. Step 2: Compare first number (say 'x') in the list 'L' with 'max', if 'x' is larger than 'max', set 'max' to
'x'.
3. Step 3: Repeat step 2 for all numbers in the list 'L'.
4. Step 4: Display the value of 'max' as a result.

Code using C Programming Language


int findMax(L)
{
int max = 0,i;
for(i=0; i < listSize; i++)
{
if(L[i] > max)
max = L[i];
}
return max;
}

Recursive Algorithm
In computer science, all algorithms are implemented with programming language functions. We can view a
function as something that is invoked (called) by another function. It executes its code and then returns
control to the calling function. Here, a function can be called by itself or it may call another function which
in turn call the same function inside it is known as recursion. A recursive function can be defined as
follows...

The function which is called by itself is known as Direct Recursive function (or Recursive function)

A recursive algorithm can also be defined as follows...

The function which calls a function and that function calls its called function is known Indirect
Recursive function (or Recursive function)

Most of the computer science students think that recursive is a technique useful for only a few special
problems like computing factorials, Ackermann's function, etc., This is unfortunate because the function
implemented using assignment or if-else or while or looping statements can also be implemented using
recursive functions. This recursive function is very easier to understand when compared to its iterative
counterpart.

Performance Analysis

What is Performance Analysis of an algorithm?


 If we want to go from city "A" to city "B", there can be many ways of doing this. We can go
by flight, by bus, by train and also by bicycle. Depending on the availability and
convenience, we choose the one which suits us.
 Similarly, in computer science, there are multiple algorithms to solve a problem. When we
have more than one algorithm to solve a problem, we need to select the best one.

ADSA UNIT-1 R23 DEPT OF CSE(AI&ML) KITS AKSHAR INSTITUTE OF TECHNOLOGY


 Performance analysis helps us to select the best algorithm from multiple algorithms to
solve a problem.
 When there are multiple alternative algorithms to solve a problem, we analyze them and
pick the one which is best suitable for our requirements. The formal definition is as
follows...

Performance of an algorithm is a process of making evaluative judgement about algorithms.

It can also be defined as follows...

Performance of an algorithm means predicting the resources which are required to an algorithm to

perform its task.

Generally, the performance of an algorithm depends on the following elements...

1. Whether that algorithm is providing the exact solution for the problem?
2. Whether it is easy to understand?
3. Whether it is easy to implement?
4. How much space (memory) it requires to solve the problem?
5. How much time it takes to solve the problem? Etc.,

When we want to analyze an algorithm, we consider only the space and time required by that
particular algorithm and we ignore all the remaining elements.
Based on this information, performance analysis of an algorithm can also be defined as
follows...

Performance analysis of an algorithm is the process of calculating space and time required by

that algorithm.

Performance analysis of an algorithm is performed by using the following measures...

1. Space required to complete the task of that algorithm (Space Complexity). It includes
program space and data space
2. Time required to complete the task of that algorithm (Time Complexity)

Space Complexity
What is Space complexity?
When we design an algorithm to solve a problem, it needs some computer memory to
complete its execution. For any algorithm, memory is required for the following purposes...

1. To store program instructions.


2. To store constant values.
3. To store variable values.
4. And for few other things like function calls, jumping statements etc,.

ADSA UNIT-1 R23 DEPT OF CSE(AI&ML) KITS AKSHAR INSTITUTE OF TECHNOLOGY


Space complexity of an algorithm can be defined as follows...

Total amount of computer memory required by an algorithm to complete its execution is called as

space complexity of that algorithm.

Generally, when a program is under execution it uses the computer memory for THREE reasons. They are as

follows...

1. Instruction Space: It is the amount of memory used to store compiled version of instructions.

2. Environmental Stack: It is the amount of memory used to store information of partially executed

functions at the time of function call.

3. Data Space: It is the amount of memory used to store all the variables and constants.

Note - When we want to perform analysis of an algorithm based on its Space complexity, we

consider only Data Space and ignore Instruction Space as well as Environmental Stack.

That means we calculate only the memory required to store Variables, Constants, Structures,

etc.,

To calculate the space complexity, we must know the memory required to store different datatype values

(according to the compiler). For example, the C Programming Language compiler requires the following...

1. 2 bytes to store Integer value.

2. 4 bytes to store Floating Point value.

3. 1 byte to store Character value.

4. 6 (OR) 8 bytes to store double value.

Consider the following piece of code...


Example 1
int square(int a)
{
return a*a;
}

In the above piece of code, it requires 2 bytes of memory to store variable 'a' and another 2

bytes of memory is used for return value.

ADSA UNIT-1 R23 DEPT OF CSE(AI&ML) KITS AKSHAR INSTITUTE OF TECHNOLOGY


That means, totally it requires 4 bytes of memory to complete its execution. And this 4

bytes of memory is fixed for any input value of 'a'. This space complexity is said to

be Constant Space Complexity.

If any algorithm requires a fixed amount of space for all input values then that space complexity is

said to be Constant Space Complexity.

Consider the following piece of code...

Example 2
int sum(int A[ ], int n)
{
int sum = 0, i;
for(i = 0; i < n; i++)
sum = sum + A[i];
return sum;
}

In the above piece of code it requires

'n*2' bytes of memory to store array variable 'a[ ]'

2 bytes of memory for integer parameter 'n'

4 bytes of memory for local integer variables 'sum' and 'i' (2 bytes each)

2 bytes of memory for return value.

That means, totally it requires '2n+8' bytes of memory to complete its execution. Here,
the total amount of memory required depends on the value of 'n'. As 'n' value increases

the space required also increases proportionately. This type of space complexity is said

to be Linear Space Complexity.

If the amount of space required by an algorithm is increased with the increase of input

value, then that space complexity is said to be Linear Space Complexity.

What is Time complexity?


Every algorithm requires some amount of computer time to execute its instruction to perform
the task. This computer time required is called time complexity.
The time complexity of an algorithm can be defined as follows...

ADSA UNIT-1 R23 DEPT OF CSE(AI&ML) KITS AKSHAR INSTITUTE OF TECHNOLOGY


The time complexity of an algorithm is the total amount of time required by an
algorithm to complete its execution.
Generally, the running time of an algorithm depends upon the following...

1. Whether it is running on Single processor machine or Multi processor machine.


2. Whether it is a 32 bit machine or 64 bit machine.
3. Read and Write speed of the machine.
4. The amount of time required by an algorithm to
perform Arithmetic operations, logical operations, return value
and assignment operations etc.,
5. Input data

Note - When we calculate time complexity of an algorithm, we consider only input data and
ignore the remaining things, as they are machine dependent. We check only, how our program
is behaving for the different input values to perform all the operations like Arithmetic, Logical,
Return value and Assignment etc.,

Calculating Time Complexity of an algorithm based on the system configuration is a very


difficult task because the configuration changes from one system to another system. To solve
this problem, we must assume a model machine with a specific configuration. So that, we can
able to calculate generalized time complexity according to that model machine.
To calculate the time complexity of an algorithm, we need to define a model machine. Let us
assume a machine with following configuration...

1. It is a Single processor machine


2. It is a 32 bit Operating System machine
3. It performs sequential execution
4. It requires 1 unit of time for Arithmetic and Logical operations
5. It requires 1 unit of time for Assignment and Return value
6. It requires 1 unit of time for Read and Write operations

Now, we calculate the time complexity of following example code by using the above-defined
model machine...
Consider the following piece of code...

Example 1
int sum(int a, int b)
{
return a+b;
}
In the above sample code, it requires 1 unit of time to calculate a+b and 1 unit of time to
return the value. That means, totally it takes 2 units of time to complete its execution. And it
does not change based on the input values of a and b. That means for all input values, it
requires the same amount of time i.e. 2 units.
If any program requires a fixed amount of time for all input values then its time
complexity is said to be Constant Time Complexity.
ADSA UNIT-1 R23 DEPT OF CSE(AI&ML) KITS AKSHAR INSTITUTE OF TECHNOLOGY
Consider the following piece of code...

Example 2
int sum(int A[], int n)
{
int sum = 0, i;
for(i = 0; i < n; i++)
sum = sum + A[i];
return sum;
}
For the above code, time complexity can be calculated as follows...

In above calculation
Cost is the amount of computer time required for a single operation in each line.
Repeatation is the amount of computer time required by each operation for all its
repeatations.
Total is the amount of computer time required by each operation to execute.
So above code requires '4n+4' Units of computer time to complete the task. Here the exact
time is not fixed. And it changes based on the n value. If we increase the n value then the time
required also increases linearly.

Totally it takes '4n+4' units of time to complete its execution and it is Linear Time
Complexity.
If the amount of time required by an algorithm is increased with the increase of input
value then that time complexity is said to be Linear Time Complexity.

What is Asymptotic Notation?


Whenever we want to perform analysis of an algorithm, we need to calculate the complexity of
that algorithm. But when we calculate the complexity of an algorithm it does not provide the
exact amount of resource required. So instead of taking the exact amount of resource, we
represent that complexity in a general form (Notation) which produces the basic nature of
that algorithm. We use that general form (Notation) for analysis process.
Asymptotic notation of an algorithm is a mathematical representation of its complexity.
Note - In asymptotic notation, when we want to represent the complexity of an algorithm, we
use only the most significant terms in the complexity of that algorithm and ignore least
significant terms in the complexity of that algorithm (Here complexity can be Space
Complexity or Time Complexity).
For example, consider the following time complexities of two algorithms...

 Algorithm 1 : 5n2 + 2n + 1
 Algorithm 2 : 10n2 + 8n + 3

ADSA UNIT-1 R23 DEPT OF CSE(AI&ML) KITS AKSHAR INSTITUTE OF TECHNOLOGY


Generally, when we analyze an algorithm, we consider the time complexity for larger values of
input data (i.e. 'n' value). In above two time complexities, for larger value of 'n' the term
'2n + 1' in algorithm 1 has least significance than the term '5n2', and the term '8n + 3' in
algorithm 2 has least significance than the term '10n2'.
Here, for larger value of 'n' the value of most significant terms ( 5n2 and 10n2 ) is very larger
than the value of least significant terms ( 2n + 1 and 8n + 3 ). So for larger value of 'n' we
ignore the least significant terms to represent overall time required by an algorithm. In
asymptotic notation, we use only the most significant terms to represent the time complexity
of an algorithm.

Majorly, we use THREE types of Asymptotic Notations and those are as follows...

1. Big - Oh (O)
2. Big - Omega (Ω)
3. Big - Theta (Θ)

Big - Oh Notation (O)


Big - Oh notation is used to define the upper bound of an algorithm in terms of Time
Complexity.
That means Big - Oh notation always indicates the maximum time required by an algorithm
for all input values. That means Big - Oh notation describes the worst case of an algorithm
time complexity.
Big - Oh Notation can be defined as follows...
Consider function f(n) as time complexity of an algorithm and g(n) is the most
significant term. If f(n) <= C g(n) for all n >= n0, C > 0 and n0 >= 1. Then we can represent
f(n) as O(g(n)).
f(n) = O(g(n))
Consider the following graph drawn for the values of f(n) and C g(n) for input (n) value on X-
Axis and time required is on Y-Axis

In above graph after a particular input value n 0, always C g(n) is greater than f(n) which
indicates the algorithm's upper bound.

ADSA UNIT-1 R23 DEPT OF CSE(AI&ML) KITS AKSHAR INSTITUTE OF TECHNOLOGY


Example
Consider the following f(n) and g(n)...
f(n) = 3n + 2
g(n) = n
If we want to represent f(n) as O(g(n)) then it must satisfy f(n) <= C g(n) for all values of C >
0 and n0>= 1
f(n) <= C g(n)
⇒3n + 2 <= C n
Above condition is always TRUE for all values of C = 4 and n >= 2.
By using Big - Oh notation we can represent the time complexity as follows...
3n + 2 = O(n)

Big - Omege Notation (Ω)


Big - Omega notation is used to define the lower bound of an algorithm in terms of Time
Complexity.
That means Big-Omega notation always indicates the minimum time required by an algorithm
for all input values. That means Big-Omega notation describes the best case of an algorithm
time complexity.
Big - Omega Notation can be defined as follows...
Consider function f(n) as time complexity of an algorithm and g(n) is the most
significant term. If f(n) >= C g(n) for all n >= n0, C > 0 and n0 >= 1. Then we can represent
f(n) as Ω(g(n)).
f(n) = Ω(g(n))
Consider the following graph drawn for the values of f(n) and C g(n) for input (n) value on X-
Axis and time required is on Y-Axis

In above graph after a particular input value n 0, always C g(n) is less than f(n) which indicates
the algorithm's lower bound.
Example
Consider the following f(n) and g(n)...
f(n) = 3n + 2
g(n) = n
If we want to represent f(n) as Ω(g(n)) then it must satisfy f(n) >= C g(n) for all values of C > 0 and n0>= 1

ADSA UNIT-1 R23 DEPT OF CSE(AI&ML) KITS AKSHAR INSTITUTE OF TECHNOLOGY


f(n) >= C g(n)
⇒3n + 2 >= C n
Above condition is always TRUE for all values of C = 1 and n >= 1.
By using Big - Omega notation we can represent the time complexity as follows...
3n + 2 = Ω(n)

Big - Theta Notation (Θ)


Big - Theta notation is used to define the average bound of an algorithm in terms of Time
Complexity.
That means Big - Theta notation always indicates the average time required by an algorithm
for all input values. That means Big - Theta notation describes the average case of an
algorithm time complexity.
Big - Theta Notation can be defined as follows...
Consider function f(n) as time complexity of an algorithm and g(n) is the most
significant term. If C1 g(n) <= f(n) <= C2 g(n) for all n >= n0, C1 > 0, C2 > 0 and n0 >= 1. Then
we can represent f(n) as Θ(g(n)).
f(n) = Θ(g(n))
Consider the following graph drawn for the values of f(n) and C g(n) for input (n) value on X-
Axis and time required is on Y-Axis

In above graph after a particular input value n 0, always C1 g(n) is less than f(n) and C2 g(n) is greater than
f(n) which indicates the algorithm's average bound.

Example
Consider the following f(n) and g(n)...
f(n) = 3n + 2
g(n) = n
If we want to represent f(n) as Θ(g(n)) then it must satisfy C1 g(n) <= f(n) <= C2 g(n) for all values of C1 >
0, C2 > 0 and n0>= 1
C1 g(n) <= f(n) <= C2 g(n)
⇒C1 n <= 3n + 2 <= C2 n
Above condition is always TRUE for all values of C1 = 1, C2 = 4 and n >= 2.
By using Big - Theta notation we can represent the time compexity as follows...
3n + 2 = Θ(n)

ADSA UNIT-1 R23 DEPT OF CSE(AI&ML) KITS AKSHAR INSTITUTE OF TECHNOLOGY


AVL Tree Data structure
 AVL tree is a height-balanced binary search tree. That means, an AVL tree is also a
binary search tree but it is a balanced tree.
 A binary tree is said to be balanced if, the difference between the heights of left and
right sub trees of every node in the tree is either -1, 0 or +1.
 In other words, a binary tree is said to be balanced if the height of left and right
children of every node differ by either -1, 0 or +1.
 In an AVL tree, every node maintains an extra information known as balance factor.
 The AVL tree was introduced in the year 1962 by G.M. Adelson-Velsky and E.M. Landis.

An AVL tree is defined as follows...

An AVL tree is a balanced binary search tree. In an AVL tree, balance factor of every node is either -1,

0 or +1.

Balance factor of a node is the difference between the heights of the left and right subtrees of

that node. The balance factor of a node is calculated either height of left subtree - height of

right subtree (OR) height of right subtree - height of left subtree. In the following

explanation, we calculate as follows...

Balance factor = height Of LeftSubtree – height Of Right Subtree


Example of AVL Tree

The above tree is a binary search tree and every node is satisfying balance factor condition. So

this tree is said to be an AVL tree.

Every AVL Tree is a binary search tree but every Binary Search Tree need not be AVL

tree.

ADSA UNIT-1 R23 DEPT OF CSE(AI&ML) KITS AKSHAR INSTITUTE OF TECHNOLOGY


AVL Tree Rotations

In AVL tree, after performing operations like insertion and deletion we need to check

the balance factor of every node in the tree. If every node satisfies the balance factor

condition then we conclude the operation otherwise we must make it balanced. Whenever the

tree becomes imbalanced due to any operation we use rotation operations to make the tree

balanced.

Rotation operations are used to make the tree balanced.

Rotation is the process of moving nodes either to left or to right to make the tree

balanced.

There are four rotations and they are classified into two types.

Single Left Rotation (LL Rotation)

In LL Rotation, every node moves one position to left from the current position. To understand

LL Rotation, let us consider the following insertion operation in AVL Tree...

ADSA UNIT-1 R23 DEPT OF CSE(AI&ML) KITS AKSHAR INSTITUTE OF TECHNOLOGY


Single Right Rotation (RR Rotation)

In RR Rotation, every node moves one position to right from the current position. To

understand RR Rotation, let us consider the following insertion operation in AVL Tree...

Left Right Rotation (LR Rotation)

The LR Rotation is a sequence of single left rotation followed by a single right rotation. In LR

Rotation, at first, every node moves one position to the left and one position to right from the

current position. To understand LR Rotation, let us consider the following insertion operation

in AVL Tree...

Right Left Rotation (RL Rotation)

The RL Rotation is sequence of single right rotation followed by single left rotation. In RL

Rotation, at first every node moves one position to right and one position to left from the

current position. To understand RL Rotation, let us consider the following insertion operation

in AVL Tree...

ADSA UNIT-1 R23 DEPT OF CSE(AI&ML) KITS AKSHAR INSTITUTE OF TECHNOLOGY


Operations on an AVL Tree

The following operations are performed on AVL tree...

1. Search

2. Insertion

3. Deletion

Search Operation in AVL Tree

In an AVL tree, the search operation is performed with O(log n) time complexity. The search

operation in the AVL tree is similar to the search operation in a Binary search tree. We use the

following steps to search an element in AVL tree...

 Step 1 - Read the search element from the user.

 Step 2 - Compare the search element with the value of root node in the tree.

 Step 3 - If both are matched, then display "Given node is found!!!" and terminate the

function

 Step 4 - If both are not matched, then check whether search element is smaller or

larger than that node value.

 Step 5 - If search element is smaller, then continue the search process in left subtree.

 Step 6 - If search element is larger, then continue the search process in right subtree.

 Step 7 - Repeat the same until we find the exact element or until the search element is

compared with the leaf node.

ADSA UNIT-1 R23 DEPT OF CSE(AI&ML) KITS AKSHAR INSTITUTE OF TECHNOLOGY


 Step 8 - If we reach to the node having the value equal to the search value, then display

"Element is found" and terminate the function.

 Step 9 - If we reach to the leaf node and if it is also not matched with the search

element, then display "Element is not found" and terminate the function.

Insertion Operation in AVL Tree

In an AVL tree, the insertion operation is performed with O(log n) time complexity. In AVL

Tree, a new node is always inserted as a leaf node. The insertion operation is performed as

follows...

 Step 1 - Insert the new element into the tree using Binary Search Tree insertion logic.
 Step 2 - After insertion, check the Balance Factor of every node.

 Step 3 - If the Balance Factor of every node is 0 or 1 or -1 then go for next operation.

 Step 4 - If the Balance Factor of any node is other than 0 or 1 or -1 then that tree is

said to be imbalanced. In this case, perform suitable Rotation to make it balanced and

go for next operation.

ADSA UNIT-1 R23 DEPT OF CSE(AI&ML) KITS AKSHAR INSTITUTE OF TECHNOLOGY


ADSA UNIT-1 R23 DEPT OF CSE(AI&ML) KITS AKSHAR INSTITUTE OF TECHNOLOGY
Deletion Operation in AVL Tree

The deletion operation in AVL Tree is similar to deletion operation in BST. But after every

deletion operation, we need to check with the Balance Factor condition. If the tree is balanced

after deletion go for next operation otherwise perform suitable rotation to make the tree

Balanced.
Deletion in an AVL Tree

 Deletion in an AVL tree is similar to that in a BST.


 Deletion of a node tends to disturb the balance factor. Thus to balance the tree, we
again use the Rotation mechanism.
 Deletion in AVL tree consists of two steps:
o Removal of the node: The given node is removed from the tree structure. The
node to be removed can either be a leaf or an internal node.
o Re-balancing of the tree: The elimination of a node from the tree can cause
disturbance to the balance factor of certain nodes. Thus it is important to re-
balance theb_fact of the nodes; since the balance factor is the primary aspect that
ensures the tree is an AVL Tree.
Note: There are certain points that must be kept in mind during a deletion process.

 If the node to be deleted is a leaf node, it is simply removed from the tree.
 If the node to be deleted has one child node, the child node is replaced with the node to
be deleted simply.
 If the node to be deleted has two child nodes then,
o Either replace the node with it’s inorder predecessor , i.e, the largest element
of the left sub tree.
o Or replace the node with it’s inorder successor , i.e, the smallest element of
the right sub tree.
Let us consider the same example as above with some additional elements as shown. We can
see in the image the balance factor of each node in the tree.

ADSA UNIT-1 R23 DEPT OF CSE(AI&ML) KITS AKSHAR INSTITUTE OF TECHNOLOGY


Step 1:

 The node to be deleted from the tree is 8.


 If we observe it is the parent node of the node 5 and 9.
 Since the node 8 has two children it can be replaced by either of it’s child nodes.

Step 2:

 The node 8 is deleted from the tree.


 As the node is deleted we replace it with either of it’s children nodes.
 Here we replaced the node with the inorder successor , i.e, 9.
 Again we check the balance factor for each node.

Step 3:
 Now The next element to be deleted is 12.
 If we observe, we can see that the node 12 has a left subtree and a right subtree.
 We again can replace the node by either it’s inorder successor or inorder
predecessor.
 In this case we have replaced it by the inorder successor.
Step 4:
 The node 12 is deleted from the tree.
 Since we have replaced the node with the inorder successor, the tree structure
looks like shown in the image.
 After removal and replacing check for the balance factor of each node of the tree.

ADSA UNIT-1 R23 DEPT OF CSE(AI&ML) KITS AKSHAR INSTITUTE OF TECHNOLOGY


Step 5:
 The next node to be eliminated is 14.
 It can be seen clearly in the image that 14 is a leaf node.
 Thus it can be eliminated easily from the tree.
Step 6:
 As the node 14 is deleted, we check the balance factor of all the nodes.
 We can see the balance factor of the node 13 is 2.
 This violates the terms of the AVL tree thus we need to balance it using the rotation
mechanism.

Step 7:
 In order to balance the tree, we identify the rotation mechanism to be applied.
 Here we need to use LL Rotation.
 The nodes involved in the rotation is shown as follows.
Step 8:
 The nodes are rotated and the tree satisfies the conditions of an AVL tree.
 The final structure of the tree is shown as follows.
 We can see all the nodes have their balance factor as ‘0’ , ‘1’ and ‘-1’.

ADSA UNIT-1 R23 DEPT OF CSE(AI&ML) KITS AKSHAR INSTITUTE OF TECHNOLOGY


Advantages of AVL Trees
o The AVL tree's height never exceeds log N, where N is the total number of nodes in the tree,
since the height is always balanced.
o When compared to straightforward Binary Search trees, it provides a superior search time
complexity.
o AVL trees are capable of self-balancing.

Disadvantages of AVL Trees


o The aforementioned examples show that AVL trees can be challenging to implement.
o Additionally, for particular operations, AVL trees have significant constant factors.
o Red-black trees, as opposed to AVL trees, are used in the majority of STL implementations of
the ordered associative containers (sets, multisets, maps, and multimaps). Red-black trees,
unlike AVL trees, only need one restructure for an insertion or removal.

Applications of AVL Trees

o Most in-memory sets and dictionaries are stored using AVL trees.
o Database applications, where insertions and deletions are less common but frequent data
lookups are necessary, also frequently employ AVL trees.
o In addition to database applications, it is employed in other applications that call for better
searching.
o A balanced binary search tree called an AVL tree uses rotation to keep things balanced.
o It may be used in games with plotlines as well.
o It is mostly utilized in business sectors where it is necessary to keep records on the employees
that work there and their shift changes.

B-Tree
B trees are extended binary search trees that are specialized in m-way searching, since the
order of B trees is 'm'.
 Order of a tree is defined as the maximum number of children a node can
accommodate. Therefore, the height of a b tree is relatively smaller than the height of
AVL tree and RB tree.
 They are general form of a Binary Search Tree as it holds more than one key and two
children.
The various properties of B trees include −
 Every node in a B Tree will hold a maximum of m children and (m-1) keys, since the
order of the tree is m.
 Every node in a B tree, except root and leaf, can hold at least m/2 children
 The root node must have no less than two children.
 All the paths in a B tree must end at the same level, i.e. the leaf nodes must be at the
same level.
 A B tree always maintains sorted data.

ADSA UNIT-1 R23 DEPT OF CSE(AI&ML) KITS AKSHAR INSTITUTE OF TECHNOLOGY


B trees are also widely used in disk access, minimizing the disk access time since the height of
a b tree is low.

Note − A disk access is the memory access to the computer disk where the information is
stored and disk access time is the time taken by the system to access the disk memory.

Basic Operations of B Trees


The operations supported in B trees are Insertion, deletion and searching with the time
complexity of O(log n) for every operation.
Insertion operation

The insertion operation for a B Tree is done similar to the Binary Search Tree but the elements
are inserted into the same node until the maximum keys are reached. The insertion is done
using the following procedure −

Step 1 − Calculate the maximum (m−1)(m−1) and, minimum (⌈m2⌉−1)(⌈m2⌉−1) number of


keys a node can hold, where m is denoted by the order of the B Tree.

Step 2 − The data is inserted into the tree using the binary search insertion and once the keys
reach the maximum number, the node is split into half and the median key becomes the
internal node while the left and right keys become its children.

ADSA UNIT-1 R23 DEPT OF CSE(AI&ML) KITS AKSHAR INSTITUTE OF TECHNOLOGY


Step 3 − All the leaf nodes must be on the same level.

The keys, 5, 3, 21, 9, 13 are all added into the node according to the binary search property but
if we add the key 22, it will violate the maximum key property. Hence, the node is split in half,
the median key is shifted to the parent node and the insertion is then continued.

Another hiccup occurs during the insertion of 11, so the node is split and median is shifted to
the parent.

ADSA UNIT-1 R23 DEPT OF CSE(AI&ML) KITS AKSHAR INSTITUTE OF TECHNOLOGY


While inserting 16, even if the node is split in two parts, the parent node also overflows as it
reached the maximum keys. Hence, the parent node is split first and the median key becomes
the root. Then, the leaf node is split in half the median of leaf node is shifted to its parent.

The final B tree after inserting all the elements is achieved.

Operations on a B-Tree
The following operations are performed on a B-Tree...

1. Search
2. Insertion
3. Deletion

Search Operation in B-Tree


 The search operation in B-Tree is similar to the search operation in Binary Search Tree.
 In a Binary search tree, the search process starts from the root node and we make a 2-way
decision every time (we go to either left subtree or right subtree).
 In B-Tree also search process starts from the root node but here we make an n-way decision
every time. Where 'n' is the total number of children the node has.
 In a B-Tree, the search operation is performed with O(log n) time complexity.

The search operation is performed as follows...


ADSA UNIT-1 R23 DEPT OF CSE(AI&ML) KITS AKSHAR INSTITUTE OF TECHNOLOGY
 Step 1 - Read the search element from the user.
 Step 2 - Compare the search element with first key value of root node in the tree.
 Step 3 - If both are matched, then display "Given node is found!!!" and terminate the function
 Step 4 - If both are not matched, then check whether search element is smaller or larger than
that key value.
 Step 5 - If search element is smaller, then continue the search process in left subtree.
 Step 6 - If search element is larger, then compare the search element with next key value in the
same node and repeat steps 3, 4, 5 and 6 until we find the exact match or until the search
element is compared with last key value in the leaf node.
 Step 7 - If the last key value in the leaf node is also not matched then display "Element is not
found" and terminate the function.

Insertion Operation in B-Tree


In a B-Tree, a new element must be added only at the leaf node. That means, the new keyValue is
always attached to the leaf node only. The insertion operation is performed as follows...

 Step 1 - Check whether tree is Empty.


 Step 2 - If tree is Empty, then create a new node with new key value and insert it into the tree
as a root node.
 Step 3 - If tree is Not Empty, then find the suitable leaf node to which the new key value is
added using Binary Search Tree logic.
 Step 4 - If that leaf node has empty position, add the new key value to that leaf node in
ascending order of key value within the node.
 Step 5 - If that leaf node is already full, split that leaf node by sending middle value to its parent
node. Repeat the same until the sending value is fixed into a node.
 Step 6 - If the spilting is performed at root node then the middle value becomes new root node
for the tree and the height of the tree is increased by one.

Example
Construct a B-Tree of Order 3 by inserting numbers from 1 to 10.

ADSA UNIT-1 R23 DEPT OF CSE(AI&ML) KITS AKSHAR INSTITUTE OF TECHNOLOGY


ADSA UNIT-1 R23 DEPT OF CSE(AI&ML) KITS AKSHAR INSTITUTE OF TECHNOLOGY
Suppose we have to create a B tree of order 4. The elements to be inserted are 4, 2, 20, 10, 1, 14, 7,
11, 3, 8.

Since m=3, max number of keys for a node = m-1 = 2.

 Insert 4
 Since 2<4, insert 2 to the left of 4 in the same node.
 Since 20>4, insert 20 to the right of 4 in the same node. As we now, maximum number of
keys in the node are 2, one of these keys will have to be moved to a node above to split it. 4
being the middle element will move up and 2 and 20 will be its left and right nodes
respectively.
 10>4 and 10<20 and thus, 10 will be inserted as a key in the node that contains 20 as a key.
 Since 1<2, it will be inserted as a key in the node that contains 2 as a key.
 14>10 and 14<20. Since the number of keys in that node exceeds the maximum number of
keys, the node will split after the middle key moves upto the node in the above line. Thus,
14 gets added to the right of 4 in the node that contains 4, and 10 and 20 are split as 2
separate nodes.
 Since 7<10, it gets inserted to the left in the node that contains 10 as a key.
 11<14 and 11>10. Thus, 11 should get added to the right of the node that contains 7 and 10.
However, since the maximum number of keys in the tree are 2, a split should take place.
 Thus, the middle element 10 moves to the above node and 7 and 11 split as separate nodes.
The above node now contains 4, 10 and 14. Since the count of keys exceeds the maximum
key count, there would be a split there. Now, 10 is the root node with 4 and 14 as its
children.
 Since 3<4 and 3>2,it gets inserted to the right of the node containing 1, 2. This node
exceeds the maximum count of keys in a node, leading to a split. 2 is added to the upper
node beside 4.
 Since 8>7 and 8<10, it gets added to the left of the node that contains 7 as a key.
 In this particular example, the number of comparisons at each step varied. The first value
was directly entered, thereafter every value had to be compared with the nodes present in
the tree.
 The time complexity for insertion in a B Tree is dependent on the number of nodes and
thus, O(log n).

The diagram below shows the insertions in order.

ADSA UNIT-1 R23 DEPT OF CSE(AI&ML) KITS AKSHAR INSTITUTE OF TECHNOLOGY


Deletion In B-Tree
The deletion of nodes in a B-Tree can be broadly classified into two vivid cases:
 deletion at leaf node.
 deletion at internal node.
Let us say the node to be deleted is called the target key. The target key can either be at the leaf
node or an internal node. Let us now consider the various cases that follow:
Example:
Let us consider the given tree. From the given tree we are to delete the following
elements:

Assuming we have order = 5; minimum keys = ⌈ m/2⌉ – 1 = 2; maximum keys = ⌈ m/2⌉ + 1


A = 20 , 53 , 89 , 90 , 85.

minimum children = ⌈ m/2⌉ = 3


= 4;

maximum children = m = 5

ADSA UNIT-1 R23 DEPT OF CSE(AI&ML) KITS AKSHAR INSTITUTE OF TECHNOLOGY


1. If the target key is at the leaf node :
If the target key is at the leaf node, we further study the given data to check if any of the
following cases exist:
 Case 1: If the leaf node consists of the min number of keys according to the given
degree/order, then the key is simply deleted from the node.
 Case 2: If the leaf contains the minimum number of keys, then:
 Case 2a: The node can borrow a key from the immediate left sibling node,if it has more
than the minimum number of keys.The transfer of the keys take place through the
parent node, i.e, the maximum key of the left sibling moves upwards and replaces the
parent; while the parent key moves down to the target node from where the target key
is simply deleted.
 Case 2b: The node can borrow a key from the immediate right sibling node, if it has
more than the minimum number of keys.The transfer of the keys take place through
the parent node, i.e, the minimum key of the right sibling moves upwards and replaces
the parent; while the parent key moves down to the target node from where the target
key is simply deleted.

 Case 2c: If neither of the siblings have keys more than the minimum number of keys
required then, merge the target node with either the left or the right sibling along with
the parent key of respective node.
2.If the target key is at the internal node:
If the target key is at an internal node, we further study the given data to check if any of the
following cases exist:

 Case 1: If the left child has more than the minimum number of keys, the target key in
the internal node is replaced by its inorder predecessor ,i.e, the largest element of the
left child node.
 Case 2: If the right child has more than the minimum number of keys, the target key in
the internal node is replaced by it’s inorder successor ,i.e, the smallest element of the
right child node.

ADSA UNIT-1 R23 DEPT OF CSE(AI&ML) KITS AKSHAR INSTITUTE OF TECHNOLOGY


Step 1:

 The first element to be deleted from the tree structure is 20.


 We can see the key lies in the leaf node.

Step 2:

 The key 20 exists in a leaf node.


 The node has more than the minimum number of keys required.
 Thus the key is simply deleted from the node.
 The tree after deletion is shown as follows.

Step 3:
 The next element to be deleted from the tree is 53.
 We can see from the image that the key exists in the leaf node.

Step 4:

 Since the node in which the target key 53 exists has just the minimum number of keys,
we cannot delete it directly.
 We check if the target node can borrow a key from it’s sibling nodes.
 Since the target node doesn’t have any right sibling, it borrows from the left sibling
node.
 As we have studied above how the process of borrow and replace takes place, we apply
it to the given structure.

ADSA UNIT-1 R23 DEPT OF CSE(AI&ML) KITS AKSHAR INSTITUTE OF TECHNOLOGY


Step 5:
 The key 49 moves upwards to the parent node.
 The key 50 moves down to the target node.

Step 6:
 Now, since the target node has keys more than the minimum number of keys required,
the key can be deleted directly.
 The tree structure after deletion is shown as follows.

Step 7:
 The next element to be deleted is 89.
 The target key lies within a leaf node as seen from the image.

Step 8:

 Again, the target node holds just the minimum number of keys required and hence the
node cannot be deleted directly.
 The target node now has to borrow a key from either of it’s siblings.
 We check the left sibling; it also holds just the minimum number of keys required.
 We check the right sibling node; it has one more than the minimum number of nodes
so the target node can borrow a key from it.

ADSA UNIT-1 R23 DEPT OF CSE(AI&ML) KITS AKSHAR INSTITUTE OF TECHNOLOGY


Step 9:

 The key 93 moves up to the parent node.


 The parent key 90 moves down to the target node.

Step 10:
 Now, as the target node has sufficient number of keys the target key can directly be
deleted from the target node.
 The tree structure after deletion is shown as follows.

Step 11:
 The next key to be deleted is 90.
 The key exists within a leaf node as shown in the image.

ADSA UNIT-1 R23 DEPT OF CSE(AI&ML) KITS AKSHAR INSTITUTE OF TECHNOLOGY


Step 12:

 We can see that the target node has just the minimum number of keys.
 The target node has to borrow a key from either of it’s siblings.
 Since each of the siblings just have the number of the minimum keys, it cannot borrow
the keys directly.

Step 13:

 Since the target node cannot borrow from either of the siblings, we merge the target
node, either of the sibling node and the corresponding parent to them.
 The process of merging is shown as follows.

Step 14:

 Since the target node now has sufficient number of keys, the target key 90 can be
deleted directly.
 The tree structure after the deletion of the element is shown as follows.

ADSA UNIT-1 R23 DEPT OF CSE(AI&ML) KITS AKSHAR INSTITUTE OF TECHNOLOGY


Step 15:

 The next target node is 85.


 Now here the node to be deleted is not a leaf node but an internal node.

Step 16:

 In case, when an internal node is to be deleted, we replace the key with it’s inorder
predecessor or inorder successor.
 We can select either of the child nodes if they have sufficient number of keys.
 But as we can see in this case the target internal node can only borrow from it’s right
child, i.e, inorder predecessor.
 The key 85 moves down to the child node; key 87 moves up to the parent node.

ADSA UNIT-1 R23 DEPT OF CSE(AI&ML) KITS AKSHAR INSTITUTE OF TECHNOLOGY


Step 17:

 Now, as the target key is moved to the leaf node, it can be simply deleted from the leaf
node.
 The final tree structure after deletion of various nodes and preserving the b-tree
properties is shown as follows.

Application of B-Tree:
B-trees are commonly used in applications where large amounts of data need to be stored
and retrieved efficiently. Some of the specific applications of B-trees include:
 Databases: B-trees are widely used in databases to store indexes that allow for efficient
searching and retrieval of data.
 File systems: B-trees are used in file systems to organize and store files efficiently.
 Operating systems: B-trees are used in operating systems to manage memory
efficiently.
 Network routers: B-trees are used in network routers to efficiently route packets
through the network.
 DNS servers: B-trees are used in Domain Name System (DNS) servers to store and
retrieve information about domain names.
 Compiler symbol tables: B-trees are used in compilers to store symbol tables that allow
for efficient compilation of code.

ADSA UNIT-1 R23 DEPT OF CSE(AI&ML) KITS AKSHAR INSTITUTE OF TECHNOLOGY


Advantages of B-Tree:
B-trees have several advantages over other data structures for storing and retrieving large
amounts of data. Some of the key advantages of B-trees include:

 Sequential Traversing: As the keys are kept in sorted order, the tree can be traversed
sequentially.
 Minimize disk reads: It is a hierarchical structure and thus minimizes disk reads.
 Partially full blocks: The B-tree has partially full blocks which speed up insertion and
deletion.
Disadvantages of B-Tree:
 Complexity: B-trees can be complex to implement and can require a significant amount
of programming effort to create and maintain.
 Overhead: B-trees can have significant overhead, both in terms of memory usage and
processing time. This is because B-trees require additional metadata to maintain the tree
structure and balance.
 Not optimal for small data sets: B-trees are most effective for storing and retrieving
large amounts of data. For small data sets, other data structures may be more efficient.
 Limited branching factor: The branching factor of a B-tree determines the number of
child nodes that each node can have. B-trees typically have a fixed branching factor,
which can limit their performance for certain types of data.

ADSA UNIT-1 R23 DEPT OF CSE(AI&ML) KITS AKSHAR INSTITUTE OF TECHNOLOGY

You might also like