Ads Unit 1
Ads Unit 1
An algorithm is a sequence of unambiguous instructions used for solving a problem, which can be
implemented (as a program) on a computer.
Algorithms are used to convert our problem solution into step by step statements. These statements can
be converted into computer programming instructions which form a program. This program is executed by
a computer to produce a solution. Here, the program takes required data as input, processes data according
to the program instructions and finally produces a result as shown in the following picture.
Specifications of Algorithms
Every algorithm must satisfy the following specifications...
1. Input - Every algorithm must take zero or more number of input values from external.
2. Output - Every algorithm must produce an output as result.
3. Definiteness - Every statement/instruction in an algorithm must be clear and unambiguous (only
one interpretation).
4. Finiteness - For all different cases, the algorithm must produce result within a finite number of
steps.
5. Effectiveness - Every instruction must be basic enough to be carried out and it also must be
feasible.
Let us consider the following problem for finding the largest value in a given list of values.
Problem Statement: Find the largest number in the given list of numbers?
Input: A list of positive integer numbers. (List must contain at least one number).
Output: The largest number in the given list of positive integer numbers.
Consider the given list of numbers as 'L' (input), and the largest number as 'max' (Output).
Recursive Algorithm
In computer science, all algorithms are implemented with programming language functions. We can view a
function as something that is invoked (called) by another function. It executes its code and then returns
control to the calling function. Here, a function can be called by itself or it may call another function which
in turn call the same function inside it is known as recursion. A recursive function can be defined as
follows...
The function which is called by itself is known as Direct Recursive function (or Recursive function)
The function which calls a function and that function calls its called function is known Indirect
Recursive function (or Recursive function)
Most of the computer science students think that recursive is a technique useful for only a few special
problems like computing factorials, Ackermann's function, etc., This is unfortunate because the function
implemented using assignment or if-else or while or looping statements can also be implemented using
recursive functions. This recursive function is very easier to understand when compared to its iterative
counterpart.
Performance Analysis
Performance of an algorithm means predicting the resources which are required to an algorithm to
1. Whether that algorithm is providing the exact solution for the problem?
2. Whether it is easy to understand?
3. Whether it is easy to implement?
4. How much space (memory) it requires to solve the problem?
5. How much time it takes to solve the problem? Etc.,
When we want to analyze an algorithm, we consider only the space and time required by that
particular algorithm and we ignore all the remaining elements.
Based on this information, performance analysis of an algorithm can also be defined as
follows...
Performance analysis of an algorithm is the process of calculating space and time required by
that algorithm.
1. Space required to complete the task of that algorithm (Space Complexity). It includes
program space and data space
2. Time required to complete the task of that algorithm (Time Complexity)
Space Complexity
What is Space complexity?
When we design an algorithm to solve a problem, it needs some computer memory to
complete its execution. For any algorithm, memory is required for the following purposes...
Total amount of computer memory required by an algorithm to complete its execution is called as
Generally, when a program is under execution it uses the computer memory for THREE reasons. They are as
follows...
1. Instruction Space: It is the amount of memory used to store compiled version of instructions.
2. Environmental Stack: It is the amount of memory used to store information of partially executed
3. Data Space: It is the amount of memory used to store all the variables and constants.
Note - When we want to perform analysis of an algorithm based on its Space complexity, we
consider only Data Space and ignore Instruction Space as well as Environmental Stack.
That means we calculate only the memory required to store Variables, Constants, Structures,
etc.,
To calculate the space complexity, we must know the memory required to store different datatype values
(according to the compiler). For example, the C Programming Language compiler requires the following...
In the above piece of code, it requires 2 bytes of memory to store variable 'a' and another 2
bytes of memory is fixed for any input value of 'a'. This space complexity is said to
If any algorithm requires a fixed amount of space for all input values then that space complexity is
Example 2
int sum(int A[ ], int n)
{
int sum = 0, i;
for(i = 0; i < n; i++)
sum = sum + A[i];
return sum;
}
4 bytes of memory for local integer variables 'sum' and 'i' (2 bytes each)
That means, totally it requires '2n+8' bytes of memory to complete its execution. Here,
the total amount of memory required depends on the value of 'n'. As 'n' value increases
the space required also increases proportionately. This type of space complexity is said
If the amount of space required by an algorithm is increased with the increase of input
Note - When we calculate time complexity of an algorithm, we consider only input data and
ignore the remaining things, as they are machine dependent. We check only, how our program
is behaving for the different input values to perform all the operations like Arithmetic, Logical,
Return value and Assignment etc.,
Now, we calculate the time complexity of following example code by using the above-defined
model machine...
Consider the following piece of code...
Example 1
int sum(int a, int b)
{
return a+b;
}
In the above sample code, it requires 1 unit of time to calculate a+b and 1 unit of time to
return the value. That means, totally it takes 2 units of time to complete its execution. And it
does not change based on the input values of a and b. That means for all input values, it
requires the same amount of time i.e. 2 units.
If any program requires a fixed amount of time for all input values then its time
complexity is said to be Constant Time Complexity.
ADSA UNIT-1 R23 DEPT OF CSE(AI&ML) KITS AKSHAR INSTITUTE OF TECHNOLOGY
Consider the following piece of code...
Example 2
int sum(int A[], int n)
{
int sum = 0, i;
for(i = 0; i < n; i++)
sum = sum + A[i];
return sum;
}
For the above code, time complexity can be calculated as follows...
In above calculation
Cost is the amount of computer time required for a single operation in each line.
Repeatation is the amount of computer time required by each operation for all its
repeatations.
Total is the amount of computer time required by each operation to execute.
So above code requires '4n+4' Units of computer time to complete the task. Here the exact
time is not fixed. And it changes based on the n value. If we increase the n value then the time
required also increases linearly.
Totally it takes '4n+4' units of time to complete its execution and it is Linear Time
Complexity.
If the amount of time required by an algorithm is increased with the increase of input
value then that time complexity is said to be Linear Time Complexity.
Algorithm 1 : 5n2 + 2n + 1
Algorithm 2 : 10n2 + 8n + 3
Majorly, we use THREE types of Asymptotic Notations and those are as follows...
1. Big - Oh (O)
2. Big - Omega (Ω)
3. Big - Theta (Θ)
In above graph after a particular input value n 0, always C g(n) is greater than f(n) which
indicates the algorithm's upper bound.
In above graph after a particular input value n 0, always C g(n) is less than f(n) which indicates
the algorithm's lower bound.
Example
Consider the following f(n) and g(n)...
f(n) = 3n + 2
g(n) = n
If we want to represent f(n) as Ω(g(n)) then it must satisfy f(n) >= C g(n) for all values of C > 0 and n0>= 1
In above graph after a particular input value n 0, always C1 g(n) is less than f(n) and C2 g(n) is greater than
f(n) which indicates the algorithm's average bound.
Example
Consider the following f(n) and g(n)...
f(n) = 3n + 2
g(n) = n
If we want to represent f(n) as Θ(g(n)) then it must satisfy C1 g(n) <= f(n) <= C2 g(n) for all values of C1 >
0, C2 > 0 and n0>= 1
C1 g(n) <= f(n) <= C2 g(n)
⇒C1 n <= 3n + 2 <= C2 n
Above condition is always TRUE for all values of C1 = 1, C2 = 4 and n >= 2.
By using Big - Theta notation we can represent the time compexity as follows...
3n + 2 = Θ(n)
An AVL tree is a balanced binary search tree. In an AVL tree, balance factor of every node is either -1,
0 or +1.
Balance factor of a node is the difference between the heights of the left and right subtrees of
that node. The balance factor of a node is calculated either height of left subtree - height of
right subtree (OR) height of right subtree - height of left subtree. In the following
The above tree is a binary search tree and every node is satisfying balance factor condition. So
Every AVL Tree is a binary search tree but every Binary Search Tree need not be AVL
tree.
In AVL tree, after performing operations like insertion and deletion we need to check
the balance factor of every node in the tree. If every node satisfies the balance factor
condition then we conclude the operation otherwise we must make it balanced. Whenever the
tree becomes imbalanced due to any operation we use rotation operations to make the tree
balanced.
Rotation is the process of moving nodes either to left or to right to make the tree
balanced.
There are four rotations and they are classified into two types.
In LL Rotation, every node moves one position to left from the current position. To understand
In RR Rotation, every node moves one position to right from the current position. To
understand RR Rotation, let us consider the following insertion operation in AVL Tree...
The LR Rotation is a sequence of single left rotation followed by a single right rotation. In LR
Rotation, at first, every node moves one position to the left and one position to right from the
current position. To understand LR Rotation, let us consider the following insertion operation
in AVL Tree...
The RL Rotation is sequence of single right rotation followed by single left rotation. In RL
Rotation, at first every node moves one position to right and one position to left from the
current position. To understand RL Rotation, let us consider the following insertion operation
in AVL Tree...
1. Search
2. Insertion
3. Deletion
In an AVL tree, the search operation is performed with O(log n) time complexity. The search
operation in the AVL tree is similar to the search operation in a Binary search tree. We use the
Step 2 - Compare the search element with the value of root node in the tree.
Step 3 - If both are matched, then display "Given node is found!!!" and terminate the
function
Step 4 - If both are not matched, then check whether search element is smaller or
Step 5 - If search element is smaller, then continue the search process in left subtree.
Step 6 - If search element is larger, then continue the search process in right subtree.
Step 7 - Repeat the same until we find the exact element or until the search element is
Step 9 - If we reach to the leaf node and if it is also not matched with the search
element, then display "Element is not found" and terminate the function.
In an AVL tree, the insertion operation is performed with O(log n) time complexity. In AVL
Tree, a new node is always inserted as a leaf node. The insertion operation is performed as
follows...
Step 1 - Insert the new element into the tree using Binary Search Tree insertion logic.
Step 2 - After insertion, check the Balance Factor of every node.
Step 3 - If the Balance Factor of every node is 0 or 1 or -1 then go for next operation.
Step 4 - If the Balance Factor of any node is other than 0 or 1 or -1 then that tree is
said to be imbalanced. In this case, perform suitable Rotation to make it balanced and
The deletion operation in AVL Tree is similar to deletion operation in BST. But after every
deletion operation, we need to check with the Balance Factor condition. If the tree is balanced
after deletion go for next operation otherwise perform suitable rotation to make the tree
Balanced.
Deletion in an AVL Tree
If the node to be deleted is a leaf node, it is simply removed from the tree.
If the node to be deleted has one child node, the child node is replaced with the node to
be deleted simply.
If the node to be deleted has two child nodes then,
o Either replace the node with it’s inorder predecessor , i.e, the largest element
of the left sub tree.
o Or replace the node with it’s inorder successor , i.e, the smallest element of
the right sub tree.
Let us consider the same example as above with some additional elements as shown. We can
see in the image the balance factor of each node in the tree.
Step 2:
Step 3:
Now The next element to be deleted is 12.
If we observe, we can see that the node 12 has a left subtree and a right subtree.
We again can replace the node by either it’s inorder successor or inorder
predecessor.
In this case we have replaced it by the inorder successor.
Step 4:
The node 12 is deleted from the tree.
Since we have replaced the node with the inorder successor, the tree structure
looks like shown in the image.
After removal and replacing check for the balance factor of each node of the tree.
Step 7:
In order to balance the tree, we identify the rotation mechanism to be applied.
Here we need to use LL Rotation.
The nodes involved in the rotation is shown as follows.
Step 8:
The nodes are rotated and the tree satisfies the conditions of an AVL tree.
The final structure of the tree is shown as follows.
We can see all the nodes have their balance factor as ‘0’ , ‘1’ and ‘-1’.
o Most in-memory sets and dictionaries are stored using AVL trees.
o Database applications, where insertions and deletions are less common but frequent data
lookups are necessary, also frequently employ AVL trees.
o In addition to database applications, it is employed in other applications that call for better
searching.
o A balanced binary search tree called an AVL tree uses rotation to keep things balanced.
o It may be used in games with plotlines as well.
o It is mostly utilized in business sectors where it is necessary to keep records on the employees
that work there and their shift changes.
B-Tree
B trees are extended binary search trees that are specialized in m-way searching, since the
order of B trees is 'm'.
Order of a tree is defined as the maximum number of children a node can
accommodate. Therefore, the height of a b tree is relatively smaller than the height of
AVL tree and RB tree.
They are general form of a Binary Search Tree as it holds more than one key and two
children.
The various properties of B trees include −
Every node in a B Tree will hold a maximum of m children and (m-1) keys, since the
order of the tree is m.
Every node in a B tree, except root and leaf, can hold at least m/2 children
The root node must have no less than two children.
All the paths in a B tree must end at the same level, i.e. the leaf nodes must be at the
same level.
A B tree always maintains sorted data.
Note − A disk access is the memory access to the computer disk where the information is
stored and disk access time is the time taken by the system to access the disk memory.
The insertion operation for a B Tree is done similar to the Binary Search Tree but the elements
are inserted into the same node until the maximum keys are reached. The insertion is done
using the following procedure −
Step 2 − The data is inserted into the tree using the binary search insertion and once the keys
reach the maximum number, the node is split into half and the median key becomes the
internal node while the left and right keys become its children.
The keys, 5, 3, 21, 9, 13 are all added into the node according to the binary search property but
if we add the key 22, it will violate the maximum key property. Hence, the node is split in half,
the median key is shifted to the parent node and the insertion is then continued.
Another hiccup occurs during the insertion of 11, so the node is split and median is shifted to
the parent.
Operations on a B-Tree
The following operations are performed on a B-Tree...
1. Search
2. Insertion
3. Deletion
Example
Construct a B-Tree of Order 3 by inserting numbers from 1 to 10.
Insert 4
Since 2<4, insert 2 to the left of 4 in the same node.
Since 20>4, insert 20 to the right of 4 in the same node. As we now, maximum number of
keys in the node are 2, one of these keys will have to be moved to a node above to split it. 4
being the middle element will move up and 2 and 20 will be its left and right nodes
respectively.
10>4 and 10<20 and thus, 10 will be inserted as a key in the node that contains 20 as a key.
Since 1<2, it will be inserted as a key in the node that contains 2 as a key.
14>10 and 14<20. Since the number of keys in that node exceeds the maximum number of
keys, the node will split after the middle key moves upto the node in the above line. Thus,
14 gets added to the right of 4 in the node that contains 4, and 10 and 20 are split as 2
separate nodes.
Since 7<10, it gets inserted to the left in the node that contains 10 as a key.
11<14 and 11>10. Thus, 11 should get added to the right of the node that contains 7 and 10.
However, since the maximum number of keys in the tree are 2, a split should take place.
Thus, the middle element 10 moves to the above node and 7 and 11 split as separate nodes.
The above node now contains 4, 10 and 14. Since the count of keys exceeds the maximum
key count, there would be a split there. Now, 10 is the root node with 4 and 14 as its
children.
Since 3<4 and 3>2,it gets inserted to the right of the node containing 1, 2. This node
exceeds the maximum count of keys in a node, leading to a split. 2 is added to the upper
node beside 4.
Since 8>7 and 8<10, it gets added to the left of the node that contains 7 as a key.
In this particular example, the number of comparisons at each step varied. The first value
was directly entered, thereafter every value had to be compared with the nodes present in
the tree.
The time complexity for insertion in a B Tree is dependent on the number of nodes and
thus, O(log n).
maximum children = m = 5
Case 2c: If neither of the siblings have keys more than the minimum number of keys
required then, merge the target node with either the left or the right sibling along with
the parent key of respective node.
2.If the target key is at the internal node:
If the target key is at an internal node, we further study the given data to check if any of the
following cases exist:
Case 1: If the left child has more than the minimum number of keys, the target key in
the internal node is replaced by its inorder predecessor ,i.e, the largest element of the
left child node.
Case 2: If the right child has more than the minimum number of keys, the target key in
the internal node is replaced by it’s inorder successor ,i.e, the smallest element of the
right child node.
Step 2:
Step 3:
The next element to be deleted from the tree is 53.
We can see from the image that the key exists in the leaf node.
Step 4:
Since the node in which the target key 53 exists has just the minimum number of keys,
we cannot delete it directly.
We check if the target node can borrow a key from it’s sibling nodes.
Since the target node doesn’t have any right sibling, it borrows from the left sibling
node.
As we have studied above how the process of borrow and replace takes place, we apply
it to the given structure.
Step 6:
Now, since the target node has keys more than the minimum number of keys required,
the key can be deleted directly.
The tree structure after deletion is shown as follows.
Step 7:
The next element to be deleted is 89.
The target key lies within a leaf node as seen from the image.
Step 8:
Again, the target node holds just the minimum number of keys required and hence the
node cannot be deleted directly.
The target node now has to borrow a key from either of it’s siblings.
We check the left sibling; it also holds just the minimum number of keys required.
We check the right sibling node; it has one more than the minimum number of nodes
so the target node can borrow a key from it.
Step 10:
Now, as the target node has sufficient number of keys the target key can directly be
deleted from the target node.
The tree structure after deletion is shown as follows.
Step 11:
The next key to be deleted is 90.
The key exists within a leaf node as shown in the image.
We can see that the target node has just the minimum number of keys.
The target node has to borrow a key from either of it’s siblings.
Since each of the siblings just have the number of the minimum keys, it cannot borrow
the keys directly.
Step 13:
Since the target node cannot borrow from either of the siblings, we merge the target
node, either of the sibling node and the corresponding parent to them.
The process of merging is shown as follows.
Step 14:
Since the target node now has sufficient number of keys, the target key 90 can be
deleted directly.
The tree structure after the deletion of the element is shown as follows.
Step 16:
In case, when an internal node is to be deleted, we replace the key with it’s inorder
predecessor or inorder successor.
We can select either of the child nodes if they have sufficient number of keys.
But as we can see in this case the target internal node can only borrow from it’s right
child, i.e, inorder predecessor.
The key 85 moves down to the child node; key 87 moves up to the parent node.
Now, as the target key is moved to the leaf node, it can be simply deleted from the leaf
node.
The final tree structure after deletion of various nodes and preserving the b-tree
properties is shown as follows.
Application of B-Tree:
B-trees are commonly used in applications where large amounts of data need to be stored
and retrieved efficiently. Some of the specific applications of B-trees include:
Databases: B-trees are widely used in databases to store indexes that allow for efficient
searching and retrieval of data.
File systems: B-trees are used in file systems to organize and store files efficiently.
Operating systems: B-trees are used in operating systems to manage memory
efficiently.
Network routers: B-trees are used in network routers to efficiently route packets
through the network.
DNS servers: B-trees are used in Domain Name System (DNS) servers to store and
retrieve information about domain names.
Compiler symbol tables: B-trees are used in compilers to store symbol tables that allow
for efficient compilation of code.
Sequential Traversing: As the keys are kept in sorted order, the tree can be traversed
sequentially.
Minimize disk reads: It is a hierarchical structure and thus minimizes disk reads.
Partially full blocks: The B-tree has partially full blocks which speed up insertion and
deletion.
Disadvantages of B-Tree:
Complexity: B-trees can be complex to implement and can require a significant amount
of programming effort to create and maintain.
Overhead: B-trees can have significant overhead, both in terms of memory usage and
processing time. This is because B-trees require additional metadata to maintain the tree
structure and balance.
Not optimal for small data sets: B-trees are most effective for storing and retrieving
large amounts of data. For small data sets, other data structures may be more efficient.
Limited branching factor: The branching factor of a B-tree determines the number of
child nodes that each node can have. B-trees typically have a fixed branching factor,
which can limit their performance for certain types of data.