Unit-1 DAA
Unit-1 DAA
REGULATION: 2023
COMMON TO CSE , IT , CYBER SECURITY & AIDS
23CY4401 - DESIGN AND ANALYSIS OF ALGORITHMS
UNIT-I INTRODUCTION
MATERIAL
Algorithms: Definitions and notations: - Asymptotic notations and its properties – worst case, best case
and average case analysis; big oh, small oh, omega and theta notations; Recursive algorithms and non-
recursive, Mathematical analysis of non-recursive and recursive algorithms, solving recurrence
equations.
Analysis of Sorting and Searching: Heap sort, insertion sort; linear, binary and Interpolation Search,
Algorithm visualization Tool- Sorting.
Mohammed Ibn Musa Al Khowarizmi (ninth century). An algorithm is simply s set of rules used
1.1 Definition: An algorithm is a finite set of instructions that accomplishes a particular task.
Another definition is a sequence of unambiguous instructions for solving a problem i.e, for
obtaining a required output for any legitimate (genuine) input in a finite amount of time.
1.2 In addition all algorithms must satisfy the following criteria (characteristics).
Consider Fibonacci numbers program, here aim of the problem is to display ten Fibonacci numbers ,
No input is required; in the problem itself this is clearly mentioned as ten Fibonacci values. So zero
items required for input.
Another problem is displaying given numbers of evens, so user should accept how many events
required. Based on the user input the number of evens is to be displayed. So, one data
In the case of Fibonacci numbers program after executing the program, first ten
In second case, based on user input it should display given number of evens. An input of
negative number is wrong, should display proper error message as output. So this program
displays at least one output as error message, or number if outputs that show given number of
ACADEMIC YEAR: 2024-25
REGULATION: 2023
steps.
3. Definiteness: Each instruction is clear and unambiguous i.e. each step must be easy to
4. Effectiveness: each instruction must be very basic, so that it can be carried out by a
This step is common in both Fibonacci and primes. For example, if user enters a negative
Go to ERROR
A wrong instruction given as go to ERROR, those kinds of instructions should not be there
in an algorithm.
5. Finiteness: If we can trace out the instructions of an algorithm then for all cases, the
Either in the case of Fibonacci or even numbers problem should be solved in some number
of steps. For example, continuous display or Fibonacci series without termination leads to
abnormal termination.
1.3 Notations:
They can be expressed in various forms, including natural language, pseudocode, flowcharts, and
programming languages. The effectiveness of an algorithm is often evaluated based on its efficiency
and complexity, which are typically analyzed using asymptotic notations such as Big O, Omega, and
Theta.
ACADEMIC YEAR: 2024-25
REGULATION: 2023
1.4 Performance Analysis:
efficiency of an algorithm i.,e how much computing time and storage an algorithm requires to
run (or execute). This analysis of algorithm helps in judging the value of one algorithm over
another.
1. Space complexity
2. Time complexity.
memory it needs to run to completion. The space needed by an algorithm has the following
components.
1. Instruction Space.
2. Data Space.
Instruction Space: Instruction space is the space needed to store the compiled version of the
program instructions. The amount of instruction space that is needed depends on factors such as
i). The compiler used to compile the program into machine code.
iii). The target computer, i.,e computer on which the algorithm run.
Note that, one compiler may produce less code as compared to another compiler, when the same
program is compiled by these two.
Data Space: Data space is the space needed to store all constant and variable values. Data
ii). Space needed by dynamically allocated objects such as arrays, structures, classes.
functions. Each time function is involved the following data are saved as the environmental
ACADEMIC YEAR: 2024-25
REGULATION: 2023
stack.
Environmental stack space is mainly used in recursive functions. Thus, the space requirement
This equation shows that the total space needed by a program is divided into two parts.
outputs.
-Instruction space
- This part includes dynamically allocated space and the recursion stack space.
Examples: 1
In the above algorithm, there are no instance characteristics and the space needed by X, Y, Z is
S(XYZ) =3+0=3
Time Complexity
ACADEMIC YEAR: 2024-25
REGULATION: 2023
The time complexity of an algorithm is the amount of compile time it needs to run to
completion. We can measure time complexity of an algorithm in two approaches
In priori analysis before the algorithm is executed we will analyze the behavior of the algorithm. A
priori analysis concentrates on determining the order if execution of statements.
In Posteriori analysis while the algorithm is executed we measure the execution time. Posteriori
analysis gives accurate values but it is very costly.
As we know that the compile time does not depend on the size of the input. Hence, we will
confine ourselves to consider only the run-time which depends on the size of the input and this run-
time is denoted by TP(n). Hence
The time (T(P)) taken by a program P is the sum of the compile time and execution time. The compile
time does not depend on the instance characteristics, so we concentrate on the runtime of a program.
This runtime is denoted by tp (instance characteristics).
The following equation determines the number of addition, subtraction, multiplication, division
compares, loads stores and so on, that would be made by the code for p.
where n denotes instance characteristics, and Ca, Cs, Cm, Cd and so on…..
As denote the time needed for an addition, subtraction, multiplication, division and so on, and ADD,
SUB, MUL, DIV and so on, are functions whose values are the number of additions, subtractions,
multiplications, divisions and so on. But this method is an impossible task to find out time complexity.
Another method is step count. By using step count, we can determine the number if steps needed by a
program to solve a particular problem.
2. Omega notation
3. Theta notation
2.1Big oh notation
Big oh notation is denoted by ‘O’. it is used to describe the efficiency of an algorithm. It is
used to represent the upper bound of an algorithms running time. Using Big O notation, we can
Definition: Let f(n) and g(n) be the two non-negative functions. We say that f(n) is said to be
O(g(n)) if and only if there exists a positive constant ‘c’ and ‘n0‘ such that,
< (2+5+2+3)n4
< 12n4.
. 4
. . f(n)=12n
This implies g(n)=n4, n >1
... c=12 and n0 =1
. 4
. . f(n)=O(n)
n
The above definition states that the function ‘f’ is almost ‘c’ times the function ‘g’ when ‘n’
ACADEMIC YEAR: 2024-25
REGULATION: 2023
is greater than or equal to n0.
This notion provides an upper bound for the function ‘f’ i.,e, the function g(n) is an
upper bound on the value of f(n) for all n, where n≥ n0.
Definition: The function f(n)= (g(n)) (read as for of n is omega of g of n) if and only if there
exist positive constants ‘c’ and ‘n0’ such that,
Example:
. 4
. . g(n)=n ,
c=2 and n0 =1
.
4
.
f
(
n
)
=
(
n
)
Let f(n) = 2n4 + 5n2 + 2n +3
f(n)
> 2n4 (for example as
.
. . f(n) > 2n4, n >1
n0
ACADEMIC YEAR: 2024-25
REGULATION: 2023
The big theta notation is denoted by ‘ . It is in between the upper bound and lower
bound of an algorithms running time.
Definition: Let f(n) and g(n) be the two non-negetive functions. We say that f(n) is said to
be (g(n)) if and only if there exists a positive constants ‘c1’ and ‘c2’, such that,
The above definition states that the function f(n) lies between ‘c1’times the function g(n)
and ‘c2’, times the function g(n) where ‘c1’ and ‘c2’ are positive constants.
This notation provides both lower and upper bounds for the function f(n) i.,e, g(n) is both
lower and upper bounds on the value of f(n), for large n. in other words theta notation says that
f(n) is both O(g(n)) and (g(n)) for all n, where n≥n0.
This function f(n) = (g(n)) iff g(n) is both upper and lower bound an f(n
ACADEMIC YEAR: 2024-25
REGULATION: 2023
Example:
c2*g(n)
Constant − Ο(1)
Logarithmic − Ο(log n)
Linear − Ο(n)
Quadratic − Ο(n2)
Cubic − Ο(n3)
ACADEMIC YEAR: 2024-25
REGULATION: 2023
2.4 Properties of Asymptotic Notations
General Properties:
If f(n) is O(g(n)) then a*f(n) is also O(g(n)) ; where a is a constant.
Example: f(n) = 2n²+5 is O(n²)
then 7*f(n) = 7(2n²+5)
= 14n²+35 is also O(n²)
Similarly this property satisfies for both Θ and Ω notation.
The linear search algorithm checks each element in a list sequentially until it finds the target element
or reaches the end of the list. The steps for performing a linear search are as follows:
5. Repeat steps 2-4 until either the target is found or all elements have been checked.
Definition: The worst case scenario represents the maximum time required for an algorithm to
complete its task.Example for Linear Search:
Scenario: The target element is not present in the array or is located at the last position.
Steps:
Definition: The best case scenario describes the minimum time required for an algorithm to complete
its task.Example for Linear Search:
Scenario: The target element is located at the first position of the array.
Steps:
Definition: The average case scenario evaluates the expected performance of an algorithm across all
possible inputs.Example for Linear Search:
Analyzing algorithms through worst case, best case, and average case scenarios provides
valuable insights into their efficiency and reliability. For linear search, while it is straightforward
with clear complexities, understanding these distinctions helps in selecting appropriate algorithms
based on expected performance in various conditions.
This is an easy sum to compute because it is nothing other than 1 repeated n − 1 times. Thus,
In the context of Design and Analysis of Algorithms (DAA), recurrence relations are
often used to analyze the time complexity of algorithms by expressing the time
complexity of a problem of size n in terms of the time complexity of smaller instances of
the same problem.
Substitution Method
Guess the form of the solution.
Use mathematical induction to prove that the guess is correct.
ACADEMIC YEAR: 2024-25
REGULATION: 2023
For example consider the recurrence
T (n) = T +n
Now,
T (n) ≤c log +1
T (n) = 2T + n n>1
Solution:
Now,
T (n) ≤2c log +n
≤cnlogn-cnlog2+n
=cn logn-n (clog2-1)
The time Complexity of Substitution Method is O(nlogn), and the space complexity is
ACADEMIC YEAR: 2024-25
REGULATION: 2023
often the same unless additional space complexities (like recursion stack) need to be
considered.
Recursion Tree Method:
Represent the recurrence relation as a tree.
Analyze the total cost by summing up the costs of all levels of the tree.
In this method, we draw a recurrence tree and calculate the time taken by every level of
the tree.
Finally, we sum the work done at all levels.
To draw the recurrence tree, we start from the given recurrence and keep drawing till
we find a pattern among levels.
The pattern is typically arithmetic or geometric series. For
example, consider the recurrence relation
cn2
/ \
T(n/4) T(n/2)
cn2
/ \
c(n2)/16 c(n2)/4
/ \ / \
T(n/16) T(n/8) T(n/8) T(n/4)
cn2
/ \
c(n2)/16 c(n2)/4
ACADEMIC YEAR: 2024-25
REGULATION: 2023
/ \ / \
c(n2)/256 c(n2)/64 c(n2)/64 c(n2)/16
/ \ / \ / \ / \
Space Complexity:
ACADEMIC YEAR: 2024-25
REGULATION: 2023
The space complexity of QuickSort is influenced by the recursion stack. At each level, the
space used is proportional to the size of the subproblems at that level = Number of
levels×Space per level
S(n)=Number of levels×Space per level log
S(n)=logn×O(logn)
log S(n)=O(logn)
The dominant term in the space complexity is log n, indicating a logarithmic growth in space
usage.
So ,
Time Complexity is 0(n). Space
Complexity is log n
Iteration Methods:
It means to expand the recurrence and express it as a summation of terms of n and
initial condition.
Example1:
Consider the Recurrence T (n) = 1 if n=1
= 2T (n-1) if n>1
Solution:
T (n) = 2T (n-1)
= 2[2T (n-2)] = 22T (n-2)
= 4[2T (n-3)] = 23T (n-3)
= 8[2T (n-4)] = 24T (n-4) ….(Eq.1)
Repeat the procedure for i times T (n)
= 2i T (n-i)
Put n-I =1 or I = n-1 in …(Eq.1)
T (n) = 2n-1 T (1)
= 2n-1 .1 {T (1) =1 ........ given}
= 2n-1
Master Theorem:
The Master Theorem is a tool used in the analysis of algorithms, particularly those
that follow a divide-and-conquer approach.
ACADEMIC YEAR: 2024-25
REGULATION: 2023
It provides a straightforward way to determine the time complexity of recursive
algorithms by expressing their time recurrence relations in a specific form.
The general form of a recurrence relation that can be solved using the Master Theorem
is: T(n) = aT(n/b) + f(n)
Here:
o T(n) is the time complexity of the algorithm for a problem of size n,
o a is the number of subproblems at each recursion level,
o b is the factor by which the problem size is reduced in each recursive call,
o f(n) is the cost of dividing the problem and combining the results.
o The Master Theorem has three cases, each of which provides a solution for the time
complexity T(n)
Internal Quality Assurance Cell
ACADEMIC YEAR: 2024-25 AC-01, W.e.f: 01.06.2023
Format Number: 2c Ver: 1.0
REGULATION: 2023
Case1:
Case2:
Case 3:
The Master Theorem provides a convenient way to analyze the time complexity of
many divide-and-conquer algorithms without going through the process of constructing
a recursion tree or using the substitution method. However, it is applicable to a specific
form of recurrence relations, and not all recurrence relations fit the Master Theorem's
framework. In such cases, other methods like the recursion.
6.1.1 Algorithm
Heap sort consists of two main phases: building a heap from the input data and then sorting the
heap.
1. Build a Max Heap: Convert the input array into a max heap. In a max heap, the parent
node is always greater than or equal to its child nodes.
2. Sorting:
Swap the root of the max heap (the largest element) with the last element of the
heap.
Reduce the size of the heap by one and call the max-heapify function on the root
to maintain the heap property.
6.1.2 Pseudocode
HeapSort(A):
BUILD-MAX-HEAP(A)
A.heap-size = A.heap-size - 1
MAX-HEAPIFY(A, 1)
The time complexity of heap sort can be analyzed based on three scenarios: best case, average
case, and worst case.
Even in the best scenario, building a max heap takes O(n)O(n), and each
extraction of the maximum element takes O(logn)O(logn) for nn elements.
Similar to average and best cases, the worst-case scenario does not
exceed O(nlogn)O(nlogn) due to the nature of heap operations.
Space Complexity
Heap sort is an in-place sorting algorithm, which means it requires only a constant amount of
additional space. Therefore, its space complexity is:
Not stable: The relative order of equal elements may not be preserved.
6.1.4 Applications
Heap sort is widely used in various applications where memory usage is critical, such as:
6.2.1Algorithm
The insertion sort algorithm follows these steps:
1. Start from the second element (index 1) since the first element is trivially sorted.
2. Compare the current element with the elements in the sorted sub-list (elements to its left).
3. Shift all larger elements in the sorted sub-list to the right to make space for the current
element.
4. Insert the current element into its correct position in the sorted sub-list.
5. Repeat until all elements are processed.
Internal Quality Assurance Cell
ACADEMIC YEAR: 2024-25 AC-01, W.e.f: 01.06.2023
Format Number: 2c Ver: 1.0
REGULATION: 2023
6.2.2 Pseudocode
InsertionSort(A):
for i from 1 to length(A) - 1:
key = A[i]
j=i-1
while j >= 0 and A[j] > key:
A[j + 1] = A[j]
j=j-1
A[j + 1] = key
6.2.3 Time Complexity Analysis
The time complexity of insertion sort varies based on the arrangement of elements in the input array:
Best Case: O(n)O(n)
Occurs when the array is already sorted. The inner loop only runs once for each
element, making the overall time linear.
Average Case: O(n2)O(n2)
On average, for each of the nn elements, about half of them will need to be compared
and shifted, leading to quadratic complexity.
Worst Case: O(n2)O(n2)
Happens when the array is sorted in reverse order. Every new element has to be
compared with all previously sorted elements, resulting in maximum shifts.
6.2.4 Space Complexity
Space Complexity: O(1)O(1)
Insertion sort is an in-place sorting algorithm, meaning it requires a constant amount of additional
memory space regardless of input size.
6.2.5 Characteristics of Insertion Sort
Stable: Maintains the relative order of equal elements.
In-place: Requires minimal additional storage.
Adaptive: Efficient for data that is already partially sorted.
6.2.6 Applications
Insertion sort is commonly used in various scenarios:
Sorting small datasets where its simplicity and efficiency can be advantageous.
As part of more complex algorithms like Timsort (used in Python's sort functions).
Internal Quality Assurance Cell
ACADEMIC YEAR: 2024-25 AC-01, W.e.f: 01.06.2023
Format Number: 2c Ver: 1.0
REGULATION: 2023
Situations where data is frequently added and requires maintaining a sorted order (e.g., online
sorting).
6.3 linear, binary and Interpolation Search,
6.3.1 Linear Search
//linearSearch Function
return i; }//end if
}//end for
return -1; //Value was not in the list }//end linearSearch Function
Worst Case:
The worse case for Linear Search is achieved if the element to be found is not in the list
at all. This would entail the algorithm to traverse the entire list and return nothing. Thus
the worst case running time is: O(N).
Average Case:
The average case is in short revealed by insinuating that the average element would be
somewhere in the middle of the list or N/2. This does not change since we are dividing by
a constant factor here, so again the average case would be: O(N).
Best Case:
The best case can be a reached if the element to be found is the first one in the list. This
would not have to do any traversing spare the first one giving this a constant time
complexity or: O(1).
This
formula allows the algorithm to skip over large sections of the array, making it more efficient
than linear search, especially when dealing with large datasets. The steps involved in the
algorithm are as follows:
1. Initialization: Set low to 0 and high to n−1n−1 (the last index of the array).
2. Probing: Calculate the probable position using the interpolation formula.
3. Comparison: Check if the value at this position matches the target:
If it does, return the index.
If the target is greater, adjust low to pos + 1.
If the target is smaller, adjust high to pos - 1.
4. Repeat until the target is found or low exceeds high.
Internal Quality Assurance Cell
ACADEMIC YEAR: 2024-25 AC-01, W.e.f: 01.06.2023
Format Number: 2c Ver: 1.0
REGULATION: 2023
Complexity Analysis
Average Case: The average time complexity of interpolation search is O(logl logN),
which is significantly better than linear search.
Worst Case: In cases where the elements are not uniformly distributed, performance can
degrade to O(N), particularly if there are large gaps between value.
Advantages and Limitations
Advantages:
Efficiency: Particularly effective for large datasets with uniformly distributed values.
Reduced Comparisons: Can potentially reduce the number of comparisons needed
compared to binary search.
Limitations:
Uniform Distribution Requirement: The algorithm performs poorly on non-uniformly
distributed datasets, where it may even be slower than linear search.
Complexity in Implementation: The calculation of position adds complexity compared
to simpler algorithms like binary search
Error Diagnosis: Visual tools can help in debugging and understanding why certain algorithms
perform poorly with specific data sets.
Internal Quality Assurance Cell
ACADEMIC YEAR: 2024-25 AC-01, W.e.f: 01.06.2023
Format Number: 2c Ver: 1.0
REGULATION: 2023