DAA - Unit 1
DAA - Unit 1
Unit 1
Introduction to Algorithms
Algorithms are step-by-step procedures or formulas for solving problems. They are essential
in computer science for developing software and systems. Key aspects of studying algorithms
include their efficiency and the methods used to analyze and improve them.
Fundamentals of Algorithmic Solving
Algorithmic solving involves systematically breaking down a problem into smaller,
manageable parts, devising a plan to solve each part, and combining these solutions to solve
the overall problem. This process typically includes problem understanding, designing an
algorithm, implementing it, and analyzing its correctness and efficiency. Key steps include:
Problem Definition: Clearly defining the problem and its constraints.
Algorithm Design: Creating a step-by-step plan or flowchart to solve the problem.
Implementation: Writing the algorithm in a programming language.
Testing and Debugging: Ensuring the algorithm works correctly for various inputs.
Optimization: Improving the algorithm for better performance.
Important Problem Types
Several common types of problems are frequently addressed by algorithms, including:
Sorting: Arranging data in a particular order (e.g., quicksort, mergesort).
Searching: Finding specific data within a dataset (e.g., binary search).
Graph Problems: Analyzing graphs and networks (e.g., Dijkstra's algorithm for
shortest paths).
Dynamic Programming: Solving complex problems by breaking them down into
simpler subproblems (e.g., Fibonacci sequence).
Greedy Algorithms: Making a series of choices that are locally optimal (e.g.,
Kruskal's algorithm for minimum spanning tree).
Analysing Algorithms
Algorithm Analysis is the process of determining the computational complexity of an
algorithm. It involves measuring both time and space complexity to evaluate performance.
Complexity Analysis involves evaluating how the runtime or space requirements of an
algorithm grow as the input size increases.
1. Big O Notation (O): Represents the upper bound of the time complexity, providing an
asymptotic analysis.
2. Big Ω Notation (Ω): Represents the lower bound of the time complexity.
3. Big Θ Notation (Θ): Represents the tight bound, providing both the upper and lower
bounds.
Complexity of Algorithms
1. Time Complexity: Describes how the execution time of an algorithm changes with the
size of the input. Common notations include O(n), O(log n), O(n^2), etc.
2. Space Complexity: Measures the amount of memory an algorithm uses relative to the
input size.
Growth of Functions
Growth of Functions helps us understand how algorithms perform as the input size grows.
Common growth rates include constant time (O(1)), logarithmic time (O(log n)), linear time
(O(n)), and polynomial time (O(n^k)).
Example
O(1): Accessing an element in an array.
O(log n): Binary search in a sorted array.
O(n): Iterating through an array.
O(n^2): Nested loops.
Analysis of a Simple For Loop
Consider the following simple for loop:
for (i = 1; i <= n; i++)
{
v[i] = v[i] + 1;
}
Analysis
1. Loop Execution Count
The loop variable i starts at 1 and increments by 1 until it exceeds n. Therefore, the
loop executes exactly n times. Each iteration of the loop performs a constant amount
of work, specifically updating the value of v[i].
2. Time Complexity
o Constant Time Operations: The operations inside the loop (i.e., v[i] = v[i] + 1)
are constant time operations, denoted as O(1), because they do not depend on
the size of n.
o Total Running Time: Since the loop runs n times and each iteration takes
constant time, the total running time of the loop is proportional to n.
Therefore, we express the time complexity of the loop as O(n). This notation
provides an upper bound on the running time, abstracting away constant
factors and lower-order terms.
Big-O Notation
Multiplicative Factor: The actual time taken might be expressed as a multiple of n.
For example, if the loop execution takes 100n instructions or 34n microseconds, the
Big-O notation abstracts away these constants. Thus, O(n) remains the appropriate
notation despite the specific constants.
Additive Factor: The loop may incur a constant startup time, such as an additional 3
microseconds. In Big-O notation, additive constants are also disregarded. So, a
running time of 34n + 3 microseconds still falls under O(n).
The time complexity of the loop is O(n).
The Big-O notation reflects that the loop’s running time grows linearly with n,
ignoring constant multiplicative and additive factors.
O(n) indicates linear growth in the number of operations relative to the input
size n.
The loop's linear running time can be simplified to O(n) in Big-O notation, which describes
its performance accurately for large input sizes.
Solving Recurrences
1. Substitution Method: The substitution method is a technique used to solve recurrence
relations, which are often encountered in analyzing the time complexity of recursive
algorithms. The method involves guessing the form of the solution and then using
mathematical induction to verify that the guess is correct.
Steps for the Substitution Method
1. Guess the Form of the Solution: Make an educated guess about the asymptotic form
of the solution to the recurrence relation. This guess is often based on the form of the
recurrence relation.
2. Prove the Guess by Induction: Use mathematical induction to prove that your guess
is correct. This involves showing that the guess holds for the base case and then
proving that if it holds for a certain value, it also holds for the next value.
Simplify:
T(n)≤Cnlog(n/2)+n
T(n)≤Cn(logn−log2)+n
T(n)≤Cn logn−Cn log2+n
For sufficiently large n, we can choose C such that T(n)≤Cn logn
Thus, the guess T(n)=O(n logn) is correct.
The solution to the recurrence relation T(n)=2T(n/2)+n is T(n)=O(n logn)
Continue this process until you reach the base case T(1):
T(n)=T(n−k)+k
Key Properties:
1. Time Complexity: O(n log n) – Merge Sort divides the list into halves at each level,
which takes log n steps, and each level requires O(n) operations for merging.
2. Space Complexity: O(n) – Due to the need for additional storage for the temporary
arrays created during merging.
3. Stability: Merge Sort is a stable sorting algorithm, meaning it preserves the relative
order of elements with equal keys.
4. Recursion: Merge Sort is a recursive algorithm, breaking the array into progressively
smaller sub-arrays.
Case Study: Application of Merge Sort
Scenario: Imagine an e-commerce company that processes thousands of customer orders each
day. These orders must be sorted by delivery time to ensure efficient dispatching and
delivery. Sorting the orders rapidly and efficiently is critical to maintaining customer
satisfaction and operational efficiency.
Problem: The company receives bulk orders at the start of each day. Each order contains
several attributes such as order ID, customer name, and most importantly, delivery time. To
optimize the delivery process, the orders need to be sorted by the delivery time before
dispatch.
Solution: The company uses Merge Sort to handle the sorting of orders, as it can efficiently
handle large datasets due to its O(n log n) time complexity.
Steps for Implementation:
1. Dividing the Orders: The list of orders is divided into two halves recursively until each
list contains only one order.
2. Sorting and Merging: The smaller sublists are then merged back together. While
merging, orders are compared based on delivery time, ensuring that the earlier
deliveries are processed first.
3. Result: The final merged list is a sorted list of orders by delivery time, ready for
dispatch.
Why Merge Sort was chosen:
Efficient for large datasets: Merge Sort’s time complexity of O(n log n) is optimal for
sorting large volumes of orders.
Stability: Merge Sort preserves the relative order of orders with the same delivery
time, which is crucial for maintaining consistent service.
Predictable performance: Merge Sort does not degrade to O(n^2) like some other
algorithms, making it a reliable choice for high-demand situations.
Outcome: The e-commerce company successfully improved its order processing efficiency,
leading to faster deliveries and higher customer satisfaction. The use of Merge Sort allowed
the company to sort large volumes of orders quickly and reliably, without compromising on
performance or accuracy.
Quick Sort
Quick Sort:
Quick Sort is an efficient, in-place, and comparison-based sorting algorithm. It follows the
divide-and-conquer paradigm by dividing the array into sub-arrays around a pivot element
and recursively sorting the sub-arrays. Quick Sort is known for its average-case time
complexity of O(n log n), making it one of the fastest sorting algorithms in practice,
especially for large datasets.
Key Concepts:
Divide-and-Conquer: Quick Sort divides the array into smaller sub-arrays, solves each
sub-array, and then combines the results.
Pivot: A key element around which the partitioning of the array is done. Various
strategies can be used to select the pivot:
o First element
o Last element
o Random element
o Median-of-three
Partitioning: The process of rearranging the array such that elements smaller than the
pivot go to the left of it, and elements larger go to the right.
Quick Sort Algorithm Steps:
1. Choose a Pivot: Select a pivot element from the array.
2. Partitioning: Reorder the array so that all elements with values less than the pivot
come before the pivot, and all elements with values greater come after it. The pivot
element is now in its correct sorted position.
3. Recursion: Recursively apply the above steps to the sub-arrays of elements with
smaller and greater values than the pivot.
4. Base Case: When the size of the sub-array becomes 1 or 0, recursion stops.
Pseudo Code:
def quick_sort(arr, low, high):
if low < high:
# pi is partitioning index, arr[pi] is now at the right place
pi = partition(arr, low, high)
Binary Heap
A binary heap is a complete binary tree where:
Max-Heap: The value of each node is greater than or equal to the values of its
children, with the largest element at the root.
Min-Heap: The value of each node is smaller than or equal to the values of its
children, with the smallest element at the root.
Heap Sort uses the max-heap property to sort elements in ascending order (or min-heap for
descending order).
Heap Sort Algorithm Steps
1. Build a Max-Heap: Transform the input array into a max-heap using the heapify
process.
2. Extract Elements: Remove the largest element (root of the heap), swap it with the last
item in the heap, and reduce the size of the heap.
3. Heapify the Root: Rebalance the heap by performing heapify on the root.
4. Repeat: Continue this process until the entire array is sorted.
Heapify Process
Heapify is the process of ensuring that the subtree rooted at a given node maintains the heap
property. In a max-heap, this means making sure that a parent node is larger than both of its
children, swapping nodes if necessary, and recursively applying this process to affected
subtrees.
Heapify Pseudocode:
heapify(arr[], n, i):
largest = i // Initialize largest as root
left = 2*i + 1 // Left child
right = 2*i + 2 // Right child
Space Complexity
Heap Sort is an in-place sorting algorithm, which means it requires a constant space
overhead, resulting in O(1) additional space.
Advantages
In-place sorting (O(1) space complexity).
Consistent O(n log n) time complexity.
No recursive function calls are required (unlike Merge Sort).
Disadvantages
Not a stable sort (relative order of equal elements may change).
Performance can be slower compared to algorithms like Quick Sort for smaller
datasets due to the complexity of the heapify process.
Case Study: Applying Heap Sort in Task Scheduling
Background
A software development team faces the challenge of efficiently scheduling a set of tasks with
varying priorities. Each task has a priority value, and higher priority tasks must be executed
first. The team aims to minimize idle time while ensuring that tasks are completed in the
correct order.
Problem
The team has a list of tasks, each represented as a tuple (task_name, priority_value). They
need an algorithm that can efficiently order tasks so that the highest-priority task is executed
first. Additionally, the process of updating priorities and reordering tasks should be efficient.
Shell Sort
Shell Sort is an in-place comparison-based sorting algorithm that generalizes the Insertion
Sort algorithm by allowing the exchange of items that are far apart. It is named after its
inventor, Donald Shell, who introduced it in 1959. Shell Sort improves the efficiency of
Insertion Sort by breaking the array into smaller subarrays and then sorting these subarrays.
Working of Shell Sort:
Gap Sequence: Shell Sort starts by sorting pairs of elements far apart from each other,
then progressively reducing the gap between elements to be compared. A common
gap sequence is to halve the gap size each time, but other sequences like Knuth’s
sequence can also be used.
Subarray Sorting: The main idea is that the elements that are far apart can be
swapped, leading to faster elimination of disorder. Once the gap becomes 1, the
algorithm performs a regular insertion sort on the entire array, but at this point, the
array is already partially sorted, making this pass very efficient.
Algorithm Steps:
1. Initialize a gap sequence: Start with a gap larger than 1, typically half the length of
the list, and then reduce the gap in subsequent passes.
2. Sort the elements using Insertion Sort: For each gap, perform a gapped insertion sort.
3. Repeat until the gap becomes 1: When the gap is reduced to 1, the list is fully sorted.
Time Complexity:
The time complexity of Shell Sort depends heavily on the gap sequence used. The
average time complexity is typically better than O(n²), but for some gap sequences, it
can approach O(n log n). The worst-case time complexity is O(n²) for the gap
sequence proposed by Shell.
Advantages of Shell Sort:
Adaptive: Performs well when the input is already partially sorted.
In-place: Does not require any additional memory.
Versatile: Works well for small or medium-sized datasets.
Disadvantages of Shell Sort:
The performance of Shell Sort heavily depends on the choice of gap sequence, which
can be tricky to optimize.
End ShellSort
Sorting in Linear Time
Sorting in linear time refers to algorithms that sort data with a time complexity of O(n),
where n is the number of elements. These algorithms are efficient for specific types of data
and constraints. Common linear time sorting algorithms include Counting Sort, Radix Sort,
and Bucket Sort.
Counting Sort
Counting Sort is a non-comparison-based sorting algorithm that operates on the principle of
counting the occurrences of each unique value in the input array. It is particularly efficient
when the range of input values is known and relatively small compared to the number of
elements.
Pseudo Code
CountingSort(A, B, k)
Input: Array A of size n, output array B of size n, maximum value k
Output: Array B sorted in non-decreasing order
Radix Sort
Radix Sort is a non-comparative integer sorting algorithm that sorts numbers by processing
individual digits. It is particularly efficient when dealing with large numbers or datasets with
a fixed range of digits. Radix Sort processes numbers from the least significant digit (LSD) to
the most significant digit (MSD) or vice versa.
Key Concepts
Radix: The base of the number system. For example, base-10 for decimal numbers.
Stable Sort: Radix Sort uses a stable sorting algorithm (often Counting Sort) to sort digits,
which ensures that the relative order of elements with the same digit is preserved.
Steps of Radix Sort
Find Maximum Value: Determine the maximum number to find the number of digits in the
largest number.
Iterate Over Each Digit: Sort the numbers based on each digit, starting from the least
significant digit to the most significant digit (LSD to MSD) or vice versa.
Use a Stable Sorting Algorithm: Apply a stable sort (like Counting Sort) to each digit to
ensure that digits are sorted correctly while maintaining the relative order of numbers with
the same digit.
Example
Let's sort the following array of integers: [170, 45, 75, 90, 802, 24, 2, 66]
Step-by-Step Execution:
Psudo code
RADIX_SORT(A, n)
// A is the array of integers to be sorted
// n is the number of elements in A
max_value = FIND_MAX_VALUE(A, n)
exp = 1 // Initialize the exponent for the LSD (1 for the unit place)
Time Complexity: O(d * (n + k)), where d is the number of digits, n is the number of
elements, and k is the base of the number system.
Space Complexity: O(n + k), where k is the base of the number system.
Bucket Sort
Concept: Bucket Sort distributes elements into buckets based on a range of values, sorts each
bucket individually using another sorting algorithm (often Insertion Sort), and then
concatenates the sorted buckets.
Bucket Sort is a distribution sort algorithm that works by dividing the elements into several
buckets, sorting each bucket individually (using another sorting algorithm), and then
concatenating the results. It's particularly efficient when the input is uniformly distributed
over a range.
Steps of Bucket Sort
1. Initialization: Create k empty buckets. The number of buckets is usually determined
based on the range of input values and their distribution.
2. Distribution: Distribute the input elements into these buckets. Each bucket
corresponds to a specific range of values.
3. Sorting Buckets: Sort each bucket individually. This can be done using any sorting
algorithm, but Insertion Sort is commonly used for its simplicity when the bucket size
is small.
4. Concatenation: Merge the sorted buckets to form the final sorted array.
Psudo code
return 0;
}
C Code: Counting Sort:
#include <stdio.h>
#include <stdlib.h>
return 0;
}
return 0;
}
C Code : Bucket Sort :
#include <stdio.h>
#include <stdlib.h>
// Create buckets
int **buckets = (int **)malloc(numBuckets * sizeof(int *));
int *bucketSizes = (int *)calloc(numBuckets, sizeof(int));
// Check if right child exists and is larger than the largest so far
if (right < size && arr[right] > arr[largest])
largest = right;
while (1) {
printf("\nMax Heap Operations:\n");
printf("1. Insert\n");
printf("2. Delete Root\n");
printf("3. Print Heap\n");
printf("4. Exit\n");
printf("Enter your choice: ");
scanf("%d", &choice);
switch (choice) {
case 1:
printf("Enter value to insert: ");
scanf("%d", &value);
insert(arr, value);
printf("Value inserted.\n");
break;
case 2:
deleteRoot(arr);
printf("Root deleted.\n");
break;
case 3:
printf("Current Heap: ");
printHeap(arr);
break;
case 4:
free(arr);
exit(0);
default:
printf("Invalid choice! Please try again.\n");
}
}
return 0;
}
Explanation
Heap Operations:
Insertion: Adds a new element to the heap by placing it at the end of the heap and
then heapifying up to maintain the max-heap property.
Deletion: Removes the root element (maximum value in the heap) by replacing it
with the last element and heapifying down to maintain the max-heap property.
Heapify Functions:
heapifyUp(): Used after insertion to maintain the heap property by comparing the
newly inserted element with its parent and swapping if necessary.
heapifyDown(): Used after deletion to restore the heap property by comparing
the current node with its children and swapping it with the largest child if
necessary.
Menu Options:
The user is prompted to choose from the following operations:
Insert: Allows the user to insert a new value into the heap.
Delete Root: Removes the maximum element (root) from the heap.
Print Heap: Displays the current state of the heap.
Exit: Exits the program.