0% found this document useful (0 votes)
25 views18 pages

Chap 5 - Sorting

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views18 pages

Chap 5 - Sorting

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

‭CSE 221: Algorithms [ARN] Ariyan Hossain‬

‭Sorting Algorithms‬

‭Merge Sort:‬

‭ his sort follows a divide and conquer algorithm. It divides the problem into half, recursively‬
T
‭sorts two subproblems, and then merges the results into a complete sorted sequence.‬

‭Divide:‬

‭Conquer & Combine:‬

‭41‬
‭CSE 221: Algorithms [ARN] Ariyan Hossain‬

‭Merging/Combining two sorted arrays A and B into another sorted array C in Θ(n):‬

‭42‬
‭CSE 221: Algorithms [ARN] Ariyan Hossain‬

‭Pseudo Code:‬

‭Advantages‬‭:‬

‭‬ C
● ‭ onsistent time complexity O(n log n), making it efficient for large datasets‬
‭●‬ ‭Stable sorting algorithm, preserving the relative order of equal elements‬

‭Disadvantages:‬

‭●‬ R ‭ equires additional memory space (Out-of-place sorting) for the temporary arrays during‬
‭the merging process, leading to higher space complexity O(n)‬
‭●‬ ‭Slower for small datasets compared to simpler algorithms like insertion sort‬

‭Follow-up Question:‬

‭Q) What is the Recurrence Equation?‬

‭ ns: Depends on the number of subproblems and each subproblem size and work done at each‬
A
‭step. In each step, there are 2 subproblems where size gets divided by 2 (n/2) and work done is‬
‭n at each step. T(n) = 2T(n/2) + n‬

‭43‬
‭CSE 221: Algorithms [ARN] Ariyan Hossain‬

‭Quick Sort:‬

‭ his sort also follows a divide and conquer algorithm. It partitions the array into subarrays‬
T
‭around a pivot x such that the elements in the lower subarray ≤ x ≤ elements in the upper‬
‭subarray, recursively sorts the 2 subarrays, and then concatenates the lower subarray, pivot,‬
‭and the upper subarray.‬

‭Divide / Partition:‬

‭Conquer & Combine:‬

‭44‬
‭CSE 221: Algorithms [ARN] Ariyan Hossain‬

‭Partitioning/Dividing an array A in Θ(n):‬

‭45‬
‭CSE 221: Algorithms [ARN] Ariyan Hossain‬

‭Pseudo Code:‬

‭Quicksort is unique because its speed depends on the pivot you choose.‬

‭Worst Case vs Average/Best Case:‬

‭ orst Case happens when the pivot is the first or last element of a sorted (ascending or‬
W
‭descending) array. The result is that one of the partitions is always empty.‬

‭46‬
‭CSE 221: Algorithms [ARN] Ariyan Hossain‬

‭Worst Case:‬

‭ here are O(n) levels or the height is O(n), And each level takes O(n) time. The entire algorithm‬
T
‭will take O(n) * O(n) = O(n‬‭2‬‭) time.‬

‭Best Case:‬

‭47‬
‭CSE 221: Algorithms [ARN] Ariyan Hossain‬

‭ here are O(log n) levels or the height is O(log n). And each level takes O(n) time. The entire‬
T
‭algorithm will take O(n) * O(log n) = O(n log n) time.‬

‭ he best case is also the average case. If you always choose a random element in the array as‬
T
‭the pivot, quicksort will complete in O(n log n) time on average‬

‭Advantages‬‭:‬

‭‬ F
● ‭ astest sorting algorithm O(n log n), making it efficient for large datasets‬
‭●‬ ‭Does not require additional memory space (In-place sorting)‬

‭Disadvantages:‬

‭‬ O
● ‭ (n‬‭2‭)‬ time complexity in worst-case scenario‬
‭●‬ ‭Unstable sorting algorithm, not preserving the relative order of equal elements‬

‭Follow-up Question:‬

‭ ) If quicksort is O(n log n) on average, but merge sort is O(n log n) always, why not use merge‬
Q
‭sort? Isn’t it faster?‬

‭ ns: Even though both functions are the same speed in Big O notation, quicksort is faster in‬
A
‭practice. When you write Big O notation like O(n), it means O(c * n) where c is some fixed‬
‭amount of time that your algorithm takes. Let’s see an example:‬

‭48‬
‭CSE 221: Algorithms [ARN] Ariyan Hossain‬

‭ uicksort has a smaller constant than merge sort. So if they’re both O(n log n) time, quicksort is‬
Q
‭faster. And quicksort is faster in practice because it hits the average case way more often than‬
‭the worst case.‬

‭Q) What is the Recurrence Equation when it is worst-case?‬

‭ ns: Depends on the number of subproblems and each subproblem size and work done at each‬
A
‭step. In each step, there is 1 subproblem where size gets subtracted by 1 (n-1) and work done‬
‭is n at each step. T(n) = T(n-1) + n‬

‭Q) What is the Recurrence Equation when it is best-case?‬

‭ ns: Depends on the number of subproblems and each subproblem size and work done at each‬
A
‭step. In each step, there are 2 subproblems where size gets divided by 2 (n/2) and work done is‬
‭n at each step. T(n) = 2T(n/2) + n‬

‭ ) How to make sure your algorithm never reaches the worst-case when choosing 1st element‬
Q
‭as a pivot?‬

‭ ns: Use Randomized Quicksort. It is the same as the usual Quicksort but you will swap the‬
A
‭pivot with a random number in the array within the range and then start partitioning. This will‬
‭always give O(n log n).‬

‭Heap Sort:‬

‭Heap Sort uses a data structure called the heap.‬

‭Heap Data Structure:‬

‭Recap:‬

‭ ‬ I‭n a Binary Tree, a parent can have a maximum of 2 nodes and a minimum of 0 nodes.‬

‭●‬ ‭In a Complete Binary Tree, nodes are added to the tree in a left-to-right manner not‬
‭skipping any position. A parent can have 0, 1, or 2 children.‬

‭49‬
‭CSE 221: Algorithms [ARN] Ariyan Hossain‬

‭ Heap is an Abstract Data Type (ADT) for storing values. Its underlying data structure is an‬
A
‭array.‬

‭ Heap has to be a complete binary tree and it must satisfy the heap property.‬
A
‭Heap property:‬

‭●‬ T ‭ he value of the parent must be greater than or equal to the values of the‬
‭children (Max heap).‬
‭or‬
‭●‬ ‭The value of the parent must be smaller than or equal to the values of the children. (Min‬
‭heap).‬

‭ here are two types of heaps. Max heap is mostly used (default). A heap can be either a max‬
T
‭heap or a min heap but can't be both at the same time.‬

‭ eap data structure provides worst-case O(1) time access to the largest (max-heap) or smallest‬
H
‭(min-heap) element and provides worst-case Θ(log n) time to extract the largest (max-heap) or‬
‭smallest (min-heap) element.‬

‭Note: Tree is used for efficient tracing. While programming, the data structure is a simple Array.‬

‭ he benefit of using Array for Heap rather than Linked List is Arrays give you random access to‬
T
‭its elements by indices. You can just pick any element from the Array by just calling the‬
‭corresponding index. Finding a parent and their children is trivial. Whereas, Linked List is‬
‭sequential. This means you need to keep visiting elements in the linked list unless you find the‬

‭50‬
‭CSE 221: Algorithms [ARN] Ariyan Hossain‬

‭ lement you are looking for. Linked List does not allow random access as Array does. Also,‬
e
‭each Linked List must have three (3) references to traverse the whole Tree (Parent, left, Right).‬

‭Heap Operations:‬

‭●‬ ‭Insert:‬

I‭nserts an element at the bottom of the Heap. Then we must make sure that the Heap property‬
‭remains unchanged. When inserting an element in the Heap, we start from the left available‬
‭position to the right.‬

‭ ere, Heap property is kept intact. What if we want to insert 102 instead of 3? 102 will be added‬
H
‭as a child of 5 but Heap property will be broken. Therefore, we need to put 102 in its correct‬
‭position.‬

‭●‬ ‭HeapIncreaseKey / Swim:‬

‭ et the new node be ‘n’ (in this case it is the node that contains 102). Check ‘n’ with its parent. If‬
L
‭the parent is smaller (n > parent) than the node ‘n’, replace ‘n’ with the parent. Continue this‬
‭process until n is in its correct position.‬

‭51‬
‭CSE 221: Algorithms [ARN] Ariyan Hossain‬

‭ est-case Time Complexity is O(1) when a key is inserted in the correct position at the first go.‬
B
‭Worst-case Time Complexity is when the newest node needs to climb up to the root‬
‭O(1) [insertion] + O(log n) [swim] = O(log n)‬
‭,‬

‭●‬ ‭Delete:‬

I‭n heap, you cannot just cannot randomly delete an item. Deletion is done by replacing the root‬
‭with the last element. The Heap property will be broken as small value will be at the top (root) of‬
‭the Heap. Therefore we must put it in the right place.‬

‭52‬
‭CSE 221: Algorithms [ARN] Ariyan Hossain‬

‭ ere, the root 102 will be replaced by the last element 5, and 102 will be removed. Heap‬
H
‭property will be broken. Therefore, we need to put 5 in its correct position.‬

‭●‬ ‭MaxHeapify / Sink:‬

‭ et the replaced node be ‘n’ (in this case it is the node that contains 5). Check ‘n’ with its‬
L
‭children. If the node ‘n’ is smaller (n < any child) than any child, replace ‘n’ with the largest child.‬
‭Continue this process until n is in its correct position.‬

‭ eleted element will always be the maximum element available in max-heap.Time Complexity is‬
D
‭O(1) [deletion] + O(log n) [sink] = O(log n)‬

‭53‬
‭CSE 221: Algorithms [ARN] Ariyan Hossain‬

‭●‬ ‭Heap Sort:‬

‭ elete + Sink all the nodes of the heap and store them in an array. The array will return a sorted‬
D
‭array in descending order. Reversing the array will give a sorted array in ascending order.‬

‭Simulation:‬

‭Delete + Sink takes O(log n) and for ‘n’ nodes, Heap Sort will take O(n log n).‬

‭54‬
‭CSE 221: Algorithms [ARN] Ariyan Hossain‬

‭●‬ ‭Build Max Heap:‬

‭ ou are given an arbitrary array and you have been asked to built it a heap. This will take‬
Y
‭O(n log n).‬

‭Advantages‬‭:‬

‭‬ C
● ‭ onsistent time complexity O(n log n), making it efficient for large datasets‬
‭●‬ ‭Does not require additional memory space (In-place sorting)‬
‭●‬ ‭Often used as a priority queue‬

‭Disadvantages:‬

‭●‬ ‭Unstable sorting algorithm, not preserving the relative order of equal elements‬

‭Count Sort:‬

‭ ount sort, also known as counting sort, is a non-comparative integer sorting algorithm. This‬
C
‭sorting technique doesn't perform sorting by comparing elements, but rather by using a‬
‭frequency array. It is efficient when the range of the input data (i.e., the difference between the‬
‭maximum and minimum values) is not significantly greater than the number of elements to be‬
‭sorted.‬

‭Step 1:‬‭Find out the maximum element from the given‬‭array.‬

‭55‬
‭CSE 221: Algorithms [ARN] Ariyan Hossain‬

‭ tep 2:‬‭Initialize a countArray[] of length max+1 with all elements as 0. This array will be used‬
S
‭for storing the occurrences of the elements of the input array.‬

‭ tep 3:‬‭In the countArray[], store the count of each unique element of the input array at their‬
S
‭respective indices.‬

‭ tep 4:‬‭Store the cumulative sum of the elements of‬‭the countArray[] by doing‬
S
‭countArray[i] = countArray[i – 1] + countArray[i].‬

‭ tep 5:‬‭Iterate from end of the input array (to preserve stability) and update‬
S
‭outputArray[ countArray[ inputArray[i] ] – 1] = inputArray[i] and also, update‬
‭countArray[ inputArray[i] ] = countArray[ inputArray[i] ] – 1 (so that duplicate values‬
‭are not overwritten)‬

‭For i=7‬

‭56‬
‭CSE 221: Algorithms [ARN] Ariyan Hossain‬

‭For i=6‬

.‭‬
‭.‬
‭.‬

‭For i=0‬

‭The outputArray will return a sorted array‬

‭57‬
‭CSE 221: Algorithms [ARN] Ariyan Hossain‬

‭Pseudo Code:‬

‭Advantages:‬

‭‬ C
● ‭ onsistent linear time complexity O(n+k)‬
‭●‬ ‭Stable sorting algorithm, preserving the relative order of equal elements‬

‭Disadvantages:‬

‭‬ C
● ‭ ounting sort is inefficient if the range of values to be sorted is very large‬
‭●‬ ‭Requires additional memory space (Out-of-place sorting) for countArray and outputArray,‬
‭leading to higher space complexity O(n+k)‬
‭●‬ ‭Counting sort does not work on decimal values‬

‭Follow-Up Question:‬

‭Q) What is the Time Complexity?‬

‭ ns: The first loop takes O(k), the second loop takes O(n), the third loop takes O(k) and the last‬
A
‭loop takes O(n). Hence, time complexity is O(n+k)‬

‭58‬

You might also like