Chap 5 - Sorting

‭CSE 221: Algorithms [ARN] Ariyan Hossain‬
‭Sorting Algorithms‬
‭Merge Sort:‬
‭ his sort follows a divide and conquer algorithm. It divides the problem into half, recursively‬
T
‭sorts two subproblems, and then merges the results into a complete sorted sequence.‬
‭Divide:‬
‭Conquer & Combine:‬
‭41‬
‭Merging/Combining two sorted arrays A and B into another sorted array C in Θ(n):‬
‭42‬
‭Pseudo Code:‬
‭Advantages‬‭:‬
‭‬ C
● ‭ onsistent time complexity O(n log n), making it efficient for large datasets‬
‭●‬ ‭Stable sorting algorithm, preserving the relative order of equal elements‬
‭Disadvantages:‬
‭●‬ R ‭ equires additional memory space (Out-of-place sorting) for the temporary arrays during‬
‭the merging process, leading to higher space complexity O(n)‬
‭●‬ ‭Slower for small datasets compared to simpler algorithms like insertion sort‬
‭Follow-up Question:‬
‭Q) What is the Recurrence Equation?‬
‭ ns: Depends on the number of subproblems and each subproblem size and work done at each‬
A
‭step. In each step, there are 2 subproblems where size gets divided by 2 (n/2) and work done is‬
‭n at each step. T(n) = 2T(n/2) + n‬
‭43‬
‭Quick Sort:‬
‭ his sort also follows a divide and conquer algorithm. It partitions the array into subarrays‬
T
‭around a pivot x such that the elements in the lower subarray ≤ x ≤ elements in the upper‬
‭subarray, recursively sorts the 2 subarrays, and then concatenates the lower subarray, pivot,‬
‭and the upper subarray.‬
‭Divide / Partition:‬
‭Conquer & Combine:‬
‭44‬
‭Partitioning/Dividing an array A in Θ(n):‬
‭45‬
‭Pseudo Code:‬
‭Quicksort is unique because its speed depends on the pivot you choose.‬
‭Worst Case vs Average/Best Case:‬
‭ orst Case happens when the pivot is the first or last element of a sorted (ascending or‬
W
‭descending) array. The result is that one of the partitions is always empty.‬
‭46‬
‭Worst Case:‬
‭ here are O(n) levels or the height is O(n), And each level takes O(n) time. The entire algorithm‬
T
‭will take O(n) * O(n) = O(n‬‭2‬‭) time.‬
‭Best Case:‬
‭47‬
‭ here are O(log n) levels or the height is O(log n). And each level takes O(n) time. The entire‬
T
‭algorithm will take O(n) * O(log n) = O(n log n) time.‬
‭ he best case is also the average case. If you always choose a random element in the array as‬
T
‭the pivot, quicksort will complete in O(n log n) time on average‬
‭‬ F
● ‭ astest sorting algorithm O(n log n), making it efficient for large datasets‬
‭●‬ ‭Does not require additional memory space (In-place sorting)‬
‭‬ O
● ‭ (n‬‭2‭)‬ time complexity in worst-case scenario‬
‭●‬ ‭Unstable sorting algorithm, not preserving the relative order of equal elements‬
‭Follow-up Question:‬
‭ ) If quicksort is O(n log n) on average, but merge sort is O(n log n) always, why not use merge‬
Q
‭sort? Isn’t it faster?‬
‭ ns: Even though both functions are the same speed in Big O notation, quicksort is faster in‬
A
‭practice. When you write Big O notation like O(n), it means O(c * n) where c is some fixed‬
‭amount of time that your algorithm takes. Let’s see an example:‬
‭48‬
‭ uicksort has a smaller constant than merge sort. So if they’re both O(n log n) time, quicksort is‬
Q
‭faster. And quicksort is faster in practice because it hits the average case way more often than‬
‭the worst case.‬
‭Q) What is the Recurrence Equation when it is worst-case?‬
A
‭step. In each step, there is 1 subproblem where size gets subtracted by 1 (n-1) and work done‬
‭is n at each step. T(n) = T(n-1) + n‬
‭Q) What is the Recurrence Equation when it is best-case?‬
A
‭step. In each step, there are 2 subproblems where size gets divided by 2 (n/2) and work done is‬
‭n at each step. T(n) = 2T(n/2) + n‬
‭ ) How to make sure your algorithm never reaches the worst-case when choosing 1st element‬
Q
‭as a pivot?‬
‭ ns: Use Randomized Quicksort. It is the same as the usual Quicksort but you will swap the‬
A
‭pivot with a random number in the array within the range and then start partitioning. This will‬
‭always give O(n log n).‬
‭Heap Sort:‬
‭Heap Sort uses a data structure called the heap.‬
‭Heap Data Structure:‬
‭Recap:‬
‭ ‬ I‭n a Binary Tree, a parent can have a maximum of 2 nodes and a minimum of 0 nodes.‬
●
‭●‬ ‭In a Complete Binary Tree, nodes are added to the tree in a left-to-right manner not‬
‭skipping any position. A parent can have 0, 1, or 2 children.‬
‭49‬
‭ Heap is an Abstract Data Type (ADT) for storing values. Its underlying data structure is an‬
A
‭array.‬
‭ Heap has to be a complete binary tree and it must satisfy the heap property.‬
A
‭Heap property:‬
‭●‬ T ‭ he value of the parent must be greater than or equal to the values of the‬
‭children (Max heap).‬
‭or‬
‭●‬ ‭The value of the parent must be smaller than or equal to the values of the children. (Min‬
‭heap).‬
‭ here are two types of heaps. Max heap is mostly used (default). A heap can be either a max‬
T
‭heap or a min heap but can't be both at the same time.‬
‭ eap data structure provides worst-case O(1) time access to the largest (max-heap) or smallest‬
H
‭(min-heap) element and provides worst-case Θ(log n) time to extract the largest (max-heap) or‬
‭smallest (min-heap) element.‬
‭Note: Tree is used for efficient tracing. While programming, the data structure is a simple Array.‬
‭ he benefit of using Array for Heap rather than Linked List is Arrays give you random access to‬
T
‭its elements by indices. You can just pick any element from the Array by just calling the‬
‭corresponding index. Finding a parent and their children is trivial. Whereas, Linked List is‬
‭sequential. This means you need to keep visiting elements in the linked list unless you find the‬
‭50‬
‭ lement you are looking for. Linked List does not allow random access as Array does. Also,‬
e
‭each Linked List must have three (3) references to traverse the whole Tree (Parent, left, Right).‬
‭Heap Operations:‬
‭●‬ ‭Insert:‬
I‭nserts an element at the bottom of the Heap. Then we must make sure that the Heap property‬
‭remains unchanged. When inserting an element in the Heap, we start from the left available‬
‭position to the right.‬
‭ ere, Heap property is kept intact. What if we want to insert 102 instead of 3? 102 will be added‬
H
‭as a child of 5 but Heap property will be broken. Therefore, we need to put 102 in its correct‬
‭position.‬
‭●‬ ‭HeapIncreaseKey / Swim:‬
‭ et the new node be ‘n’ (in this case it is the node that contains 102). Check ‘n’ with its parent. If‬
L
‭the parent is smaller (n > parent) than the node ‘n’, replace ‘n’ with the parent. Continue this‬
‭process until n is in its correct position.‬
‭51‬
‭ est-case Time Complexity is O(1) when a key is inserted in the correct position at the first go.‬
B
‭Worst-case Time Complexity is when the newest node needs to climb up to the root‬
‭O(1) [insertion] + O(log n) [swim] = O(log n)‬
‭,‬
‭●‬ ‭Delete:‬
I‭n heap, you cannot just cannot randomly delete an item. Deletion is done by replacing the root‬
‭with the last element. The Heap property will be broken as small value will be at the top (root) of‬
‭the Heap. Therefore we must put it in the right place.‬
‭52‬
‭ ere, the root 102 will be replaced by the last element 5, and 102 will be removed. Heap‬
H
‭property will be broken. Therefore, we need to put 5 in its correct position.‬
‭●‬ ‭MaxHeapify / Sink:‬
‭ et the replaced node be ‘n’ (in this case it is the node that contains 5). Check ‘n’ with its‬
L
‭children. If the node ‘n’ is smaller (n < any child) than any child, replace ‘n’ with the largest child.‬
‭Continue this process until n is in its correct position.‬
‭ eleted element will always be the maximum element available in max-heap.Time Complexity is‬
D
‭O(1) [deletion] + O(log n) [sink] = O(log n)‬
‭53‬
‭●‬ ‭Heap Sort:‬
‭ elete + Sink all the nodes of the heap and store them in an array. The array will return a sorted‬
D
‭array in descending order. Reversing the array will give a sorted array in ascending order.‬
‭Simulation:‬
‭Delete + Sink takes O(log n) and for ‘n’ nodes, Heap Sort will take O(n log n).‬
‭54‬
‭●‬ ‭Build Max Heap:‬
‭ ou are given an arbitrary array and you have been asked to built it a heap. This will take‬
Y
‭O(n log n).‬
‭‬ C
● ‭ onsistent time complexity O(n log n), making it efficient for large datasets‬
‭●‬ ‭Does not require additional memory space (In-place sorting)‬
‭●‬ ‭Often used as a priority queue‬
‭●‬ ‭Unstable sorting algorithm, not preserving the relative order of equal elements‬
‭Count Sort:‬
‭ ount sort, also known as counting sort, is a non-comparative integer sorting algorithm. This‬
C
‭sorting technique doesn't perform sorting by comparing elements, but rather by using a‬
‭frequency array. It is efficient when the range of the input data (i.e., the difference between the‬
‭maximum and minimum values) is not significantly greater than the number of elements to be‬
‭sorted.‬
‭Step 1:‬‭Find out the maximum element from the given‬‭array.‬
‭55‬
‭ tep 2:‬‭Initialize a countArray[] of length max+1 with all elements as 0. This array will be used‬
S
‭for storing the occurrences of the elements of the input array.‬
‭ tep 3:‬‭In the countArray[], store the count of each unique element of the input array at their‬
S
‭respective indices.‬
‭ tep 4:‬‭Store the cumulative sum of the elements of‬‭the countArray[] by doing‬
S
‭countArray[i] = countArray[i – 1] + countArray[i].‬
‭ tep 5:‬‭Iterate from end of the input array (to preserve stability) and update‬
S
‭outputArray[ countArray[ inputArray[i] ] – 1] = inputArray[i] and also, update‬
‭countArray[ inputArray[i] ] = countArray[ inputArray[i] ] – 1 (so that duplicate values‬
‭are not overwritten)‬
‭For i=7‬
‭56‬
‭For i=6‬
.‭‬
‭.‬
‭.‬
‭For i=0‬
‭The outputArray will return a sorted array‬
‭57‬
‭Pseudo Code:‬
‭Advantages:‬
‭‬ C
● ‭ onsistent linear time complexity O(n+k)‬
‭●‬ ‭Stable sorting algorithm, preserving the relative order of equal elements‬
‭‬ C
● ‭ ounting sort is inefficient if the range of values to be sorted is very large‬
‭●‬ ‭Requires additional memory space (Out-of-place sorting) for countArray and outputArray,‬
‭leading to higher space complexity O(n+k)‬
‭●‬ ‭Counting sort does not work on decimal values‬
‭Follow-Up Question:‬
‭Q) What is the Time Complexity?‬
‭ ns: The first loop takes O(k), the second loop takes O(n), the third loop takes O(k) and the last‬
A
‭loop takes O(n). Hence, time complexity is O(n+k)‬
‭58‬

Chap 5 - Sorting

Uploaded by

Copyright:

Available Formats

Chap 5 - Sorting

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chap 5 - Sorting

Uploaded by

Copyright:

Available Formats

‭CSE 221: Algorithms [ARN] Ariyan Hossain‬

‭Conquer & Combine:‬

‭Q) What is the Recurrence Equation?‬

‭Conquer & Combine:‬

‭Partitioning/Dividing an array A in Θ(n):‬

‭Worst Case vs Average/Best Case:‬

‭Q) What is the Recurrence Equation when it is worst-case?‬

‭Q) What is the Recurrence Equation when it is best-case?‬

‭Heap Sort uses a data structure called the heap.‬

‭Heap Data Structure:‬

‭●‬ ‭HeapIncreaseKey / Swim:‬

‭●‬ ‭MaxHeapify / Sink:‬

‭●‬ ‭Heap Sort:‬

‭●‬ ‭Build Max Heap:‬

‭Step 1:‬‭Find out the maximum element from the given‬‭array.‬

‭The outputArray will return a sorted array‬

‭Q) What is the Time Complexity?‬

You might also like