Sorting notes
Sorting notes
Object-Oriented
Roadmap Programming
C++ basics
Implementation
User/client
vectors + grids arrays
dynamic memory
stacks + queues
management
sets + maps linked data structures
real-world
Diagnostic algorithms
C++ basics
Implementation
User/client
vectors + grids arrays
dynamic memory
stacks + queues
management
sets + maps linked data structures
real-world
Diagnostic algorithms
4. Divide-and-Conquer Sorts
(MergeSort and QuickSort)
Review
[linked list operations]
Common linked lists operations
● Traversal
○ How do we walk through all elements in the linked list?
● Rewiring
○ How do we rearrange the elements in a linked list?
● Insertion
○ How do we add an element to a linked list?
● Deletion
○ How do we remove an element from a linked list?
Linked List Traversal Takeaways
● Temporary pointers into lists are very helpful!
○ When processing linked lists iteratively, it’s common to introduce pointers that point to cells in
multiple spots in the list.
○ This is particularly useful if we’re destroying or rewiring existing lists.
● Using a while loop with a condition that checks to see if the current pointer is
nullptr is the prevailing way to traverse a linked list.
● Iterative traversal offers the most flexible, scalable way to write utility functions
that are able to handle all different sizes of linked lists.
Pointers by Value
● Unless specified otherwise, function
arguments in C++ are passed by
value – this includes pointers!
Node*
Node*
0xbc70 0x40f0
prev cur
0x26b0
PTR
head
Linked list summary
● We saw lots of ways to examine and manipulate linked lists!
○ Traversal
○ Rewiring
○ Insertion (front/back/middle)
○ Deletion (front/back/middle)
● We saw linked lists in classes and outside classes, and pointers passed by
value and passed by reference.
The e e ements
are in the ri ht
place n w.
● Find the smallest element of what’s left and move it to the second position.
● Find the smallest element of what’s left and move it to the third position.
● Find the smallest element of what’s left and move it to the fourth position.
● (etc.)
void selectionSort(Vector<int>& elems) {
for (int index = 0; index < elems.size(); index++) {
int smallestIndex = indexOfSmallest(elems, index);
swap(elems[index], elems[smallestIndex]);
}
}
/**
* Given a vector and a starting point, returns the index of the smallest
* element in that vector at or after the starting point
*/
int indexOfSmallest(const Vector<int>& elems, int startPoint) {
int smallestIndex = startPoint;
for (int i = startPoint + 1; i < elems.size(); i++) {
if (elems[i] < elems[smallestIndex]) {
smallestIndex = i;
}
}
return smallestIndex;
}
Analyzing selection sort
● How much work do we do for selection sort?
Analyzing selection sort
● How much work do we do for selection sort?
○ To find the smallest value, we need to look at all n elements.
Analyzing selection sort
● How much work do we do for selection sort?
○ To find the smallest value, we need to look at all n elements.
○ To find the second-smallest value, we need to look at n – 1 elements.
Analyzing selection sort
● How much work do we do for selection sort?
○ To find the smallest value, we need to look at all n elements.
○ To find the second-smallest value, we need to look at n – 1 elements.
○ To find the third-smallest value, we need to look at n – 2 elements.
Analyzing selection sort
● How much work do we do for selection sort?
○ To find the smallest value, we need to look at all n elements.
○ To find the second-smallest value, we need to look at n – 1 elements.
○ To find the third-smallest value, we need to look at n – 2 elements.
○ This process continues until we have found every last "smallest element"
from the original collection.
Analyzing selection sort
● How much work do we do for selection sort?
○ To find the smallest value, we need to look at all n elements.
○ To find the second-smallest value, we need to look at n – 1 elements.
○ To find the third-smallest value, we need to look at n – 2 elements.
○ This process continues until we have found every last "smallest element"
from the original collection.
● Can we do better?
○ Yes!
Insertion Sort
(Bonus Content, not covered in live
lecture)
Insertion sort
Considered alone, the blue
Insertion sort item is triviall in sorted
order because it is only
one item.
Insertion sort The items in gray are in
no articular order
(unsorted).
Insertion sort Insert the yellow element
into the se uence that
includes the lue element.
Insertion sort
Insertion sort
The b ue ele ent
Insertion sort
are rte !
Insert the yellow
Insertion sort
item into the blue
se uence, making
the sequence one
element longer.
Insertion sort
Insertion sort
Insertion sort
Insertion sort
Insertion sort
● In the best case (the array is already sorted), insertion takes time O(n) because
you only iterate through once to check each element.
○ Selection sort, however, is always O(n2) because you always have to search the remainder of
the list to guarantee that you’re finding the minimum at each step.
● Fun fact: Insertion sorting an array of random values takes, on average, O(n2)
time.
○ This is beyond the scope of the class – take CS109 if you’re interested in learning more!
How can we design better,
more efficient sorting
algorithms?
Divide-and-Conquer
Motivating Divide-and-Conquer
● So far, we've seen O(N2) sorting algorithms. How can we start to do better?
Motivating Divide-and-Conquer
● So far, we've seen O(N2) sorting algorithms. How can we start to do better?
● Assume that it takes t seconds to run insertion sort on the following array:
Motivating Divide-and-Conquer
● So far, we've seen O(N2) sorting algorithms. How can we start to do better?
● Assume that it takes t seconds to run insertion sort on the following array:
● Poll: Approximately how many seconds will it take to run insertion sort on each
of the following arrays?
Motivating Divide-and-Conquer
● So far, we've seen O(N2) sorting algorithms. How can we start to do better?
● Both sorting algorithms we explore today will have both of these components:
○ Divide Step
■ Make the problem smaller by splitting up the input list
○ Join Step
■ Unify the newly sorted sublists to build up the overall sorted result
General Divide-and-Conquer Approach
● Our general approach when designing a divide-and-conquer algorithm is to
decide how to make the problem smaller and how to unify the results of these
solved, smaller problems.
● Both sorting algorithms we explore today will have both of these components:
○ Divide Step
■ Make the problem smaller by splitting up the input list
○ Join Step
■ Unify the newly sorted sublists to build up the overall sorted result
● Base Case:
○ An empty or single-element list is already sorted.
● Recursive step:
○ Break the list in half and recursively sort each part. (easy divide)
○ Use merge to combine them back into a single sorted list (hard join)
What do we do now?
When does the sorting
magic ha pen?
The Key Insight: Merge
The Key Insight: Merge
The Key Insight: Merge
The Key Insight: Merge
The Key Insight: Merge
The Key Insight: Merge
The Key Insight: Merge
The Key Insight: Merge
The Key Insight: Merge
The Key Insight: Merge
The Key Insight: Merge
The Key Insight: Merge
The Key Insight: Merge
The Key Insight: Merge
● While both lists are nonempty, compare their first elements. Remove the
smaller element and append it to the output.
● Once one list is empty, add all elements from the other list to the output.
Merge Sort
A recursive sorting algorithm!
● Base Case:
○ An empty or single-element list is already sorted.
● Recursive step:
○ Break the list in half and recursively sort each part. (easy divide)
○ Use merge to combine them back into a single sorted list (hard join)
Merge Sort – Let's
code it!
Announcements
Announcements
● Revisions for Assignment 4 will be due today at 11:59pm PDT.
/*
* Empty out the original vector and re-fill it with merged result
* of the two sorted halves.
*/
vec = {};
merge(vec, left, right);
}
void mergeSort(Vector<int>& vec) {
/* A list with 0 or 1 elements is already sorted by definition. */
if (vec.size() <= 1) return;
/*
* Empty out the original vector and re-fill it with merged result
* of the two sorted halves.
*/
vec = {}; O(n) work
merge(vec, left, right);
}
void mergeSort(Vector<int>& vec) {
/* A list with 0 or 1 elements is already sorted by definition. */
if (vec.size() <= 1) return;
/*
* Empty out the original vector and re-fill it with merged result
* of the two sorted halves.
*/
vec = {};
merge(vec, left, right);
}
O(n)
O(n)
O(n)
O(n)
O(n)
O(n)
O(n)
O(n)
O(n)
O(n)
O(n)
O(n)
O(n)
O(n)
O(n)
O(n)
O(n)
O(log n) levels!
O(n)
O(n)
O(n)
O(n)
O(n)
/*
* Empty out the original vector and re-fill it with merged result
* of the two sorted halves.
*/
vec = {};
merge(vec, left, right);
}
Analyzing Mergesort: Can we do better?
● Mergesort runs in time O(n log n), which is faster than insertion sort’s O(n2).
○ Can we do better than this?
○ Let's explore one more divide-and-conquer sort!
A Quick Historical Aside
● Mergesort was one of the first algorithms developed for computers as we
know them today.
● It was invented by John von Neumann in 1945 (!) as a way of validating the
design of the first “modern” (stored-program) computer.
● Want to learn more about what he did? Check out this article by Stanford’s very
own Donald Knuth.
Quicksort
Quicksort Algorithm
1. Partition the elements into three categories based on a chosen pivot element:
○ Elements smaller than the pivot
○ Elements equal to the pivot
○ Elements larger than the pivot
Quicksort Algorithm
1. Partition the elements into three categories based on a chosen pivot element:
○ Elements smaller than the pivot
○ Elements equal to the pivot
○ Elements larger than the pivot
2. Recursively sort the two partitions that are not equal to the pivot (smaller and
larger elements).
○ Now our smaller elements are in sorted order, and our larger elements are also in
sorted order!
Quicksort Algorithm
1. Partition the elements into three categories based on a chosen pivot element:
○ Elements smaller than the pivot
○ Elements equal to the pivot
○ Elements larger than the pivot
2. Recursively sort the two partitions that are not equal to the pivot (smaller and
larger elements).
○ Now our smaller elements are in sorted order, and our larger elements are also in
sorted order!
2. Recursively sort the two partitions that are not equal to the pivot (smaller and
larger elements).
○ Now our smaller elements are in sorted order, and our larger elements are also in
sorted order!
12 13 11 16 15
Input of unsorted elements: 14 12 16 13 11 15
12 13 11 16 15
12 13 11 16 15
12 13 11 16 15
12 13 11 16 15
11 13
Input of unsorted elements: 14 12 16 13 11 15
12 13 11 16 15
11 13
Recursivel sort the smaller
partition for pivot 12!
Input of unsorted elements: 14 12 16 13 11 15
12 13 11 16 15
11 13
12 13 11 16 15
11 13
Recursivel sort the larger
partition for pivot 12!
Input of unsorted elements: 14 12 16 13 11 15
12 13 11 16 15
11 13
12 13 11 16 15
11 13
Now we can concatenate smaller
than, equal to, and greater than
for the pivot 12.
Input of unsorted elements: 14 12 16 13 11 15
12 13 11 16 15
11 13
11 12 13
Input of unsorted elements: 14 12 16 13 11 15
12 13 11 16 15
11 13
Recursivel sort the larger
partition for pivot 14!
11 12 13
Input of unsorted elements: 14 12 16 13 11 15
12 13 11 16 15
11 13
Choose the first element as the
pivot.
11 12 13
Input of unsorted elements: 14 12 16 13 11 15
12 13 11 16 15
12 13 11 16 15
11 13 15
12 13 11 16 15
11 13 15
12 13 11 16 15
11 13 15
12 13 11 16 15
11 13 15
12 13 11 16 15
11 13 15
12 13 11 16 15
11 13 15
11 12 13 15 16
Input of unsorted elements: 14 12 16 13 11 15
12 13 11 16 15
12 13 11 16 15
11 13 15
11 12 13 14 15 16
Input of unsorted elements: 14 12 16 13 11 15
12 13 11 16 15
11 13 15
11 12 13 14 15 16 Sorted!
Quicksort Algorithm
1. Partition the elements into three categories based on a chosen pivot element:
○ Elements smaller than the pivot
○ Elements equal to the pivot
○ Elements larger than the pivot
2. Recursively sort the two partitions that are not equal to the pivot (smaller and
larger elements).
○ Now our smaller elements are in sorted order, and our larger elements are also in
sorted order!
/* Pick the pivot and partition the list into three components.
* 1) elements less than the pivot
* 2) elements equal to the pivot
* 3) elements greater than the pivot
*/
Define three empty lists: less, equal, greater
pivot = first element of vector (arbitrary choice)
partition(vec, less, equal, greater, pivot)
● Unlike in merge sort where most of the sorting work happens in the “join” step,
our sorting work occurs primarily at the “divide” step for quicksort (when we
sort elements into partitions).
Quicksort Efficiency Analysis
● Similar to Merge Sort, Quicksort also has O(N log N) runtime in the average
case.
○ With good choice of pivot, we split the initial list into roughly two equally-sized parts every time.
○ Thus, we reach a depth of about log N split operations before reaching the base case.
○ At each level, we do O(n) work to partition and concatenate.
Quicksort Efficiency Analysis
● Similar to Merge Sort, Quicksort also has O(N log N) runtime in the average
case.
○ With good choice of pivot, we split the initial list into roughly two equally-sized parts every time.
○ Thus, we reach a depth of about log N split operations before reaching the base case.
○ At each level, we do O(n) work to partition and concatenate.
● You can prove that it is not possible to guarantee a list has been sorted unless
you have done at minimum O(N log N) comparisons.
○ Take CS161 to learn how to write this proof!
The Limit Does Exist
● There is a fundamental limit on the efficiency of comparison-based sorting
algorithms.
● You can prove that it is not possible to guarantee a list has been sorted unless
you have done at minimum O(N log N) comparisons.
○ Take CS161 to learn how to write this proof!
● Thus, we can't do better (in Big-O terms at least) than Merge Sort and
Quicksort!
○ Take CS161 to learn about how there are actually clever non-comparison-based sorting
algorithms that are able to break this limit.
Final Advice
Assignment 6 Tips
● When implementing the sorting algorithm on linked lists, it is strongly
recommended to implement helper functions for the divide/join components of
the algorithm.
○ For quicksort this means having helper functions for the partition and concatenate operations
Assignment 6 Tips
● When implementing the sorting algorithm on linked lists, it is strongly
recommended to implement helper functions for the divide/join components of
the algorithm.
○ For quicksort this means having helper functions for the partition and concatenate operations
● Write tests for your helper functions first! Then, write end-to-end tests for your
sorting function.
Summary
https://www.toptal.com/developers/sorting-algorithms
Sorting
● Sorting is a powerful tool for organizing data in a meaningful format!
C++ basics
Implementation
User/client
vectors + grids arrays
dynamic memory
stacks + queues
management
sets + maps linked data structures
real-world
Diagnostic algorithms