Pedagogical Dimension: Number of Comparisons
Pedagogical Dimension: Number of Comparisons
Due to its simplicity Lomuto's partitioning method might be easier to implement. There is a nice
anecdote in Jon Bentley's Programming Pearl on Sorting:
Most discussions of Quicksort use a partitioning scheme based on two approaching indices [...]
[i.e. Hoare's]. Although the basic idea of that scheme is straightforward, I have always found the
details tricky - I once spent the better part of two days chasing down a bug hiding in a short
partitioning loop. A reader of a preliminary draft complained that the standard two-index method
is in fact simpler than Lomuto's and sketched some code to make his point; I stopped looking
after I found two bugs.
Performance Dimension
For practical use, ease of implementation might be sacrificed for the sake of efficiency. On a
theoretical basis, we can determine the number of element comparisons and swaps to compare
performance. Additionally, actual running time will be influenced by other factors, such as
caching performance and branch mispredictions.
As shown below, the algorithms behave very similar on random permutations except for the
number of swaps. There Lomuto needs thrice as many as Hoare!
Number of Comparisons
Both methods can be implemented using
comparisons to partition an array of length .
This is essentially optimal, since we need to compare every element to the pivot for deciding
where to put it.
Number of Swaps
The number of swaps is random for both algorithms, depending on the elements in the array. If
we assume random permutations, i.e. all elements are distinct and every permutation of the
elements is equally likely, we can analyze the expected number of swaps.
As only relative order counts, we assume that the elements are the numbers
the discussion below easier since the rank of an element and its value coincide.
Lomuto's Method
. That makes
The index variable scans the whole array and whenever we find an element
smaller than
pivot , we do a swap. Among the elements
, exactly
ones are smaller than , so
we get
swaps if the pivot is .
The overall expectation then results by averaging over all pivots. Each value in
equally likely to become pivot (namely with prob.
), so we have
is
Hoare's Method
Here, the analysis is slightly more tricky: Even fixing pivot
random.
More precisely: The indices and run towards each other until they cross, which always
happens at (by correctness of Hoare's partitioning algorithm!). This effectively divides the
array into two parts: A left part which is scanned by and a right part scanned by .
Now, a swap is done exactly for every pair of misplaced elements, i.e. a large element (larger
than , thus belonging in the right partition) which is currently located in the left part and a
small element located in the right part. Note that this pair forming always works out, i.e. there
the number of small elements initially in the right part equals the number of large elements in the
left part.
One can show that the number of these pairs is hypergeometrically
distributed: For the
large elements we randomly draw their positions in the array and have
positions in the left part. Accordingly, the expected number of pairs is
given that the pivot is .
Finally, we average again over all pivot values to obtain the overall expected number of swaps
for Hoare's partitioning:
Both algorithms use two pointers into the array that scan it sequentially. Therefore both behave
almost optimal w.r.t. caching.
Conclusion
Lomuto's method is simple and easier to implement, but should not be used for implementing a
library sorting method.