0% found this document useful (0 votes)
2 views

Lecture 06

Uploaded by

Zohaib Malik
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Lecture 06

Uploaded by

Zohaib Malik
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

Data Structures & Analysis of Algorithms

(COMP4120)
Lecture # 6
Sorting Algorithms
Course Instructor
Dr. Aftab Akram
PhD CS
Assistant Professor
Department of Information Sciences, Division of Science &
Technology
University of Education, Lahore
Sorting Problem
• Given a set of records with key
values , the Sorting Problem is to
arrange the records into any order s such that
records have keys obeying the
property .
• In other words, the sorting problem is to arrange a
set of records so that the values of their key fields
are in non-decreasing order.
• A sorting algorithm is said to be stable if it does not
change the relative ordering of records with
identical key values.
Sorting Algorithms
• When comparing two sorting algorithms, the most
straightforward approach would seem to be simply
program both and measure their running times.
• However, such a comparison can be misleading because
the running time for many sorting algorithms depends
on specifics of the input values.
• In particular, the number of records, the size of the keys
and the records, the allowable range of the key values,
and the amount by which the input records are “out of
order” can all greatly affect the relative running times
for sorting algorithms.
Sorting Algorithms
• When analyzing sorting algorithms, it is traditional to
measure the number of comparisons made between
keys.
• This measure is usually closely related to the running
time for the algorithm and has the advantage of being
machine and datatype independent.
• However, in some cases records might be so large that
their physical movement might take a significant
fraction of the total running time.
• If so, it might be appropriate to measure the number of
swap operations performed by the algorithm.
Sorting Algorithms
• In most applications we can assume that all records
and keys are of fixed length, and that a single
comparison or a single swap operation requires a
constant amount of time regardless of which keys
are involved.
• Special cases
• Variable length keys
• Small number of records to be sorted
• Use as little memory as possible
Three Sorting Algorithms
• We will discuss three sorting algorithms:
• Insertion Sort
• Bubble Sort
• Selection Sort
Insertion Sort
• Insertion Sort iterates through a list of records.
• Each record is inserted in turn at the correct
position within a sorted list composed of those
records already processed.
• It is like sorting a pile of telephone bills by date.
• A fairly natural way to do this might be to look at
the first two bills and put them in order.
• Then take the third bill and put it into the right
order with respect to the first two, and so on.
• As you take each bill, you would add it to the sorted
pile that you have already made.
Insertion Sort
Insertion Sort
Insertion Sort
• Two loops are needed to implemented insertion
sort.
• The outer loop scans through the entire array.
• The inner loop compares arrays values, and swap
the values if needed.
• For example, if initially array was sorted from
highest to lowest values, and now we want to sort
it from lowest to highest, then each in each
iteration of outer loop, the number of swaps in
inner loop will be 1 for 1st iteration, 2 for 2nd, and so
on.
• This represents the worst case scenario.
Insertion Sort
• The total number of comparisons will be:

• The best case will occur, if array is already sorted from


lowest to highest, then no swap will be needed.
• The total number of comparisons will be , which
is the number of times the outer for loop executes.
• Thus, the cost for Insertion Sort in the best case is
.
• While the best case is significantly faster than the worst
case, the worst case is usually a more reliable indication
of the “typical” running time.
Bubble Sort
• Bubble Sort consists of a simple double for loop.
• The first iteration of the inner for loop moves through
the record array from bottom to top, comparing
adjacent keys.
• If the lower-indexed key’s value is greater than its
higher-indexed neighbor, then the two values are
swapped.
• Once the smallest value is encountered, this process
will cause it to “bubble” up to the top of the array.
• The second pass through the array repeats this process.
• However, because we know that the smallest value
reached the top of the array on the first pass, there is
no need to compare the top two elements on the
second pass.
Bubble Sort
Bubble Sort
Bubble Sort
• Determining Bubble Sort’s number of comparisons
is easy.
• Regardless of the arrangement of the values in the
array, the number of comparisons made by the
inner for loop is always , leading to a total cost
of:
• Bubble Sort’s running time is roughly the same in
the best, average, and worst cases.
Selection Sort
• Consider again the problem of sorting a pile of phone bills
for the past year.
• Another intuitive approach might be to look through the
pile until you find the bill for January, and pull that out.
• Then look through the remaining pile until you find the bill
for February, and add that behind January.
• Proceed through the ever-shrinking pile of bills to select the
next one in order until you are done.
• This is the inspiration for our last sort, called
Selection Sort.
• The pass of Selection Sort “selects” the smallest key
in the array, placing that record into position .
• In other words, Selection Sort first finds the smallest key in
an unsorted list, then the second smallest, and so on.
Selection Sort
Selection Sort
Selection Sort
• Selection Sort is essentially a Bubble Sort, except that
rather than repeatedly swapping adjacent values to get
the next smallest record into place, we instead
remember the position of the element to be selected
and do one swap at the end.
• Thus, the number of comparisons is still , but the
number of swaps is much less than that required by
bubble sort.
• Selection Sort is particularly advantageous when the
cost to do a swap is high, for example, when the
elements are long strings or other large records.
• Selection Sort is more efficient than Bubble Sort (by a
constant factor) in most other situations as well.
Cost of Exchange Sorting
• Swapping adjacent records is called an exchange.
• Thus, these sorts are sometimes referred to as
exchange sorts.
• This is the crucial bottleneck is that only adjacent
records are compared. Thus, comparisons and
moves (in all but Selection Sort) are by single steps.
• The cost of any exchange sort can be at best the
total number of steps that the records in the array
must move to reach their “correct” location (i.e.,
the number of inversions for each record).
Cost of Exchange Sorting
Shell Sort
• Named after the inventor D.L. Shell.
• It is also sometimes called the diminishing increment sort.
• Unlike Insertion and Selection Sort, there is no real life
intuitive equivalent to Shell sort.
• Unlike the exchange sorts, Shell sort makes comparisons
and swaps between non-adjacent elements.
• Shell sort also exploits the best-case performance of
Insertion Sort.
• Shell sort’s strategy is to make the list “mostly sorted” so
that a final Insertion Sort can finish the job.
• When properly implemented, Shell sort will give
substantially better performance than in the worst
case.
Shell Sort
• Shell sort uses a process that forms the basis for
many of the sorts presented in the following
sections:
• Break the list into sublists,
• sort them,
• then recombine the sublists.
• Shellsort breaks the array of elements into “virtual”
sublists.
• Each sublist is sorted using an Insertion Sort.
• Another group of sublists is then chosen and
sorted, and so on.
Shell Sort
• During each iteration, Shell sort breaks the list into disjoint
sublists so that each element in a sublist is a fixed number of
positions apart.
• For example, let us assume for convenience that , the
number of values to be sorted, is a power of two.
• One possible implementation of Shell sort will begin by
breaking the list into sublists of 2 elements each, where
the array index of the 2 elements in each sublist differs by .
• If there are 16 elements in the array indexed from 0 to 15,
there would initially be 8 sublists of 2 elements each.
• The first sublist would be the elements in positions 0 and 8,
the second in positions 1 and 9, and so on.
• Each list of two elements is sorted using Insertion Sort.
Shell Sort
• The second pass of Shellsort looks at fewer, bigger lists.
• For our example the second pass would have lists of
size 4, with the elements in the list being positions
apart.
• Thus, the second pass would have as its first sublist the
4 elements in positions 0, 4, 8, and 12; the second
sublist would have elements in positions 1, 5, 9, and 13;
and so on.
• Each sublist of four elements would also be sorted
using an Insertion Sort.
• The third pass would be made on two lists, one
consisting of the odd positions and the other consisting
of the even positions.
Shell Sort
Shell Sort
Merge Sort
• A natural approach to problem solving is divide and
conquer.
• In terms of sorting, we might consider breaking the
list to be sorted into pieces, process the pieces, and
then put them back together somehow.
• A simple way to do this would be to split the list in
half, sort the halves, and then merge the sorted
halves together.
• This is the idea behind Merge sort.
Merge Sort
Merge Sort
• The merging part takes time where is the total
length of the two subarrays being merged.
• The array to be sorted is repeatedly split in half until
subarrays of size 1 are reached, at which time they are
merged to be of size 2, these merged to subarrays of
size 4, and so on.
• The depth of the recursion is for n elements
(assume for simplicity that is a power of two).
• The first level of recursion can be thought of as working
on one array of size , the next level working on two
arrays of size , the next on four arrays of size ,
and so on.
• The bottom of the recursion has arrays of size 1.
Merge Sort
• Thus, arrays of size 1 are merged (requiring
total steps), arrays of size 2 (again requiring
total steps), arrays of size 4, and so on.
• At each of the levels of recursion, work
is done, for a total cost of .
• This cost is unaffected by the relative order of the
values being sorted, thus this analysis holds for the
best, average, and worst cases.
Merge Sort
• Merge sort is one of the simplest sorting algorithms
conceptually, and has good performance both in
the asymptotic sense and in empirical running time.
• Surprisingly, even though it is based on a simple
concept, it is relatively difficult to implement in
practice.
Quick Sort
• Quicksort is aptly named because, when properly
implemented, it is the fastest known general-purpose
in-memory sorting algorithm in the average case.
• It does not require the extra array needed by
Mergesort, so it is space efficient as well.
• Quicksort is widely used, and is typically the algorithm
implemented in a library sort routine such as the UNIX
qsort function.
• Interestingly, Quicksort is hampered by exceedingly
poor worst-case performance, thus making it
inappropriate for certain applications.
Quick Sort
• Quicksort first selects a value called the pivot.
• Assume that the input array contains values less than
the pivot.
• The records are then rearranged in such a way that the
values less than the pivot are placed in the first, or
leftmost, positions in the array.
• The values greater than or equal to the pivot are placed
in the last, or rightmost, positions.
• This is called a partition of the array.
• The values placed in a given partition need not (and
typically will not) be sorted with respect to each other.
Quick Sort
• All that is required is that all values end up in the
correct partition.
• The pivot value itself is placed in position .
• Quicksort then proceeds to sort the resulting
subarrays now on either side of the pivot, one of
size and the other of size .
• Intuitively, Quick sort is again used on subarrays to
sort elements in correct order.
Quick Sort
Quick Sort
Quick Sort
• Selecting a pivot can be done in many ways. The
simplest is to use the first key.
• However, if the input is sorted or reverse sorted, this
will produce a poor partitioning with all values to one
side of the pivot.
• It is better to pick a value at random, thereby reducing
the chance of a bad input order affecting the sort.
• In the worst case, Quicksort is .
• When will this worst case occur? Only when each pivot
yields a bad partitioning of the array.
• If the pivot values are selected at random, then this is
extremely unlikely to happen.
Quick Sort
• Quicksort’s best case occurs when findpivot
always breaks the array into two equal halves.
• Best case and average case running time for Quick
Sort is

You might also like