0% found this document useful (0 votes)
64 views17 pages

Sorting Algorithms Sorting Algorithms

The document discusses various sorting algorithms including selection sort, insertion sort, merge sort, and quicksort. It provides information on the time complexity of each algorithm's operations including comparisons and swaps. Selection sort and insertion sort have O(n2) time complexity in the worst case due to multiple passes through the array. Merge sort and quicksort improve this to O(n log n) time using the divide and conquer paradigm to recursively split the array into smaller subproblems.

Uploaded by

Divyam Narayan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
64 views17 pages

Sorting Algorithms Sorting Algorithms

The document discusses various sorting algorithms including selection sort, insertion sort, merge sort, and quicksort. It provides information on the time complexity of each algorithm's operations including comparisons and swaps. Selection sort and insertion sort have O(n2) time complexity in the worst case due to multiple passes through the array. Merge sort and quicksort improve this to O(n log n) time using the divide and conquer paradigm to recursively split the array into smaller subproblems.

Uploaded by

Divyam Narayan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Sorting Algorithms Sorting Algorithms

1. Selection • Recurrence relation


2. Insertion • Number of comparison
3. Merge • Number of Swaps
4. Quick
5. Counting
6. Radix
7. Bucket
8. Heap
9. Bubble

Selection Sort Selection Sort


• The number of swaps required to sort n elements using selection • The method can be outlined as follows
sort, in the worst case
1. Find the minimum value in the list • The algorithm starts by selecting the minimum of all the
2. Swap it with the value in the first position elements and allocates this element to the first position.
3. Repeat the steps above for the remainder of the list
(starting at the second position and advancing each time) • Then, it does the same with the minimum of the remaining (n-1)
• As we can see from the algorithm, selection sort performs swap elements, allocating it to the second position.
only after finding the appropriate position of the current picked
element. • At any iteration i, the first (i-1) elements are already ordered
• So there are O(n) swaps performed in selection sort. and the algorithm looks for the minimum among the remaining (n-
i-1) elements and allocates it to the ith position.
Selection Sort
Selection Sort
• Selecting the lowest element requires scanning all n elements
(this takes (n − 1) comparisons) and then swapping it into the
• T(n), time to run selection sort on length n first position.
• O(n) steps to find minimum and move to position zero
• T(n-1) time to run selection sort on A[1] to A[n-1] • Finding the next lowest element requires scanning the remaining
(n − 1) elements and so on, for (n − 1) + (n − 2) + ... + 2 + 1 = n(n − 1)
/ 2 ∈ Θ(n2) comparisons.
• Recurrence
• T(n) = T(n-1) + O(n) • Each of these scans requires one swap for (n − 1) elements (the
final element is already in place).
• T(1) = 1
• T(n) = n+((n-1)+T(n-2))= …… = n+(n-1)+(n-2)+…..+1=n(n+1)/2=O(n 2) • Selection sort is not difficult to analyse compared to other
sorting algorithms since none of the loops depend on the data in
the array.

Selection Sort Insertion Sort


• The number of comparisons made by the algorithm (for finding
the minimum) is given in any case by 1. Get a list of unsorted numbers
2. Set a marker for the sorted section after the first number in
the list
• T(n) = = n(n-1)/2 = O(n2)
3. Repeat steps 4 through 6 until the unsorted section is empty
4. Select the first unsorted number
• The number of swaps required to sort n elements using selection 5. Swap this number to the left until it arrives at the correct
sort, in the worst case sorted position
6. Advance the marker to the right one position
7. Stop
Insertion Sort Insertion Sort
•Searching for a value
• Unsorted array, linear scan, O(n)
• Sorted array, binary search, O(log n)

•Unsorted array 
A[0] A[n-2] A[n-1]

Insertion Sort Insertion Sort: Sorted array


A[0] A[1] A[2] A[3] A[4] A[5] A[n-1] A[0] A[1] A[2] A[3] A[4] A[5] A[n-1]
Array Index Pos Swaps Comparison Array Index Pos Swaps Comparison
A[0] 1 0 0 A[0] 1 0 0
A[1] 2 1 1 A[1] 2 0 1
A[2] 3 2 2 A[2] 3 0 1
A[n-2] 4 3 3 A[n-2] 4 0 1
A[n-1] n n-1 n-1 A[n-1] n 0 1
O(n2) O(n2) O(n)

Can the binary search be used to improve efficiency? There is no right shifting of the elements in the sorted array
Insertion Sort O(n2) sorting algorithms
A[0] A[n-2] A[n-1]
• Selection sort and Insertion sort are both O(n2)
• T(n), time to run insertion sort on length n
• Time T(n-1) to sort segment A[0] to A[n-2] by recursion
• O(n2) sorting is infeasible for n over 100000
• (n-1) steps to insert A[n-1] in sorted segment

• A different strategy is divide and conquer


• T(n) = T(n-1) + (n-1)
• T(1) = 1

• T(n) = (n-1) + T(n-1) = (n-1) + ((n-2)+T(n-2)) = …. = (n-1)+(n-2)


+ …………. +1 = n(n-1)/2 = O(n2)

Divide-and-conquer Divide-and-conquer
• You should think of a divide-and-conquer algorithm as having
• Both merge sort and quicksort employ a common algorithmic three parts:
paradigm based on recursion.
• This paradigm, divide-and-conquer,
• breaks a problem into sub-problems that are similar to the original problem, • Divide the problem into a number of sub-problems that are
• recursively solves the sub-problems, and smaller instances of the same problem.
• finally combines the solutions to the sub-problems to solve the original
problem.
• Conquer the sub-problems by solving them recursively. If they are
• Because divide-and-conquer solves sub-problems recursively, small enough, solve the sub-problems as base cases.
• each sub-problem must be smaller than the original problem, and
• there must be a base case for sub-problems.
• Combine the solutions to the sub-problems into the solution for
the original problem.
Divide-and-conquer
Merge Sort

43 32 22 78 63 57 91 13
43 32 22 78 63 57 91 13
43 32 22 78 63 57 91 13
43 32 22 78 63 57 91 13

Merge Sort Merge Sort


• T(n) = T(n/2) + T(n/2) + O(n); n>1
13 22 32 43 57 63 78 91
• T(n) = 0; n<2
22 32 43 78 13 57 63 91 • Two sub-problems, each of size n/2, for left and right
halves.

32 43 22 78 57 63 13 91 • O(n) time required to merge two halves.

43 32 22 78 63 57 91 13
• T(n) = n(1+log2(n)) = O(n log2n)
Quick Sort
Quick Sort

Recurrence relation based on the


Two recursive calls: Worst case analysis
a.Best case: each call is on half the array, hence time is 2T(n/2)
b.Worst case: one array is empty, the other is n-1 elements, hence time The pivot is the smallest element
is T(n-1) T(n) = T(n-1) + cn, n > 1
T(n) = T(i) + T(n - i -1) + cn
The time to sort the file is equal to Best-case analysis:
•the time to sort the left partition with i elements, plus The pivot is in the middle
•the time to sort the right partition with n-i-1 elements, plus
•the time to build the partitions T(n) = 2T(n/2) + cn

Quick Sort: Worst case Quick Sort: Best case


Quick Sort: Average case Heap Sort
• The heap data structure is an array object that we can view as a
nearly complete binary tree.

• Each node of the tree corresponds to an element of the array. The


tree is completely filled on all levels except possibly the lowest,
which is filled from the left up to a point.

• An array A that represents a heap is an object with two attributes:


• A.length, which (as usual) gives the number of elements in the array, and
• A.heap-size, which represents how many elements in the heap are stored
within array A.
• The root of the tree is A[1], and given the index i of a node, we can
easily compute the indices of its parent, left child, and right child:

Heap Sort: Max heap


Heap Sort
• There are two kinds of binary heaps: max-heaps and min-heaps.

• In both kinds, the values in the nodes satisfy a heap property,


the specifics of which depend on the kind of heap.

• In a max-heap, the max-heap property is that for every node i


other than the root, that is, the value of a node is at most the
value of its parent.

• Thus, the largest element in a max-heap is stored at the root, and


the sub-tree rooted at a node contains values no larger than that
contained at the node itself.
Heap Sort: Min heap Heap Sort: Height
• A min-heap is organized in the opposite way; the min-heap • Viewing a heap as a tree, we define the height of a node in a
property is that for every node i other than the root, the heap to be the number of edges on the longest simple
smallest element in a min-heap is at the root. downward path from the node to a leaf, and
• We define the height of the heap to be the height of its
root.
• For the heap sort algorithm, we use max-heaps. Min-heaps • Since a heap of n elements is based on a complete binary
commonly implement priority queues. tree, its height is ϴ (lg n).
• We shall see that the basic operations on heaps run in time
at most proportional to the height of the tree and thus take
O (lg n) time.

Heap Sort Max Heapify


Max Heapify Exercise: Max Heapify

Build Max Heap Build Max Heap


Build Max Heap Exercise: Build Max Heap

Heap sort Heap sort


Exercise: Heap Sort

Heap Sort: analysis About the sorting Algorithms


• We can bound the running time of BUILD-MAX-HEAP as O(n). • The lower bound of all the comparison based sorting
• Each call to MAX-HEAPIFY costs O(lg n) time, and algorithms( Merge sort, Heap sort, Quick sort, ...etc) is
Ω(nLogn), i.e.., cannot be better than nLogn.
• BUILD-MAX-HEAP makes O(n) such calls.
• Counting sort/Pigeon sort is a linear time sorting algorithm
• Thus, the running time is O(n lg n). that sort in O(n+k) time when elements are in range from 1
to k.
• Where it(counting-sort) Fails?when elements reaches in
the range from 1 to n2 as in that case it will take O(n2) which
is worst than the above mentioned sorting algorithms.
About the sorting Algorithms Concept of Stability
• can we do better than O(nLogn) for the range 1 to n2? • A sorting algorithm is stable if elements with the same key appear in the output
The answer is Radix sort. array in the same order as they do in the input array.
• That is, it breaks ties between two elements by the rule that whichever element
• The Idea of Radix sort appears first in the input array appears first in the output array.
• The idea is to sort digit by digit starting from the least • Normally, the property of stability is important when satellite data are carried
significant digit and moving to the most significant digit. around with the element being sorted.
here counting-sort is used as a subroutine to sort. • For example, in order for radix sort to work correctly, the digit sorts must be
stable.
• The Radix sort Algorithm
• For all i where i is from the least significant to the most
significant digit of the number do the following
• sort the input array using counting sort according to its i'th digit.

Satellite Data
• In practice, the numbers to be sorted are rarely isolated values. Each is usually part
Satellite Data
of a collection of data called a record.
• Each record contains a key, which is the value to be sorted, and the remainder of • To keep things simple, we assume, as we have for binary search trees and red-black
the record consists of satellite data, which are usually carried around with the key. trees, that any “satellite information” associated with a key is stored in the same
node as the key.
• In practice, when a sorting algorithm permutes the keys, it must permute the
satellite data as well. • In practice, one might actually store with each key just a pointer to another disk
page containing the satellite information for that key.
• If each record includes a large amount of satellite data, we often permute an array
of pointers to the records rather than the records themselves in order to minimize • They implicitly assumes that the satellite information associated with a key, or the
data movement. pointer to such satellite information, travels with the key whenever the key is moved
from node to node.
• For example:
• When you sort, you can't break the structure. If you have a collection of people with
the attributes of a name, a current address, a social security number, and an age and Significance:
you want to sort them by age, you can't change the association between the four There's a few reasons why stability can be important. One is that, if two records
fields. You just need to sort them within the structure based on the value of a field. don't need to be swapped by swapping them you can cause a memory update, a page is
In this example, age is the key and the satellite data is the name, address, and social marked dirty, and needs to be re-written to disk (or another slow medium).
security number.
Radix Sort Radix Sort
• Radix sort solves the problem of card sorting—by sorting on the least
significant digit first.

• The algorithm then combines the cards into a single deck, with the
cards in the 0 bin preceding the cards in the 1 bin preceding the cards
in the 2 bin, and so on.

• Then it sorts the entire deck again on the second-least significant digit
and recombines the deck in a like manner.

• The process continues until the cards have been sorted on all d digits.
Analysis:
Remarkably, at that point the cards are fully sorted on the d-digit
number.

• Thus, only d passes through the deck are required to sort.

Bucket Sort Bucket Sort


• Bucket sort divides the interval into n equal-sized subintervals,
or buckets, and then distributes the n input numbers into the
buckets.
• Since the inputs are uniformly and independently distributed,
we do not expect many numbers to fall into each bucket.
• To produce the output, we simply sort the numbers in each
bucket and then go through the buckets in order, listing the
elements in each.
• Bucket sort assumes that the input is drawn from a uniform
distribution and has an average-case running time of O(n).
Bucket Sort Bucket Sort: analysis

Linearity of expectation is the key property that enables


us to perform probabilistic analyses by using indicator random variable

Linearity of Expectation Linearity of Expectation


• Linearity of expectation is the key property that enables us to perform probabilistic analyses
by using indicator random variable.
• Indicator random variables provide a convenient method for converting between probabilities
and expectations. Suppose we are given a sample space S and an event A.
• Then the indicator random variable I {A} associated with event A is defined as

• As a simple example, let us determine the expected number of heads that we obtain when
flipping a fair coin. Our sample space is S = {H, T}, and we define a random variable Y • Thus the expected number of heads obtained by one flip of a fair coin is 1/2. As the following
which takes on the values H and T, each with probability 1/2. We can then define an indicator lemma shows, the expected value of an indicator random variable associated with an event A
random variable XH, associated with the coin coming up heads, which we can express as the is equal to the probability that A occurs.
event Y = H. This variable counts the number of heads obtained in this flip, and it is 1 if the
coin comes up heads and 0 otherwise. We write
Bucket Sort: analysis Exercise
• Using this expected value, we conclude that the average-case running time
for bucket sort is linear. • Illustrate the operation of BUCKET-SORT on the array
A = <.79; .13; .16; .64; .39; .20; .89; .53; .71; .42>
• Even if the input is not drawn from a uniform distribution, bucket sort may
still run in linear time.

• As long as the input has the property that the sum of the squares of the
bucket sizes is linear in the total number of elements,
• by linearity of expectation bucket sort will run in linear time.

Counting sort Counting sort


• Assumes that each of the n input elements is an integer in the
range 0 to k, for some integer k.
• Note that Counting sort beats the lower bound of Ω(n lg n), • When k = O(n), the sort runs in Θ(n) time.
because it is not a comparison sort. • Counting sort determines, for each input element x, the number
of elements less than x.
• There is no comparison between elements. • It uses this information to place element x directly into its
position in the output array.
• For example, if 17 elements are less than x, then x belongs in
• Counting sort uses the actual values of the elements to index output position 18.
into an array.
• We must modify this scheme slightly to handle the situation in
which several elements have the same value, since we do not
want to put them all in the same position.
Counting sort Counting sort
COUNTING_SORT (A, B, k) COUNTING_SORT (A, B, k)
• Count the instances 1. for i ← 1 to k do
2. c[i] ← 0
• Modify the count array by adding the previous counts
3. for j ← 1 to n do
• Since we have X inputs we create an array with X spaces Count the instances
4. c[A[j]] ← c[A[j]] + 1
• We place the objects in their correct position and decrease 5. //c[i] now contains the number of elements equal to i
the count by one
6. for i ← 2 to k do
• Now the array contains the sorted objects 7. c[i] ← c[i] + c[i-1] Modify the count array by adding the previous counts
8. // c[i] now contains the number of elements ≤ i
9. for j ← n downto 1 do
10. B[c[A[j]]] ← A[j] We place the objects in their correct
11. c[A[j]] ← c[A[j]] - 1 position and decrease the count by one

Counting sort Counting sort


Counting sort: Analysis Counting sort: Analysis
• An important property of counting sort is that it is stable: numbers
• Thus, the overall time is Θ(k+n). In practice, we usually use with the same value appear in the output array in the same order as
counting sort when we have k = O(n), in which case the they do in the input array.
running time is Θ(n).
• That is, ties between two numbers are broken by the rule that
• Counting sort beats the lower bound of Ω(n lg n) because it whichever number appears first in the input array appears first in
is not a comparison sort. the output array.
• In fact, no comparisons between input elements occur anywhere in
the code.
• Normally, the property of stability is important only when satellite
data are carried around with the element being sorted.
• Instead, counting sort uses the actual values of the
elements to index into an array.
• The Θ(n lg n) lower bound for sorting does not apply when we • Counting sort's stability is important for another reason: counting
depart from the comparison-sort model. sort is often used as a subroutine in radix sort.

You might also like