0% found this document useful (0 votes)
23 views

Module 2 AOA

Merge sort is a divide and conquer algorithm that works by recursively splitting the array into halves, sorting each half, and then merging the sorted halves into a single sorted array. It has a worst-case running time of O(n log n) which makes it one of the most efficient sorting algorithms.

Uploaded by

Niramay K
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

Module 2 AOA

Merge sort is a divide and conquer algorithm that works by recursively splitting the array into halves, sorting each half, and then merging the sorted halves into a single sorted array. It has a worst-case running time of O(n log n) which makes it one of the most efficient sorting algorithms.

Uploaded by

Niramay K
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

Introduction to Algorithms

Quicksort
QuickSort Design
● Follows the divide-and-conquer paradigm.
● Divide: Partition (separate) the array A[p..r] into two
(possibly empty) subarrays A[p..q–1] and A[q+1..r].
● Each element in A[p..q–1] < A[q].
● A[q] < each element in A[q+1..r].
● Index q is computed as part of the partitioning procedure.
● Conquer: Sort the two subarrays by recursive calls to
quicksort.
● Combine: The subarrays are sorted in place – no
work is needed to combine them.
● How do the divide and combine steps of quicksort
compare with those of merge sort?
Pseudocode
Quicksort(A, p, r)
if p < r then Partition(A, p, r)
q := Partition(A, p, r); x := A[r], i :=p-1;
Quicksort(A, p, q – 1); for j := p to r – 1 do
Quicksort(A, q + 1, r) if A[j] ≤ x then
i := i + 1;
A[p..r] A[i] ↔ A[j]
A[i + 1] ↔ A[r];
5 return i + 1

A[p..q – A[q+1..r]
1]
Partition
5

≤ ≥
5 5
Example
p r
initially: 2 5 8 3 9 4 1 7 10 6 note: pivot (x) = 6
i j

next iteration: 2 5 8 3 9 4 1 7 10 6
i j Partition(A, p, r)
x := A[r], i := p – 1;
next iteration: 2 5 8 3 9 4 1 7 10 6 for j := p to r – 1 do
i j if A[j] ≤ x then
i := i + 1;
next iteration: 2 5 8 3 9 4 1 7 10 6 A[i] ↔ A[j]
i j A[i + 1] ↔ A[r];
return i + 1
next iteration: 2 5 3 8 9 4 1 7 10 6
i j
Example (Continued)
next iteration: 2 5 3 8 9 4 1 7 10 6
i j
next iteration: 2 5 3 8 9 4 1 7 10 6
i j
next iteration: 2 5 3 4 9 8 1 7 10 6 Partition(A, p, r)
i j x, i := A[r], p – 1;
next iteration: 2 5 3 4 1 8 9 7 10 6 for j := p to r – 1 do
i j if A[j] ≤ x then
next iteration: 2 5 3 4 1 8 9 7 10 6 i := i + 1;
i j A[i] ↔ A[j]
next iteration: 2 5 3 4 1 8 9 7 10 6 A[i + 1] ↔ A[r];
i j return i + 1
after final swap: 2 5 3 4 1 6 9 7 10 8
i j
Partitioning
● Select the last element A[r] in the subarray
A[p..r] as the pivot – the element around which
to partition.
● As the procedure executes, the array is
partitioned into four (possibly empty) regions.
1. A[p..i ] — All entries in this region are < pivot.
2. A[i+1..j – 1] — All entries in this region are > pivot.
3. A[r] = pivot.
4. A[j..r – 1] — Not known how they compare to pivot.
● The above hold before each iteration of the for
loop, and constitute a loop invariant. (4 is not part
of the loopi.)
Correctness of Partition
● Use loop invariant.
● Initialization:
● Before first iteration
A[p..i] and A[i+1..j – 1] are empty – Conds. 1 and 2 are satisfied

(trivially).
● r is the index of the pivot Partition(A, p, r)
● Cond. 3 is satisfied. x, i := A[r], p – 1;
for j := p to r – 1 do
● Maintenance:
if A[j] ≤ x then
● Case 1: A[j] > x i := i + 1;
● Increment j only. A[i] ↔ A[j]
● Loop Invariant is maintained. A[i + 1] ↔ A[r];
return i + 1
Correctness of Partition
Case
1:
p i j r
>x x

≤ >
p x i x j r
x

≤ >
x x
Correctness of Partition
● Case 2: A[j] ≤ x ● Increment j
● Increment i ● Condition 2 is
maintained.
● Swap A[i] and A[j]
● A[r] is unaltered.
● Condition 1 is
maintained. ● Condition 3 is
maintained.
p i j r
≤x x

≤ >
p x i x j r
x

≤ >
x x
Correctness of Partition

● Termination:
● When the loop terminates, j = r, so all elements
in A are partitioned into one of the three cases:
● A[p..i] ≤ pivot
● A[i+1..j – 1] > pivot
● A[r] = pivot
● The last two lines swap A[i+1] and A[r].
● Pivot moves from the end of the array to
between the two subarrays.
● Thus, procedure partition correctly performs
the divide step.
Complexity of Partition

● PartitionTime(n) is given by the number


of iterations in the for loop.
● Θ(n) : n = r – p + 1.
Partition(A, p, r)
x, i := A[r], p – 1;
for j := p to r – 1 do
if A[j] ≤ x then
i := i + 1;
A[i] ↔ A[j]
A[i + 1] ↔ A[r];
return i + 1
Quicksort Overview

● To sort a[left...right]:
1. if left < right:
1.1. Partition a[left...right] such that:
all a[left...p-1] are less than a[p], and
all a[p+1...right] are >= a[p]
1.2. Quicksort a[left...p-1]
1.3. Quicksort a[p+1...right]
2. Terminate
Partitioning in Quicksort

● A key step in the Quicksort algorithm is


partitioning the array
● We choose some (any) number p in the
array to use as a pivot
● We partition the array into three parts:

numbers less p numbers greater than or


than p equal to p
Recurrence

Alg.: QUICKSORT(A, p, r) Initially: p=1, r=n

if p < r

then q ← PARTITION(A, p, r)

QUICKSORT (A, p, q-1)

QUICKSORT (A, q+1, r)


Recurrence:
T(n) = T(q) + T(n – q) + n
14
Worst Case Partitioning
● Worst-case partitioning
● One region has one element and the other has n – 1 elements
● Maximally unbalanced
n n
● Recurrence: q=1 1 n-1 n
T(n) = T(1) + T(n – 1) + n, 1 n-2 n-1
n 1 n-3 n-2
T(1) = Θ(1)
1
T(n) = T(n – 1) + n 2 3
1 1 2

= Θ(n2)

When does the worst case happen? 15


Best Case Partitioning
● Best-case partitioning
● Partitioning produces two regions of size n/2
● Recurrence: q=n/2
T(n) = 2T(n/2) + Θ(n)
T(n) = Θ(nlgn) (Master theorem)

16
Case Between Worst and Best
● 9-to-1 proportional split
Q(n) = Q(9n/10) + Q(n/10) + n

17
Analysis of Algorithms
Merge Sort and Quick Sort
Sorting
• Insertion sort
– Design approach: incremental
– Sorts in place: Yes
– Best case: Θ(n)
– Worst case: Θ
(n2)

• Bubble Sort
– Design approach: incremental
– Sorts in place: Yes
– Running time: Θ
(n2)

2
Sorting
• Selection sort
– Design approach: incremental
– Sorts in place: Yes
– Running time: Θ
(n2)

• Merge Sort
– Design approach: divide and conquer
– Sorts in place:
No
– Running time: Let’s see!!

3
Divide-and-Conquer
• Divide the problem into a number of sub-problems
– Similar sub-problems of smaller size

• Conquer the sub-problems


– Solve the sub-problems recursively
– Sub-problem size small enough ⇒ solve the problems in
straightforward manner

• Combine the solutions of the sub-problems


– Obtain the solution for the original problem

4
Merge Sort Approach
• To sort an array A[p . . r]:
• Divide
– Divide the n-element sequence to be sorted into two
subsequences of n/2 elements each
• Conquer
– Sort the subsequences recursively using merge sort
– When the size of the sequences is 1 there is nothing
more to do
• Combine
– Merge the two sorted subsequences

5
Merge Sort
p q r
1 2 3 4 5 6 7 8
Alg.: MERGE-SORT(A, p, r)
5 2 4 7 1 3 2 6

if p < r Check for base case

then q ← ⎣(p + r)/2⎦ Divide


MERGE-SORT(A, p, q) Conquer
MERGE-SORT(A, q + 1, r) Conquer
MERGE(A, p, q, r) Combine

• Initial call: MERGE-SORT(A, 1, n)

6
Example – n Power of 2
1 2 3 4 5 6 7 8

Divide 5 2 4 7 1 3 2 6 q=4

1 2 3 4 5 6 7 8

5 2 4 7 1 3 2 6

1 2 3 4 5 6 7 8

5 2 4 7 1 3 2 6

1 2 3 4 5 6 7 8

5 2 4 7 1 3 2 6

7
Example – n Power of 2
1 2 3 4 5 6 7 8

Conquer 1 2 2 3 4 5 6 7
and
Merge 1 2 3 4 5 6 7 8

2 4 5 7 1 2 3 6

1 2 3 4 5 6 7 8

2 5 4 7 1 3 2 6

1 2 3 4 5 6 7 8

5 2 4 7 1 3 2 6

8
Example – n Not a Power of 2
1 2 3 4 5 6 7 8 9 10 11

4 7 2 6 1 4 7 3 5 2 6 q=6
Divide
1 2 3 4 5 6 7 8 9 10 11

q=3 4 7 2 6 1 4 7 3 5 2 6 q=9

1 2 3 4 5 6 7 8 9 10 11

4 7 2 6 1 4 7 3 5 2 6

1 2 3 4 5 6 7 8 9 10 11

4 7 2 6 1 4 7 3 5 2 6

1 2 4 5 7 8

4 7 6 1 7 3

9
Example – n Not a Power of 2
1 2 3 4 5 6 7 8 9 10 11

Conquer 1 2 2 3 4 4 5 6 6 7 7
and
1 2 3 4 5 6 7 8 9 10 11
Merge 1 2 4 4 6 7 2 3 5 6 7

1 2 3 4 5 6 7 8 9 10 11

2 4 7 1 4 6 3 5 7 2 6

1 2 3 4 5 6 7 8 9 10 11

4 7 2 1 6 4 3 7 5 2 6

1 2 4 5 7 8

4 7 6 1 7 3

10
Merging
p q r
1 2 3 4 5 6 7 8

2 4 5 7 1 2 3 6

• Input: Array A and indices p, q, r such that


p≤q<r
– Subarrays A[p . . q] and A[q + 1 . . r] are sorted
• Output: One single sorted subarray A[p . . r]

11
Merging
p q r
• Idea for merging: 1 2 3 4 5 6 7 8

2 4 5 7 1 2 3 6
– Two piles of sorted cards
• Choose the smaller of the two top cards
• Remove it and place it in the output pile
– Repeat the process until one pile is empty
– Take the remaining input pile and place it face-down
onto the output pile
A1 A[p, q]
A[p, r]

A2 A[q+1, r]

12
Example: MERGE(A, 9, 12, 16)
p q r

13
Example: MERGE(A, 9, 12, 16)

14
Example (cont.)

15
Example (cont.)

16
Example (cont.)

Done!

17
Merge - Pseudocode
p q r
Alg.: MERGE(A, p, q, r) 1 2 3 4 5 6 7 8

2 4 5 7 1 2 3 6
1. Compute n1 and n2
2. Copy the first n1 elements into n1 L[1 . .n n1
2
+ 1] and the next n2 elements into R[1 . . n2 + 1]
3. L[n1 + 1] ← ∞; R[n2 + 1] ← ∞ p q

4. i ← 1; j ← 1 L 2 4 5 7 ∞
5. for k ← p to r q+1 r

6. do if L[ i ] ≤ R[ j ] R 1 2 3 6 ∞
7. then A[k] ← L[ i ]
8. i ←i + 1
9. else A[k] ← R[ j ]
10. j←j+1
18
Running Time of Merge
(assume last for loop)
• Initialization (copying into temporary arrays):
– Θ(n1 + n2) = Θ(n)
• Adding the elements to the final array:
- n iterations, each taking constant time ⇒ Θ(n)
• Total time for Merge:
– Θ(n)

19
Analyzing Divide-and Conquer Algorithms
• The recurrence is based on the three steps of
the paradigm:
– T(n) – running time on a problem of size n
– Divide the problem into a subproblems, each of size
n/b: takes D(n)
– Conquer (solve) the subproblems aT(n/b)
– Combine the solutions C(n)

Θ(1) if n ≤ c
T(n) = aT(n/b) + D(n) + C(n) otherwise

20
MERGE-SORT Running Time
• Divide:
– compute q as the average of p and r: D(n) = Θ(1)
• Conquer:
– recursively solve 2 subproblems, each of size n/2
⇒ 2T (n/2)
• Combine:
– MERGE on an n-element subarray takes Θ(n) time
⇒ C(n) = Θ(n)
Θ(1) if n =1
T(n) = 2T(n/2) + Θ(n) if n > 1

21
Solve the Recurrence
T(n) = c if n = 1
2T(n/2) + cn if n > 1

Use Master’s Theorem:

Compare n with f(n) = cn


Case 2: T(n) = Θ(nlgn)

22
Merge Sort - Discussion
• Running time insensitive of the input

• Advantages:
– Guaranteed to run in Θ(nlgn)

• Disadvantage
– Requires extra space ≈N

23
Sorting Challenge 1
Problem: Sort a file of huge records with tiny
keys
Example application: Reorganize your MP-3 files

Which method to use?


A. merge sort, guaranteed to run in time ~NlgN
B. selection sort
C. bubble sort
D. a custom algorithm for huge records/tiny keys
E. insertion sort

24
Sorting Files with Huge Records and
Small Keys
• Insertion sort or bubble sort?

– NO, too many exchanges

• Selection sort?

– YES, it takes linear time for exchanges

• Merge sort or custom method?

– Probably not: selection sort simpler, does less swaps

25
Sorting Challenge 2
Problem: Sort a huge randomly-ordered file of
small records
Application: Process transaction record for a
phone company

Which sorting method to use?


A. Bubble sort
B. Selection sort
C. Mergesort guaranteed to run in time ~NlgN
D. Insertion sort

26
Sorting Huge, Randomly - Ordered Files
• Selection sort?
– NO, always takes quadratic time
• Bubble sort?
– NO, quadratic time for randomly-ordered keys
• Insertion sort?
– NO, quadratic time for randomly-ordered keys
• Mergesort?
– YES, it is designed for this problem

27
Sorting Challenge 3
Problem: sort a file that is already almost in
order
Applications:
– Re-sort a huge database after a few changes
– Doublecheck that someone else sorted a file
Which sorting method to use?
A. Mergesort, guaranteed to run in time ~NlgN
B. Selection sort
C. Bubble sort
D. A custom algorithm for almost in-order files
E. Insertion sort
28
Sorting Files That are Almost in Order
• Selection sort?
– NO, always takes quadratic time
• Bubble sort?
– NO, bad for some definitions of “almost in order”
– Ex: B C D E F G H I J K L M N O P Q R S T U V W X Y Z A
• Insertion sort?
– YES, takes linear time for most definitions of “almost
in order”
• Mergesort or custom method?
– Probably not: insertion sort simpler and faster

29

You might also like