0% found this document useful (0 votes)
5 views

Lecture05 1

Data Analysis and Algorithm

Uploaded by

kainat sajid
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Lecture05 1

Data Analysis and Algorithm

Uploaded by

kainat sajid
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 73

Advanced Analysis of Algorithms

Cormen: chapter 4
Levetin: chapter 5
Designing Algorithms using
Divide & Conquer Approach
Divide and Conquer Approach
A General Divide and Conquer Algorithm
Step 1:
• If the problem size is small, solve this problem
directly
• Otherwise, split the original problem into 2 or more
sub-problems with almost equal sizes.
Step 2:
• Recursively solve these sub-problems by applying
this algorithm.
Step 3:
• Merge the solutions of the sub- problems into a
solution of the original problem.
Divide and Conquer (contd.)
Problem of size
n

Subproble Subproble
m 1 of size m 2 of size
Don’t assume
n/2 n/2
always breaks up
into 2, could be > 2
subproblems
Solution to Solution to
subproblem subproblem
1 2

Solution to
the original
probelm
Divide and Conquer (contd.)
• Let us add n numbers using divide and
conquer technique
a0 + a1 + …… + an-1

a0 + …… + + …… + an-1

Is it more efficient than brute force ?

Let’s see with an example


Div. & Conq. (add n numbers)
0 1 2 3 4 5 6 7 8 9

2 10 3 5 7 1 6 10 1 3

0 1 2 3 4 0 1 2 3 4
2 10 3 5 7 1 6 10 1 3
# of additions
is same as in
brute force,
2 10 3 5 7 1 6 10 1 3
needs stack
for recursion…
12 3 5 7 7 10 1 3
Bad!
not all divide
12 and conquer 4
works!!
15 14

27 21
Could be efficient
48
for parallel processors though….
Div. & Conq. (contd.)
• Usually in div. & conq., a problem instance of size n
is divided into two instances of size n/2
• More generally, an instance of size n can be divided
into b instances of size n/b, with a of them needing to
be solved
• Assuming that n is a power of b (n = bm), we get
– T(n) = aT(n/b) + f(n) general divide-and-conquer
recurrence
– Here, f(n) accounts for the time spent in dividing an
instance of size n into subproblems of size n/b and
combining their solution
– For adding n numbers, a = b = 2 and f(n) = 1
Time Complexity of General Algorithms
• Time complexity:
 2T(n/2) + S(n) + M(n) , n c
T(n)= 
 b ,n<c

– where S(n) is time for splitting


– M(n) is time for merging
– b and c are constants
Example
• Binary search
• Quick sort
• Merge sort
Merge-sort
Merge-sort
Merge-sort is based on divide-and-conquer approach
and can be described by the following three steps:
Divide Step:
• If given array A has zero or one element, return S.
• Otherwise, divide A into two arrays, A1 and A2,
• Each containing about half of the elements of A.
Recursion Step:
• Recursively sort array A1, A2
Conquer Step:
• Combine the elements back in A by merging the sorted
arrays A1 and A2 into a sorted sequence.
Visualization of Merge-sort as Binary Tree
• We can visualize Merge-sort by means of binary
tree where each node of the tree represents a
recursive call
• Each external node represents individual elements
of given array A.
• Such a tree is called Merge-sort tree.
• The heart of the Merge-sort algorithm is conquer
step, which merge two sorted sequences into a
single sorted sequence
• The merge algorithm is explained in the next
Div. & Conq.: Mergesort
• Sort an array A[0..n-1]
A[0……n-1]
divide

A[0……] A[……n-1]

sort sort

A[0……] A[……n-1]

merge

A[0……n-1]

Go on dividing recursively…
Div. & Conq.: Mergesort(contd.)
ALGORITHM Mergesort(A[0..n-1])
//sorts array A[0..n-1] by recursive mergesort
//Input: A[0..n-1] to be sorted
//Output: Sorted A[0..n-1]
if n > 1
copy A[0..-1] to B[0.. -1]
copy A[..n-1] to C[0..-1]
Mergesort(B[0..-1])
Mergesort(C[0..-1])
Merge(B, C, A)

B: 2 3 8 9 C: 1 4 5 7

A: 1 2 3 4 5 7 8 9
Div. & Conq.: Mergesort(contd.)
ALGORITHM Merge(B[0..p-1], C[0..q-1], A[0..p+q-1])
//Merges two sorted arrays into one sorted array
//Input: Arrays B[0..p-1] and C[0..q-1] both sorted
//Output: Sorted array A[0..p+q-1] of elements of B and C
i <- 0; j <- 0; k <- 0;
while i < p and j < q do
if B[i] ≤ C[j]
A[k] <- B[i]; i <- i+1
else
A[k] <- C[j]; j <- j+1
k <- k+1
if i = p
copy C[j..q-1] to A[k..p+q-1]
else
copy B[i..p-1] to A[k..p+q-1]
Div. & Conq.: Mergesort(contd.)
Divide:
8 3 2 9 7 1 5 4
Merge:

8 3 2 9 7 1 5 4

8 3 2 9 7 1 5 4

8 3 2 9 7 1 5 4

3 8 2 9 1 7 4 5

2 3 8 9 1 4 5 7

1 2 3 4 5 7 8 9
Div. & Conq.: Mergesort(contd.)
ALGORITHM Mergesort(A[0..n-1]) ALGORITHM Merge(B[0..p-1], C[0..q-1], A[0..p+q-1])
//sorts array A[0..n-1] by recursive mergesort //Merges two sorted arrays into one sorted array
//Input: A[0..n-1] to be sorted //Input: Arrays B[0..p-1] and C[0..q-1] both sorted
//Output: Sorted array A[0..p+q-1] of elements of B
//Output: Sorted A[0..n-1] //and C
if n > a i <- 0; j <- 0; k <- 0;
copy A[0..-1] to B[0.. -1] while i < p and j < q do
copy A[..n-1] to C[0..-1] if B[i] ≤ C[j]
Mergesort(B[0..-1]) A[k] <- B[i]; i <- i+1
Mergesort(C[0..-1]) else
Merge(B, C, A) A[k] <- C[j]; j <- j+1
k <- k+1
if i = p
copy C[j..q-1] to A[k..p+q-1]
What is the time-efficiency else
of Meresort? copy B[i..p-1] to A[k..p+q-1]

Input size: n = 2m C(n) = 2C(n/2) + CMerge(n) for n > 1,


Basic op: comparison C(1) = 0
C(n) depends on input
B:C2worst5(n) 8= 2Cworst
C: (n/2)+n-1
3 4 9 for n > 1
type… Cworst(1) = 0
In worst-case
CMerge(n) = n-1 C 2 (n)
A: 3 =4nlgn-n+1
5 8 є 9 Θ(nlgn)
worst

How many comparisons


Could use the master thm too! is needed for this Merge?
Analysis of Merge-sort Algorithm
• Let T(n) be the time taken by this algorithm to sort
an array of n elements dividing A into sub-arrays A1
and A2.
• It is easy to see that the Merge (A1, A2, A) takes
the linear time. Consequently,
T(n) = T(n/2) + T(n/2) + θ(n)
T(n) = 2T (n/2) + θ(n)
• The above recurrence relation can be solved by
any of the methods
– Substitution
– recursion tree or
– master method
Analysis: Substitution Method
n
T (n) 2.T ( )  n
2
n n n
T ( ) 2.T ( 2 ) 
2 2 2
n n n
T ( 2 ) 2.T ( 3 )  2
2 2 2
n n n
T ( 3 ) 2.T ( 4 )  3 . . .
2 2 2

n n n
T ( k  1 ) 2.T ( k )  k  1
2 2 2
Analysis of Merge-sort Algorithm
n 2 n
T (n) 2.T ( )  (n) 2 .T ( 2 )  n  n
2 2
2 n
T (n) 2 .T ( 2 )  n  n
2
3 n
T (n) 2 .T ( 3 )  n  n  n
2
...

n
k
T (n) 2 .T ( k )  n n  .. .  n
2 k  times
Analysis of Merge-sort Algorithm

k n
T (n) 2 .T ( k )  n n  .. .  n
2 k  times
nk
T (n) 2 .T ( k )  k .n
2
Let us suppose that : n 2 k  log 2 n k

Hence, T (n) n.T (1)  n. log 2 n n  n. log 2 n

T (n) (n. log 2 n)


Div. & Conq.: Mergesort(contd.)
• Worst-case of Mergesort is Θ(nlgn)
• Average-case is also Θ(nlgn)
• It is stable but quicksort and heapsort are not
• Possible improvements
– Implement bottom-up. Merge pairs of elements, merge the
sorted pairs, so on… (does not require recursion-stack
anymore)
– Could divide into more than two parts, particularly useful for
sorting large files that cannot be loaded into main memory
at once: this version is called “multiway mergesort”
• Not in-place, needs linear amount of extra memory
– Though we could make it in-place, adds a bit more
“complexity” to the algorithm
Div. & Conq.: Quicksort
• Another divide and conquer based
sorting algorithm, discovered by C. A.
R. Hoare (British) in 1960 while trying
to sort words for a machine translation
project from Russian to English
• Instead of “Merge” in Mergesort,
Quicksort uses the idea of
partitioning.
Div. & Conqr.: Quicksort (contd.)

A[0]…A[n-1] A[0]…A[s-1] A[s] A[s+1]…A[n-1]

all are ≤ A[s] all are ≥ A[s]

Notice: A[s] is in it’s


In Mergesort final position
all work is in
combining the
partial solutions. Now continue working with
In Quicksort all these two parts
work is in dividing
the problem,
Combining does not require any work!
Div. & Conqr.: Quicksort (contd.)
ALGORITHM Quicksort(A[l..r])
//Sorts a subarray by quicksort
//Input: Subarray of array A[0..n-1] defined by its //left
and right indices l and r
//Output: Subarray A[l..r] sorted in nondecreasing
//order
if l < r
s <- Partition( A[l..r] ) // s is a split position
Quicksort( A[l..s-1] )
Quicksort( A[s+1]..r )
Div. & Conqr.: Quicksort (contd.)
• As a partition algorithm we start by selecting
a “pivot”
• There are various strategies to select the
pivot, we shall use the simplest: we shall
select pivot, p =A[l], the first element of A[l..r]
Div. & Conqr.: Quicksort (contd.)
p ≤p p ≥p

i j
p

If A[i] < p, we continue incrementing i, stop when A[i] ≥ p


If A[j] > p, we continue decrementing j, stop when A[j] ≤ p

i j
p all ≤ p ≥p ≤p all ≥ p

j i
p all ≤ p ≤p≥p all ≥ p

j=i
p all ≤ p =p all ≥ p
Div. & Conqr.: Quicksort (contd.)
ALGORITHM HoarePartition(A[l..r])
//Output: the split position
p <- A[l] i could go out of array’s
i <- l; j <- r+1 bound, we could check
or we could put a “sentinel”
repeat at the end…
Do you see any repeat i <- i+1 until A[i] ≥ p
possible problem
with this pseudocode ? repeat j <- j-1 until A[j] ≤ p
swap( A[i], A[j] )
until i ≥ j
swap( A[i], A[j] ) // undo last swap when i ≥ j
swap( A[l], A[j] )
return j
More sophisticated pivot selection
that we shall see briefly makes this
“sentinel” unnecessary…
Div. & Conqr.: Quicksort (contd.)
i j i j
5 3 1 9 8 2 4 7 2 3 1 4 5 8 9 7
i j i j
5 3 1 9 8 2 4 7 2 3 1 4 5 8 9 7
1 2 3 4 5 8 9 7
i j i j
i j
5 3 1 4 8 2 9 7 2 1 3 4 5 8 9 7
1 2 3 4 5 8 9 7
i j j i
i j
5 3 1 4 8 2 9 7 2 1 3 4 5 8 9 7
1 2 3 4 5 8 7 9
i j
j i
5 3 1 4 2 8 9 7 1 2 3 4 5 8 9 7
1 2 3 4 5 8 7 9
j i
5 3 1 4 2 8 9 7 1 2 3 4 5 8 9 7
1 2 3 4 5 7 8 9
i=j
2 3 1 4 5 8 9 7 1 2 3 4 5 8 9 7
1 2 3 4 5 7 8 9
j i
1 2 3 4 5 8 9 7
1 2 3 4 5 7 8 9
Div. & Conqr.: Quicksort (contd.)
0 1 2 3 4 5 6 7

5 3 1 9 8 2 4 7 0 1 2 3 4 5 6 7

5 3 1 9 8 2 4 7

l=0,r=7 0 1 2 3 4 5 6 7

s=4 2 3 1 4 5 8 9 7

l=0,r=3 l=5,r=7
0 1 2 3 5 6 7
s=1 s=6
1 2 3 4 7 8 9
l=0,r=0 l=2,r=3 l=5,r=5 l=7,r=7
s=2 0 5

l=2,r=1 1 7
l=3,r=3
2 3 7

3 4 9

4
Div. & Conqr.: Quicksort (contd.)
• Let us analyze Quicksort
ALGORITHM Quicksort(A[l..r]) Time-complexity of this line ?
if l < r i j
s <- Partition ( A[l..r] ) So, n+1 5 3 1 4 8 2 9 7
Quicksort( A[l..s-1] ) comparisons
when cross-over i j
Quicksort( A[s+1]..r )
5 3 1 4 2 8 9 7
If all splits ALGORITHM Partition(A[l..r]) j i
happen in the //Output: the split position 5 3 1 4 2 8 9 7
So, n
middle, it is p <- A[l]
comparisons
the best-case! i <- l; j <- r+1 when coincide What if,
repeat i,j
repeat i <- i+1 until A[i] ≥ p
5 3 1 4 5 8 9 7
repeat j <- j-1 until A[j] ≤ p
swap( A[i], A[j] )
until i ≥ j
swap( A[i], A[j] ) // undo last swap when i ≥ j
swap( A[l], A[j] )
return j C (n) = 2C (n/2)+n for n > 1
best best
Cbest(1) = 0
Div. & Conqr.: Quicksort (contd.)
T(n) = aT(n/b)+f(n), a ≥ 1, b > 1
ALGORITHM Quicksort(A[l..r])
If f(n) є nd with d ≥ 0, then
if l < r
Θ(nd) if a < bd
s <- Partition( A[l..r] )
Quicksort( A[l..s-1] )
T(n) є Θ(ndlgn) if a = bd Quicksort( A[s+1]..r )

Θ() if a > bd j ij j jj
2 5 6 8 9 5+1=6
Cbest(n) = 2Cbest(n/2)+n for n > 1 i j
Cbest(1) = 0 5 6 8 9 4+1=5
i j
Using Master Thm, Cbest(n) є Θ(nlgn)
6 8 9 3+1=4
What is the worst-case ? ij
8 9 2+1=3
Cworst(n) = (n+1) + (n-1+1) + … + (2+1) = (n+1) + … + 3
= (n+1) + … + 3 + 2 + 1 – (2 + 1) = - 3
= - 3 є Θ(n2) !
So, Quicksort’s fate
depends on its average-case!
Div. & Conqr.: Quicksort (contd.)
• Let us sketch the outline of average-
case analysis… ALGORITHM Quicksort(A[l..r])
if l < r
s <- Partition( A[l..r] )
Cavg(n) is the average number of key-comparisons Quicksort( A[l..s-1] )
made by the Quicksort on a randomly ordered array Quicksort( A[s+1]..r )
of size n
After n+1 comparisons, a partition can Let us assume that
happen in any position s (0 ≤ s ≤ n-1) partition split can
happen in each position s
After the partition, left part has s elements, with equal probability 1/n
Right part has n-1-s elements
0 s n-1 Cavg(n) = Expected[ Cavg(s) + Cavg(n-1-s) + (n+1) ]
p
Average over all possibilities
s elements n-1-s elements Cavg(n) =
Cavg(0) = 0, Cavg(1) = 0
Cavg(n) ≈ 1.39nlgn
Div. & Conqr.: Quicksort (contd.)
• Recall that for Quicksort, Cbest(n) ≈ nlgn
• So, Cavg(n) ≈ 1.39nlgn is not far from Cbest(n)
• Quicksort is usually faster than Mergesort or
Heapsort on randomly ordered arrays of nontrivial
sizes
• Some possible improvements
• Randomized quicksort: selects a random element as pivot
• Median-of-three: selects median of left-most, middle, and
right-most elements as pivot
• Switching to insertion sort on very small subarrays, or not
sorting small subarrays at all and finish the algorithm with
insertion sort applied to the entire nearly sorted array
• Modify partitioning: three-way partition

These improvements can speed up by 20% to 30%


Div. & Conqr.: Quicksort (contd.)
• Weaknesses
– Not stable
– Requires a stack to store parameters of
subarrays that are yet to be sorted, the
stack can be made to be in O(lgn) but that
is still worse than O(1) space efficiency of
Heapsort

DONE with Quicksort!


Div. & Conq. : Multiplication of
Large Integers
• We want to efficiently multiply two very large
numbers, say each has more than 100
decimal digits
• How do we usually multiply 23 and 14?
• 23 = 2*101 + 3*100, 14 = 1*101 + 4*100
• 23*14 = (2*101 + 3*100) * (1*101 + 4*100)
• 23*14 = (2*1)102 + (2*4+3*1)101+(3*4)100
• How many multiplications?
4 = n2
Div. & Conq. : Multiplication of
Large Integers
• 23*14 = (2*1)102 + (2*4+3*1)101+(3*4)100
We can rewrite the middle term as:
What has been gained?
(2*4+3*1) = (2+3)*(1+4) - 2*1 - 3*4
We have reused 2*1 and 3*4 and now need
one less multiplication

If we have a pair of 2-digits numbers a and b


a = a1a0 and b = b1b0
we can write c = a*b = c2102+c1101+c0100

c2 = a1*b1 c0 = a0*b0

c1 = (a1+a0)*(b1+b0)-(c2+c0)
Div. & Conq. : Multiplication of Large Integers
If we have a pair of 2-digits numbers a and b
a = a1a0 and b = b1b0
we can write c = a*b = c2102+c1101+c0100
c2 = a1*b1 , c0 = a0*b0
a = 1234 = 1*103+2*102+3*101+4*100
= (12)102+(34)
c1 = (a1+a0)*(b1+b0)-(c2+c0)
(12)102+(34) = (1*101+2*100)102+3*101+4*100
If we have two n-digits numbers,
a and b (assume n is a power of 2) Apply the same idea
a: a1 a0 recursively to get c2, c1, c0
n/2 digits until n is so small that you can
b: b1 b0 you can directly multiply

c = a*b =
=
We can write, =
a = a110n/2 + a0
Why?
b = b110 n/2
+ b0
c2 = a1*b1 c0 = a0*b0

c1 = (a1+a0)*(b1+b0)-(c2+c0)
Div. & Conq. : Multiplication of Large Integers
c = a*b =
Notice: a1, a0, b1, b0 all are n/2 5 additions
digits numbers 1 subtraction
c2 = a1*b1 c0 = a0*b0
So, computing a*b requires
three n/2-digits multiplications c1 = (a1+a0)*(b1+b0)-(c2+c0)

Recurrence for the number of


Multiplications is Assume n = 2m

M(n) = 3M(n/2) = 3[ 3M(n/22) ] = 32M(n/22)


M(n) = 3M(n/2) for n > 1
M(1) = ? M(n) = 3mM(n/2m) = ?
How many additions Let = x
And subtractions? =
# of add/sub, M(n) = 3m = 3lgn = nlg3 =
A(n) = 3A(n/2)+cn for n > 1 Why? x=
A(1) = 0
Using Master Thm, M(n) ≈ n1.585
A(n) є Θ(nlg3)
Div. & Conq. : Multiplication of Large Integers
• People used to believe that multiplying two n-digits
number has complexity Ω(n2)
• In 1960, Russian mathematician Anatoly
Karatsuba, gave this algorithm whose asymptotic
complexity is Θ(n1.585)
• A use of large number multiplication is in modern
cryptography
• It does not generally make sense to recurse all the
way down to 1 bit: for most processors 16- or 32-bit
multiplication is a single operation; so by this time,
the numbers should be handed over to built-in
procedure
Next we see how to multiply
Matrices efficiently…
Div. & Conq. : Strassen’s Matrix Multiplication

• How do we multiply two 2×2 matrices ?

1 2 3 5 How many multiplications


5 = 13 and additions did we need?
3
4 1 13 8 mults and 4 adds
4
V. Strassen in 1969 found out, he can 31
do the above multiplication in the following
way:
m1+m4-m5+m7 m3+m5
a00 b00 b01 c 00
= c01 =
a01 m2+m4 m1+m3-m2+m6
b10
a10 b11 c10
m1 = (a00+a11)*(b00+b11c)11 m2 = (a10+a11)*b00 m3 = a00*(b01-b11)
a11
m4 = a11*(b10-b00) m5 = (a00+a01)*b11 7 mults
18 adds/subs
m6 = (a10-a00)*(b00+b01) m7 = (a01-a11)*(b10+b11)
Div. & Conq. : Strassen’s Matrix Multiplication

• Let us see how we can apply Strassen’s idea for


multiplying two n×n matrices
Let A and B be two n×n matrices where n is a power of 2

A00 A01 B00 C00


B01 = C01
A10
A11 B10 C10
Each block is B
(n/2)×(n/2)
11 C11
E.g., In Strassen’s
method,
You can treat blocks
M1 = (A00+A11)*(B00+B11)
as if they were numbers
to get the C = A*B M2 = (A10+A11)*B00
etc.
Div. & Conq. : Strassen’s Matrix Multiplication
ALGORITHM Strassen(A, B, n)
//Input: A and B are n×n matrices Recurrence for
//where n is a power of two # of multiplications is
//Output: C = A*B
M(n) = 7M(n/2) for n > 1
if n = 1
M(1) = ?
return C = A*B
else For n = 2m,
A00 A01 B00 B01
M(n) = 7M(n/2) = 72M(n/22)
Partition A = and B =
A10 A11 B10 B11 M(n) = 7m M(n/2m)
where the blocks Aij and Bij are (n/2)-by-(n/2) = 7m = 7lgn
= nlg7 ≈ n2.807
M1 <- Strassen(A00+A11, B00+B11, n/2)
M2 <- Strassen(A10+A11, B00, n/2) For # of adds/subs,
M3 <- Strassen(A00, B01-B11, n/2)
A(n) = 7A(n/2)+18(n/2)2 for n > 1
M4 <- Strassen(A11, B10-B00, n/2)
A(1) = 0
M5 <- Strassen(A00+A01, B11, n/2)
M6 <- Strassen(A10-A00, B00+B01, n/2) Using Master thm,
M7 <- Strassen(A01-A11, B10+B11, n/2) A(n) є Θ(nlg7) better
C00 <- M1+M4-M5+M7 than brute-force’s Θ(n3)
C01 <- M3+M5
C10 <- M2+M4
C00 C01
C11 <- M1+M3-M2+M6
DONE WITH STRASSEN!
return C = C10
C11
Searching: Finding Maxima in 1-D
A Simple Example in 1-D
Finding the maximum of a set S of n numbers

14

14 , 10, 4 , 8 , 2 , 12, 6,0


14
12

14 , 10, 4 , 8 2 , 12, 6,0


14 8 12 6

14 , 10 4,8 2 , 12 6,0

14 10 4 8 2 12 6 0
Time Complexity



2T(n/2) + 1 , n > 2
T(n) =  1 , n 2

• Assume n = 2k, then


T(n) = 2T(n/2) + 1 = 2(2T(n/4) + 1) + 1
= 22T(n/22) + 2 + 1
= 22(2T(n/23) + 1) + 2 + 1
= 23T(n/23) + 22 + 21 + 1
:
= 2k-1T(n/2k-1)+2k-2 +…+ 22 + 21 + 1
= 2k-1T(2) + 2k-2 +…+ 22 + 21 + 1
= 2k-1 + 2k-2 +…+ 4 + 2 + 1 = 2k - 1 = n – 1 = (n)
Finding Maxima in 2-D using
Divide and Conquer
How to Find Maxima in 2-D
L

P2
P3

P1
P4

SL SR

{P1, P2} both maximal in SL and {P3} only maxima in


SR
Merging SL and SR
L

P2
P3

P1
P4

SL SR

After Merging Maximal in SLand SR we get {P2, P3} only


maximal
Divide and Conquer for Maxima Finding Problem

P1
P2
P12
P6
P3

P10

The maximal points of SL and SR


Divide and Conquer for Maxima Finding Problem

P1
P2
P12
P3 P6

P10

P3 is not maximal point of SL


2-D Maxima Finding Problem

P1
P2
P12
P6
P3
P10
Algorithm: Maxima Finding Problem
Input: A set S of 2-dimensional points.
Output: The maximal set of S.
Maxima(P[1..n])
1. Sort the points in ascending order w. r .t. X axis
2. If |S| = 1, then return it, else
find a line perpendicular to X-axis which separates S
into SL and SR, each of which consisting of n/2 points.
3. Recursively find the maxima’s SL and SR
4. Project the maxima’s of SL and SR onto L and sort
these points according to their y-values.
5. Conduct a linear scan on the projections and discard
each of maxima of SL if its y-value is less than the y-
value of some maxima’s of SR .
Time Complexity


2T(n/2) + O(n) + O(n) , n 2
T(n) =  1 ,n<2

Assume n = 2k, then


T(n) = 2T(n/2) + n + n
= 2(2T(n/4) + n/2 + n/2) + n + n
= 22T(n/22) + n + n + n + n
= 22T(n/22) + 4n
= 22(2T(n/23) + n/4 + n/4) + 4n
= 23T(n/23) + n + n + 6n
Time Complexity
T(n) = 23T(n/23) + n + n + 6n
.
.

T(n) = 2kT(n/2k) + 2kn


= 2kT(2k/2k) + 2kn Since n = 2k

Hence
T(n) = 2k + 2kn
T(n) = 2k + 2kn n = 2k  k = log(n)
T(n) = n + 2n.logn = (n.logn)
Necessary Dividing Problem into two Parts?
Maximal Points: Dividing Problem into four Parts
Maximal points in S11 = {P1}
Maximal points in S12 = {P3, P4}
Maximal points in S21 = {P5, P6}
Maximal points in S22 = {P7, P8}

P3
P7
P5
P1

P4
P6
P8
Maximal Points: Dividing Problem into four Parts

A1
Merging S12, S12 A2

A1 = {P3, P4}
P3
P7
Merging S21, S22
A2 = {P7, P8}
P4
Merging A1, A2 P8
A = {P3, P7, P8}
Maximal Points: Dividing Problem into four Parts

A
Merging S12, S12
A1 = {P3, P4}
P3
P7
Merging S21, S22
A2 = {P7, P8}
P4

Merging A1, A2 P8

A = {P3, P7, P8}


Finding Closest Pair in 2-D
Closest Pair in 2-D using Divide and Conquer
Problem
The closest pair problem is defined as follows:
• Given a set of n points
• Determine the two points that are closest to each
other in terms of distance.
• Furthermore, if there are more than one pair of
points with the closest distances, all such pairs
should be identified.
Div. & Conq.: Closest pair
• Find the two closest points in a set of n
points
ALGORITHM BruteForceClosestPair(P)
//Input: A list P of n (n≥2) points p1(x1,y1),
//p2(x2,y2), …, pn(xn,yn)
Traffic
//Output: distance between closest pair
Control:
d <- ∞
detect two
for i <- 1 to n-1 do
vehicles
for j <- i+1 to n do
most likely
d <- min( d, sqrt( ) )
to collide
return d

Idea: consider each pair of points and keep track of the


pair having the minimum distance
n =
There are pairs, so time-efficiency is in Θ(n2)
2
Closest Pair: Divide and Conquer Approach
• First we sort the points on x-coordinate basis, and
divide into left and right parts
p1 p2 ... pn/2 and pn/2+1 ... Pn
• Solve recursively the left and right sub-problems
• Let d = min {dl, dr},
• How do we combine
two solutions to
sub-problems?
d

PL PR
Closest Pair: Divide and Conquer Approach
• How do we combine two solutions?
– Let d = min {dl, dr}, where d is distance of closest
pair where both points are either in left or in right
– Something is missing. We have to check where
one point is from left and the other from the right.
– Such closest-pair can only be in a strip of width
2d around the dividing line, otherwise the points
would be more than d units apart.
• Combining solutions:
• Finding the closest pair in a strip of width 2d,
knowing that no one in any two given pairs is
closer than d
Closest Pair: Divide and Conquer Approach
• Combining solutions:
• For a given point p from one partition, where can
there be a point q from the other partition, that can
form the closest pair with p?
• How many points can there be p

in this square?
– At most 4
• Algorithm for checking the strip:
– Sort all the points in the strip on the y-coordinate
– For each point p only 7 points ahead of it in the
order have to be checked to see if any of them is
closer to p than d
Div. & Conq.: Closest pair (contd.)
• We shall apply “divide and conquer”
technique to find a better solution
Any idea how to divide and then
conquer? Solve right and left
x=m portions recursively
Let P be the and then combine
set of points the partial solutions
sorted by
x-coordinates dl
How should we
combine?
let Q be the dr d = min {dl, dr} ?
same points sorted
by y-coordinates Does not work,
because one point
can be in left portion
Left portion Right portion and the other could be
in right portion having
distance < d between
them…
Div. & Conq.:
x=m
Closest pair (contd.)
d = min{dl, dr}
We wish to find a pair
dl having distance < d
It is enough to consider
dr the points inside the symmetric
vertical strip of width 2d around
the separating line! Why?
Because the distance
between any other pair
Left portion Right portion of points is at least d

But S can We shall scan through S, updating


contain all the information about dmin, inititially
d d dmin = d using brute-force.
points, right?
Let S be the list of points Let p(x, y) be a point in S.
for a point p’(x’, y’) to have a
inside the strip of width 2d
chance to be closer to p than dmin,
obtained from Q, meaning ?
p’ must “follow” p in S and the
difference bewteen their
y-coordinates must be less than dmin
Div. & Conq.:
x=m
Closest pair (contd.)
Let p(x, y) is a point in S.
For a point p’(x’, y’) to have a
chance to be closer to p than dmin,
dl
p’ must “follow” p in S and
the difference bewteen their
dr y-coordinates must be less
than dmin Why? It seems, this rectangle
can contain many
points, may be all…
Geometrically,
p’ must be in
x=m
the following
Left portion Right portion rectangle
d d

dmin
d d
p
Now comes the crucial observation, x=m
everything hinges on this one…
How many points can there d d One of these 8
being p, we need
be in the dmin-by-2d rectangle? d to check 7 pairs
to find if any pair
has distance < dmin
My claim is “this”
is the most you can
put in the rectangle…
Div. & Conq.: Closest pair (contd.)
ALGORITHM EfficientClosestPair(P, Q)
//Solves closest-pair problem by divide and conquer
//Input: An array P of n ≥ 2 points sorted by x-coordinates and another array Q of same //points
sorted by y-coordinates The algorithm spends linear time
//Output: Distance between the closest pair in dividing and merging, so assuming
if n ≤ 3 n = 2m, we have the following
return minimal distance by brute-force recurrence for runnning-time,
else
T(n) = 2T(n/2)+f(n) where f(n) є Θ(n)
copy first points of P to array P l
copy the same points from Q to array Q l
copy the remaining points of P to array P r Applying Master Theorem,
copy the same points from Q to array Q r T(n) є Θ(nlgn)
dl <- EfficientClosestPair( Pl, Ql )
dr <- EfficientClosestPair( Pr, Qr )
Dividing line d <- min{ dl, dr } Brute froce
m <- P[-1].x Inside the
Points in 2*d
copy all points of Q for which |x-m| < d into array S[0..num-1] 2*d width strip
width strip
dminsq <- d2
for i <- 0 to num-2 do
k <- i+1
while k ≤ num-1 and ( S[k].y – S[i].y )2 < dminsq
dminsq <- min( (S[k].x-S[i].x)2+(S[k].y-S[i].y)2 , dminsq)
k <- k+1
return sqrt( dminsq ) // could easily keep track of the pair of points
Closest Pair: Divide and Conquer Approach
1. Partition the strip into
squares of length d/2
d
as shown in the picture.
2. Each square contains at
most 1 point by d/2 d/2 d/2 d/2
definition of d.
3. If there are at least 2 d/2
squares between points
d/2
then they can not be
the closest points.
4. There are at most 8
squares to check. L
Closest Pair: Divide and Conquer Approach
Running Time
• Running time of a divide-and-conquer algorithm
can be described by a recurrence
– Divide = O(1)
– Combine = O(n lg n)
– This gives the recurrence given below
– Total running time: O(n log2 n)

n n 3

T (n)  n
2T ( 2 )  n log n otherwise
Improved Version: Divide and Conquer Approach
• Sort all the points by x and y coordinate once
• Before recursive calls, partition the sorted lists
into two sorted sublists for the left and right
halves, it will take simple time O(n)
• When combining, run through the y-sorted list
once and select all points that are in a 2d strip
around partition line, again time O(n)
• New recurrence:

n n 3

T (n)  n
2T ( 2 )  n otherwise
Conclusion
• Brute Force approach is discussed, design of
some algorithms is also discussed.
• Algorithms computing maximal points is
generalization of sorting algorithms
• Maximal points are useful in Computer
Sciences and Mathematics in which at least
one component of every point is dominated
over all points.
• In fact we put elements in a certain order
• For Brute Force, formally, the output of any
sorting algorithm must satisfy the following
two conditions:
– Output is in decreasing/increasing order
and

You might also like