1 Alg Lecture1 (1) (7 Files Merged)
1 Alg Lecture1 (1) (7 Files Merged)
1 Alg Lecture1 (1) (7 Files Merged)
Algorithm:
Electronic Commerce.
The ability of keeping the information (credit card
numbers, passwords, bank statements) private, safe, and
secure.
Algorithms involves encryption/decryption techniques.
Hard problems
We can identify the Efficiency of an algorithm
from its speed (how long does the algorithm
take to produce the result).
Computers may be fast but they are not infinitely fast, and
memory may be cheap but it is not free. This resources should
be used wisely.
As an example:
Analysis
Problem
specification
Design
Algorithm
Implementation
Program
Compilation
Executable
(solution)
Components of an Algorithm
• Variables and values
• Instructions
• Procedures…. functions
• Selections…. If statement
• Repetitions…. For , while loops
• Documentation
Running time & Input size
The running time of an algorithm on a particular input is
the number of primitive operations or “steps”
executed.
The best notation for “input size” depends on the
problem being studied. For example:
In sorting or computing discrete Fourier transforms the
most natural measure is the no. of item in the input
For example the array size n for sorting.
On the other hand in multiplying two integers the best
measure is the total no. of bits needed to represent the
input in ordinary binary notation .
• The running time depends on the input: an
already sorted sequence is easier to sort.
• Parameterize the running time by the size of the
input, since short sequences are easier to sort
than long ones.
• Generally, we seek upper bounds on the running
time, because everybody likes a guarantee.
Complexity of Algorithm
The complexity of an algorithm M is the function f(n)
which give the running time and/or storage space
requirement of the algorithm in term of the size n of
input data. Frequently storage space required by an
algorithm is simply a multiple of data size n.
Accordingly, unless or otherwise stated the term
“complexity” will refer to running time of an
algorithm.
Cases for complexity function
C2(g(n))
f(n)
C1(g(n))
n0 n
Big oh notation(O)
f(n) = O(g(n))
C(g(n))
f(n)
n
n0
Omega notation(Ω)
f(n)
C(g(n))
no n
Small oh notation(o)
f(n) = O(f(n))
f(n) = Ω(f(n))
Symmetry
Combine the solutions to the sub-problems into the solution for the
original problem.
References:
Lecture 4
The divide step simply computes an index q that partitions A[ p…r] into two
sub-arrays: A[ p…q], containing n/2 elements, and A[ q + 1…r], containing
n/2 elements.
To sort the entire sequence A ={A[1], A[2], . . . , A[ n]}, we make the initial
call MERGE-SORT( A, 1, length[ A]), where length[ A] = n.
Merge sort – cont.
The operation of merge sort on the array A = {1, 5, 2, 4, 6, 3, 2, 6}.
Merge sort
PARTITION (A, p, r)
1. x ← A [r]
2. i ← p – 1
3. for j ← p to r – 1
4. do if A[ j] ≤ x
5. then i ← i + 1
6. exchange A [i] ↔ A [j]
7. exchange A [i + 1] ↔ A [r]
8. return i + 1
Figure below shows the operation of
PARTITION on an 8-element array.
PARTITION always selects an element x = A
[r] as a pivot element around which to partition
the sub array A[p…...r].
Partitioning the array
i p, j r
(a) 2 8 7 1 3 5 6 4
p, i j r
(b) 2 8 7 1 3 5 6 4
p. i j r
(c) 2 8 7 1 3 5 6 4
p, i j r
(d) 2 8 7 1 3 5 6 4
p i j r
(e) 2 1 7 8 3 5 6 4
p i j r
(f) 2 1 3 8 7 5 6 4
p i j r
(g) 2 1 3 8 7 5 6 4
p i j r
(h) 2 1 3 8 7 5 6 4
p i r
(i) 2 1 3 4 7 5 6 8
• MERGE-SORT
– Contains details:
• T(n) = Θ(1) if n=1
T(n/2)+ T(n/2)+ Θ(n) if n>1
• Ignore details, T(n) = 2T(n/2)+ Θ(n).
– T(n) = Θ(1) if n=1
2T(n/2)+ Θ(n) if n>1
3
Methods for solving recurrence
Substitution method
Recursion method
The master method
The Substitution Method
• Two steps:
1. Guess the form of the solution.
• By experience, and creativity.
• By some heuristics.
– If a recurrence is similar to one you have seen before.
» T(n)=2T(n/2+17)+n, similar to T(n)=2T(n/2)+n, , guess O(nlg n).
– Prove loose upper and lower bounds on the recurrence and then reduce the
range of uncertainty.
» For T(n)=2T(n/2)+n, prove lower bound T(n)= Ω(n), and prove upper
bound T(n)= O(n2), then guess the tight bound is T(n)=O(nlg n).
• By recursion tree.
2. Use mathematical induction to find the constants and show
that the solution works.
5
Solve T(n)=2T(n/2)+n
• Guess the solution: T(n)=O(nlg n),
– i.e., T(n)≤ cnlg n for some c.
• Prove the solution by induction:
– Suppose this bound holds for n/2, i.e.,
• T(n/2)≤ cn/2 lg (n/2).
– T(n) ≤ 2(cn/2 lg (n/2))+n
• ≤ cn lg (n/2))+n
• = cn lg n - cn lg 2 +n
• = cn lg n - cn +n
• ≤ cn lg n (as long as c≥1)
7
Subtleties
• Guess is correct, but induction proof not work.
• Problem is that inductive assumption not strong
enough.
• Solution: revise the guess by subtracting a lower-order
term.
• Example: T(n)=T(n/2)+T(n/2)+1.
– Guess T(n)=O(n), i.e., T(n) ≤ cn for some c.
– However, T(n) ≤c n/2+c n/2+1 =cn+1, which does not
imply T(n) ≤ cn for any c.
– Attempting T(n)=O(n2) will work, but overkill.
– New guess T(n) ≤ cn – b will work as long as b ≥ 1.
– (Prove it in an exact way).
8
Avoiding Pitfall
• It is easy to guess T(n)=O(n) (i.e., T(n) ≤
cn) for T(n)=2T(n/2)+n.
• And wrongly prove:
– T(n) ≤ 2(c n/2)+n
• ≤ cn+n
• =O(n). wrongly !!!!
• Problem is that it does not prove the exact
form of T(n) ≤ cn.
9
Find bound, ceiling, floor, lower term–
domain transformation
• Find the bound: T(n)=2T(n/2)+n (O(nlogn)
• How about T(n)=2T(n/2)+n ?
• How about T(n)=2T(n/2)+n ?
– T(n)≤2T(n/2+1)+n
– Domain transformation
• Set S(n)=T(n+a) and assume S(n)=2S(n/2)+n (so S(n)=O(nlogn))
• S(n)=2S(n/2)+n T(n+a)=2T(n/2+a)+n
• T(n)≤2T(n/2+1)+n T(n+a) ≤ 2T((n+a)/2+1)+n
• Thus, set n/2+a=(n+a)/2+1, get a=2.
• so T(n)=S(n-2)=O((n-2)log(n-2)) = O(nlogn).
• How about T(n)=2T(n/2+19)+n ?
– Set S(n)=T(n+a) and get a=38.
• As a result, ceiling, floor, and lower terms will not affect.
– Moreover, the master theorem also provides proof for this.
10
Changing Variables
• Suppose T(n)=2T(√n)+lg n.
• Rename m=lg n. so T(2m)=2T(2m/2)+m.
• Domain transformation:
– S(m)=T(2m), so S(m)=2S(m/2)+m.
– Which is similar to T(n)=2T(n/2)+n.
– So the solution is S(m)=O(m lg m).
– Changing back to T(n) from S(m), the solution is T(n)
=T(2m)=S(m)=O(m lg m)=O(lg n lg lg n).
11
The Recursion-tree Method
• Idea:
– Each node represents the cost of a single subproblem.
– Sum up the costs with each level to get level cost.
– Sum up all the level costs to get total cost.
• Particularly suitable for divide-and-conquer
recurrence.
• Best used to generate a good guess, tolerating
“sloppiness”.
• If trying carefully to draw the recursion-tree and
compute cost, then used as direct proof.
12
Recursion Tree for T(n)=3T(n/4)+Θ(n2)
T(n) cn2 cn2
T(n/4) T(n/4) T(n/4) c(n/4)2 c(n/4)2 c(n/4)2
log 4n (3/16)2cn2
c(n/16)2 c(n/16)2 c(n/16)2 c(n/16)2 c(n/16)2 c(n/16)2 c(n/16)2 c(n/16)2 c(n/16)2
14
Prove the above Guess
• T(n)=3T(n/4)+Θ(n2) =O(n2).
• Show T(n) ≤dn2 for some d.
• T(n) ≤3(d (n/4)2) +cn2
≤3(d (n/4)2) +cn2
=3/16(dn2) +cn2
≤ dn2, as long as d≥(16/13)c.
15
One more example
• T(n)=T(n/3)+ T(2n/3)+O(n).
• Construct its recursive tree (Figure 4.2,
page 71).
• T(n)=O(cnlg3/2n) = O(nlg n).
• Prove T(n) ≤ dnlg n.
16
Recursion Tree of T(n)=T(n/3)+ T(2n/3)+O(n)
17
Master Method/Theorem
• Theorem 4.1 (page 73)
– for T(n) = aT(n/b)+f(n), n/b may be n/b or n/b.
– where a ≥ 1, b>1 are positive integers, f(n) be a non-
negative function.
1. If f(n)=O(nlogba-ε) for some ε>0, then T(n)= Θ(nlogba).
2. If f(n)= Θ(nlogba), then T(n)= Θ(nlogba lg n).
3. If f(n)=Ω(nlogba+ε) for some ε>0, and if af(n/b) ≤cf(n)
for some c<1 and all sufficiently large n, then T(n)=
Θ(f(n)).
18
Implications of Master Theorem
• Comparison between f(n) and nlogba (<,=,>)
• Must be asymptotically smaller (or larger) by a
polynomial, i.e., nε for some ε>0.
• In case 3, the “regularity” must be satisfied, i.e.,
af(n/b) ≤cf(n) for some c<1 .
• There are gaps
– between 1 and 2: f(n) is smaller than nlogba, but not
polynomially smaller.
– between 2 and 3: f(n) is larger than nlogba, but not
polynomially larger.
– in case 3, if the “regularity” fails to hold.
19
Application of Master Theorem
• T(n) = 9T(n/3)+n;
– a=9,b=3, f(n) =n
– nlogba = nlog39 = Θ (n2)
– f(n)=O(nlog39-ε) for ε=1
– By case 1, T(n) =Θ (n2).
• T(n) = T(2n/3)+1
– a=1,b=3/2, f(n) =1
– nlogba = nlog3/21 = Θ (n0) = Θ (1)
– By case 2, T(n)= Θ(lg n).
20
Application of Master Theorem
• T(n) = 3T(n/4)+nlg n;
– a=3,b=4, f(n) =nlg n
– nlogba = nlog43 = Θ (n0.793)
– f(n)= Ω(nlog43+ε) for ε≈0.2
– Moreover, for large n, the “regularity” holds for
c=3/4.
• af(n/b) =3(n/4)lg (n/4) ≤ (3/4)nlg n = cf(n)
– By case 3, T(n) =Θ (f(n))=Θ (nlg n).
21
Exception to Master Theorem
• T(n) = 2T(n/2)+nlg n;
– a=2,b=2, f(n) =nlg n
– nlogba = nlog22 = Θ (n)
– f(n) is asymptotically larger than nlogba , but not
polynomially larger because
– f(n)/nlogba = lg n, which is asymptotically less
than nε for any ε>0.
– Therefore,this is a gap between 2 and 3.
22
Where Are the Gaps
f(n), case 3, at least polynomially larger
nε Gap between case 3 and 2
c1
nlogba f(n), case 2: within constant distances
c2
nε Gap between case 1 and 2
f(n), case 1, at least polynomially smaller
• Proof:
– By iterating the recurrence
– By recursion tree (See figure 4.3)
24
Recursion tree for T(n)=aT(n/b)+f(n)
25
Proof of Master Theorem (cont.)
• Lemma 4.3:
– Let a ≥ 1, b>1, f(n) be a nonnegative function defined on
exact power of b, then
logbn-1
– g(n)= ajf(n/bj) can be bounded for exact power of b as:
j=0
1. If f(n)=O(nlogba-ε) for some ε>0, then g(n)= O(nlogba).
2. If f(n)= Θ(nlogba), then g(n)= Θ(nlogba lg n).
3. If f(n)= Ω(nlogba+ε) for some ε>0 and if af(n/b) ≤cf(n) for
some c<1 and all sufficiently large n ≥b, then g(n)= Θ(f(n)).
26
Proof of Lemma 4.3
• For case 1: f(n)=O(nlogba-ε) implies f(n/bj)=O((n /bj)logba-ε), so
logbn-1 logbn-1
• g(n)= ajf(n/bj) =O( a j(n /bj)logba-ε )
j=0 j=0
logbn-1 logbn-1
• = O(nlogba-ε a j/(blogba-ε)j ) = O(nlogba-ε
a j/(aj(b-ε)j))
j=0 j=0
logbn-1
• = O(nlogba-ε (bε)j ) = O(nlogba-ε (((bε ) logbn-1)/(bε-1) )
j=0
• = O(nlogba-ε (((blogbn)ε -1)/(bε-1)))=O(nlogba n-ε (nε -1)/(bε-1))
• = O(nlogba )
27
Proof of Lemma 4.3(cont.)
• For case 2: f(n)= Θ(nlogba) implies f(n/bj)= Θ((n /bj)logba), so
logbn-1 logbn-1
• g(n)= ajf(n/bj) = Θ( aj(n /bj)logba)
j=0 j=0
logbn-1 logbn-1
• = Θ(nlogba aj/(blogba)j ) = Θ(nlogba 1)
j=0 j=0
28
Proof of Lemma 4.3(cont.)
• For case 3:
– Since g(n) contains f(n), g(n) = Ω(f(n))
– Since af(n/b) ≤cf(n), ajf(n/bj) ≤cjf(n) , why???
logbn-1 j logbn-1 j ∞
– g(n)= a f(n/bj) ≤ c f(n) ≤ f(n) cj
j=0 j=0 j=0
– =f(n)(1/(1-c)) =O(f(n))
– Thus, g(n)=Θ(f(n))
29
Proof of Master Theorem (cont.)
• Lemma 4.4:
– for T(n) = Θ(1) if n=1
– aT(n/b)+f(n) if n=bk for k≥1
– where a ≥ 1, b>1, f(n) be a nonnegative function,
1. If f(n)=O(nlogba-ε) for some ε>0, then T(n)= Θ(nlogba).
2. If f(n)= Θ(nlogba), then T(n)= Θ(nlogba lg n).
3. If f(n)=Ω(nlogba+ε) for some ε>0, and if af(n/b) ≤cf(n)
for some c<1 and all sufficiently large n, then T(n)=
Θ(f(n)).
30
Proof of Lemma 4.4 (cont.)
• Combine Lemma 4.2 and 4.3,
– For case 1:
• T(n)= Θ(nlogba)+O(nlogba)=Θ(nlogba).
– For case 2:
• T(n)= Θ(nlogba)+Θ(nlogba lg n)=Θ(nlogba lg n).
– For case 3:
• T(n)= Θ(nlogba)+Θ(f(n))=Θ(f(n)) because f(n)=
Ω(nlogba+ε).
31
Floors and Ceilings
• T(n)=aT(n/b)+f(n) and T(n)=aT(n/b)+f(n)
• Want to prove both equal to T(n)=aT(n/b)+f(n)
• Two results:
– Master theorem applied to all integers n.
– Floors and ceilings do not change the result.
• (Note: we proved this by domain transformation too).
• Since n/b≤n/b, and n/b≥ n/b, upper bound for
floors and lower bound for ceiling is held.
• So prove upper bound for ceilings (similar for lower
bound for floors).
32
Upper bound of proof for T(n)=aT(n/b)+f(n)
• consider sequence n, n/b, n/b/b, n/b /b/b,
…
• Let us define nj as follows:
• nj = n if j=0
• = nj-1/b if j>0
• The sequence will be n0, n1, …, nlogbn
• Draw recursion tree:
33
Recursion tree of T(n)=aT(n/b)+f(n)
34
The proof of upper bound for ceiling
logbn -1
– T(n) = Θ(nlogba)+ ajf(nj)
j=0
35
The simple format of master theorem
36
Summary
Recurrences and their bounds
– Substitution
– Recursion tree
– Master theorem.
– Proof of subtleties
– Recurrences that Master theorem does not
apply to.
37
Design and Analysis of Algorithms
Heap sort
Lecture 7
References:
Introduction to Algorithms, Thomas H. Cormen,
2ed edition,
The (binary) heap data structure is an array object that can be viewed as a
nearly complete binary tree. Each node of the tree corresponds to an
element of the array that stores the value in the node. The tree is
completely filled on all levels except possibly the lowest, which is
filled from the left up to a point. An array A that represents a heap is an
object with two attributes: length[A], which is the number of elements
in the array, and heap-size[A], the number of elements in the heap stored
within array A.
The root of the tree is A[1], and given the index i of a node, the indices of its
parent PARENT(i), left child LEFT(i), and right
child RIGHT(i) can be computed simply:
Heap types
There are two type of binary heaps
Max heap :in a max heap ,the max heap property
is that for every node i other than root
A[Parent(i)] =>A[i]
Min heap :in a min heap ,the min heap property is
that for every node i other than root
A[Parent(i)] <=A[i]
For the heap sort algorithm we use max heap
Viewing a heap as a tree we define the height of a
node in a heap to be the longest simple
downwards path from the node to a leaf and we
define the height of the tree as the height of its
root .since a heap of n element is based on a
complete binary tree its height is Θ(n)
Basic procedures
MAX- HEAPIFY procedure which run in
O(lgn ) time
Initialization: Prior to the first iteration of the loop, i = ⌊n/2⌋. Each node ⌊n/2⌋ +
1, ⌊n/2⌋ + 2, . . . , n is a leaf and is thus the root of a trivial max-heap.
Maintenance: To see that each iteration maintains the loop invariant, observe
that the children of node i are numbered higher than i. By the loop invariant,
therefore, they are both roots of max-heaps. This is precisely the condition
required for the call MAX-HEAPIFY(A, i) to make node i a max-heap root.
Moreover, the MAX-HEAPIFY call preserves the property that nodes i + 1, i + 2,
. . . , n are all roots of max-heaps. Decrementing i in the for loop update
reestablishes the loop invariant for the next iteration.
The last summation can be evaluated by substituting x = 1/2 in the formula (A.8), which yields
(A.8)