0% found this document useful (0 votes)
13 views

Algorithms - Greedy Algorithms

These are related to Greedy Algorithms
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Algorithms - Greedy Algorithms

These are related to Greedy Algorithms
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 73

Algorithms: Greedy Algorithms

Amotz Bar-Noy

CUNY

Spring 2012

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 1 / 62


Greedy Algorithms

Greedy algorithms make decisions that “seem” to be the best


following some greedy criteria.

In Off-Line problems:
The whole input is known in advance.
Possible to do some preprocessing of the input.
Decisions are irrevocable.

In Real-Time and On-Line problems:


The present cannot change the past.
The present cannot rely on the un-known future.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 2 / 62


How and When to use Greedy Algorithms?

Initial solution: Establish trivial solutions for a problem of a small


size. Usually n = 0 or n = 1.

Top bottom procedure: For a problem of size n, look for a greedy


decision that reduces the size of the problem to some k < n and
then, apply recursion.

Bottom up procedure: Construct the solution for a problem of


size n based on some greedy criteria applied on the solutions to
the problems of size k = 1, . . . , n − 1.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 3 / 62


The Coin Changing Problem

Input:
Integer coin denominations dn > · · · > d2 > d1 = 1.
An integer amount to pay: A.

Output: Number of coins ni for each denomination di to get the


exact amount.
A = nn dn + nn−1 dn−1 + n2 d2 + n1 d1 .

Goal: Minimize total number of coins.


N = nn + · · · + n2 + n1 .

Remark: There is always a solution with N = A since d1 = 1.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 4 / 62


Examples

USA: d6 = 100, d5 = 50, d4 = 25, d3 = 10, d2 = 5, d1 = 1.


A = 73 = 2 · 25 + 2 · 10 + 3 · 1.
N = 2 + 2 + 3 = 7.

Old British: d3 = 240, d2 = 20, d1 = 1.


A = 307 = 1 · 240 + 3 · 20 + 7 · 1.
N = 1 + 3 + 7 = 11.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 5 / 62


Greedy Solution

Idea: Use the largest possible denomination and update A.

Implementation:
Coin-Changing(dn > · · · > d2 > d1 = 1)
for i = n downto 1
ni = bA/di c
A = A mod di = A − ni di
Return(N = nn + · · · + n2 + n1 )

Correctness: A = nn dn + nn−1 dn−1 + n2 d2 + n1 d1 .

Complexity: Θ(n) division and mod integer operations.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 6 / 62


Optimality

Greedy is optimal for the USA system.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 7 / 62


Optimality

Greedy is optimal for the USA system.

A coin system for which Greedy is not optimal:


d3 = 4, d2 = 3, d1 = 1 and A = 6:
Greedy: 6 = 1 · 4 + 2 · 1 ⇒ N = 3.
Optimal: 6 = 2 · 3 ⇒ N = 2.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 7 / 62


Optimality

Greedy is optimal for the USA system.

A coin system for which Greedy is not optimal:


d3 = 4, d2 = 3, d1 = 1 and A = 6:
Greedy: 6 = 1 · 4 + 2 · 1 ⇒ N = 3.
Optimal: 6 = 2 · 3 ⇒ N = 2.

A coin system for which Greedy is very “bad”:


d3 = x + 1, d2 = x, d1 = 1 and A = 2x:
Greedy: 2x = 1 · (x + 1) + (x − 1) · 1 ⇒ N = x.
Optimal: 2x = 2 · x ⇒ N = 2.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 7 / 62


Efficiency

Optimal solution: Check all possible combinations.


Not a polynomial time algorithm.

Another optimal solution: Polynomial in both n and A.


Not a strongly polynomial time algorithm.

Objective:
Find a solution that is polynomial only in n.
Probably impossible!?

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 8 / 62


The Knapsack Problem

Input:
A thief enters a store and finds n items I1 , . . . , In .
The value of item Ii is v (Ii ) and its weight is w(Ii ).
Both are positive integers.
The thief can carry at most weight W .
The thief either takes all of item Ii or doesn’t take item Ii .

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 9 / 62


The Knapsack Problem

Input:
A thief enters a store and finds n items I1 , . . . , In .
The value of item Ii is v (Ii ) and its weight is w(Ii ).
Both are positive integers.
The thief can carry at most weight W .
The thief either takes all of item Ii or doesn’t take item Ii .

Goal: Carry items with maximum total value.


Which are these items?
What is their total value?

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 9 / 62


A General Greedy Scheme

Order the items according to some greedy criterion.


Assume this order is J1 , J2 , . . . , Jn .
Assume J1 is the most desired item and Jn is the least desired item.

If J1 is not too heavy (w(J1 ) ≤ W ):


Take item J1 .
Continue recursively with J2 , J3 , . . . , Jn and updated maximum
weight W − w(J1 ).

If J1 is too heavy (w(J1 ) > W ):


Ignore item J1 .
Continue recursively with J2 , J3 , . . . , Jn and the same maximum
weight W .

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 10 / 62


A General Greedy Scheme – Implementation

Non-Recursive Knapsack(I1 , . . . , In , w(·), v (·), W )


Let J1 , . . . , Jn be the new order on the items.
S = ∅ (* the set of items the thief takes *)
V = 0 (* the value of these items *)
for i = 1 to n
if w(Ji ) ≤ W then
S = S ∪ {Ji }
V = V + v (Ji )
W = W − w(Ji )
Return(S, V )

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 11 / 62


Greedy Criteria

Greedy criterion I: Order the items by their value from the most
expensive to the cheapest.

Greedy criterion II: Order the items by their weight from the
lightest to the heaviest.

Greedy criterion III: Order the items by their ratio of value over
weight from the largest ratio to the smallest ratio.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 12 / 62


The three criteria are not optimal

Counter example for Greedy-by-Value and Greedy-by-Ratio:


3 items and maximum weight is W = 10. Weights and values are:
I1 = h6, 10i, I2 = h5, 6i, and I3 = h5, 6i.
Optimal takes items I2 and I3 for a profit of 12.
Greedy-by-Value or Greedy-by-Ratio take only item I1 for a profit
of 10.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 13 / 62


The three criteria are not optimal

Counter example for Greedy-by-Value and Greedy-by-Ratio:


3 items and maximum weight is W = 10. Weights and values are:
I1 = h6, 10i, I2 = h5, 6i, and I3 = h5, 6i.
Optimal takes items I2 and I3 for a profit of 12.
Greedy-by-Value or Greedy-by-Ratio take only item I1 for a profit
of 10.

Counter example for Greedy-by-Weight:


3 items and maximum weight is W = 10. Weights and values are:
I1 = h6, 13i, I2 = h5, 6i, and I3 = h5, 6i.
Optimal takes only item I1 for a profit of 13.
Greedy-by-Weight takes items I2 and I3 for a profit of 12.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 13 / 62


Very Bad Counter Examples for Criteria I and II

Counter example for Greedy-by-Value:


n items and maximum weight is W . Weights and values are:
I1 = hW , 2i , I2 = h1, 1i , . . . , I3 = h1, 1i.
Optimal takes items I2 , . . . , In for a profit of n − 1.
Greedy-by-Value takes only item I1 for a profit of 2.
The ratio is (n − 1)/2.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 14 / 62


Very Bad Counter Examples for Criteria I and II

Counter example for Greedy-by-Value:


n items and maximum weight is W . Weights and values are:
I1 = hW , 2i , I2 = h1, 1i , . . . , I3 = h1, 1i.
Optimal takes items I2 , . . . , In for a profit of n − 1.
Greedy-by-Value takes only item I1 for a profit of 2.
The ratio is (n − 1)/2.

Counter example for Greedy-by-Weight:


2 items and maximum weight is 2. Weights and values are:
I1 = h1, 1i and I2 = h2, xi for a very large x.
Optimal takes item I2 for a profit of x.
Greedy-by-Weight takes item I1 for a profit of 1.
The ratio is x.
Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 14 / 62
A Very Bad Counter Example for Criterion III

Counter example for Greedy-by-Ratio:


2 items and maximum weight is W . Weights and values are:
I1 = h1, 2i and I2 = hW , W i.
Optimal takes items I2 for a profit of W .
Greedy-by-Ratio takes item I1 for a profit of 2.
W
The ratio is almost 2 .

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 15 / 62


A Very Bad Counter Example for Criterion III

Counter example for Greedy-by-Ratio:


2 items and maximum weight is W . Weights and values are:
I1 = h1, 2i and I2 = hW , W i.
Optimal takes items I2 for a profit of W .
Greedy-by-Ratio takes item I1 for a profit of 2.
W
The ratio is almost 2 .

A 1/2 guaranteed approximation algorithm:


Greedy-by-Ratio guarantees half of the profit of Optimal with a
tweak.
Select either the output of greedy or the one item with the
maximum value whose weight is at most W .

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 15 / 62


The Fractional Knapsack Problem

The thief can take portions of items.

If the thief takes a fraction 0 ≤ pi ≤ 1 of item Ii :


Its value is pi v (Ii ).
Its weight is pi w(Ii ).

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 16 / 62


The Fractional Knapsack Problem

The thief can take portions of items.

If the thief takes a fraction 0 ≤ pi ≤ 1 of item Ii :


Its value is pi v (Ii ).
Its weight is pi w(Ii ).

Theorem: Greedy-by-Ratio is optimal.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 16 / 62


Proof

Assume that Greedy-by-Ratio fails on the input I1 , . . . , In and the


weight W .
Let the portions taken by Optimal be p1 , . . . , pn .
pi = 1: all of item Ii is taken.
pi = 0: none of item Ii is taken.
0 < pi < 1: some but not all of item Ii is taken.
Since Greedy-by-Ratio fails, there exist Ii and Ij such that:
v (Ii ) v (Ij )
w(Ii ) > w(Ij ) and pi < 1 and pj > 0.
Because each unit of weight of item Ii has more value than each
unit of weight of item Ij , it is more profitable to take more of item Ii
and less of item Ij .
A contradiction to the optimality of Optimal.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 17 / 62


The 0 − 1 Knapsack Problem

Optimal solution: Check all possible sets of items.


Not a polynomial time algorithm.

Another optimal solution: Polynomial in both n and W .


Not a strongly polynomial time algorithm.

Objective:
Find a solution that is polynomial only in n.
Probably impossible!?
However, Greedy-by-Ratio produces “good” solutions.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 18 / 62


The Activity-Selection Problem

Input:
Activities A1 , . . . , An that need the service of a common resource.
Activity Ai is associated with a time interval [si , fi ) for si < fi .
Ai needs the service from time si until just before time fi .

Mutual Exclusion: The resource serves at most one activity at


any time.

Definition: Ai and Aj are compatible if either fi ≤ sj or fj ≤ si .

Goal: Find a maximum size set of compatible activities.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 19 / 62


Example

Input: 3 activities A1 = [1, 4), A2 = [3, 6), A3 = [5, 8).

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 20 / 62


Example

Input: 3 activities A1 = [1, 4), A2 = [3, 6), A3 = [5, 8).

A graphical representation:
activities

A3

A2

A1

time
1 2 3 4 5 6 7 8

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 20 / 62


Example

Input: 3 activities A1 = [1, 4), A2 = [3, 6), A3 = [5, 8).

A graphical representation:
activities

A3

A2

A1

time
1 2 3 4 5 6 7 8

The best solution:


activities

A1 A3

time
1 2 3 4 5 6 7 8

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 20 / 62


Static vs. Dynamic Greedy

Static: The greedy criterion is determined in advance and cannot


be changed during the execution of the algorithm.

Dynamic: The greedy criterion may be modified during the


execution of the algorithm based on prior decisions.

Remark: A static criterion is also a dynamic criterion.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 21 / 62


A General Static Greedy Scheme

Maintain a set S of the activities that have been selected so far.

Initially, S = ∅ and at the end, S is an optimal solution.

Order the activities following some greedy criterion and consider


the activities according to this order.

Let A be the current considered activity. If A is compatible with all


the activities in S:
Then add A to S.
Else ignore A.

Continue until there are no activities to consider

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 22 / 62


A General Dynamic Greedy Scheme

Maintain two sets of activities:


S those that have been selected so far.
R those that can still be selected.
Initially, S = ∅ and R = {A1 , . . . , An }.
At the end, S is an optimal solution and R = ∅.

Select a “good” activity A from R, following some greedy criterion.

Add A to S.

Delete from R the activities that are not compatible with activity A.

Continue until R is empty.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 23 / 62


Greedy Criteria

Four criteria:
Prefer short activities.
Prefer activities intersecting few other activities.
Prefer activities that start earlier.
Prefer activities that terminate earlier.

Optimality: Only the fourth criterion is optimal.

Remarks:
All four criteria are static in their nature.
The second criterion has a dynamic version.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 24 / 62


An Optimal Greedy Solution

Preprocessing(A1 , . . . , An )
Sort the activities according to their finish time
Let this order be A1 , . . . , An (*i < j ⇒ fi ≤ fj *)

Greedy-Activity-Selector(A1 , . . . , An )
S = {A1 } (* A1 terminates the earliest *)
j = 1 (* Aj is the current selected activity *)
for i = 2 to n (* scan all the activities *)
if si ≥ fj (* check compatibility *)
then (* select Ai that is compatible with S *)
S = S ∪ {Ai }
j =i
else (* Ai is not compatible *)
Return(S)

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 25 / 62


Correctness and Complexity

Correctness: By definition.

Complexity:
The sorting can be done in O(n log n) time.
There are O(1) operations per each activity.
All together: O(n log n) + n · O(1) = O(n log n) time.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 26 / 62


Example - Input

activities

A 11

A 10

A9

A8

A7

A6

A5

A4

A3

A2

A1

time
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 27 / 62


Example - Output

activities

A 11

A8

A4

A1

time
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 28 / 62


Optimality

Let T be an optimal set of activities.

Transform T to S preserving the size of T .

Let A1 , . . . , An be ordered by their finish time.

Let Ai be the first activity that is in T and not in S.

All the activities in T that finish before Ai are also in S.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 29 / 62


Optimality

Ai ∈
/ S ⇒ ∃Aj ∈ S that is not in T in which j < i.

Aj is compatible with all the activities in T that finish before it since


they are all in S.

Aj is compatible with all the activities in T that finish after Ai since


it finishes before Ai .

Therefore, T ∪ Aj \ {Ai } is a solution with the same size as T
and hence optimal.

Continue this way until T becomes S.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 30 / 62


Example

activities

A 11

A 10

A9

A8

A7

A6

A5

A4

A3

A2

A1

time
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Another optimal solution with 4 activities.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 31 / 62


Example

activities

A 11

A 10

A9

A8

A7

A6

A5

A4

A3

A2

A1

time
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

A third optimal solution: after the first transformation.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 32 / 62


Example

activities

A 11

A 10

A9

A8

A7

A6

A5

A4

A3

A2

A1

time
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

The greedy solution: after the second transformation.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 33 / 62


Huffman Codes

Input:
An alphabet of n symbols a1 , . . . , an .
A frequency fi for each symbol ai :
Pn
i=1 fi = 1.
A File F containing L symbols from the alphabet.
ai appears exactly ni = fi · L times in F.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 34 / 62


Huffman Codes

Input:
An alphabet of n symbols a1 , . . . , an .
A frequency fi for each symbol ai :
Pn
i=1 fi = 1.
A File F containing L symbols from the alphabet.
ai appears exactly ni = fi · L times in F.

Output:
For symbol ai , 1 ≤ i ≤ n: A binary codeword wi of length `i .
A compressed (encoded) binary file F 0 of F.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 34 / 62


Huffman Codes – Goals

L0 the length of F 0 should be minimal.

An efficient algorithm to find the n codewords.


Good polynomial running time: (O(n log n)).

Efficient encoding and decoding of the file


Should be done in O(B)-time.
B is the size of the original file in bits.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 35 / 62


Example

A file with the alphabet a, b, c, d, e, f containing 100 symbols.


na = 45, nb = 13, nc = 12, nd = 16, ne = 9, nf = 5.

Code I:
wa = 000, wb = 001, wc = 010, wd = 011, we = 100, wf = 101.
Length of encoded file is 300.

Code II:
wa = 0, wb = 101, wc = 100, wd = 111, we = 1101, wf = 1100.
Length of encoded file is 224
1 · 45 + 3 · 13 + 3 · 12 + 3 · 16 + 4 · 9 + 4 · 5 = 224.

Remark: Code II is optimal, ≈ 25% better than code I.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 36 / 62


Prefix Free Codes

Definition: A prefix free code is a code in which no codeword is a


prefix of another codeword.

Examples: Both code I and code II are prefix free.

Proposition: A code in which the lengths of all the codewords is


the same is a prefix free code.

Theorem: Always exists an optimal prefix free code.

Encoding: “Easy” using tables.

Decoding: By scanning the coded text once.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 37 / 62


Binary Tree Representation for Prefix Free Codes

A code can be represented by a rooted and ordered binary tree


with n leaves.

Each leaf stores a codeword.

The codeword corresponding to a leaf is defined by the unique


path from the root to the leaf:
0 for going left.
1 for going right.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 38 / 62


Example: Code II

100

0 1

a:45 55

0 0 1

25 30

0 1 0 1

c:12 b:13 14 d:16

100 101 0 1 111

f:5 e:9

1100 1101

A leaf is represented by the symbol and its frequency.


An internal node is labelled by the sum of the frequencies of all
the leaves in its subtree.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 39 / 62


Binary Tree Representation

Proposition: The binary tree represents a prefix free code since


a path to a leaf cannot be a prefix of any other path.

Complexity Parameters:
f (x) the frequency of a leaf x.

`(x) the length of the path from the root to x.


P
The cost of the tree is: B(T ) = a leaf x (f (x) · `(x)).
B(T ) is the average length of a codeword.
P
The length of the encoded file: a leaf x (n(x) · `(x)).

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 40 / 62


A Structural Claim

Lemma: Let T be a tree that represents an optimal code. Then


each internal node in the tree has two children.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 41 / 62


A Structural Claim

Lemma: Let T be a tree that represents an optimal code. Then


each internal node in the tree has two children.

Proof:
Let z be an internal node with only one child y.
There are 2 cases:
Case I: z is the root.
Case II: z is not the root.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 41 / 62


Case I

z is the root: Make y the new root.

z y

y
B C

B C

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 42 / 62


Case II

z is not a root and p is its parent: Bypass z by making y the child


of p.

p p

z y
A A

y
B C

B C

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 43 / 62


Proof

In both cases:
`(x) of all the leaves in the sub-tree rooted at z is reduced by 1.

These are the only changes.

As a result the cost of the tree is improved.

A contradiction to the optimality of the code.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 44 / 62


Example: Code I

100

0 1

86 14

0 1 0

58 28 14

0 1 0 1 0 1

a:45 b:13 c:12 d:16 e:9 f:5

000 001 010 011 100 101

B(T ) = 300

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 45 / 62


Example: Improving Code I

100

0 1

86 14

0 1 0 1

58 28 e:9 f:5

0 1 0 1 10 11

a:45 b:13 c:12 d:16

000 001 010 011

B(T ) = 3 · 86 + 2 · 14 = 286

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 46 / 62


Huffman Algorithm

Construct a coding tree bottom-up.


Maintain a forest with n leaves in all of its trees. Each tree is
optimal for its leaves.
Initially, there are n singleton trees in the forest. Each tree is a leaf.
The frequency of a tree is the sum of the frequencies of all of its
leaves.
Greedy step:
Find the two trees with the minimum frequencies.
Combine them together into one tree.
The frequency of the new tree is the sum of the frequencies of the
two combined trees.
Terminate when there is only one tree in the forest.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 47 / 62


Example

f:5 e:9 c:12 b:13 d:16 a:45

.......................................................................

c:12 b:13 14 d:16 a:45

f:5 e:9

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 48 / 62


Example

c:12 b:13 14 d:16 a:45

f:5 e:9

.......................................................................

14 d:16 25 a:45

f:5 e:9 c:12 b:13

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 49 / 62


Example

14 d:16 25 a:45

f:5 e:9 c:12 b:13

.......................................................................
25 30 a:45

c:12 b:13 14 d:16

f:5 e:9

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 50 / 62


Example

25 30 a:45

c:12 b:13 14 d:16

f:5 e:9

.......................................................................
a:45 55

25 30

c:12 b:13 14 d:16

f:5 e:9

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 51 / 62


Example

a:45 55

25 30

c:12 b:13 14 d:16

f:5 e:9

.......................................................................
100

a:45 55

25 30

c:12 b:13 14 d:16

f:5 e:9

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 52 / 62


Huffman Code Animation

http://www.cs.auckland.ac.nz/˜jmor159/PLDS210/huffman.html

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 53 / 62


Correctness

Huffman algorithm generates a binary tree with n leaves.

A binary tree represents a prefix free code.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 54 / 62


Implementation – Data Structure

A forest of binary trees.


Initially, the forest contains n singleton trees.
At the end, the forest contains one tree.

The frequencies of the trees in the forest are maintained in a


priority queue Q.
Initially, the queue contains the n original frequencies.
At the end, the queue contains one frequency which is the sum of
all original frequencies.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 55 / 62


Implementation – Procedure

Huffman(ha1 , f1 i , . . . , han , fn i)
Build-Queue({f1 , . . . , fn } , Q)
for i = 1 to n − 1 (* the combination loop *)
z =Allocate-Node() (* creating a new root *)
x = left(z) =Extract-Min(Q)
(* lightest tree is the left sub-tree *)
y = right(z) =Extract-Min(Q)
(* second lightest tree is the right sub-tree *)
f (z) = f (x) + f (y ) (* frequency of new root *)
Insert(Q, f (z)) (* inserting the new root to the queue *)
return Extract-Min(Q) (* last tree is the Huffman code *)

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 56 / 62


Complexity

Implement the priority queue with a Binary Heap.

The complexity of Build-Queue is O(n).

The complexity of Extract-Min and Insert is O(log n).

The loop is executed O(n) times.

The complexity of all the Extract-Min and the Insert operations is


O(n log n).

The total complexity is: O(n log n).

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 57 / 62


Optimality - First Lemma

Let A be an alphabet.

Let x and y be the two symbols in A with the smallest frequencies.

Then, there exists an optimal tree in which:


x and y are adjacent leaves (differ only in their last bit).
x and y are the farthest leaves from the root.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 58 / 62


Proof

x z

y w

z w x y

Let z and w be adjacent leaves in an optimal tree that are the


farthest from the root.

Exchanging z and w with x and y yields a tree with a smaller or


equal cost.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 59 / 62


Optimality - Second Lemma

Let T be an optimal tree for the alphabet A.

Let x, y be adjacent leaves in T and let z be their parent.

Let A0 be A with a new symbol z replacing x and y with


frequency: f (z) = f (x) + f (y ).

Let T 0 be the tree T without the leaves x and y and with z as a


new leaf.

Then T 0 is an optimal tree for the alphabet A0 .

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 60 / 62


Proof

z z:f(x)+f(y)

x:f(x) y:f(y)

Let T 00 be an optimal tree with smaller cost than T 0 .


Replacing z in T 00 with the two leaves x and y creates a tree with
a smaller cost than T .
A contradiction to the optimality of T .

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 61 / 62


Optimality

Theorem: Huffman code is optimal.

Proof by Induction:
The first lemma implies that the first greedy step is a first step
towards an optimal solution.

The second lemma justifies the inductive steps, applying again and
again the first lemma.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 62 / 62

You might also like