Greedy Algorithms and Data Compression.: Curs Fall 2017
Greedy Algorithms and Data Compression.: Curs Fall 2017
Fractional Knapsack
INPUT:a set I = {i}n1 of items that can be fractioned, each i with
weight wi and value vi . A maximum weight W permissible
QUESTION: select whole or partial items to maximize the profit,
within allowed weight W
Example.
Item I: 1 2 3
Value V: 60 100 120
Weight w : 10 20 30
W = 28
Fractional knapsack
0-1 Knapsack
INPUT:a set I = {i}n1 of items that can NOT be fractioned, each i
with weight wi and value vi . A maximum weight W permissible
QUESTION: select the items to maximize the profit, within
allowed weight W .
For example
Item I: 1 2 3
Value V : 60 100 120
with
Weight w : 10 20 30
v /w : 6 5 4
W = 50.
Then any solution which includes item 1 is not optimal. The
optimal solution consists of items 2 and 3.
Activity scheduling problems
Activity (A): 1 2 3 4 5 6 7 8
Start (s): 3 2 2 1 8 6 4 7
Finish (f): 5 5 3 5 9 9 5 8
4
3 7 8
2 6
1 5
1 2 3 4 5 6 7 8 9 10
To apply the greedy technique to a problem, we must take into
consideration the following,
I A local criteria to allow the selection,
I a condition to determine if a partial solution can be
completed,
I a procedure to test that we have the optimal solution.
The Activity Selection problem.
Given a set A of activities, wish to maximize the number of
compatible activities.
Activity selection A
Sort A by increasing order of fi
Let a1 , a2 , . . . , an the resulting sorted list of activities
S := {a1 }
j := 1 {pointer in sorted list}
for i = 2 to n do
if si ≥ fj then
S := S ∪ {ai } and j := i
end if
end for
return S.
4
A : 3 1 2 7 8 5 6; fi : 3 5 5 5 8 9 9 3 7 8
⇒ SOL: 3 1 8 5 2 6
1 5
1 2 3 4 5 6 7 8 9 10
Notice: In the activity problem we are maximizing the number of
activities, independently of the occupancy of the resource under
consideration. For example in:
4
3 7 8
2 6
1 5
1 2 3 4 5 6 7 8 9 10
6 6
10
1 5 10
The algorithm chooses the interval (1, 10) with weight 10, and the
solution is the intervals (2, 5) and (5, 9) with total weight of 12
Job scheduling problem
Also known as the Lateness minimisation problem.
We have a single resource and n requests to use the resource, each
request i taking a time ti .
In contrast to the previous problem, each request instead of having
an starting and finishing time, it has a deadline di . The goal is to
schedule the tasks as to minimize over all the requests, the
maximal amount of time that a request exceeds the deadline.
Minimize Lateness
I We have a single processor
i ti di
I We have n jobs such that job i:
1 1 9
I requires ti units of processing time,
I it has to be finished by time di , 2 2 8
3 2 15
I Lateness of i:
( 4 3 6
0 if fi ≤ di , 5 3 14
Li = max 6 4 9
fi − d i otherwise.
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1 2 3 4 5 6
0 0 0 2 0 6
Minimize Lateness
Schedule jobs according to some ordering
(1.-) Sort in increasing order of ti :
Process jobs with short time first
i ti di
1 1 6
2 5 5
i ti di d1 − ti
1 1 2 1
2 10 10 0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
d: 6 8 9 9 14 15
i: 1 2 3 4 5 6
0 0 0 1 0 0
Complexity and idle time
Time complexity
Running-time of the algorithm without comparison sorting: O(n)
Total running-time: O(n lg n)
Idle steps
From an optimal schedule with idle steps, we always can eliminate
gaps to obtain another optimal schedule:
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8
i j
fi0
j i
dj di
Lemma
Exchanging two inverted jobs reduces the number of inversions by 1 and
does not increase the max lateness.
Proof Let L = lateness before exchange and let L0 be the lateness after
the exchange, let Li , Lj , L0i , L0j , the corresponding quantities for i, j.
Notice that fj0 = fi , using the fact that dj < di
⇒ L0i = fi 0 − di = fj − di < fj − dj = Lj
Therefore the swapping does not increase the maximum lateness of the
schedule. 2
Correctness of LatenessA
Theorem
Algorithm LatenessA returns an optimal schedule S.
Proof
Assume Ŝ is an optimal schedule with the minimal number of
inversions (and no idle steps).
If Ŝ has 0 inversions then Ŝ = S.
If number inversions in Ŝ is > 0, let i − j be an adjacent inversion.
Exchanging i and j does not increase lateness and decrease the
number of inversions.
Therefore, max lateness S ≤ max lateness Ŝ. 2
Network construction: Minimum Spanning Tree
I We have a set of locations V = {v1 , . . . , vn },
I we want to build a communication network on top of them
I we want that any vi can communicate with any vj ,
I for any pair (vi , vj ) there is a cost w (vi , vj ) of building a
direct link,
I if E is the set of all n(n − 1)/2 possible edges, we want to
P T (E ) ⊆ E s.t. (V , T (E )) is connected and
find a subset
minimizes e∈T (E ) w (e).
Network construction: Minimum Spanning Tree
I We have a set of locations V = {v1 , . . . , vn },
I we want to build a communication network on top of them
I we want that any vi can communicate with any vj ,
I for any pair (vi , vj ) there is a cost w (vi , vj ) of building a
direct link,
I if E is the set of all n(n − 1)/2 possible edges, we want to
P T (E ) ⊆ E s.t. (V , T (E )) is connected and
find a subset
minimizes e∈T (E ) w (e).
Construct
the
MST
Minimum Spanning Tree (MST).
e e
6 4 6 4
5 9 5 9
a f g a f g
14 2 14 2
b 10 d 15 h b 10 d 15 h
3 8 3 8
c c
Some definitions
Given G = (V , E ):
A path is a sequence of consecutive
edges. A cyle is a path with no
repeated vertices other that the one 6
e
4
that it starts and ends. a 5 f
9
g
A cut is a partition of V into S and 14 2
10 15
V − S. b d h
less weight.
The cycle rule (Red rule)
The MST problem has the property that the the optimal solution
can’t be a cycle.
Greedy scheme:
Given G , V (G ) = n, apply the red and blue rules until having
n − 1 blue edges, those form the MST.
Greedy for MST
Greedy scheme:
Given G , V (G ) = n, apply the red and blue rules until having
n − 1 blue edges, those form the MST.
Greedy for MST
Greedy scheme:
Given G , V (G ) = n, apply the red and blue rules until having
n − 1 blue edges, those form the MST.
Greedy for MST : Correctness
Theorem
There exists a MST T containing only all blue edges. Moreover
the algorithm finishes and finds a MST
Sketch of proof Induction on number of
iterations for blue and red rules. The base e
4
case (no edges colored) is trivial. The 6
9
a 5 f g
induction step is the same that in the
14 10 2
proofs of the cut and cycle rules. b d 15 h
Moreover if we have an e not colored, if 3
e
8
c
ends are in different blue tree, apply blue C
Kruskal: How to stablish a min distance cost network about all men
Jarnı́k-Prim vs. Kruskal
Kruskal: How to stablish a min distance cost network about all men
(6 first edges)
Jarnı́k - Prim greedy algorithm.
MST (G , w , r )
T := ∅
for i = 1 to |V | do
Let e ∈ E : e touches T , it has min weight, and do not form a
cycle
T := T ∪ {e}
end for
Use a priority queue to choose min e connected to the tree already
formed.
For every v ∈ V − T , let k[v ] = minimum weight of any edge
connecting v to any vertex in T .
Start with k[v ] = ∞ for all v .
For v ∈ T , let π[v ] be the parent of v . During the algorithm
T = {(v , π[v ]) : v ∈ V − {r } − Q}
where r is the arbitrary starting vertex and Q is a min priority
queue storing k[v ]. The algorithm finishes when Q = ∅.
Example.
e e e
6 4 6 4 6 4
5 9 5 9 5 9
a f g a f g a f g
14 2 14 2 14 2
b 10 d 15 h b 10 d 15 h b 10 d 15 h
3 8 3 8 3 8
c c c
e e e
4 4 6 4
6 6
5 9 5 9 5 9
a f g a f g a f g
14 2 14 2 14 2
b 10 d 15 h b 10 d 15 h 10 15
b d h
3 8 3 8 3 8
c c c
e e
6 4 6 4
5 9 5 9
a f g a f g
14 2 14 2
b 10 d 15 h b 10 d 15 h
3 8 3 8
c c
w (T ) = 52
J. Kruskal, 1956
Similar to Jarnı́k - Prim, but chooses minimum weight edges,
without keeping the graph connected.
MST (G , w , r )
Sort E by increasing weight
T := ∅
for i = 1 to |V | do
Let e ∈ E : with minimum weight and do not form a cycle
with T
T := T ∪ {e}
end for
We have an O(m lg m) from the sorting the edges.
Useful to implement the adding and removing of edges to T we
use Union-Find data structure.
Example.
e e e
6 4 6 4 6 4
5 9 5 9 5 9
a f g a f g a f g
14 2 14 2 14 2
b 10 d 15 h b 10 d 15 h b 10 d 15 h
3 8 3 8 3 8
c c c
e e e
4 4 6 4
6 6
5 9 5 9 5 9
a f g a f g a f g
14 2 14 2 14 2
b 10 d 15 h b 10 d 15 h 10 15
b d h
3 8 3 8 3 8
c c c
Union-find: A DS to implement Kruskal
0 0 1 1 0 0 2
0 0 1 0
0 0
Link by rank
Union rule: Link the root of smaller rank tree to the root of larger
rank tree.
In case the roots of both trees have the same rank, choose
arbitrarily and increase +1 the rank of the winer
except for the root, a node does not change rank during the
process
H rank=2
rank=1 B E F G
D C
Union(D,F)
Union (x, y )
Make-set (x) rx = Find(x)
parent(x) = x ry = Find(y )
rank(x) = 0 if rx = ry then
Stop
else if (rank)(rx ) > (rank)(ry ) then
parent(ry ) = rx
else if (rank)(rx ) < (rank)(ry ) then
Find (x) parent(rx ) = ry
while (x 6= parent(x) else
do parent(rx ) = ry
x = parent(x) (rank)(ry ) = (rank)(ry ) + 1
end if
end while
Example construction Union-find
0 A 0 B 0 C 0 D 0 E 0 F 0 G 0 H 0 I 0 J 0 K
1F 1H
1B C1
0 G 0 I 0 J 0K
0 A 0 D 0E
2C 2H
1B D 0 1F 0 J K0
0 I
0 A 0 E 0 G
3
H
J 0 K 0
2C 1F I 0
1B D 0 E0 G0
A 0
Properties of Link by rank
3
P1.- If x is not a root then
2
rank(x) <
rank(parent(x))
1
P2.- If parent(x) changes
0
then rank(parent(x)) Ranks
increases
P3.- Any root of rank k has ≥ 2k descendants.
P4.- The highest rank of a root is ≤ blg nc
P5.- For any r ≥ 0, there are ≤ n/2r nodes with rank r .
Properties of Link by rank
Theorem
Using link-by-rank, each Union(x, y ) and Find(x) operations take
O(lg n) operations.
Proof The number of steps for each operation is bounded by the
height of the tree, which is O(lg n). 2
Theorem
Starting from an empty data structure with h disjoint single sets,
link-by-rank performs any intermixed sequence of m ≥ n Find and
n − 1 Union operations in m lg n steps.
Back to Kruskal
MST (G , w , r )
Sort E by increasing weight: {e1 , . . . , em }
T := ∅
for all v ∈ V do
Make-set(v )
end for
for i = 1 to m do
Chose ei = (u, v ) in order from E
if Find(x) 6= Find(y ) then
T := T ∪ {ei }
Union(u, v )
end if
end for
Cost is dominated by O(m lg m)
Greedy and Approximations algorithms
Many times the Greedy strategy yields a local feasible solution with
value which is near to the optimum solution.
In many practical cases, when finding the global optimum is hard,
it is sufficient to find a good local approximation.
Given an optimization problem (maximization or minimization) an
optimal algorithm computes the best output OPT (e) on any
instance e of size n.
An approximation algorithm for the problem computes any valid
output.
We want to design approximation algorithms, that are fast and in
worst case get an output as close as possible to OPT (e).
Greedy and Approximations algorithms
1 Apx(e)
≤ ≤ α.
α OPT (e)
GreedyVC G = (V , E )
E 0 = E , S = ∅,
while E 0 6= ∅ do
2
Pick e ∈ E 0 , say e = (u, v ) 1 3
S = S ∪ {u, v },
4 5 6 7
E 0 = E 0 − {(u, v ) ∪ {edges incident to u, v }}
end while
return S.
An easy example: Vertex cover
GreedyVC G = (V , E )
E 0 = E , S = ∅,
while E 0 6= ∅ do
2
Pick e ∈ E 0 , say e = (u, v ) 1 3
S = S ∪ {u, v },
4 5 6 7
E 0 = E 0 − {(u, v ) ∪ {edges incident to u, v }}
end while
return S.
An easy example: Vertex cover
GreedyVC G = (V , E )
E 0 = E , S = ∅,
while E 0 6= ∅ do
2
Pick e ∈ E 0 , say e = (u, v ) 1 3
S = S ∪ {u, v },
4 5 6 7
E 0 = E 0 − {(u, v ) ∪ {edges incident to u, v }}
end while
return S.
An easy example: Vertex cover
GreedyVC G = (V , E )
E 0 = E , S = ∅,
while E 0 6= ∅ do
2
Pick e ∈ E 0 , say e = (u, v ) 1 3
S = S ∪ {u, v },
4 5 6 7
E 0 = E 0 − {(u, v ) ∪ {edges incident to u, v }}
end while
return S.
An easy example: Vertex cover
Theorem
The algorithm Apx runs in O(m + n) steps. Moreover,
|Apx(e)| ≤ 2|OPT (e) |.
Proof.
We use induction to prove |Apx(e)| ≤ 2|OPT (e) |. Notice for
every {u, v } we add to Apx(e), either u or v are in OPT (e).
Base: If V = ∅ then |Apx(e)| = |OPT (e) | = 0.
Hipothesis: |Apx(e) − {u, v }| ≤ 2|OPT (e) − {u, v }|. Then,
x1
r1
x1
x2
r2 r
Greedy algorithm: Complexity
We have the set X of points and all their O(n2 ) distances. We assume we have
a data structure that keeps ordered the set of distances D, so we can and it is
quick to retrieve quickly any distance between points in X . How?
I At each step i we have to compute the distance from all x ∈ X to all
current centers c ∈ Ci−1 , and choose the new ci and ri , but
I For each x ∈ define
di [x] = d(x, Ci ) = min{d(x, Ci−1 ), d(x, ci )} = min{di−1 [x], d(x, ci )}
| {z }
(∗)
Theorem
The the resulting diameter in the previous greedy algorithm is an
approximation algorithm to the k-center problem, with an
approximation ratio of α = 2.
(i.e. It returns a set C s.t. r (C ) ≤ 2r (C ∗ ) where C ∗ is an optimal
set of k-centers.
Proof
Let C ∗ = {ci∗ }ki=1 and r ∗ be the optimal values, and let
C = {Ci }ki=1 and r the values returned by the algorithm. Want to
prove r ≤ 2r ∗ .
Case 2: At least one Cj∗ does not cover any center in C . Then,
∃Cl∗ covering at least ci and cj ⇒ d(ci , cj ) ≤ 2r ∗ .
We need to prove that d(ci , cj ) > r . Wlog assume the algorithm
chooses cj at iteration j and that ci has been selected as centre in
a previous iteration, then d(ci , cj ) > rj .
Moreover, notice than r1 ≥ r2 ≥ . . . rk = r ,
therefore d(ci , cj ) ≥ rj > r and r ≤ d(ci , cj ) ≤ 2r ∗ 2
Data Compression
|AAACAGTTGCAT{z· · · GGTCCCTAGG}
130.000.000
000 |{z}
|{z} 1 |{z}
01 |{z}
1 |{z}
001 |{z}
1 |{z}
01 |{z}
000 |{z}
001 |{z}
01
G A T A C A T G C T
Prefix tree.
Represent encoding with prefix property as a binary tree, the prefix
tree:
A prefix tree T is a binary tree with the following properties:
I One leaf for symbol,
I Left edge labeled 0 and right edge labeled 1,
I Labels on the path from the root to a leaf specify the code for
that leaf.
For Σ = {A, T , G , C }
0 1
A
0 1
T
0 1
G C
Frequency.
0 1
1 0 1
0
b d a
0 1
e c
0 1
1 0 1
0
c b a
0 1
e d
Proof.
Let T be the prefix tree of an optimal code, and suppose it
contains a u with a son v .
If u is the root, construct T 0 by deleting u and using v com root.
T 0 will yield a code with less bits to code the symbols.
Contradiction to optimality of T .
If u is not the root, let w be the father of u. Construct T 0 by
deleting u and connecting directly v to w . Again this decreases the
number of bits, contradiction to optimality of T .
Greedy approach: Huffman code
Wish to produce a labeled binary full tree, in which the leaves are
as close to the root as possible. Moreover symbols with low
frequency will be placed deeper than the symbol with high
frequency.
Greedy approach: Huffman code
Huffman Σ, S
Given Σ and S {compute the frequencies {f }}
Construct priority queue Q of Σ, ordered by increasing f
while Q 6= ∅ do
create a new node z
x =Extract-Min (Q)
y =Extract-Min (Q)
make x, y the sons of z
f (z) = f (x) + f (y )
Insert (Q, z)
end while
If Q is implemented by a Heap, the algorithm has a complexity
O(n lg n).
Example
Consider the text: for each rose, a rose is a rose, the rose.
with Σ = {for/ each/ rose/ a/ is/ the/ ,/ b}
Frequencies: f (for) = 1/21, f (rose) = 4/21, f (is) = 1/21,
f (a) = 2/21, f (each) = 1/21, f (,) = 2/21, f (the) = 1/21,
f (b) = 9/21.
Priority Queue:
Q=(for(1/21), each(1/21), a(1/21), is(1/21), ,(2/21), the(2/21),
rose(4/21), b(9/21))
z1 (2/21)
for each
z2 (2/21) z3 (4/21)
z4 (4/21)
z1 z2
a is , the
for each a is
z7 (21/21)
z5
0 1
z6
b z6
0 1
rose z3
z4 z5
z5 z4
0 1 0 1
, the z1 z2 rose z3
0 1 0 1 1
0
, the
for each a is
Example
Proof.
For T optimal with a and b sibilings at max. depth. Assume
f (b) ≤ f (a). Construct T 0 by exchanging x with a and y with b.
As f (x) ≤ f (a) and f (y ) ≤ f (b) then B(T 0 ) ≤ B(T ).
Theorem (Optimal substructure)
Assume T 0 is an optimal prefix tree for (Σ − {x, y }) ∪ {z} where
x, y are symbols with lowest frequency, and z has frequency
f (x) + f (y ). The T obtained from T 0 by making x and y children
of z is an optimal prefix tree for Σ.
Proof.
Let T0 be any prefix tree for Σ. Must show B(T ) ≤ B(T0 ).
We only need to consider T0 where x and y are siblings. Let T00 be
obtained by removing x, y from T0 . As T00 is a prefix tree for
(Σ − {x, y }) ∪ {z}, then B(T00 ) ≥ B(T 0 ).
Comparing T0 with T00 we get,
B(T00 ) + f (x) + f (y ) = B(T0 ) and B(T 0 ) + f (x) + f (y ) = B(T ),
Putting together the three identities, we get B(T ) ≤ B(T0 ).
Optimality of Huffman