0% found this document useful (0 votes)
8 views78 pages

Greedy Algorithms

Greedy algorithms construct solutions to optimization problems through a series of locally optimal choices, which may lead to globally optimal solutions for some problems. They are particularly useful for certain applications like the Activity-Selection Problem and Change-Making Problem, where they can provide efficient and simple solutions. While dynamic programming can solve more problems optimally, greedy algorithms are preferred for their simplicity and speed in specific cases.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views78 pages

Greedy Algorithms

Greedy algorithms construct solutions to optimization problems through a series of locally optimal choices, which may lead to globally optimal solutions for some problems. They are particularly useful for certain applications like the Activity-Selection Problem and Change-Making Problem, where they can provide efficient and simple solutions. While dynamic programming can solve more problems optimally, greedy algorithms are preferred for their simplicity and speed in specific cases.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 78

Greedy Algorithms

Greedy Technique
Constructs a solution to an optimization problem piece by
piece through a sequence of choices that are:
Defined by an
• feasible, i.e. satisfying the constraints objective function
and a set of
constraints

• locally optimal (with respect to some neighborhood


definition)

• greedy (in terms of some measure), and irrevocable

For some problems, it yields a globally optimal solution for


every instance. For most, does not but can be useful for fast
approximations. We are mostly interested in the former case
in this class.
Optimization Problems

• Optimization Problem
– Problem with an objective function to either:
• Maximize some profit
• Minimize some cost

• Optimization problems appear in so many applications


– Maximize the number of jobs using a resource [Activity-Selection Problem]
– Encode the data in a file to minimize its size [Huffman Encoding Problem]
– Collect the maximum value of goods that fit in a given bucket [knapsack
Problem]
– Select the smallest-weight of edges to connect all nodes in a graph [Minimum
Spanning Tree]
Solving Optimization Problems
• Two techniques for solving optimization problems:
– Greedy Algorithms (“Greedy Strategy”)
– Dynamic Programming
Space of optimization problems

Greedy algorithms can solve


some problems optimally

Dynamic programming can solve


more problems optimally (superset)

We still care about Greedy Algorithms because for


some problems:
• Dynamic programming is overkill (slow)
• Greedy algorithm is simpler and more efficient
Greedy Algorithms

• Main Concept
– Divide the problem into multiple steps (sub-problems)

– For each step take the best choice at the current


moment (Local optimal) (Greedy choice)

– A greedy algorithm always makes the choice that looks


best at the moment

– The hope: A locally optimal choice will lead to a globally


optimal solution
• For some problems, it works. For others, it does not
Change-Making Problem
Given unlimited amounts of coins of denominations d1 > … > dm ,
give change for amount n with the least number of coins

Q: What are the objective function and constraints?

Example: d1 = 25c, d2 =10c, d3 = 5c, d4 = 1c and n = 48c

Ex: Prove the greedy algorithm is optimal for the above


denominations.

For example, d1 = 25c, d2 = 10c, d3 = 1c, and n = 30c


Making change

Consider this commonplace example:


– Making the exact change with the minimum number of coins
– Consider the Euro denominations of 1, 2, 5, 10, 20, 50 cents
– Stating with an empty set of coins, add the largest coin possible into the
set which does not go over the required amount
Making change

To make change for €0.72:


– Start with €0.50

Total €0.50
Making change

To make change for €0.72:


– Start with €0.50
– Add a €0.20

Total €0.70
Making change
To make change for €0.74:
– Start with €0.50
– Add a €0.20
– Skip the €0.10 and the €0. 05 but add a €0.02

Total €0.72
Making change
Notice that each digit can be worked with
separately
– The maximum number of coins for any digit is
three
– Thus, to make change for anything less than
€1 requires at most six coins
– The solution is optimal
Anohter Example with TL
To make change for 9.78 TL using the given denominations of 50, 25, 10, 5,
and 1 Kurus:
We start with the largest coin, a 50 Kurus coin. We subtract its value from the
total:
Total value: 9.78 TL - 9.50 TL = 0.28 TL
Number of 50 Kurus coins: 1
The remaining amount is less than the value of the next largest coin, 25 Kurus.
So we move on to the next smallest coin, a 10 Kurus coin:
Total value: 0.28 TL - 2 * 10 Kurus = 0.08 TL
Number of 10 Kurus coins: 2
Next, we use a 5 Kurus coin:
Total value: 0.08 TL - 1 * 5 Kurus = 0.03 TL
Number of 5 Kurus coins: 1
Finally, we use 3 x 1 Kurus coins to reach the total of 9.78 TL:
Total value: 0.03 TL - 3 * 1 Kurus = 0 TL
Number of 1 Kurus coins: 3
Activity-Selection Problem
• Given a set of activities A1, A2, …An (E.g., talks or lectures)
• Each activity has a start time and end time
– Each Ai has (Si, Ei)
• Activities should use a common resource (E.g., Lecture hall)
• Objective: Maximize the number of “compatible” activities that use
the resource
– Cannot have two overlapping activities

A3 A7
A1
A2 A5

A8
A6
A4

Time dimension
Another Version Of The Same Problem

• A set of events A1, A2, …An

• Each event has a start time and end time

• You want to attend as many events as possible


Example of Compatible Activities

• Set of activities A = {A1, A2, …., An}


Examples of compatible activities:
{A1, A3, A8}
{A1, A4, A5, A7}
• Each Ai = (Si, Ei) {A2, A5, A8}
….

A3 A7
A1
A2 A5

A8
A6
A4

Time dimension
Greedy Algorithm
• Select the activity that ends first (smallest end time)
– Intuition: it leaves the largest possible empty space for
more activities

• Once selected an activity


– Delete all non-compatible activities
– They cannot be selected

• Repeat the algorithm for the remaining activities


– Either using iterations or recursion
Greedy Algorithm
Greedy Choice: Select
• the next best activity
Select the activity that ends first (smallest end time)
(Local Optimal)
– Intuition: it leaves the largest possible empty space
for more activities

• Once selected an activity


– Delete all non-compatible activities
– They cannot be selected Sub-problem: We created
one sub-problem to solve
• Repeat the algorithm for the remaining activities (Find the optimal schedule
– Either using iterations or recursion after the selected activity)

Hopefully when we merge the local optimal + the sub-


problem optimal solution  we get a global optimal
Example

Is that an optimal
The greedy algorithm will select: answer?? Can we find
{A1, A4, A6, A7} a larger set?
Recursive Solution

Two arrays containing the start and end times The activity chosen
(Assumption: they are sorted based on end times) in the last call

The problem size

Recursive-Activity-Selection(S, E, k, n)
m = k +1
While (m <= n) && ( S[m] < E[k]) Find the next activity starting
after the end of k
m++;
If (m <= n)
return {Am} U Recursive-Activity-Selection(S, E, m, n)
Else
return Φ

Time Complexity: O(n)


(Assuming arrays are already sorted, otherwise we add O(n Log n)
Iterative Solution

Two arrays containing the start and end times


(Assumption: they are sorted based on end times)

Iterative-Activity-Selection(S, E)
n = S.Length
List = {A1}
lastSelection = 1

For (i = 2 to n)
if (S[i] >= E[lastSelection])
List = List U {Ai}
lastSelection = i
End If
End Loop

Return List
Elements Of Greedy Algorithms
• Greedy-Choice Property
– At each step, we do a greedy (local optimal) choice

• Top-Down Solution
– The greedy choice is usually done independent of the sub-
problems
– Usually done “before” solving the sub-problem

• Optimal Substructure
– The global optimal solution can be composed from the local
optimal of the sub-problems
Interval scheduling

Suppose we want to maximize the number of processes that are run

In order to create a greedy algorithm, we must have a fast selection


process which quickly determines which process should be run next

The first thought may be to always run that process that is next
ready to run
– A little thought, however, quickly demonstrates that this fails

– The worst case would be to only run 1 out of n possible processes when
n – 1 processes could have been run
Interval scheduling

To maximize the number of processes that are run, we should


trying to free up the processor as quickly as possible
– Instead of looking at the start times, look at the end times
– At any time that the processor is available, select that process with the
earliest end time: the earliest-deadline-first algorithm

In this example, Process B is the first to start, and then Process C


follows:
Interval scheduling
Consider the following list of 12 processes
Process Interval
together with the time interval during which
A 5 – 8
they must be run B 10 – 13
– Find the optimal schedule with the earliest- C 6 – 9
deadline-first greedy algorithm D 12 – 15
E 3 – 7
F 8 – 11
G 1 – 6
H 8 – 12
J 3 – 5
K 2 – 4
L 11 – 16
M 10 – 15
Interval scheduling

In order to simplify this, sort the processes


on their end times Process Interval
K 2 – 4
J 3 – 5
G 1 – 6
E 3 – 7
A 5 – 8
C 6 – 9
F 8 – 11
H 8 – 12
B 10 – 13
D 12 – 15
M 10 – 15
L 11 – 16
Interval scheduling
To begin, choose Process K Process Interval
K 2 – 4
J 3 – 5
G 1 – 6
E 3 – 7
A 5 – 8
C 6 – 9
F 8 – 11
H 8 – 12
B 10 – 13
D 12 – 15
M 10 – 15
L 11 – 16
Interval scheduling
At this point, Process J, G and E can no
longer be run Process Interval
K 2 – 4
J 3 – 5
G 1 – 6
E 3 – 7
A 5 – 8
C 6 – 9
F 8 – 11
H 8 – 12
B 10 – 13
D 12 – 15
M 10 – 15
L 11 – 16
Interval scheduling
Next, run Process A
Process Interval
K 2 – 4
J 3 – 5
G 1 – 6
E 3 – 7
A 5 – 8
C 6 – 9
F 8 – 11
H 8 – 12
B 10 – 13
D 12 – 15
M 10 – 15
L 11 – 16
Interval scheduling
We can no longer run Process C
Process Interval
K 2 – 4
J 3 – 5
G 1 – 6
E 3 – 7
A 5 – 8
C 6 – 9
F 8 – 11
H 8 – 12
B 10 – 13
D 12 – 15
M 10 – 15
L 11 – 16
Interval scheduling
Next, we can run Process F
Process Interval
K 2 – 4
J 3 – 5
G 1 – 6
E 3 – 7
A 5 – 8
C 6 – 9
F 8 – 11
H 8 – 12
B 10 – 13
D 12 – 15
M 10 – 15
L 11 – 16
Interval scheduling
This restricts us from running
Process Interval
Processes H, B and M K 2 – 4
J 3 – 5
G 1 – 6
E 3 – 7
A 5 – 8
C 6 – 9
F 8 – 11
H 8 – 12
B 10 – 13
D 12 – 15
M 10 – 15
L 11 – 16
Interval scheduling
The next available process is D Process Interval
K 2 – 4
J 3 – 5
G 1 – 6
E 3 – 7
A 5 – 8
C 6 – 9
F 8 – 11
H 8 – 12
B 10 – 13
D 12 – 15
M 10 – 15
L 11 – 16
Interval scheduling
The prevents us from running Process L
– We are therefore finished Process Interval
K 2 – 4
J 3 – 5
G 1 – 6
E 3 – 7
A 5 – 8
C 6 – 9
F 8 – 11
H 8 – 12
B 10 – 13
D 12 – 15
M 10 – 15
L 11 – 16
Application: Interval scheduling

We have scheduled four processes Process Interval


– The selection may not be unique K 2 – 4
J 3 – 5
G 1 – 6
E 3 – 7
A 5 – 8
Once the processes are sorted, the run time C 6 – 9
is linear—we simply look ahead to find the F 8 – 11
next process that can be run H 8 – 12
– Thus, the run time is the run time of sorting B 10 – 13
the list. D 12 – 15
M 10 – 15
L 11 – 16
Application: Interval scheduling
Process Interval
For example, we could have chosen
K 2 – 4
Process L J 3 – 5
G 1 – 6
E 3 – 7
A 5 – 8

In this case, processor usage would go C 6 – 9


F 8 – 11
up, but no significance is given to that
H 8 – 12
criteria
B 10 – 13
D 12 – 15
M 10 – 15
L 11 – 16
Application: Interval scheduling
Process Interval
We could add weights to the individual
K 2 – 4
processes J 3 – 5
G 1 – 6
E 3 – 7
A 5 – 8
C 6 – 9
– The weights could be the duration of
F 8 – 11
the processes—maximize processor usage
H 8 – 12
– The weights could be revenue gained from
B 10 – 13
the performance—maximize revenue
D 12 – 15
M 10 – 15
We will see an efficient algorithm in the L 11 – 16
topic on dynamic programming
KNAPSACK PROBLEM

There are two version of knapsack problem:

1. 0-1 knapsack problem:

• — Items are indivisible. (either take an item or not)

• — can be solved with dynamic programming.

2. Fractional knapsack problem:

• — Items are divisible. (can take any fraction of an item)

• — It can be solved in greedy method


Knapsack Problem: Definition

 Thief has a knapsack with maximum capacity W, and a set S


consisting of n items

 Each item i has some weight wi and benefit value vi (all wi , vi and W
are integer values)

 Problem: How to pack the knapsack to achieve maximum total


value of packed items?
0-1 Knapsack

• Items cannot be divided


– Either take it or leave it

 find xi such that for all xi = {0, 1},


If Xi = 1, then item i will be taken
i = 1, 2, .., n
 wixi  W and If Xi = 0, then item i will be skipped
 xivi is maximum
0-1 Knapsack - Greedy Strategy Does Not Work

• E.g.1:
Item 3 30 120
+
Item 2 50 50 50
20 100
Item 1 30
20 + 20 100
10 10 60

60 100 120 W 160 220


$6/pound $5/pound $4/pound

• Greedy choice: Not optimal


– Compute the benefit per pound
– Sort the items based on these values
Project management
0/1 knapsack problem
Situation:
– The next cycle for a given product is 26 weeks
– We have ten possible projects which could be completed in that
time, each with an expected number of weeks to complete the
project and an expected increase in revenue

This is also called the 0/1 knapsack problem


– You can place n items in a knapsack where each item has a
value and a weight in kilograms
– The knapsack can hold a maximum of m kilograms
Project management
0/1 knapsack problem

Objective: choose those projects which


can be completed in the required amount
of time which maximizes revenue
Project management
0/1 knapsack problem
The projects: Completio
Product ID n Time Expected Revenue
(1000 $)
(wks)
A 15 210
B 12 220
C 10 180
D 9 120
E 8 160
F 7 170
G 5 90
H 4 40
J 3 60
K 1 10
Project management
0/1 knapsack problem
first try to find an optimal schedule by trying
to be as productive as possible during the
26 weeks:
– we will start with the projects in order from
most time to least time, and at each step,
select the longest-running project which does
not put us over 26 weeks
– we will be able to fill in the gaps with the
smaller projects
Project management
0/1 knapsack problem
Greedy-by-time (make use of all 26 wks):
– Project A: 15 wks
– Project C: 10 wks Product Completion Expected Revenue
– Project J: 1 wk ID Time (wks) (1000 $)
A 15 210
Total time: 26 wks B 12 220
C 10 180
Expected revenue: D 9 120
$400 000 E 8 160
F 7 170
G 5 90
H 4 40
I 3 60
J 1 10
Project management
0/1 knapsack problem
Next, let us attempt to find an optimal schedule
by starting with the most :
– we will start with the projects in order from most time
to least time, and at each step, select the longest-
running project which does not put us over 26 weeks
– we will be able to fill in the gaps with the smaller
projects
Project management
0/1 knapsack problem
Greedy-by-revenue (best-paying projects):
– Project B: $220K
– Project C: $180K Product Completion Expected Revenue
– Project H: $ 60K ID Time (wks) (1000 $)
– Project K: $ 10K B 12 220
A 15 210
Total time: 26 wks C 10 180
F 7 170
Expected revenue: E 8 160
$470 000
D 9 120
G 5 90
J 3 60
H 4 40
K 1 10
Project management
0/1 knapsack problem
Unfortunately, either of these techniques focuses on
projects which have high projected revenues or high run
times

What we really want is to be able to complete those jobs


which pay the most per unit of development time

Thus, rather than using development time or revenue, let


us calculate the expected revenue per week of
development time
Project management
0/1 knapsack problem
This is summarized here:
Product Completion Expected Revenue Revenue Density
ID Time (wks) (1000 $) ($ / wk)
A 15 210 14 000
B 12 220 18 333
C 10 180 18 000
D 9 120 13 333
E 8 160 20 000
F 7 170 24 286
G 5 90 18 000
H 4 40 10 000
J 3 60 20 000
K 1 10 10 000
Project management
0/1 knapsack problem
Greedy-by-revenue-density:
– Project F: $24 286/wk
– Project E: $20 000/wk Expected Revenue
Product Completion Density
– Project J: $20 000/wk Revenue
ID Time (wks)
– Project G:$18 000/wk (1000 $) ($/wk)
– Project K: $10 000/wk F 7 170 24 286
E 8 160 20 000
Total time: 24 wks
J 3 60 20 000
B 12 220 18 333
Expected revenue:
$490 000 C 10 180 18 000
G 5 90 18 000
Bonus: 2 weeks for bug fixing A 15 210 14 000
D 9 120 13 333
H 4 40 10 000
K 1 10 10 000
Project management
0/1 knapsack problem
Using brute force, we find that the optimal solution is:
– Project C: $180 000
Expected Revenue
– Project E: $170 000 Product Completion
Revenue Density
– Project F: $150 000 ID Time (wks)
(1000 $) ($/wk)
– Project K: $ 10 000
A 15 210 14 000
B 12 220 18 333
Total time: 26 wks
C 10 180 18 000
D 9 120 13 333
Expected revenue:
E 8 160 20 000
$520 000
F 7 170 24 286
G 5 90 18 000
H 4 40 10 000
J 3 60 20 000
K 1 10 10 000
Project management 0/1 knapsack
problem
In this case, the greedy-by-revenue-density came closest to the optimal solution:
Expected
Algorithm
Revenue
Greedy-by-time $400 000
Greedy-by-expected revenue $470 000
Greedy-by-revenue density $490 000
Brute force $520 000

– The run time is Q(n ln(n)) — the time required to sort the list
– Later, we will see a dynamic program for finding an optimal solution with one additional
constraint
Project management
0/1 knapsack problem
Of course, in reality, there are numerous other factors
affecting projects, including:
– Flexible deadlines (if a delay by a week would result in a
significant increase in expected revenue, this would be
acceptable)
– Probability of success for particular projects
Fractional Knapsack

• Items can be divided


– Can take part of it as needed

 find xi such that for all 0 <= xi <= 1,

i = 1, 2, .., n If Xi = 0, then item i will be skipped


 wixi  W and
If Xi > 0, then Xi fraction of item i will be taken
 xivi is maximum
Fractional Knapsack - Greedy Strategy Works

• E.g.1: 2/3 80
Of
Item 3 30 +

Item 2 50 50
20 100
Item 1 30
20 +
10 10 60

60 100 120 W 240


6/pound 5/pound 4/pound

• Greedy choice: Optimal


– Compute the benefit per pound
– Sort the items based on these values
– Take as much as you can from the top items in the list
THE OPTIMAL KNAPSACK ALGORITHM
 Input:

§ an integer n

§ positive values wi and vi such that 1 £ i £ n

§ positive value W.

— Output:

§ n values of xi such that 0 £ xi £ 1

§ Total profit
THE OPTIMAL KNAPSACK ALGORITHM

Initialization:

— Sort the n objects from large to small based on their


ratios vi / wi .

— We assume the arrays w[1..n] and v[1..n] store the


respective
weights and values after sorting.

— initialize array x[1..n] to zeros.

— weight = 0; i = 1;
THE OPTIMAL KNAPSACK ALGORITHM

while (i £ n and weight < W) do


if weight + w[i] £ W then
x[i] = 1
else
x[i] = (W – weight) / w[i]
weight = weight + x[i] * w[i]
i++
Examples

• Many algorithms can be viewed as applications of


the Greedy algorithms, such as (includes but is
not limited to):

1. Minimum Spanning Tree


2. Dijkstra’s
algorithm for shortest paths from a single source
3. Huffman codes ( data-compression codes )
• We will come back to these slides after we
start graphs
MINIMUM SPANNING TREE
• Let G = (N, A) be a connected, undirected graph where
N is the set of nodes and A is the set of edges. Each
edge has a given nonnegative length. The problem is to
find a subset T of the edges of G such that all the nodes
remain connected when only the edges in T are used,
and the sum of the lengths of the edges in T is as small
as possible possible.
• Since G is connected, atleast one solution must exist.
Finding Spanning Trees

There are two basic algorithms for finding minimum-cost


spanning trees, and both are greedy algorithms

• Kruskal’s algorithm:
Created in 1957 by Joseph Kruskal

• Prim’s algorithm
Created by Robert C. Prim
Kruskal's Algorithm

• The steps for implementing Kruskal's algorithm


are as follows:

 Sort all the edges from low weight to high


 Take the edge with the lowest weight and add it
to the spanning tree. If adding the edge created
a cycle, then reject this edge.
 Keep adding edges until we reach all vertices.
Prim's Algorithm

• The steps for implementing Prim's algorithm


are as follows:

 Initialize the minimum spanning tree with a


vertex chosen at random.
 Find all the edges that connect the tree to new
vertices, find the minimum and add it to the tree
 Keep repeating step 2 until we get a minimum
spanning tree
Djikstra's Algorithm Pseudocode
function dijkstra(G, S)
for each vertex V in G
distance[V] <- infinite
previous[V] <- NULL
If V != S, add V to Priority Queue Q
distance[S] <- 0

while Q IS NOT EMPTY


U <- Extract MIN from Q
for each unvisited neighbour V of U
tempDistance <- distance[U] + edge_weight(U, V)
if tempDistance < distance[V]
distance[V] <- tempDistance
previous[V] <- U
return distance[], previous[]
Huffman Codes

Huffman codes are an effective technique of


‘lossless data compression’ which means no
information is lost.
 The algorithm builds a table of the
frequencies of each character in a file
 The table is then used to determine an optimal
way of representing each character as a binary
string
Constructing a Huffman Code
Huffman developed a greedy algorithm for constructing an
optimal prefix code
q The algorithm builds the tree in a bottom-up manner
q It begins with the leaves, then performs merging
operations to build up the tree
q At each step, it merges the two least frequent members
together
n It removes these characters from the set, and
replaces them with a “metacharacter” with frequency =
sum of the removed characters’ frequencies
Example

Let A = {a / 20, b/ 15, c/ 5, d/ 15, e/ 45} be the alphabet and


its frequency distribution.
In the first step Huffman coding merges c and d .

Alphabet is now A 1 = {a / 20 , b/ 15, n1/ 20, e/45}.


Alphabet is now A1 = {a / 20 , b/ 15, n1/ 20 , e/ 45}.
Algorithm merges a and b
(could also have merged n1 and b).

New alphabet is A2 = {n2/ 35, n1/ 20, e/ 45}.


Alphabet is A2 = {n2/35 , n1/ 20, e/45}.
Algorithm merges n1 and n2 .

New alphabet is A3 = {n3/ 55, e/ 45}.


Current alphabet is A3= {n3/ 55, e/45}.
Algorithm merges e and n3 and finishes.
Huffman code is obtained from the Huffman tree

Huffman code is
a = 000, b = 001, c = 010, d = 011, e = 1.
This is the optimum (minimum-cost) prefix code for this
distribution.

You might also like