0% found this document useful (0 votes)
108 views

A Greedy Algorithm

A greedy algorithm is an algorithm that makes locally optimal choices at each step in an attempt to find a global optimum. While greedy algorithms often provide good approximations, they do not always yield optimal solutions because they make irreversible decisions at each step without considering future choices or the overall problem. For example, a greedy algorithm for finding the longest path in a graph by always selecting the highest value node at each step would fail to find the optimal solution.

Uploaded by

billpetrrie
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
108 views

A Greedy Algorithm

A greedy algorithm is an algorithm that makes locally optimal choices at each step in an attempt to find a global optimum. While greedy algorithms often provide good approximations, they do not always yield optimal solutions because they make irreversible decisions at each step without considering future choices or the overall problem. For example, a greedy algorithm for finding the longest path in a graph by always selecting the highest value node at each step would fail to find the optimal solution.

Uploaded by

billpetrrie
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 8

A greedy algorithm is any algorithm that follows the problem-solving heuristic of making the

locally optimal choice at each stage[1] with the intent of finding a global optimum. In many
problems, a greedy strategy does not usually produce an optimal solution, but nonetheless a greedy
heuristic may yield locally optimal solutions that approximate a globally optimal solution in a
reasonable amount of time.

For example, a greedy strategy for the traveling salesman problem (which is of a high
computational complexity) is the following heuristic: "At each step of the journey, visit the nearest
unvisited city." This heuristic does not intend to find a best solution, but it terminates in a
reasonable number of steps; finding an optimal solution to such a complex problem typically
requires unreasonably many steps. In mathematical optimization, greedy algorithms optimally solve
combinatorial problems having the properties of matroids, and give constant-factor approximations
to optimization problems with submodular structure.

In general, greedy algorithms have five components:

1. A candidate set, from which a solution is created


2. A selection function, which chooses the best candidate to be added to the solution
3. A feasibility function, that is used to determine if a candidate can be used to contribute to a
solution
4. An objective function, which assigns a value to a solution, or a partial solution, and
5. A solution function, which will indicate when we have discovered a complete solution

Greedy algorithms produce good solutions on some mathematical problems, but not on others. Most
problems for which they work will have two properties:

Greedy choice property


We can make whatever choice seems best at the moment and then solve the subproblems
that arise later. The choice made by a greedy algorithm may depend on choices made so far,
but not on future choices or all the solutions to the subproblem. It iteratively makes one
greedy choice after another, reducing each given problem into a smaller one. In other words,
a greedy algorithm never reconsiders its choices. This is the main difference from dynamic
programming, which is exhaustive and is guaranteed to find the solution. After every stage,
dynamic programming makes decisions based on all the decisions made in the previous
stage, and may reconsider the previous stage's algorithmic path to solution.
Optimal substructure
"A problem exhibits optimal substructure if an optimal solution to the problem contains
optimal solutions to the sub-problems."[2]

Cases of failure[edit]

Examples on how a greedy algorithm may fail to achieve the optimal solution.

/conversion/tmp/scratch/452024382.doc 1
Starting from A, a greedy algorithm that tries to find the maximum by following the greatest slope
will find the local maximum at "m", oblivious to the global maximum at "M".

With a goal of reaching the largest sum, at each step, the greedy algorithm will choose what appears
to be the optimal immediate choice, so it will choose 12 instead of 3 at the second step, and will not
reach the best solution, which contains 99.

For many other problems, greedy algorithms fail to produce the optimal solution, and may even
produce the unique worst possible solution. One example is the traveling salesman problem
mentioned above: for each number of cities, there is an assignment of distances between the cities
for which the nearest-neighbor heuristic produces the unique worst possible tour.[3]

Greedy algorithms can be characterized as being 'short sighted', and also as 'non-recoverable'. They
are ideal only for problems which have 'optimal substructure'. Despite this, for many simple
problems, the best suited algorithms are greedy algorithms. It is important, however, to note that the
greedy algorithm can be used as a selection algorithm to prioritize options within a search, or
branch-and-bound algorithm. There are a few variations to the greedy algorithm:

 Pure greedy algorithms


 Orthogonal greedy algorithms
 Relaxed greedy algorithms

Greedy algorithms mostly (but not always) fail to find the globally optimal solution because they
usually do not operate exhaustively on all the data. They can make commitments to certain choices
too early which prevent them from finding the best overall solution later. For example, all known
greedy coloring algorithms for the graph coloring problem and all other NP-complete problems do

/conversion/tmp/scratch/452024382.doc 2
not consistently find optimum solutions. Nevertheless, they are useful because they are quick to
think up and often give good approximations to the optimum.

If a greedy algorithm can be proven to yield the global optimum for a given problem class, it
typically becomes the method of choice because it is faster than other optimization methods like
dynamic programming. Examples of such greedy algorithms are Kruskal's algorithm and Prim's
algorithm for finding minimum spanning trees, and the algorithm for finding optimum Huffman
trees.

Greedy algorithms appear in network routing as well. Using greedy routing, a message is forwarded
to the neighboring node which is "closest" to the destination. The notion of a node's location (and
hence "closeness") may be determined by its physical location, as in geographic routing used by ad
hoc networks. Location may also be an entirely artificial construct as in small world routing and
distributed hash table.

A greedy algorithm is a simple, intuitive algorithm that is used in optimization problems. The algorithm makes the

optimal choice at each step as it attempts to find the overall optimal way to solve the entire problem. Greedy

algorithms are quite successful in some problems, such as Huffman encoding which is used to compress data, or

Dijkstra's algorithm, which is used to find the shortest path through a graph.

However, in many problems, a greedy strategy does not produce an optimal solution. For example, in the animation

below, the greedy algorithm seeks to find the path with the largest sum. It does this by selecting the largest

available number at each step. The greedy algorithm fails to find the largest sum, however, because it makes

decisions based only on the information it has at any one step, without regard to the overall problem.

With a goal of reaching the largest sum, at each step, the greedy algorithm

will choose what appears to be the optimal immediate choice, so it will choose 12 instead of 3 at the second step and will not reach the best

solution, which contains 99.[1]

Greedy Algorithms

Structure of a Greedy Algorithm

Greedy algorithms take all of the data in a particular problem, and then set a rule for which elements to add to the

solution at each step of the algorithm. In the animation above, the set of data is all of the numbers in the graph, and

the rule was to select the largest number available at each level of the graph. The solution that the algorithm builds

is the sum of all of those choices.

If both of the properties below are true, a greedy algorithm can be used to solve the problem.

/conversion/tmp/scratch/452024382.doc 3
 Greedy choice property: A global (overall) optimal solution can be reached by choosing the optimal

choice at each step.

 Optimal substructure: A problem has an optimal substructure if an optimal solution to the entire problem

contains the optimal solutions to the sub-problems.

In other words, greedy algorithms work on problems for which it is true that, at every step, there is a choice that is

optimal for the problem up to that step, and after the last step, the algorithm produces the optimal solution of the

complete problem.

To make a greedy algorithm, identify an optimal substructure or subproblem in the problem. Then, determine what

the solution will include (for example, the largest sum, the shortest path, etc.). Create some sort of iterative way to

go through all of the subproblems and build a solution.

4 to 5 to 8 4 to 7 to 3 4 to 5 to 4 to 9 4 to 7 to 2 to 10

If there is a greedy algorithm that will traverse a graph, selecting the largest node value at each point until it

reaches a leaf of the graph, what path will the greedy algorithm follow in the graph below?

14 19 22 23 24 26

What is the length of the longest path through the graph below? Calculate the length by adding the values of the

nodes.

/conversion/tmp/scratch/452024382.doc 4
Limitations of Greedy Algorithms

Sometimes greedy algorithms fail to find the globally optimal solution because they do not consider all the data. The

choice made by a greedy algorithm may depend on choices it has made so far, but it is not aware of future choices it

could make.

In the graph below, a greedy algorithm is trying to find the longest path through the graph (the number inside each

node contributes to a total length). To do this, it selects the largest number at each step of the algorithm. With a

quick visual inspection of the graph, it is clear that this algorithm will not arrive at the correct solution. What is the

correct solution? Why is a greedy algorithm ill-suited for this problem?

/conversion/tmp/scratch/452024382.doc 5
An

example of greedy algorithm, searching the largest path in a tree [2]

The correct solution for the longest path through the graph is 7,3,1,997, 3, 1, 997,3,1,99. This is clear to us

because we can see that no other combination of nodes will come close to a sum of 999999, so whatever path we

choose, we know it should have 999999 in the path. There is only one option that includes 999999: 7,3,1,997, 3,

1, 997,3,1,99.

The greedy algorithm fails to solve this problem because it makes decisions purely based on what the best answer

at the time is: at each step it did choose the largest number. However, since there could be some huge number that

the algorithm hasn't seen yet, it could end up selecting a path that does not include the huge number. The solutions

to the subproblems for finding the largest sum or longest path do not necessarily appear in the solution to the total

problem. The optimal substructure and greedy choice properties don't hold in this type of problem. □_\square□

Here, we will look at one form of the knapsack problem. The knapsack problem involves deciding which subset of

items you should take from a set of items if you want to optimize some value: perhaps the worth of the items, the

size of the items, or the ratio of worth to size.

In this problem, we will assume that we can either take an item or leave it (we cannot take a fractional part of an

item). We will also assume that there is only one of each item. Our knapsack has a fixed size, and we want to

optimize the worth of the items we take, so we must choose the items we take with care. [3]

Our knapsack can hold at most 25 units of space.

Here is the list of items and their worths.

/conversion/tmp/scratch/452024382.doc 6
Item Size Price

Laptop 22 12

PlayStation 10 9

Textbook 9 9

Basketball 7 6

Which items do we choose to optimize for price?

There are two greedy algorithms we could propose to solve this. One has a rule that selects the item with the largest

price at each step, and the other has a rule that selects the smallest sized item at each step.

 Largest-price Algorithm: At the first step, we take the laptop. We gain 121212 units of worth, but can

now only carry 25−22=325-22 = 325−22=3 units of additional space in the knapsack. Since no items that

remain will fit into the bag, we can only take the laptop and have a total of 121212 units of worth.

 Smallest-sized-item Algorithm: At the first step, we will take the smallest-sized item: the basketball. This

gives us 666 units of worth, and leaves us with 25−7=1825-7 = 1825−7=18 units of space in our bag.

Next, we select the next smallest item, the textbook. This gives us a total of 6+9=156+9 =156+9=15 units

of worth, and leaves us with 18−9=918-9 = 918−9=9 units of space. Since no remaining items are 999

units of space or less, we can take no more items.

The greedy algorithms yield solutions that give us 121212 units of worth and 151515 units of worth. But neither of

these are the optimal solution. Inspect the table yourself and see if you can determine a better selection of items.

Taking the textbook and the PlayStation yields 9+9=189+9=189+9=18 units of worth and takes up

10+9=1910+9=1910+9=19 units of space. This is the optimal answer, and we can see that a greedy algorithm

will not solve the knapsack problem since the greedy choice and optimal substructure properties do not hold.

□_\square□

In problems where greedy algorithms fail, dynamic programming might be a better approach.

Applications

There are many applications of greedy algorithms. Below is a brief explanation of the greedy nature of a famous

graph search algorithm, Dijkstra's algorithm.

Dijkstra's Algorithm

Dijkstra's algorithm is used to find the shortest path between nodes in a graph. The algorithm maintains a set of

unvisited nodes and calculates a tentative distance from a given node to another. If the algorithm finds a shorter way

/conversion/tmp/scratch/452024382.doc 7
to get to a given node, the path is updated to reflect the shorter distance. This problem has satisfactory optimization

substructure since if AAA is connected to B,B,B, BBB is connected to CCC, and the path must go through AAA and

BBB to get to the destination CCC, then the shortest path from AAA to BBB and the shortest path from BBB to CCC

must be a part of the shortest path from AAA to CCC. So the optimal answers from the subproblems do contribute to

the optimal answer for the total problem. This is because the algorithm keeps track of the shortest path possible to

any given node.

Dijkstra's algorithm to find the shortest path between a and b. It picks the

unvisited vertex with the lowest distance, calculates the distance through it to each unvisited neighbor, and updates the neighbor's distance if

smaller. Mark visited (set to red) when done with neighbors.[4]

Huffman Coding

Huffman encoding is another example of an algorithm where a greedy approach is successful. The Huffman

algorithm analyzes a message and depending on the frequencies of the characters used in the message, it assigns a

variable-length encoding for each symbol. A more commonly used symbol will have a shorter encoding while a rare

symbol will have a longer encoding.

The Huffman coding algorithm takes in information about the frequencies or probabilities of a particular symbol

occurring. It begins to build the prefix tree from the bottom up, starting with the two least probable symbols in the

list. It takes those symbols and forms a subtree containing them, and then removes the individual symbols from the

list. The algorithm sums the probabilities of elements in a subtree and adds the subtree and its probability to the list.

Next, the algorithm searches the list and selects the two symbols or subtrees with the smallest probabilities. It uses

those to make a new subtree, removes the original subtrees/symbols from the list, and then adds the new subtree

and its combined probability to the list. This repeats until there is one tree and all elements have been added. At

each subtree, the optimal encoding for each symbol is created and together composes the overall optimal encoding.

For many more applications of greedy algorithms, see the See Also section.

/conversion/tmp/scratch/452024382.doc 8

You might also like