A Greedy Algorithm
A Greedy Algorithm
locally optimal choice at each stage[1] with the intent of finding a global optimum. In many
problems, a greedy strategy does not usually produce an optimal solution, but nonetheless a greedy
heuristic may yield locally optimal solutions that approximate a globally optimal solution in a
reasonable amount of time.
For example, a greedy strategy for the traveling salesman problem (which is of a high
computational complexity) is the following heuristic: "At each step of the journey, visit the nearest
unvisited city." This heuristic does not intend to find a best solution, but it terminates in a
reasonable number of steps; finding an optimal solution to such a complex problem typically
requires unreasonably many steps. In mathematical optimization, greedy algorithms optimally solve
combinatorial problems having the properties of matroids, and give constant-factor approximations
to optimization problems with submodular structure.
Greedy algorithms produce good solutions on some mathematical problems, but not on others. Most
problems for which they work will have two properties:
Cases of failure[edit]
Examples on how a greedy algorithm may fail to achieve the optimal solution.
/conversion/tmp/scratch/452024382.doc 1
Starting from A, a greedy algorithm that tries to find the maximum by following the greatest slope
will find the local maximum at "m", oblivious to the global maximum at "M".
With a goal of reaching the largest sum, at each step, the greedy algorithm will choose what appears
to be the optimal immediate choice, so it will choose 12 instead of 3 at the second step, and will not
reach the best solution, which contains 99.
For many other problems, greedy algorithms fail to produce the optimal solution, and may even
produce the unique worst possible solution. One example is the traveling salesman problem
mentioned above: for each number of cities, there is an assignment of distances between the cities
for which the nearest-neighbor heuristic produces the unique worst possible tour.[3]
Greedy algorithms can be characterized as being 'short sighted', and also as 'non-recoverable'. They
are ideal only for problems which have 'optimal substructure'. Despite this, for many simple
problems, the best suited algorithms are greedy algorithms. It is important, however, to note that the
greedy algorithm can be used as a selection algorithm to prioritize options within a search, or
branch-and-bound algorithm. There are a few variations to the greedy algorithm:
Greedy algorithms mostly (but not always) fail to find the globally optimal solution because they
usually do not operate exhaustively on all the data. They can make commitments to certain choices
too early which prevent them from finding the best overall solution later. For example, all known
greedy coloring algorithms for the graph coloring problem and all other NP-complete problems do
/conversion/tmp/scratch/452024382.doc 2
not consistently find optimum solutions. Nevertheless, they are useful because they are quick to
think up and often give good approximations to the optimum.
If a greedy algorithm can be proven to yield the global optimum for a given problem class, it
typically becomes the method of choice because it is faster than other optimization methods like
dynamic programming. Examples of such greedy algorithms are Kruskal's algorithm and Prim's
algorithm for finding minimum spanning trees, and the algorithm for finding optimum Huffman
trees.
Greedy algorithms appear in network routing as well. Using greedy routing, a message is forwarded
to the neighboring node which is "closest" to the destination. The notion of a node's location (and
hence "closeness") may be determined by its physical location, as in geographic routing used by ad
hoc networks. Location may also be an entirely artificial construct as in small world routing and
distributed hash table.
A greedy algorithm is a simple, intuitive algorithm that is used in optimization problems. The algorithm makes the
optimal choice at each step as it attempts to find the overall optimal way to solve the entire problem. Greedy
algorithms are quite successful in some problems, such as Huffman encoding which is used to compress data, or
Dijkstra's algorithm, which is used to find the shortest path through a graph.
However, in many problems, a greedy strategy does not produce an optimal solution. For example, in the animation
below, the greedy algorithm seeks to find the path with the largest sum. It does this by selecting the largest
available number at each step. The greedy algorithm fails to find the largest sum, however, because it makes
decisions based only on the information it has at any one step, without regard to the overall problem.
With a goal of reaching the largest sum, at each step, the greedy algorithm
will choose what appears to be the optimal immediate choice, so it will choose 12 instead of 3 at the second step and will not reach the best
Greedy Algorithms
Greedy algorithms take all of the data in a particular problem, and then set a rule for which elements to add to the
solution at each step of the algorithm. In the animation above, the set of data is all of the numbers in the graph, and
the rule was to select the largest number available at each level of the graph. The solution that the algorithm builds
If both of the properties below are true, a greedy algorithm can be used to solve the problem.
/conversion/tmp/scratch/452024382.doc 3
Greedy choice property: A global (overall) optimal solution can be reached by choosing the optimal
Optimal substructure: A problem has an optimal substructure if an optimal solution to the entire problem
In other words, greedy algorithms work on problems for which it is true that, at every step, there is a choice that is
optimal for the problem up to that step, and after the last step, the algorithm produces the optimal solution of the
complete problem.
To make a greedy algorithm, identify an optimal substructure or subproblem in the problem. Then, determine what
the solution will include (for example, the largest sum, the shortest path, etc.). Create some sort of iterative way to
4 to 5 to 8 4 to 7 to 3 4 to 5 to 4 to 9 4 to 7 to 2 to 10
If there is a greedy algorithm that will traverse a graph, selecting the largest node value at each point until it
reaches a leaf of the graph, what path will the greedy algorithm follow in the graph below?
14 19 22 23 24 26
What is the length of the longest path through the graph below? Calculate the length by adding the values of the
nodes.
/conversion/tmp/scratch/452024382.doc 4
Limitations of Greedy Algorithms
Sometimes greedy algorithms fail to find the globally optimal solution because they do not consider all the data. The
choice made by a greedy algorithm may depend on choices it has made so far, but it is not aware of future choices it
could make.
In the graph below, a greedy algorithm is trying to find the longest path through the graph (the number inside each
node contributes to a total length). To do this, it selects the largest number at each step of the algorithm. With a
quick visual inspection of the graph, it is clear that this algorithm will not arrive at the correct solution. What is the
/conversion/tmp/scratch/452024382.doc 5
An
The correct solution for the longest path through the graph is 7,3,1,997, 3, 1, 997,3,1,99. This is clear to us
because we can see that no other combination of nodes will come close to a sum of 999999, so whatever path we
choose, we know it should have 999999 in the path. There is only one option that includes 999999: 7,3,1,997, 3,
1, 997,3,1,99.
The greedy algorithm fails to solve this problem because it makes decisions purely based on what the best answer
at the time is: at each step it did choose the largest number. However, since there could be some huge number that
the algorithm hasn't seen yet, it could end up selecting a path that does not include the huge number. The solutions
to the subproblems for finding the largest sum or longest path do not necessarily appear in the solution to the total
problem. The optimal substructure and greedy choice properties don't hold in this type of problem. □_\square□
Here, we will look at one form of the knapsack problem. The knapsack problem involves deciding which subset of
items you should take from a set of items if you want to optimize some value: perhaps the worth of the items, the
In this problem, we will assume that we can either take an item or leave it (we cannot take a fractional part of an
item). We will also assume that there is only one of each item. Our knapsack has a fixed size, and we want to
optimize the worth of the items we take, so we must choose the items we take with care. [3]
/conversion/tmp/scratch/452024382.doc 6
Item Size Price
Laptop 22 12
PlayStation 10 9
Textbook 9 9
Basketball 7 6
There are two greedy algorithms we could propose to solve this. One has a rule that selects the item with the largest
price at each step, and the other has a rule that selects the smallest sized item at each step.
Largest-price Algorithm: At the first step, we take the laptop. We gain 121212 units of worth, but can
now only carry 25−22=325-22 = 325−22=3 units of additional space in the knapsack. Since no items that
remain will fit into the bag, we can only take the laptop and have a total of 121212 units of worth.
Smallest-sized-item Algorithm: At the first step, we will take the smallest-sized item: the basketball. This
gives us 666 units of worth, and leaves us with 25−7=1825-7 = 1825−7=18 units of space in our bag.
Next, we select the next smallest item, the textbook. This gives us a total of 6+9=156+9 =156+9=15 units
of worth, and leaves us with 18−9=918-9 = 918−9=9 units of space. Since no remaining items are 999
The greedy algorithms yield solutions that give us 121212 units of worth and 151515 units of worth. But neither of
these are the optimal solution. Inspect the table yourself and see if you can determine a better selection of items.
Taking the textbook and the PlayStation yields 9+9=189+9=189+9=18 units of worth and takes up
10+9=1910+9=1910+9=19 units of space. This is the optimal answer, and we can see that a greedy algorithm
will not solve the knapsack problem since the greedy choice and optimal substructure properties do not hold.
□_\square□
In problems where greedy algorithms fail, dynamic programming might be a better approach.
Applications
There are many applications of greedy algorithms. Below is a brief explanation of the greedy nature of a famous
Dijkstra's Algorithm
Dijkstra's algorithm is used to find the shortest path between nodes in a graph. The algorithm maintains a set of
unvisited nodes and calculates a tentative distance from a given node to another. If the algorithm finds a shorter way
/conversion/tmp/scratch/452024382.doc 7
to get to a given node, the path is updated to reflect the shorter distance. This problem has satisfactory optimization
substructure since if AAA is connected to B,B,B, BBB is connected to CCC, and the path must go through AAA and
BBB to get to the destination CCC, then the shortest path from AAA to BBB and the shortest path from BBB to CCC
must be a part of the shortest path from AAA to CCC. So the optimal answers from the subproblems do contribute to
the optimal answer for the total problem. This is because the algorithm keeps track of the shortest path possible to
Dijkstra's algorithm to find the shortest path between a and b. It picks the
unvisited vertex with the lowest distance, calculates the distance through it to each unvisited neighbor, and updates the neighbor's distance if
Huffman Coding
Huffman encoding is another example of an algorithm where a greedy approach is successful. The Huffman
algorithm analyzes a message and depending on the frequencies of the characters used in the message, it assigns a
variable-length encoding for each symbol. A more commonly used symbol will have a shorter encoding while a rare
The Huffman coding algorithm takes in information about the frequencies or probabilities of a particular symbol
occurring. It begins to build the prefix tree from the bottom up, starting with the two least probable symbols in the
list. It takes those symbols and forms a subtree containing them, and then removes the individual symbols from the
list. The algorithm sums the probabilities of elements in a subtree and adds the subtree and its probability to the list.
Next, the algorithm searches the list and selects the two symbols or subtrees with the smallest probabilities. It uses
those to make a new subtree, removes the original subtrees/symbols from the list, and then adds the new subtree
and its combined probability to the list. This repeats until there is one tree and all elements have been added. At
each subtree, the optimal encoding for each symbol is created and together composes the overall optimal encoding.
For many more applications of greedy algorithms, see the See Also section.
/conversion/tmp/scratch/452024382.doc 8