0% found this document useful (0 votes)
82 views8 pages

Minimum Cost Path Algortihm

This document presents a formal basis for using heuristic information to determine minimum cost paths through graphs. It introduces the problem of finding optimal paths through graphs, which arises in many applications. Previous approaches have either used mathematical algorithms to systematically search graphs or heuristic strategies specific to problem domains. The paper proposes a general algorithm that incorporates heuristic domain knowledge into a formal theory to guide graph searches. It proves the algorithm is optimal in the sense that it examines the fewest possible nodes to guarantee a minimum cost solution.

Uploaded by

Adnan Hossain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
82 views8 pages

Minimum Cost Path Algortihm

This document presents a formal basis for using heuristic information to determine minimum cost paths through graphs. It introduces the problem of finding optimal paths through graphs, which arises in many applications. Previous approaches have either used mathematical algorithms to systematically search graphs or heuristic strategies specific to problem domains. The paper proposes a general algorithm that incorporates heuristic domain knowledge into a formal theory to guide graph searches. It proves the algorithm is optimal in the sense that it examines the fewest possible nodes to guarantee a minimum cost solution.

Uploaded by

Adnan Hossain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

100 IEEE TRANSACTIONS OF SYSTEMS SCIENCE AND CYBERNETICS, VOL. ssc-4, NO.

2, JULY 1968

[N1 J. E. Falk, "Lagrange multipliers and nonlinear programming," [17] R. Fletcher and C. M. Reeves, "Function minimization by
J. Math. Anal. Appl., vol. 19, July 1967. conjugate gradients," Computer J., vol. 7, July 1964.
[6] 0. L. Mangasarian and J. Ponstein, "Minimax and duality in [I8] D. Goldfarb, "A conjugate gradient method for nonlinear
nonlinear programming," J. Math. Anal. Appl., vol. 11, pp. 504- programming," Ph.D. dissertation, Dept. of Chem. Engrg., Prince-
518, 1965. ton University, Princeton, N. J., 1966.
[7] J. Stoer, "Duality in nonlinear programming and the minimax [19] L. S. Lasdon, "A multi-level technique for optimization,"
theorem," Numerische Mat hematik, vol. 5, pp. 371-379, 1963. Ph.D. dissertation, Systems Research Center, Case Institute of
[8] R. T. Rockafellar, "Duality and stability in extremum prob- Technology, Cleveland, Ohio, Rept. SRC 50-C-64-19, 1964.
lems involving convex functions," Pacific J. Math., vol. 21, pp. 120] L. S. Lasdon and J. D. Schoeffler, "A multi-level technique for
167-187, 1967. optimization," Preprints, Joint Automatic Control Conf., Troy,
[9] P. Wolfe, "A duality theorem for nonlinear programming," N. Y., June 22-25, 1965, pp. 85-92.
Q. A ppl. Math., vol. 19, pp. 239-244, 1961. [21] '-,"Decentralized plant control," ISA Trans., vol. 5,
[10] R. T. Rockafellar, "Nonlinear programming," presented at the pp. 175-183, April 1966.
American Mathematical Society Summer Seminar on the Math- [22] C. B. Brosilow and L. S. Lasdon, "A two level optimization
ematics of the Decision Sciences, Stanford University, Stanford, technique for recycle processes," 1965 Proc. AICHE-Symp. on
Calif., July August 1967. Application of Mathematical Models in Chemical Engineering Re-
[1ll D. G. Luenberger, "Convex programming and duLality in search, Design, and Production (London, England).
normal space," Proc. IEEE Systems Science and Cybernetics Conf. [21] L. S. Lasdon, "Duality and decomposition in mathematical
(Boston, Mass., October 11-13, 1967). programming," Systems Research Center, Case Institute of Tech-
[12] J. M. Danskin, "The theory of max-min with applications," nology, Cleveland, Ohio, Rept. SRC 119-C-67-52, 1967.
J. SIAM, vol. 14, pp. 641-665, July 1966. [24] A. V. Fiacco and G. P. McCormick, Sequential Unconstrained
113] W. Fenchel, "Convex cones, sets, and functions," mimeo- Minimization Techniques for Nonlinear Programming. New York:
graphed notes, Princeton University, Princeton, N. J., September Wiley, 1968.
1963. 12] R. Fox and L. Schmit, "Advances in the integrated approach
[14] R. Fletcher and M. J. D. Powell, "A rapidly convergent to structural synthesis," J. Spacecraft and Rockets, vol. 3, p. 858,
descent method for minimization," Computer J., vol. 6, p. 163, July June 1966.
1963. [261 B. P. Dzielinski and R. E. Gomory, "Optimal programming of
[115 L. S. Lasdon and A. D. Waren, "Mathematical programming lot sizes, inventory, and labor allocations," Management Sci., vol.
for optimal design," Electro-Technol., pp. 53-71, November 1967. 11, pp. 874-890, July 1965.
[16] J. B. Rosen, "The gradient projection method for nonlinear [271 J. E. Falk, "A relaxed interior approach to nonlinear pro-
programming, pt. I, linear constraints," J. SIAM, vol. 8, pp. 181- gramming," Research Analysis Corp., McLean, Va. RAC-TP-279,
217, 1960. 1967.

A Formal Basis for the Heuristic Determination


of Minimum Cost Paths
PETER E. HART, MEMBER, IEEE, NILS J. NILSSON, MEMBER, IEEE, AND BERTRAM RAPHAEL

Abstract-Although the problem of determining the minimum mechanical theorem-proving and problem-solving. These
cost path through a graph arises naturally in a number of interesting problems have usually been approached in one of two
applications, there has been no underlying theory to guide the ways, which we shall call the mathematical approach and
development of efficient search procedures. Moreover, there is no
adequate conceptual framework within which the various ad hoc the heuristic approach.
search strategies proposed to date can be compared. This paper 1) The Inathematical approach typically deals with the
describes how heuristic information from the problem domain can properties of abstract graphs and with algorithms that
be incorporated into a formal mathematical theory of graph searching prescribe an orderly examination of nodes of a graph to
and demonstrates an optimality property of a class of search strate- establish a minimum cost path. For example, Pollock and
gies.
Wiebensonf11 review several algorithms which are guaran-
teed to find such a path for any graph. Busacker and
I. INTRODUCTION Saaty[2' also discuss several algorithms, one of which uses
A. The Problem of Finding Paths Through Graphs the concept of dynamic programming. [3] The mathematical
approach is generally more concerned with the ultimate
MANY PROBLEIVIS of engineering and scientific achievement of solutions than it is with the computational
importance can be related to the general problem of feasibility of the algorithms developed.
finding a path through a graph. Examples of such prob- 2) The heuristic approach typically uses special knowl-
lems include routing of telephone traffic, navigation edge about the domain of the problem being represented by
through a maze, layout of printed circuit boards, and a graph to improve the computational efficiency of solu-
tions to particular graph-searching problems. For example,
Manuscript received November 24, 1967. Gelernter's 41 program used Euclidean diagrams to direct
The authors are with the Artificial Intelligence Group of the the search for geometric proofs. Samuel[1 and others have
Applied Physics Laboratory, Stanford Research Institute, Menlo
Park, Calif. used ad hoc characteristics of particular games to reduce
HART et al.: DETERMINATION OF MINIMUM COST PATHS 101

the "look-ahead" effort in searching game trees. Procedures path has a cost which is obtained by adding the individual
developed via the heuristic approach generally have not costs of each arc, ci,i+l, in the path. An optimal path from
been able to guarantee that minimum cost solution paths ni to nj is a path having the smallest cost over the set of all
will always be found. paths from ni to nj. We shall represent this cost by h(ni, n3).
This paper draws together the above two approaches by This paper will be concerned with the subgraph G, from
describing how information from a problem domain can some single specified start node s. We define a nonempty
be incorporated in a formal mathematical approach to a set T of nodes in GU as the goal nodes.1 For aniy node n in
graph analysis problem. It also presents a general algo- G., an element t e T is a preferred goal node of n if and only
rithm which prescribes how to use such information to find if the cost of an optimal path from n to t does not exceed
a minimum cost path through a graph. Finally, it proves, the cost of any other path from n to any member of T.
under mild assumptions, that this algorithm is optimal in For simplicity, we shall represent the unique cost of an
the sense that it examines the smallest number of nodes optimal path from n to a preferred goal node of n by the
necessary to guarantee a minimum cost solution. symbol h(n); i.e., h(n) = min h(n,t).
teT
The following is a typical illustration of the sort of
problem to which our results are applicable. Imagine a set C. Algorithms for Finding Minimun Cost Paths
of cities with roads connecting certain pairs of them.
Suppose we desire a technique for discovering a sequence We are interested in algorithms that search G, to find
of cities on the shortest route from a specified start to a an optimal path from s to a preferred goal node of s. What
specified goal city. Our algorithm prescribes how to use we mean by searching a graph and finding an optimal path
special knowledge-e.g., the knowledge that the shortest is made clear by describing in general how such algorithms
road route between any pair of cities cannot be less than proceed. Starting with the node s, they generate some part
the airline distance between them-in order to reduce the of the subgraph G, by repetitive application of the suc-
total number of cities that need to be considered. cessor operator r. During the course of the algorithm, if
First, we must make some preliminary statements and P is applied to a node, we say that the algorithm has
definitions about graphs and search algorithms. expanded that node.
We can keep track of the minimum cost path from s to
B. Some Definitions About Graphs each node encountered as follows. Each time a node is
expanded, we store with each successor node n both the
A graph G is defined to be a set I nil of elements called
nodes and a set {eij} of directed line segments called arcs. cost of getting to n by the lowest cost path found thus far,
and a pointer to the predecessor of n along that path.
If epq is an element of the set {eijj, then we say that there
is an arc from node np to node n, and that nq is a successor Eventually the algorithm terminates at some goal node t,
of n,. We shall be concerned here with graphs whose arcs
and no more nodes are expanded. We can then reConstruct
a minimum cost path from s to t known at the time of
have costs associated with them. We shall represent the termination simply by chaining back from t to s through
cost of arc eij by cij. (An arc from ni to nj does not imply the pointers.
the existence of an arc from nj to ni. If both arcs exist, in
general cij cji.) We shall consider only those graphs G We call an algorithm admissible if it is guaranteed to
for which there exists 8 > 0 such that the cost of every arc find an optimal path from s to a preferred goal node of s
for any 8 graph. Various admissible algorithms may differ
of G is greater than or equal to 6. Such graphs shall be
called 8 graphs. both in the order in which they expand the nodes of G, and
In many problems of interest the graph is not specified
in the number of nodes expanded. In the next section, we
explicitly as a set of nodes and arcs, but rather is specified shall propose a way of ordering node expansion and show
implicitly by means of a set of source nodes Sc n} and a that the resulting algorithm is admissible. Then, in a fol-
lowing section, we shall show, under a mild assumption,
successor operator P, defined on nil}, whose value for each
that this algorithm uses information from the problem
ni is a set of pairs { (nj, cij) }. In other words, applying r to
represented by the graph in an optimal way. That is, it
node ni yields all the successors nj of ni and the costs cij
associated with the arcs from ni to the various nj. Applica- expands the smallest number of nodes necessary to guar-
antee finding an optimal path.
tion of r to the source nodes, to their successors, and so
forth as long as new nodes can be generated results in an II. AN ADMISSIBLE SEARCHING ALGORITHM
explicit specification of the graph thus defined. We shall A. Description of the Algorithm
assume throughout this paper that a graph G is always
given in implicit form. In order to expand the fewest possible nodes in searching
The subgraph G,, from any node n in { ni} is the graph for an optimal path, a search algorithm must constantly
defined implicitly by the single source node n and some r make as informed a decision as possible about which node
defined on {ni}. We shall say that each node in G,, is to expand next. If it expands nodes which obviously cannot
accessible from n. be on an optimal path, it is wasting effort. On the other
A path from n, to nk is an ordered set of nodes (n1, n2, hand, if it continues to ignore nodes that might be oIn an
... , nk) with each ni+ a successor of ni. There exists a path
from ni to nj if and only if nj is accessible from ni. Every I We exclude the trivial case of s e T.
102 IEEE TRANSACTIONS ON SYSTEMS SCIENCE AND CYBERNETICS, JULY 1968

optimal path, it will sometimes fail to find such a path and


thus not be admissible. An efficient algorithm obviously
needs some way to evaluate available nodes to determine
which one should be expanded next. Suppose some evalu-
ation function f(n) could be calculated for any node n.
We shall suggest a specific function below, but first we shall
describe how a search algorithm would use such a function.
Let our evaluation function f(n) be defined in such a
way that the available node having the smallest value off Fig. 1.
is the node that should be expanded next. Then we can
define a search algorithm as follows. subgraph shown in Fig. 1. It consists of a start node s and
three other nodes, n3, n2, and n3. The arcs are shown with
Search Algorithmn A*: arrowheads and costs. Let us trace how algorithm A* pro-
1) Mark s "open" and calculatef(s). ceeded in generating this subgraph. Starting with s, we
2) Select the open node n whose value of f is smallest. obtain successors ni and n2. The estimates 0(n1) and &(n2)
Resolve ties arbitrarily, but always in favor of any node are then 3 and 7, respectively. Suppose A * expands ni next
n E T. with successors n2 and n3. At this stage g(n3) = 3 + 2 = 5,
3) If n e T, mark n "closed" and terminate the algorithm. and g(n2) is lowered (because a less costly path to it has
4) Otherwise, mark n closed and apply the successor been found) to 3 + 3 = 6. The value of 0(no) remains
operator P to n. Calculate f for each successor of n and equal to 3.
mark as open each successor not already marked closed. Next we must have an estimate h(n) of h(n). Here we
Remark as open any closed node n, which is a successor of rely on information from the problem domain. Many
n and for which f(ni) is smaller now than it was when n, problems that can be represented as a problem of finding a
was marked closed. Go to Step 2. minimum cost path through a graph contain some "physi-
cal" information that can be used to form the estimate A.
We shall next show that for a suitable choice of the In our example of cities connected by roads, h(n) might be
evaluation function f the algorithm A* is guaranteed to the airline distance between city n and the goal city. This
find an optimal path to a preferred goal node of s and thus distance is the shortest possible length of any road con-
is admissible. necting city n with the goal city; thus it is a lower bound on
h(n). We shall have more to say later about using informa-
B. The Evaluation Function tion from the problem domain to form an estimate f, but
For any subgraph GU and any goal set T, let f(n) be the first we can prove that if h is any lower bound of h, then
actual cost of an optimal path constrained to go through n, the algorithm A* is admissible.
from s to a preferred goal node of n.
Note that f(s) = h(s) is the cost of an unconstrained C. The Admissibility of A *
optimal path from s to a preferred goal node of s. In fact, We shall take as our evaluation function to be used in A*
f(n) -f(s) for every node n on an optimal path, and
f(n) > f(s) for every node n not on an optimal path. Thus, f(n) = A(n) + Ai(n) (2)
although f(n) is not known a priori (in fact, determination where g(n) is the cost of the path from s to n with minimum
of the true value of f(n) may be the main problem of inter- cost so far found by A *, and J(n) is any estimate of the
est), it seems reasonable to use an estimate of f(n) as the cost of an optimal path from n to a preferred goal node of
evaluation function f(n). In the remainder of this paper, n. We first prove a lemma.
we shall exhibit some properties of the search algorithm A *
when the cost f(n) of an optimal path through node n is Lemma 1
estimated by an appropriate evaluation function f(n). For any nonclosed node n and for any optimal path P
We can write f(n) as the sum of two parts: from s to n, there exists an open node n' on P with g(n')-
f(n) = g(n) + h(n) (1) g(n').
Proof: Let P = (s no, n1, n2, nk = n). If s is open
where g(n) is the actual cost of an optimal path from s to n, (that is, A* has not completed even one iteration), let
and h(n) is the actual cost of an optimal path from n to a n -s, and the lemma is trivially true since 0(s) = g(s)
preferred goal node of n. 0. Suppose s is closed. Let A be the set of all closed
Now, if we had estimates of g and h, we could add them nodes ni in P for which g(ni) = g(ni). A is not empty,
to form an estimate of f. Let g(n) be an estimate of g(n). since by assumption s e A. Let n* be the element of A with
An obvious choice for g(n) is the cost of the path from s to highest index. Clearly, n* 5 n, as n is nonclosed. Let n'
n having the smallest cost so far found by the algorithm. be the successor of n* on P. (Possibly n' = n.) Now
Notice that this implies 0(n) > g(n). 0(n') . g(n*) + cn*,w, by definition of 0; -(n*) - g(n*)
A simple example will illustrate that this estimate is because n ~is in A, and g(n') = g(n*) + Cn*,n, because P
easy to calculate as the algorithm proceeds. Consider the is an optimal path. Therefore, g(n') < g(n'). But in
HART et al.: DETERMINATION OF MINIMUM COST PATHS 103

general, g(n') > g(n'), since the lowest cost g(n') from s nodes in x(M) must be forever closed. Since no nodes out-
to n' discovered at any time is certainly not lower than side x(M) can be expanded, A* must terminate.
the optimal cost g(n'). Thus O(n') = g(n'), and moreover,
n' must be open by the definition of A. Case 3
Termination is at a goal node without achieving mini-
Corollary mum cost. Suppose A* terminates at some goal node t
Suppose h(n) < h(n) for all n, and suppose A* has not with f(t) = g(t) > f(s). But by the corollary to Lemma 1,
terminated. Then, for any optimal path P from s to any there existed just before termination an open node n' on
preferred goal node of s, there exists an open node n' on P an optimal path with f(n') < f(s) < f(t). Thus at this
with f(n') < f(s). stage, n' would have been selected for expansion rather
Proof: By the lemma, there exists an open node n' on P than t, contradicting the assumption that A* terminated.
with g(n') = g(n'), so by definition of f The proof of Theorem 1 is now complete. In the next
f(n') - (n') + hf(n') section, we shall show that for a certain choice of the
function h(n), A* is not only admissible but optimal, in
- g(n') + hJ(n') the sense that no other admissible algorithm expands
fewer nodes.
< g(n') + h(n') = f (n').
But P is an optimal path, so f(n') = f(s) for all n'e P, III. ON THE OPTIMALITY OF A*
which completes the proof. We can now prove our first A. Limitation of Subgraphs by Informationfrom the Problem
theorem.
In the preceding section, we proved that if h(n) is any
Theorem 1 lower bound on h(n), then A* is admissible. One such
lower bound is h(n) = 0 for all n. Such an estimate amounts
If ii(n) < h(n) for all n, then A* is admissible. to assuming that any open node n might be arbitrarily
Proof: We prove this theorem by assuming the contrary, close to a preferred goal node of n. Then the set { Gn is
namely that A* does not terminate by finding an optimal unconstrained; anything is possible at node n, and, in
path to a preferred goal node of s. There are three cases to particular, if g is a minimum at node n, then niode n must
consider: either the algorithm terminates at a nongoal node, be expanded by every admissible algorithm.
fails to terminate at all, or terminates at a goal node with- Often, however, we have information from the problem
out achieving minimum cost. that constrains the set { Gn} of possible subgraphs at each
Case 1 node. In our example with cities connected by roads, no
subgraph G01 is possible for which h(n) is less than the
Termination is at a nongoal node. This case contradicts airline distance between city n and a preferred goal city of
the termination condition (Step 3) of the algorithm, so it n. In general, if the set of possible subgraphs is con-
may be eliminated immediately. strained, one can find a higher lower bound of h(n) than
one can for the unconstrained situation. If this higher lower
Case 2 bound is used for h(n), then A* is still admissible, but, as
There is no termination. Let t be a preferred goal node of will become obvious later, A* will generally expand fewer
s, accessible from the start in a finite number of steps, nodes. We shall assume in this section that at each node n,
with associated minimum cost f(s). Since the cost on any certain information is available from the physical situation
arc is at least 3, then for any node n further than M- on which we can base a computation to limit the set; Gn3
f(s)/6 steps from s, we have f(n) > g(n) > g(n) > Mb- of possible subgraphs.
f(s). Clearly, no node n further than M steps from s is ever Suppose we denote the set of all subgraphs from node n
expanded, for by the corollary to Lemma 1, there will be by the symbol I Gn,,,} where c indexes each subgraph, and
some open node n' on an optimal path such that f(n') < co is in some index set Q,n. Now, we presume that certain
f(s) < f(n), so, by Step 2, A* will select n' instead of n. information is available from the problem domain about
Failure of A* to terminate could then only be caused by the state that node n represents; this information limits
continued reopening of nodes within M steps of s. Let the set of subgraphs from node n to the set Gn,0o}, where
%(M) be the set of nodes accessible within M steps from s, 6 is in some smaller index set On C Qn.
and let P(M) be the number of nodes in %(M). Now, any For each Gn,, in { Gn,0} there corresponds a cost ho(n)
node n in %(M) can be reopened at most a finite number of of the optimum path from n to a preferred goal node of n.
times, say p3(n, M), since there are only a finite number of We shall now take as our estimate i(n), the greatest
paths from s to n passing only through nodes within M lower bound for ho(n). That is,
steps of s. Let
hi(n) = inf h(n). (3)
p(M) = max
neX(M)
p(n, M), 0 ,On

We assume the infimum is achieved for some fOn


the maximum number of times any one node can be re- In actual problems one probably never has an explicit
opened. Hence, after at most v(M)p(M) expansions, all representation for { Gn,0 but instead one selects a pro-
104 IEEE TRANSACTIONS ON SYSTEMS SCIENCE AND CYBERNETICS, JULY 1968

cedure for computing A(n), known from information


about the problem domain, to be a lower bound on h(n).
This selection itself induces the set { Gn,0} by (3). Never-
theless, it is convenient to proceed with our formal dis-
cussioni as if { Gn,O} were available and as if A(n) were
explicitly calculated from (3). For the rest of this paper,
we assume that the algorithm A * uses (3) as the definition
of A.
3 n 5 t
B. A Consistency Assumption Fig. 2.
When a real problem is modeled by a graph, each node
of the graph corresponds to some state in the problem choose to expand node n2 next. This error occurs because
domain. Our general knowledge about the structure of the the estimates h(s) = 8 and h(n2) = 1 are inconsistent in
problem domain, together with the specific state repre- view of the fact that n2 is only six units away from s. The
sented by a node n, determines how the set Qn is reduced to information that there cannot exist a path from s to a goal
the set O,. However, we shall make one assumption with total cost less than eight was somehow available for
about the uniformity of the manner in which knowledge the computation of h(s), and then ignored during the com-
of the problem domain is used to impose this reduction. putation of (n2). The result is that (5) was violated, i.e.,
This assumption may be stated formally as follows. For
any nodes m and n, h(s, n2) + A(n2) = 6 + 1 < 8 = h(s).
h(m,n) + inf ho0(n) > inf ho(m). (4) For the rest of this paper, we shall assume that the
OceOn oeOm
family { 04 of index sets satisfies (4) or, equivalently, the
Using the definition of A given in (3), we can restate (4) procedures for computing the estimates A always lead to
as a kind of triangle inequality: values that satisfy (5). We shall call this assumption the
consistency assumption. Note that the estimate A(n) = 0
h(m, n) + A(n) 2 A(m). (5) for all n trivially satisfies the consistency assumption.
The assumption expressed by (4) [and therefore (5)] Intuitively, the consistency assumption will generally be
amounts to a type of consistency assumption on the esti- satisfied by a computation rule for h that uniformly uses
mate h(n) over the nodes. It means that any estimate h(n) measurable parameters of the problem state at all nodes;
calculated from data available in the "physical" situation it will generally be violated if the computation rule de-
represented by node n alone would not be improved by pends upon any parameter that varies between nodes
using corresponding data from the situations represented independently of the problem state (such as a parity
by the other nodes. Let us see what this assumption means count or a random variable), or if the computations at
in the case of our example of cities and roads. Suppose we some nodes are more elaborate than at others.
decide, in this example, to use as an estimate A(n), the air- C. Proof of the Optimality of A *
line distance from city n to its closest goal city. As we have
stated previously, such an estimate is certainly a lower The next lemma makes the important observation about
bound on h(n). It induces at each node n a set { Gn,0} of the operation of A * that, under the consistency assump-
possible subgraphs from n by (3). If we let d(m, n) be the tion, if node n is closed, then fr(n) = g(n). This fact is im-
airline distance between the two cities- corresponding to portant for two reasons. First, it is used in the proof of the
nodes n and m, we have h(m, n) > d(m, n) and, therefore, theorem about the optimality of A * to follow, and second,
by the triangle inequality for Euclidean distance it states that A * need never reopen a closed node. That is,
if A* expands a node, then the optimal path to that node
h(m, n) + h(n) > d(m, n) + h(n) > h(m), has already been found. Thus, in Step 4 of the algorithm
which shows that this A satisfies the assumption of (5). A*, the provision for reopening a closed node is vacuous
Now let us consider for a moment the following A for the and may be eliminated.
roads-and-cities problem. Suppose the nodes of the graph
are numbered sequentially in the order in which they are Lemma 2
discovered. Let A for cities represented.by nodes with odd- Suppose the consistency assumption is satisfied, and
numbered indexes be the airline distance to a preferred suppose that node n is closed by A*. Then g(n) = g(n).
goal city of these nodes, and let A = 1 for nodes with even- Proof: Consider the subgraph Gs just before closing n,
numbered indexes. For the graph of Fig. 2, f(s) h(ni) = 8. and suppose the contrary, i.e., suppose g(n) > g(n). NoW
Nodes n2 and n3 are the successors of n, along arcs with there exists some optimal path P from s to n. Since '(n) >
costs as indicated. By the above rule for computing A, g(n), A* did not find P. By Lemma 1, there exists an open
h(n2) 1 while A(n3) = 5. Then f(n2) = 6 + 1 = 7, while node n' on P with g(n') = g(n'). If n' = n, we have
f(n) = 3 + 5 = 8, and algorithm A* would erroneously proved the lemma. Otherwise,
HIART et al.: DETERMINATION OF MINIMUM COST PATHS 105

g(n) g(n') + h(n',n) uses no more information about the problem thanl does A *.
Let 0/, be the index set used by algorithm A at node n.
g(n') + h(n',n). Then, if OnA* C 0"A for all nodes n in Gs, we shall say that
Thus, algorithm A is no more informed than algorithm A*.
The next theorem states that if an admissible algorithm
9(n) > g(n') + h(n',n). A is no more informed than A *, then any node expanded by
Adding h(n) to both sides yields A * must also be expanded by A. We prove this theorem
for the special case for which ties never occur in the value
0(n) + h(n) > g(n') + h(n',n) + A(n). of f used by A *. Later we shall generalize the theorem to
We can apply (5) to the right-hand side of the above in- cover the case where ties can occur, but the proof of the no-
equality to yield ties theorem is so transparent that we include it for clarity.
0(n) + h(n) > g(n') + h(n') Theorem 2
or
Let A be any admissible algorithm no more iniformed
than A*. Let G, be any a graph such that n # m implies
f(n) > f(n'), f(n) X f(m), and let the consistency assumption be satis-
fied by the h used in A *. Then if niode n was expanided by
contradicting the fact that A * selected n for expansion A*, it was also expanded by A.
when n' was available and thus proving the lemma. Proof: Suppose the contrary. Then there exists some
The next lemma states that f is monotonically nonde- node n expanded by A* but not by A. Let t* anid t be the
creasing on the sequence of nodes closed by A *. preferred goal nodes of s found by A* and A, respectively.
Since A * and A are both admissible,
Lemma S
f(t*) 0(t*) + A(t*) -J(t*) + 0 = f(t*) f(t) = f(s).
Let (n1, n2, .., n,) be the sequence of nodes closed by
A*. Then, if the consistency assumption is satisfied, p . q Since A * must have expanded n before closing t*, by
implies f(np,) < f(nq). Lemma 3 we have
Proof: Let n be the next node closed by A * after closing f(n) < f(t*) = f(t)
m. Suppose first that the optimum path to n does not go
through m. Then n was available at the time m was (Strict inequality occurs because no ties are allowed.)
selected, and the lemma is trivially true. Then suppose that There exists some graph Gn,, 0 EeOn,3 for which A(n) -
the optimum path to n does, in fact, go through m. Then h(n) by the definitioin of h. Now by Lemma 2, 0(n) = g(n).
g(n) = g(m) + h(m, n). Since, by Lemma 2, we have Then on the graph Gn,Oi f(n) = f(n). Since A is no more
0(n) g(n) and g(m) =g(m), informed than. A*, A could not rule out the existence of
Gn,O; but A did not expand n before termination and is,
J(n) -C(n) + A(n) therefore, not admissible, contrary to our assumption and
-g(n) + h(n) completing the proof.
Upon defining N(A,Gs) to be the total number of nodes
g(m) + h(m, n) + h(n) in G, expanded by the algorithm A, the following simple
corollary is immediate.
> g(m) + h(m)
Corollary
- (m) + A(m) Under the premises of Theorem 2,
where the inequality follows by application of (5). Thus we N(A*, Gs) < N(A, Gs)
have
with equality if and only if A expands the identical set of
f(n) . f(m). nodes as A*.
In this sense, we claim that A* is an optimal algorithm.
Since this fact is true for any pair of nodes tk and nk+l in Compared with other no more informed admissible
the sequence, the proof is complete.
algorithms, it expands the fewest possible nodes necessary
to guarantee finding an optimal path.
Corollary In case of ties, that is if there exist two or more open
Under the premises of the lemma, if n is closed then nodes n1, . ., nk with f(n1) = ... = f(n,) < f(n) for every
f(n) < f(s). other open node n, A * arbitrarily chooses one of the ni.
Proof: Let t be the goal node found by A*. Then f(n) < Consider the set C* of all algorithms that act identically to
>(t) = f(t) = f(s). A * if there are no ties, but whose members resolve ties
We can now prove a theorem about the optimality of A * differently. An algorithm is a member of R* if it is simply
as compared with any other admissible algorithm A that the original A * with any arbitrary tie-breaking rule.
106 IEEE TRANSACTIONS ON SYSTEMS SCIENCE AND CYBERNETICS, JULY 1968

The next theorem extends Theorem 2 to situations where Proof: For any noncritical tie, all alternative nodes must
ties may occur. It states that for any admissible algorithm be expanded by A as well as by A* or A would not be
A, one can always find a member A* of R* such that each admissible. Therefore, we need merely observe that each
node expanded by A* is also expanded by A. node expanded by A* but not by A must correspond to a
different critical tie in which A *'s tie-breaking rule made
Theorem 3 the inappropriate choice.
Let A be any admissible algorithm no more informed Of course, one must remember that when A does expand
than the algorithms in a'*, and suppose the consistency fewer nodes than some particular A* in 02*, it is only be-
assumption is satisfied by the A used in the algorithms in cause A was in some sense "lucky" for the graph
*. Then for any a graph G, there exists an A*E G,* such searched, and that there exists a graph consistent with being
the
that every node expanded by A* is also expanded by A. information available to A and A * for which A * would not
Proof: Let G, be any a graph and Al* be any algorithm search more nodes than A.
in a*. If every node of G, that A1* expands is also ex- Note that, although one cannot
panded by A, let A1* be the A* of the theorem. Otherwise, of R while the algorithm proceedskeep
a running estimate
because one does not
we will show how to construct the A* of the theorem by know the value of
changing the tie-breaking rule of A,*. Let L be the set of the algorithm f(s), this value is established as soon as
nodes expanded by A, and let P = (s, n1, n2, ..*, nk, t) be puted. In mostterminates,
and R can then be easily com-
the optimal path found by A.
practical situations, R is not likely to be
large because critical ties are likely to occur only very
Expand nodes as prescribed by A1* as long as all nodes close to termination of the algorithm, when A can become a
selected for expansion are elements of L. Let n be the first perfect estimator of h.
node selected for expansion by A1* which is not in L. Now
f(n) < f(s) by the corollary to Lemma 3. Since A(n) < IV. DiscussION AND CONCLUSIONS
f(s) f(t) would imply that A is inadmissible (by the A. Comparisons Between A * and Other Search Techniques
=
argument of Theorem 2), we may conclude that f(n)
f(s). At the time A1* selected n, goal node t was not closed Earlier we mentioned that the estimate A(n) - 0 for all
(or A1* would have been terminated). Then by the n trivially satisfies the consistency assumption. In this
corollary to Lemma 1, there is an open node n' on P such case, f(n) &(n), the lowest cost so far discovered to node
=

that f(n') < f(s) f(n). But since n was selected for ex- n. Such an estimate is appropriate when no information at
pansion by A1* instead of n', A(n) < f(n'). Hence f(n) < all is available from the problem domain. In this case, an
admissible algorithm
A(n') < A(n), so f(n) = f('). Let A2* be identical to A1* the goal might be as close cannot rule out the possibility that
except that the tie-breaking rule is modified just enough to as a to that node with minimum
choose n' instead of n. By repeating the above argument, g(n). Pollack and WiebensonUl] discuss an algorithm, pro-
we obtain for some i an A *,e a* that expands only nodes posed to them by Minty in a private communication, that
that are also expanded by A, completing the proof of the is essentially identical to our A* using f(n) = g(n).
theorem. Many algorithms, such as Moore's "Algorithm D"[fi
and Busacker and Saaty's implementation of dynamic
Corollary 1 programming, keep track of g(n) but do not use it to
Suppose the premises of the theorem are satisfied. Then order the expansion of nodes. The nodes are expanded in a
for any a graph G, there exists an A *e a* such that N(A *, "breadth-first" order, meaning that all nodes one step
away from the start expanded first, then all nodes two
GQ) NA(A, G,), with equality if and only if A expands the steps away, etc. Sucharemethods must allow for changes in
identical set of nodes as A*. the value of g(n) as a node previously expanded is later
Since we cannot select the most fortuitous tie-breaking reached again by a less costly route.
rule ahead of time for each graph, it is of interest to ask It might be argued that the algorithms of Moore,
how all members of a* compare against any admissible Busacker and Saaty, and other equivalent algorithms
algorithm A in the number of nodes expanded. Let us (sometimes known as "water flow" or "amoeba" algo-
define a critical tie between n and n' as one for which rithms) are advantageous because they first encounter the
J(n) =(n') f(s). f Then we have the following as a second goal by a path with a minimum number of steps. This
corollary to Theorem 3. argument merely reflects an imprecise formulation of the
problem, since it implies that the number of steps, and not
Corollary 2 the cost of each step, is the quantity to be minimized.
Suppose the premises of the theorem are satisfied. Let Indeed, if we set cij = 1 for all arcs, this class of algorithms
R(A*, C,) be the number of critical ties which occurred in is identical to A* with A-0. We emphasize that, as is
the course of applying A* to G,. Then for any a graph G, always the case when a mathematical model is used to
and any A*Ea*, represent a real problem, the first responsibility of the in-
vestigator is to ensure that the model is an adequate
AV(A*, Gs) N(A, Gs) + R(A*, G,). representation of the problem for his purposes.
HART et al.: DETERMINATION OF MINIMUM COST PATHS 107

It is beyond the scope of the discussion to consider how ample, choose h(n) = (x + y)/2. Since (x + y)/2 <
to define a successor operator P or assign costs cij so that V/x2 + y2, the algorithm is still admissible. Since we are
the resulting graph realistically reflects the nature of a not using "all" our knowledge of the problem domain, a
specific problem domain.2 few extra nodes may be expanded, but total computational
B. The Heuristic Power of the Estimate A effort may be reduced; again, each "extra" node must also
be expanded by other admissible algorithms that limit
The algorithm A * is actually a family of algorithms; the themselves to the "knowledge" that the distance between
choice of a particular function A selects a particular two cities may be as small as (x + y)/2.
algorithm from the family. The function A can be used to Now suppose we would like to reduce our computational
tailor A * for particular applications. effort still further, and would be satisfied with a solution
As was discussed above, the choice AO0 corresponds to path whose cost is not necessarily minimal. Then we could
the case of knowing, or at least of using, absolutely no choose an A somewhat larger than the one defined by (3).
information from the problem domain. In our example of The algorithm would no longer be admissible, but it
cities connected by roads, this would correspond to assum- might be more desirable, from a heuristic point of view,
ing a priori that roads could travel through "hyperspace," than any admissible algorithm. In our roads-and-cities
i.e., that any city may be an arbitrarily small road dis- example, we might let A - x + y. Since road distance is
tance from any other city regardless of their geographic usually substantially greater than airline distance, this h
coordinates. will usually, but not always, result in an Qptimal solution
Since we are, in fact, "more informed" about the nature path. Often, but not always, fewer nodes will be expanded
of Euclidean space, we might increase A(n) from 0 to and less arithmetic effort required than: if we used A(n)
Nx2 + y2 (where x and y are the magnitudes of the differ- VX2 + y2.
ences in the x, y coordinates of the city represented by Thus we see that the formulation presented uses one
node n and its closest goal city). The algorithm would function, A, to embody in a formal theory all knowledge
then still find the shortest path, but would do so by ex- available from the problem domain. The selection of A,
panding, typically, considerably fewer nodes. In fact, therefore, permits one to choose a desirable compromise
A * expands no more nodes3 than any admissible algorithm between admissibility, heuristic effectiveness, and com-
that uses no more information from the problem domain; putational efficiency.
viz., the information that a road between two cities might
be as short as the airline distance between them.
Of course, the discussion thus far has not considered the REFERENCES
A [1] M. Pollack
cost of computing each time a node is generated on the route problem-a review,"Wiebenson,
and W. "'Solutions of the shortest-
Operations Res., vol. 8, March-April
graph. It could be that the computational effort required 1960.
to compute x2 + y2is significant when compared to the
[1] I. Busacker and T. Saaty, Finite Graphs and Networks: An
Introduction with Applications. New York: McGraw-Hill, 1965,
effort involved in expanding a few extra nodes; the optimal ch. 3.
procedure in the sense of minimum number of nodes ex- Princeton, [3] R. Bellman and S. Dreyfus, Applied Dynamic Programming.
N. J.: Princeton University Press, 1962.
panded might not be optimal in the sense of minimum [4] H. Gelernter, "Realization of a geometry-theorem proving
total resources expended. In this case one might, for ex- machine,"
1963.
in Computers and Thought. New York: McGraw-Hill,
[5] A. Samuel, "Some studies in machine learning using the game of
checkers," in Computers and Thought. New York: McGraw-Hill,
2 We believe that appropriate choices for r and c will permit 1963.
many of the problem domains in the heuristic programming lit- [6J E. Moore, "The shortest path through a maze," Proc. Internat'l
erature [7 to be mapped into graphs of the type treated in this paper. Symp. on Theory of Switching (April 2-5, 1957), pt. 2. Also, The
This could lead to a clearer understanding of the effects of "heu- Annals of the Computation Laboratory of Harvard University, vol. 30.
ristics" that use information from the problem domain. Cambridge, Mass.: Harvard University Press, 1959.
3 Except for possible critical ties, as discussed in Corollary 2 of [1] E. Feigenbaum and J. Feldman, Computers and Thought.
Theorem 3. New York: McGraw-Hill, 1963.

You might also like