0% found this document useful (0 votes)
13 views40 pages

Module2-Informed Search Strategies-10Jan2024

The document discusses informed search strategies in artificial intelligence, focusing on best-first search and its variants, including greedy best-first search and A* search. It explains how these algorithms utilize heuristic functions to improve search efficiency and outlines the conditions for optimality in A* search, such as admissibility and consistency of heuristics. Additionally, it addresses the complexities and limitations of these search methods, particularly in large-scale problems.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views40 pages

Module2-Informed Search Strategies-10Jan2024

The document discusses informed search strategies in artificial intelligence, focusing on best-first search and its variants, including greedy best-first search and A* search. It explains how these algorithms utilize heuristic functions to improve search efficiency and outlines the conditions for optimality in A* search, such as admissibility and consistency of heuristics. Additionally, it addresses the complexities and limitations of these search methods, particularly in large-scale problems.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

MODULE 2

INFORMED SEARCH STRATEGIES

Artificial Intelligence
1
Informed (Heuristic) Search Strategies
 Informed search strategy uses problem-specific knowledge beyond the
definition of the problem and can find solutions more efficiently than the
uninformed strategy.
 The general approach we consider is called best-first search.
 Best-first search is an instance of the general Tree-search or Graph-search
algorithm in which a node is selected for expansion based on an evaluation
function, f(n).
 The evaluation function is construed as a cost estimate, and the node with
the lowest evaluation is expanded first.
 The implementation of best-first graph search is identical to uniform-cost
search except for the use of f instead of g to order the priority queue.
 The choice of f determines the search strategy.

Artificial Intelligence 2
Informed (Heuristic) Search Strategies
 Most of the best-first algorithms include a heuristic function h(n) as a
component of f which is denoted by:
h(n) = estimated cost of the cheapest path from the state at node n to a goal state.
 Note that the function h(n) takes a node as input but unlike the function
g(n) it depends only on the state at that node.
 For example, in Romania, one might estimate the cost of the cheapest
path from Arad to Bucharest via the straight-line distance from Arad to
Bucharest.
 Heuristic functions are the most common form in which additional
knowledge about the problem is imparted to the search algorithm.
 We consider them to be arbitrary, nonnegative, problem-specific
functions with one constraint, i.e., if n is a goal node, then h(n) =0.

Artificial Intelligence 3
Greedy Best-First Search
 Greedy best-first search algorithm try to expand the node that is closest to
the goal on the basis that, this node is likely to lead to a solution quickly.
 It evaluates the nodes by using the heuristic function, i.e., f(n) = h(n).
 For route-finding problems in Romania, we use the straight-line distance
heuristic, hSLD.
 If the goal is Bucharest, we need to know the straight-line distances to
Bucharest which are shown in Figure 3.22.
 For example, hSLD(In(Arad))=366.
 Notice that the values of hSLD cannot be computed from the problem
description itself.
 Moreover, it takes a certain amount of experience to know that hSLD is
correlated with actual road distances and therefore, it is a useful heuristic.

Artificial Intelligence 4
Greedy Best-First Search

Artificial Intelligence 5
Greedy Best-First Search
 Figure 3.23 shows the progress of a greedy best-first search using hSLD to
find a path from Arad to Bucharest.
 The first node to be expanded from Arad will be Sibiu because it is closer
to Bucharest than either Zerind or Timisoara.
 The next node to be expanded will be Fagaras because it is closest.
 Fagaras in turn generates Bucharest, which is the goal.
 For this particular problem, greedy best-first search using hSLD finds a
solution without ever expanding a node that is not on the solution path,
and hence its search cost is minimal.
 It is not optimal as the path via Sibiu and Fagaras to Bucharest is 32
kilometers longer than the path through Rimnicu Vilcea and Pitesti.
 This shows why the algorithm is called greedy, at each step it tries to get
as close to the goal as it can.
Artificial Intelligence 6
Greedy Best-First Search

Artificial Intelligence 7
Greedy Best-First Search
 Greedy best-first tree search is also incomplete even in a finite state space,
much like depth-first search.
 Consider the problem of getting from Iasi to Fagaras.
 The heuristic suggests that Neamt be expanded first because it is closest to
Fagaras, but it is a dead end.
 The solution is to go first to Vaslui, a step that is actually farther from the
goal according to the heuristic, and then to continue to Urziceni, Bucharest,
and Fagaras.
 The algorithm will never find this solution, also because of expanding
Neamt puts Iasi back into the frontier, Iasi is closer to Fagaras than Vaslui
is, and so Iasi will be expanded again leading to an infinite loop.
 The graph search version is complete in finite spaces but not in infinite
ones.
Artificial Intelligence 8
Greedy Best-First Search
 The worst-case time and space complexity for the tree version is O(bm),
where m is the maximum depth of the search space.
• With a good heuristic function the time and space complexity can be
reduced substantially.
 The amount of the reduction depends on the particular problem, and on
the quality of the heuristic.

Artificial Intelligence 9
A* search: Minimizing the total estimated solution cost
 The most widely known form of best-first search is called A∗ search.
 It evaluates nodes by combining g(n), the cost to reach the node, and h(n),
the cost to get from the node to the goal:
f(n) = g(n) + h(n).
 Since g(n) gives the path cost from the start node to node n, and h(n) is the
estimated cost of the cheapest path from n to the goal, we have
f(n) = estimated cost of the cheapest solution through n.
 Thus, if we are trying to find the cheapest solution, it is better to try first the
node with the lowest value of g(n) + h(n).
 This strategy is more than just reasonable provided that the heuristic
function h(n) satisfies certain conditions, and hence A∗ search is both
complete and optimal.
 The algorithm is identical to uniform-cost-search except that A∗ uses g + h
instead of g. Artificial Intelligence 10
A* search: Minimizing the total estimated solution cost
Conditions for Optimality- Admissibility and Consistency:
 The first condition we require for optimality is that h(n) be an admissible
heuristic.
 An admissible heuristic is one that never overestimates the cost to reach
the goal.
 Because g(n) is the actual cost to reach n along the current path, and
f(n) = g(n) + h(n), we have an immediate consequence that f(n) never
overestimates the true cost of a solution along the current path through n.
 Admissible heuristics are optimistic because they think the cost of solving
the problem is less than the actual cost.
 Straight-line distance hSLD that we used in getting to Bucharest from Arad
an example of an admissible heuristic.
 Straight-line distance is admissible because the shortest path between any
two points is a straight line, so the straight line cannot be an overestimate.
Artificial Intelligence 11
A* search: Minimizing the total estimated solution cost
 The progress of an A∗ tree search for Bucharest is shown in Figure 3.24.
 The values of g are computed from the step costs in Figure 3.2, and the values of
hSLD are given in Figure 3.22.
 Notice that Bucharest first appears on the frontier at step(e), but it is not selected
for expansion because its f-cost is (450) which is higher than that of Pitesti (417).
 Since there might be another solution through Pitesti whose cost is as low as
417, the algorithm will not settle for a solution that costs 450.
 A second condition called consistency (or monotonicity) is required only for
applications of A∗ to graph search.
• A heuristic h(n) is consistent if for every node n and every successor n of n
generated by any action a, the estimated cost of reaching the goal from n is no
greater than the step cost of getting to n plus the estimated cost of reaching the
goal from n .
h(n) ≤ c(n, a, n ) + h(n ).
Artificial Intelligence 12
A* search: Minimizing the total estimated solution cost

Artificial Intelligence 13
A* search: Minimizing the total estimated solution cost

Figure 3.24
Artificial Intelligence 14
A* search: Minimizing the total estimated solution cost
Optimality of A∗ :
 The tree-search version of A∗ is optimal if h(n) is admissible, and the graph-
search version is optimal if h(n) is consistent (i.e., h(n) ≤ c(n, a, n ) + h(n )).
 We show the second of these two claims since it is more useful.
 The argument essentially mirrors the argument for the optimality of uniform-
cost search with g replaced by f just as in the A∗ algorithm itself.
 The first step is to establish the following:
 If h(n) is consistent, then the values of f(n) along any path are nondecreasing.
 Using the definition of consistency, we can prove this as shown below.
 Suppose n is a successor of n, then g(n ) = g(n) + c(n, a, n ) for some action a,
and we have f(n ) = g(n ) + h(n ) = g(n) + c(n, a, n ) + h(n ) ≥ g(n) + h(n) = f(n).

Artificial Intelligence 15
A* search: Minimizing the total estimated solution cost
 The next step is to prove that “whenever A∗ selects a node n for expansion, the
optimal path to that node has been found”.
 If this is not true, then there must be another frontier node n on the optimal path
from the start node to n.
 By the graph separation property of figure 3.9 and because f is nondecreasing
along any path, the node n would have lower f-cost than n and it would have
been selected first.
 From these observations, it follows that the sequence of nodes expanded by A∗
using graph-search is in nondecreasing order of f(n).
 Hence, the first goal node selected for expansion must be an optimal solution
because f is the true cost for goal nodes (which have h=0) and all later goal
nodes will be at least as expensive.
 Since f-costs are nondecreasing on any path, we can draw contours in the state
space as shown below just like the contours in a topographic map.
Artificial Intelligence 16
A* search: Minimizing the total estimated solution cost

Artificial Intelligence 17
A* search: Minimizing the total estimated solution cost
 For the contour labeled 400, all nodes have f(n) value less than or equal to 400,
and the contour labeled 420, all nodes have f(n) value less than or equal to 420.
 Because A∗ search expands the frontier node having lowest f-cost, we can see
that an A∗ search fans out from the start node, adds the nodes in concentric
bands of increasing f-cost.
 With better heuristics, the bands will stretch towards the goal state and become
more narrowly focused around the optimal path.
 If C∗ is the cost of the optimal solution path, then we can say that:
 A∗ expands all nodes with f(n) < C∗.
 A∗ might then expand some of the nodes on the goal contour (where f(n) = C∗) before
selecting a goal node.

 Completeness requires that there should be only finitely many nodes with cost
less than or equal to C∗, a condition that is true if all step costs exceed some
finite and if b is finite.
Artificial Intelligence 18
A* search: Minimizing the total estimated solution cost
 A∗ is optimally efficient for any given consistent heuristic.
 That is, no other optimal algorithm is guaranteed to expand fewer nodes
than A∗.
 This is because any algorithm that does not expand all nodes with f(n) < C∗
runs the risk of missing the optimal solution.
 A∗ search is complete, optimal, and optimally efficient among all such
algorithms.
 It does not mean that A∗ can be applied to all searching problems.
 For most problems, the number of states within the goal contour search
space is still exponential in the length of the solution.
 For the problems with constant step costs, the growth in run time as a
function of the optimal solution depth d is analyzed in terms of the absolute
error or the relative error of the heuristic.
Artificial Intelligence 19
A* search: Minimizing the total estimated solution cost
 The absolute error is defined as Δ ≡ h∗ - h, where h∗ is the actual cost of getting
from the root to the goal, and the relative error is defined as ≡ (h∗ - h)/h∗.
• The complexity results depend very strongly on the assumptions made about
the state space.
• The simplest model studied is a state space that has a single goal and is
essentially a tree with reversible actions.
• In this case, the time complexity of A∗ is exponential in the maximum absolute
error, that is O(bΔ).
• For constant step costs, we can write this as O(b𝜖 d), where d is the solution
depth.
• For almost all heuristics in practical use, the absolute error is at least
proportional to the path cost h∗, so 𝜖 is constant or growing and the time
complexity is exponential in d.

Artificial Intelligence 20
A* search: Minimizing the total estimated solution cost
 We can also see the effect of a more accurate heuristic: O(b𝜖 d) = O((b𝜖 )d), so
the effective branching factor is b𝜖.
 When the state space has many goal states, then the search process can be
led astray(out of the right way, off the correct or known road, path, or route)
from the optimal path and there is an extra cost proportional to the number of
goals whose cost is within a factor 𝜖 of the optimal cost.
 In the general case of a graph search, the situation is even worse.
 There can be exponentially many states with f(n) < C∗ even if the absolute
error is bounded by a constant.
 For example, consider a version of the vacuum world where the agent can
clean up any square for unit cost without even having to visit it.
 In that case, squares can be cleaned in any order.

Artificial Intelligence 21
A* search: Minimizing the total estimated solution cost
 With initially N dirty squares, there are 2N states where some subset of these
states has been cleaned and all of them are on an optimal solution path, and
hence satisfy f(n) < C∗ even if the heuristic has an error of 1.
 The complexity of A∗ makes it impractical to insist to find an optimal solution.
 We can use variants of A∗ that find suboptimal solutions quickly, or one can
design heuristics that are more accurate but not strictly admissible.
 The use of a good heuristic provides enormous savings compared to the use of
an uninformed search.
 Because A∗ keeps all generated nodes in the memory, it usually runs out of
space long before it runs out of time.
 For this reason, A∗ is not practical for many large-scale problems.
 But there are algorithms that overcome the space problem without sacrificing
optimality or completeness at a small cost in execution time.
Artificial Intelligence 22
Heuristic Functions
 To understand the nature of heuristics in general, we will consider the
heuristics for the 8-puzzle problem.
 The 8-puzzle was one of the earliest heuristic search problems.
 We know that the objective of the puzzle is to slide the tiles horizontally or
vertically into the empty space until the configuration matches the goal
configuration as shown in below.

Artificial Intelligence 23
Heuristic Functions
 The average solution cost for a randomly generated 8-puzzle instance is
about 22 steps.
 The branching factor is about 3. (When the empty tile is in the middle, four
moves are possible, when it is in a corner, two moves are possible, and when
it is along an edge, three moves are possible).
 This means that an exhaustive tree search to the depth 22 would have about
322 ≈ 3.1×1010 states.
 A graph search would cut this down by a factor of about 170,000 because
only 9!/2 =181, 440 distinct states are reachable.
 This is a manageable number, but the corresponding number for the 15-
puzzle is roughly 1013, so we need to find a good heuristic function.
 If we want to find the shortest solutions by using A∗, then we need a heuristic
function that never overestimates the number of steps to the goal.
 There is a long history of such heuristics for the 15-puzzle.
Artificial Intelligence 24
Heuristic Functions
 The two commonly used heuristics are:
 h1 = the number of misplaced tiles.
 For Figure 3.28, all of the eight tiles are out of position, so the start state would have
h1 = 8.
 h1 is an admissible heuristic because it is clear that any tile that is out of place must be
moved at least once.
 h2 = the sum of the distances of the tiles from their goal positions.
 Since tiles cannot move along diagonals, the distance we count is the sum of the
horizontal and vertical distances.
 This distance is called as the city block distance or Manhattan distance.
 h2 is also admissible because all any move can do is, move one tile one step closer to
the goal.
 Tiles 1 to 8 in the start state has a Manhattan distance of h2 = 3+1+2+2+2+3+3+2 = 18.

 Neither of these overestimates the true solution cost, which is 26.

Artificial Intelligence 25
The Effect of Heuristic Accuracy on Performance
• The quality of a heuristic can be characterized by the effective branching factor b∗.
• If the total number of nodes generated by A∗ for some problem is N and the
solution depth is d, then a uniform tree of depth d would have the branching factor
b∗ to contain N + 1 nodes.
• Thus, N + 1 = 1+ b∗ + (b∗)2 + ・ ・ ・ + (b∗)d.
• For example, if A∗ finds a solution at depth 5 using 52 nodes, then the effective
branching factor is equal to 1.92.
• The effective branching factor can vary across problem instances, but usually it is
constant for sufficiently hard problems.
• Therefore, experimental measurements of b∗ on a small set of problems can be
used as a good guide to the overall usefulness of heuristic.
• A good heuristic will have a value of b∗ close to 1 and allows large problems to be
solved at reasonable computational cost.
Artificial Intelligence 26
The Effect of Heuristic Accuracy on Performance
 To test the heuristic functions h1 and h2, we generated 1200 random problems
with solution lengths from 2 to 24 (100 problems for each even number) and
solved them with iterative deepening search and with A∗ tree search using both
h1 and h2.
 Figure 3.29 gives the average number of nodes generated by each strategy
and the effective branching factor.
 The results shows that h2 is better than h1 and is far better than using iterative
deepening search.
 Even for small problems with d = 12, A∗ with h2 is 50,000 times more efficient
than uninformed iterative deepening search.
 From the definitions of the two heuristics, we can see that for any node n,
h2(n) ≥ h1(n).
 Hence, we say that h2 dominates h1.
 A∗ using h2 will never expand more nodes than using h1.
Artificial Intelligence 27
The Effect of Heuristic Accuracy on Performance

Artificial Intelligence 28
Generating Admissible Heuristics from Relaxed Problems
 We have seen that both h1 (misplaced tiles) and h2 (Manhattan distance) are
good heuristics for the 8-puzzle and we also know that h2 is better than h1.
 How might one have come up with h2?
 Is it possible for a computer to invent such a heuristic mechanically?
 h1 and h2 are estimates of the remaining path length for the 8-puzzle, but they
are also perfectly accurate path lengths for simplified versions of the puzzle.
 If the rules of the puzzle were changed so that a tile could move anywhere
instead of just to the adjacent empty square, then h1 would give the exact
number of steps in the shortest solution.
 Similarly, if a tile could move one square in any direction, even onto an
occupied square, then h2 would give the exact number of steps in the shortest
solution.
 A problem with fewer restrictions on the actions is called a relaxed problem.
Artificial Intelligence 29
Generating Admissible Heuristics from Relaxed Problems
 The state-space graph of the relaxed problem is a supergraph of the original
state space because the removal of restrictions creates additional edges in the
graph.
 Because of the additional edges are added by the relaxed problem to the state
space, any optimal solution of the original problem is also a solution in the
relaxed problem, but the relaxed problem may have better solutions if the
added edges provide short cuts.
 Hence, the cost of an optimal solution to a relaxed problem is an admissible
heuristic for the original problem.
 Also, because the derived heuristic is an exact cost for the relaxed problem it
is also a consistent.
 If a problem definition is written in a formal language, then it is possible to
construct relaxed problems automatically.
 For example, if the 8-puzzle actions are described as given below:
Artificial Intelligence 30
Generating Admissible Heuristics from Relaxed Problems
 A tile can move from square A to square B if A is horizontally or vertically
adjacent to B and B is blank.
 we can generate three relaxed problems by removing one or both of the
conditions:
a) A tile can move from square A to square B if A is adjacent to B.
b) A tile can move from square A to square B if B is blank.
c) A tile can move from square A to square B.

 From (a), we can derive h2 (Manhattan distance).


 The reasoning is that h2 would be the proper score if we moved each tile in
turn to its destination.
 The heuristic derived from (b) is called Gaschnig’s heuristic which is at least as
accurate as h1 (misplaced tiles).
 From (c), we can derive h1 (misplaced tiles) because it would be the proper
score if tiles could move to their intended destination in one step.
Artificial Intelligence 31
Generating Admissible Heuristics from Relaxed Problems
 Note that the relaxed problems generated by this technique can be solved
without search, because the relaxed rules allows the problem to be
decomposed into eight independent subproblems.
 If the relaxed problem is hard to solve, then the values of the corresponding
heuristic will be expensive to obtain.
 A program called ABSOLVER can generate heuristics automatically from
the problem definitions using the relaxed problem method and various other
techniques.
 ABSOLVER generated a new heuristic for the 8-puzzle that was better than
any preexisting heuristic and found the first useful heuristic for the famous
Rubik’s Cube puzzle.
 The problem with generating new heuristic functions is that we often fail to
get a single clearly best heuristic.

Artificial Intelligence 32
Generating Admissible Heuristics from Relaxed Problems
 If a collection of admissible heuristics say h1 . . . hm are available for some
problem and none of them dominates any of the others, then which should
we choose?
 We need not make a choice.
 We can have the best of all by defining:
h(n) = max{h1(n), h2(n), . . . , hm(n)} .
 This composite heuristic uses whichever function is most accurate on the
given node.
 Because the component heuristics are admissible, h is also admissible.
 It can also be proved that h is consistent.
 Also, h dominates all the component heuristics.

Artificial Intelligence 33
Generating Admissible Heuristics from Subproblems: Pattern databases
 Admissible heuristics can also be derived from the solution cost of a
subproblem of a given problem.
 For example, Figure 3.30 shows a subproblem of the 8-puzzle instance of
Figure 3.28.
 The subproblem involves getting tiles 1, 2, 3, 4 into their correct positions.
 Clearly, the cost of the optimal solution of this subproblem is a lower bound
on the cost of the complete problem.
 It looks to be more accurate than Manhattan distance in some cases.
 The pattern databases is used to store these exact solution costs for every
possible subproblem instance in our example, every possible configuration of
the four tiles and the blank. (The locations of the other four tiles are irrelevant
for the purposes of solving the subproblem, but moves of those tiles do count
toward the cost.)

Artificial Intelligence 34
Generating Admissible Heuristics from Subproblems: Pattern databases

 We compute an admissible heuristic hDB for each complete state encountered


during a search simply by looking up the corresponding subproblem
configuration in the database.
 The database itself is constructed by searching back from the goal and
recording the cost of each new pattern encountered, the expense of this search
is amortized(gradually reduced) over many subsequent problem instances.
Artificial Intelligence 35
Generating Admissible Heuristics from Subproblems: Pattern databases
 The choice of constructing databases for 1-2-3-4 is arbitrary, we could also
construct databases for 5-6-7-8, for 2-4-6-8, and so on.
 Each database yields an admissible heuristic, and these heuristics can be
combined by taking the maximum value.
 A combined heuristic of this type is much more accurate than the Manhattan
distance, and the number of nodes generated while solving random 15-puzzles
can be reduced by a factor of 1000.
 One question arises whether the heuristics obtained from the 1-2-3-4 database
and the 5-6-7-8 could be added, since the two subproblems seem not to overlap.
 Would this still give an admissible heuristic?
 The answer is no, because the solutions of the 1-2-3-4 subproblem and the
5-6-7-8 subproblem for a given state will share some moves.
 It is unlikely that 1-2-3-4 can be moved into place without touching 5-6-7-8, and
vice versa.
Artificial Intelligence 36
Generating Admissible Heuristics from Subproblems: Pattern databases
 What happens if we do not record the total cost of solving the 1-2-3-4 subproblem,
but just record the number of moves involving 1-2-3-4.
 It is easy to see that the sum of the two costs is still a lower bound on the cost of
solving the entire problem.
 This is the idea behind disjoint pattern databases.
 With such databases, it is possible to solve random 15-puzzles in a few
milliseconds, and the number of nodes generated is reduced by a factor of 10,000
compared with the use of Manhattan distance.
 For 24-puzzles, a speedup of roughly a factor of a million can be obtained.
 Disjoint pattern databases work for sliding-tile puzzles because the problem can
be divided up such that each move affects only one subproblem because only one
tile is moved at a time.
 For a problem such as Rubik’s Cube, this type of subdivision is difficult because
each move affects 8 or 9 of the 26 cubies.
Artificial Intelligence 37
Learning Heuristics from Experience
 A heuristic function h(n) is used to estimate the cost of a solution beginning
from the state at node n.
 How could an agent construct such a function?
 One solution for this is to devise relaxed problems for which an optimal
solution can be found easily.
 Another solution is to learn from experience (for example, solving lots of
8-puzzles).
 Each optimal solution to an 8-puzzle problem provides examples from which
h(n) can be learned.
 Each example consists of a state from the solution path and the actual cost
of the solution from that point.
 From these examples, a learning algorithm (designed for neural networks,
decision trees) can be used to construct a function h(n) that can predict the
solution costs for other states that arise during search.
Artificial Intelligence 38
Learning Heuristics from Experience
 Inductive learning methods work best when supplied with features of a state
that are relevant to predicting the state’s value, rather than with just the raw
state description.
 For example, the feature “number of misplaced tiles” can be helpful in
predicting the actual distance of a state from the goal.
 Let’s call this feature as x1(n).
 We could take 100 randomly generated 8-puzzle configurations and gather
statistics on their actual solution costs.
 We might find that when x1(n) is 5, the average solution cost is around 14,
and so on.
 Given these data, the value of x1 can be used to predict h(n).
 A second feature x2(n) might be “number of pairs of adjacent tiles that are
not adjacent in the goal state.”
Artificial Intelligence 39
Learning Heuristics from Experience
 How x1(n) and x2(n) should be combined to predict h(n)?
 One common approach is to use a linear combination:
h(n) = c1 x1(n) + c2 x2(n) .
 The constants c1 and c2 are adjusted to give the best fit to the actual
data on solution costs.
 Both c1 and c2 needs to be positive because misplaced tiles and
incorrect adjacent pairs make the problem harder to solve.
 Note that this heuristic satisfy the condition that h(n) = 0 for goal states,
but it is not necessarily admissible or consistent.

Artificial Intelligence 40

You might also like