Aimodule 2
Aimodule 2
Best First Search is a way of combining the advantages of both depth first search
and breadth first search into a single method. Depth first search is good because it allows
a solution to be found without all competing branches have to be expanded. Breadth first
search is good because it does not get trapped on dead end paths. One way of combining
two is to follow a single path at a time, but switch paths whenever some competing
path looks more promising than the current one does.
A A A
B C D B C D
3 5 1 3 5
E F
4 6
A A
B C D B C D
5 5
G H E F G H E F
6 5 4 6 6 5 6
I J
2 1
At each step of the best first search process we select the most promising of the
nodes we have generated so far. This is done by applying an appropriate heuristic
function to each of them. We then expand the chosen node by using the rules to generate
its successors. If one of them is a solution then we can quit. If not all those new nodes are
added to the set of nodes generated so far. Again the most promising node is selected and
the process is repeated.
Usually what happens is that a bit of depth first searching occurs as the most
promising branch is explored. But eventually if a solution is not found, that branch will
1
start to look less promising than one of the top level branches that had been ignored. At
that point the now more promising, previously ignored branch will be explored. But the
old branch is not forgotten. Its last node remains in the set of generated but unexpanded
nodes. The search can return to it whenever all the others get bad enough that it is again
the most promising path.
Figure shows the beginning of a best first search procedure. Initially there is only
one node, so it will be expanded. Doing so generates 3 new nodes. The heuristic function,
which, in this example, is the cost of getting to a solution from a given node, is applied to
each of these new nodes. Since node D is the most promising, it is expanded next,
producing 2 successor nodes, E and F. But then the heuristic function is applied to them.
Now another path, that going through node B, looks more promising, so it is pursued,
generating nodes G and H. But again when these new nodes are evaluated they look less
promising than another path, so attention is returned to the path through D to E. E is then
expanded, yielding nodes I and J. At next step, J will be expanded since it is the most
promising. This process can continue until a solution is found.
Although the example above illustrates a best first search of a tree, it is sometimes
important to search a graph instead so that duplicate paths will not be pursued. An
algorithm to do this will operate by searching a directed graph in which each node
represents a point in the problem space. Each node will contain in addition to a
description of the problem state it represents, an indication of how promising it is, a
parent link that points back to the best node from which it came, and a list of the nodes
that were generated from it.
The parent link will make it possible to recover the path to the goal once the goal
is found. The list of successors will make it possible, if a better path is found to an
already existing node, to propagate the improvement down to its successors. We will call
a graph of this sort an OR graph, since each of its branches represents an alternative
problem-solving path. To implement such a graph search procedure, we will need to use 2
lists of nodes:
OPEN: nodes that have been generated, and have had the heuristic function
applied to them but which have not yet been examined (i.e., had their successors
generated). This is actually a priority queue in which the elements with the
highest priority are those with the most promising values of the heuristic function.
CLOSED: nodes that have already been examined. We need to keep these
nodes in memory if we want to search a graph rather than a tree, since whenever a
new node is generated; we need to check whether it has been generated before.
Algorithm:
2
b) Generate its successors
c) For each successor do:
i. If it has not been generated before, evaluate it, add it to OPEN, and
record its parent
ii. If it has been generated before, change the parent if this new path
is better than the previous one. In that case, update the cost of
getting to this node and to any successors that this node may
already have.
Completeness: Yes. This means that, given unlimited time and memory, the algorithm
will always find the goal state if the goal can possibly be found in the graph. Even if the
heuristic function is highly inaccurate, the goal state will eventually be added to the open
list and will be closed in some finite amount of time.
Optimality: No. The Best-First Search Algorithm is not even guaranteed to find the
shortest path from the start node to the goal node when the heuristic function perfectly
estimates the remaining cost to reach the goal from each node. Therefore, the solutions
found by this algorithm must be considered to be quick estimates of the optimal
solutions.
A* Algorithm
A-Star (or A*) is a general search algorithm that is extremely competitive with
other search algorithms, and yet intuitively easy to understand and simple to implement.
Search algorithms are used in a wide variety of contexts, ranging from A.I. planning
3
problems to English sentence parsing. Because of this, an effective search algorithm
allows us to solve a large number of problems with greater ease.
The problems that A-Star is best used for are those that can be represented as a
state space. Given a suitable problem, you represent the initial conditions of the problem
with an appropriate initial state, and the goal conditions as the goal state. For each action
that you can perform, generate successor states to represent the effects of the action. If
you keep doing this and at some point one of the generated successor states is the goal
state, then the path from the initial state to the goal state is the solution to your problem.
What A-Star does is generate and process the successor states in a certain way.
Whenever it is looking for the next state to process, A-Star employs a heuristic function
to try to pick the “best” state to process next. If the heuristic function is good, not only
will A-Star find a solution quickly, but it can also find the best solution possible.
A search technique that finds minimal cost solutions and is also directed towards
goal states is called "A* (A-star) search". A* algorithm is a typical heuristic search
algorithm, in which the heuristic function is an estimated shortest distance from the initial
state to the closest goal state, and it equals to traveled distance plus predicted distance
ahead. In Best-First search and Hill Climbing, the estimate of the distance to the goal was
used alone as the heuristic value of a state. In A*, we add the estimate of the remaining
cost to the actual cost needed to get to the current state.
That is, f(n) = g(n) + h(n). Intuitively, this is the estimate of the best solution
that goes through n.
Like breadth-first search, A* is complete in the sense that it will always find a
solution if there is one.If the heuristic function h is admissible, meaning that it never
overestimates the actual minimal cost of reaching the goal, then A* is itself admissible
(or optimal) if we do not use a closed set. If a closed set is used, then h must also be
monotonic (or consistent) for A* to be optimal. This means that it never overestimates the
cost of getting from a node to its neighbor. Formally, for all paths x,y where y is a
successor of x:
4
A* is both admissible and considers fewer nodes than any other admissible
search algorithm, because A* works from an "optimistic" estimate of the cost of a path
through every node that it considers -- optimistic in that the true cost of a path through
that node to the goal will be at least as great as the estimate. But, critically, as far as A*
"knows", that optimistic estimate might be achievable.
When A* terminates its search, it has, by definition, found a path whose actual
cost is lower than the estimated cost of any path through any open node. But since those
estimates are optimistic, A* can safely ignore those nodes. In other words, A* will never
overlook the possibility of a lower-cost path and so is admissible.
Suppose now that some other search algorithm A terminates its search with a
path whose actual cost is not less than the estimated cost of a path through some open
node. Algorithm A cannot rule out the possibility, based on the heuristic information it
has, that a path through that node might have a lower cost. So while A might consider
fewer nodes than A*, it cannot be admissible. Accordingly, A* considers the fewest nodes
of any admissible search algorithm that uses a no more accurate heuristic estimate.
Algorithm:
5
Completeness: Yes, as long as branching factor is finite.
Optimality: Yes, if h is admissible. However, a good heuristic can find optimal solutions
for many problems in reasonable time.
Advantages:
1. A* benefits from the information contained in the Open and Closed lists to avoid
repeating search effort.
2. A*’s Open List maintains the search frontier where as IDA*’s iterative deepening
results in the search repeatedly visiting states as it reconstructs the frontier(leaf
nodes) of the search.
Constraint Satisfaction
6
Many problems in AI can be viewed as problems of constraint
satisfaction in which the goal is to discover some problem states that
satisfies a given set of constraints. Examples of this sort of problems
include crypt arithmetic puzzles and many real world perceptual
labeling problems. By viewing a problem as one of constraint satisfaction
it is often possible to reduce substantially the amount of search that is
required with some other method.
7
choice of good heuristics to guide the solution strategy more
straightforward.
Secondly, although CSP algorithms are essentially very simple,
they can sometimes find solution more quickly than if integer
programming methods are used.
The first step propagation arises from the fact that there are
usually dependencies among the constraints. These dependencies occur
because many constraints involve more than one object and many
objects participate in more than one constraint. For e.g.: assume we start
with one constraint N=E+1. Then if we added the constraint N=3 we
could propagate that to get a stronger constraint on E namely E=2.
8
Path cost: a constant cost (e.g., 1) for every step.
At this point the second step begins. Some hypothesis about a way
to strengthen the constraints must be made. In this case of crypt
arithmetic problem this usually means guessing a particular value for
some letter. Once this has been done constraint propagation can begin
again from this new state. If a solution is found it can be reported. If a
contradiction is detected then backtracking can be used to try a different
guess and proceed with it.
Algorithm:
9
solution. Then do until an inconsistency is detected or until OPEN
is empty.
a) Select an object OB from OPEN. Strengthen as much as
possible the set of constraints that apply to OB
b) If this set is different from the set that was assigned the last
time OB was examined or if then first time OB has been
examined then add to OPEN all objects that share any
constraint with OB.
c) Remove OB from OPEN.
2. If the union of the constraints discovered above defines a solution
the quit and report the solution.
3. If the union of the constraints discovered above defines a
contradiction the quit then return failure.
4. If neither of the above occurs then it is necessary to make a guess
at something in order to proceed. To do this loop until a solution is
found or all possible solutions have been eliminated.
a) Select an object whose value is not yet determined and select
a way of strengthening the constraints on that object.
b) Recursively invoke constraint satisfaction with the current
set of constraints augmented by the strengthening
constraint just selected.
SEND
MORE
10
=========
MONEY
The goal state is a problem state in which all letters have been
assigned a digit in such a way that all the initial constraints are satisfied.
The solution proceeds in cycles. At each cycle 2 significant things are
done.
A utility function (Payoff function): produces a numerical value for (only) the
terminal states. Example: In chess, outcome = win/loss/draw, with values +1, -1, 0
respectively
11
Game Trees
The sequence of states formed by possible moves is called a game tree ; each
level of the tree is called a ply. In 2 player games we call the two players Max (us) and
Min (the opponent).WIN refers to winning for Max. At each ply, the "turn" switches to
the other player.
Each level of search nodes in the tree corresponds to all the possible board
configurations for a particular player – Max or Min. Utility values found at the end can
be returned back to their parent nodes. So, winning for Min is losing for Max. Max wants
to end in a board with +1 and Min in a board with a value of -1.
Max chooses the board with the max utility value, Min the minimum. Max is the
first player and Min is the second. Every player needs a strategy. For example, the
strategy for Max is to reach a winning terminal state regardless of what Min does. Even
for simple games, the search tree is huge.
Minimax Algorithm
The Minimax Game Tree is used for programming computers to play games in
which there are two players taking turns to play moves. Physically, it is just a tree of all
possible moves.
With a full minimax tree, the computer could look ahead for each move to
determine the best possible move. Of course, as you can see in example diagram shown
below, the tree can get very big with only a few moves. Thus, for large games like Chess
and Go, computer programs are forced to estimate who is winning or losing by focusing
on just the top portion of the entire tree. In addition, programmers have come up with all
sorts of algorithms and tricks such as Alpha-Beta pruning.
The minimax game tree, of course, cannot be used very well for games in which
the computer cannot see the possible moves. So, minimax game trees are best used for
games in which both players can see the entire game situation. These kinds of games,
such as checkers, othello, chess, and go, are called games of perfect information.
For instance, take a look at the following (partial) search tree for Tic-Tac-Toe.
Notice that unlike other trees like binary trees, 2-3 trees, and heap trees, a node in the
game tree can have any number of children, depending on the game situation. Let us
assign points to the outcome of a game of Tic-Tac-Toe. If X wins, the game situation is
given the point value of 1. If O wins, the game has a point value of -1. Now, X will be
trying to maximize the point value, while O will be trying to minimize the point value.
So, one of the first researchers on the minimax tree decided to name player X as Max and
player O as Min. Thus, the entire data structure came to be called the minimax game
tree.
12
two-person: there are two players.
perfect information: both players have complete information about the state of the
game. (Chess has this property, but poker does not.)
zero-sum: if we count a win as +1, a tie as 0, and a loss as -1, the sum of scores
for both players is always zero.
This minimax logic can also be extended to games like chess. In these more
complicated games, however, the programs can only look at the part of the minimax tree;
often, the programs can't even see the end of the game because it is so far down the tree.
So, the computer only looks at a certain number of nodes and then stops. Then the
computer tries to estimate who is winning and losing in each node, and these estimates
result in a numerical point value for that game position. If the computer is playing as
Max, the computer will try to maximize the point value of the position, with a win
(checkmate) being equal to the largest possible value (positive 1 million, let's say). If the
computer is playing as Min, it will obviously try to minimize the point value, with a win
being equal to the smallest possible value (negative 1 million, for instance).
Fig : A (partial)search tree for the game of Tic-Tac-Toe is shown above. The top
node is the initial state and max moves first placing an X in an empty square. We show
part of the search tree, giving alternating moves by min(O) and max until we eventually
reach terminal states, which can be assigned utilities according to the rules of the game.
13
Minimax Evaluation
A search tree is generated, depth-first, starting with the current game position up
to the end game position.
Compute the values (through the utility function) for all the terminal states.
Afterwards, compute the utility of the nodes one level higher up in the search tree
(up from the terminal states. The nodes that belong to the MAX player receive the
maximum value of its children. The nodes for the MIN player will select the
minimum value of its children.
Continue backing up the values from the leaf nodes towards the root.
When the root is reached, Max chooses the move that leads to the highest value
(optimal move).
Given a game tree the optimal strategy can be determined by examining the
minimax value of each node which we can write as MINIMAX-VALUE (n). Utility value
is the value of a terminal node in the game tree. Minimax value of a terminal state is just
14
its utility. Minimax value indicates the best value that the current player can possibly get.
It’s either the max or the min of a bunch of utility values. The minimax algorithm is a
depth-first search. The space requirements are linear with respect to b and m where b is
the no of legal moves at each point and m is the maximum depth of the tree. For real
games, the time cost is impractical.
Algorithm:
15
Above shown is an algorithm for calculating minimax decisions. It returns the
action corresponding to the best possible move, that is, the move that leads to the
outcome with the best utility, under the assumption that the opponent plays to minimize
utility. The function MAXVALUE and MINVALUE go through the whole game tree all
the way to the leaves to determine the backed up value of a state.
Alpha-Beta Search
The problem with minimax search is that the no of game states it has to examine
is exponential in the no of moves. Minimax helps us look ahead four or five ply in chess.
Average human chess players can make plans six or eight ply ahead. But it is possible to
compute the correct minimax decision without looking at every node in the game tree.
Alpha-beta pruning can be used in this context. It is similar to the minimax algorithm
(applied to the same tree) it returns the same move as minimax would but prunes
branches that cannot possibly influence the final decision.
alpha is the value of the best choice (highest value) we have found till now along
the path for MAX.
beta is the value of the best choice (lowest value) we have found so far along the
path for MIN.
Alpha-beta pruning changes the values for alpha or beta as it moves and prunes a
sub tree as soon as it finds out that it is worse than the current value for alpha or beta. If
two nodes in the hierarchy have incompatible inequalities (no possible overlap), then we
know that the node below will not be chosen, and we can stop search.
16
Alpha-Beta Search Example
Bounded depth-first search is usually used, with the alpha-beta algorithm, for game trees.
However:
1. The depth bound may stop search just as things get interesting (e.g. in the middle
of a piece exchange in chess. For this reason, the depth bound is usually extended
to the end of an exchange.
2. The search may tend to postpone bad news until after the depth bound: the
horizon effect.
Frequently, large parts of the search space are irrelevant to the final decision and can be
pruned. No need to explore options that are already definitely worse than the current best
option. Consider again the 2 ply game tree from Fig 5.2. If we go through the calculation
of optimal decision once more we can identify the minimax decision without ever
evaluating 2 of the leaf nodes.
Let the 2 unevaluated nodes be x and y and let z be the minimum of x and y. The
value of root node is given by
= max(3,min(2,x,y),2)
= 3.
17
In other words the value of root and hence the minimax decision are independent
of the values of the pruned leaves x and y. Alpha beta pruning can be applied to trees of
any depth, and it is often possible to prune entire sub trees rather than just leaves. The
general principle is this: consider a node n somewhere in the tree such that a player has a
choice of moving to that node. If the player has a better choice m either at the parent node
of n or at any choice point further up, then n will never be reached in actual play.
Algorithm:
18
The minimax search is depth first so at any one time we just have to consider the
nodes along a single path in the tree. The effectiveness of alpha beta pruning is highly
dependent on the order in which the successors are examined. In games repeated states
occur frequently because of transpositions- different permutations of the move sequence
that end up in the same position. It is worthwhile to store the evaluation of this position in
a hash table the first time it is encountered so that we don’t have to recompute it on
subsequent occurrences.
19
Limitations of alpha-beta Pruning
20
21