0% found this document useful (0 votes)
16 views

Module - 2

Uploaded by

inzamamgmd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Module - 2

Uploaded by

inzamamgmd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Principles of Artificial Intelligence/18AI55 Module – 2

Syllabus
Module – 1

Introduction to AI: history, Intelligent systems, foundation and sub area of AI, applications, current trend and
development of AI. Problem solving: state space search and control strategies.

Module – 2
Problem reduction and Game playing: Problem reduction, game playing, Bounded look-ahead strategy, alpha-
beta pruning, Two player perfect information games.

Module – 3
Logic concepts and logic Programming: propositional calculus, Propositional logic, natural deduction system,
semantic tableau system, resolution refutation, predicate logic, Logic programming.

Module – 4

Advanced problem solving paradigm: Planning: types of planning system, block world problem, logic based
planning, Linear planning using a goal stack, Means-ends analysis, Non-linear planning strategies, learning plans.

Module – 5
Knowledge Representation, Expert system
Approaches to knowledge representation, knowledge representation using semantic network, extended
semantic networks for KR, Knowledge representation using Frames.
Expert system: introduction phases, architecture ES verses Traditional system.

Course Learning Objectives: This course will enable students to:


1. Gain a historical perspective of AI and its foundations.
2. Become familiar with basic principles of AI toward problem solving.
3. Get to know approaches of inference, perception, knowledge representation, and learning.
Course outcomes: The students should be able to:
1. Apply the knowledge of Artificial Intelligence to write simple algorithm for agents.
2. Apply the AI knowledge to solve problem on search algorithm.
3. Develop knowledge base sentences using propositional logic and first order logic.
4. Apply first order logic to solve knowledge engineering process.
Textbooks:
1. Saroj Kaushik, Artificial Intelligence, Cengage learning, 2014.

Reference Books:
1. Elaine Rich, Kevin Knight, Artificial Intelligence, Tata McGraw Hill.
2. Nils J. Nilsson, Principles of Artificial Intelligence, Elsevier, 1980.
3. StaurtRussel, Peter Norvig, Artificial Intelligence: A Modern Approach, Pearson Education, 3rd Edition,
2009.
4. George F Lugar, Artificial Intelligence Structure and strategies for complex, Pearson Education, 5 th
Edition, 2011.

Web Resource Link:

Prof. Manzoor Ahmed, Dept. of AI&ML, SKIT 1


Principles of Artificial Intelligence/18AI55 Module – 2

Module – 2
Problem reduction and Game playing: Problem reduction, game playing, Bounded look-ahead strategy, alpha-
beta pruning, Two player perfect information games.

Introduction
An effective way of solving a complex problem is to reduce it to simpler parts and solve each part separately.
The problem is automatically solved when we obtain solutions to all the smaller, simpler problems. This is the
basic intuition behind the method of problem reduction.
The structure called AND-OR graph (or tree) is useful for representing the solution of complicated problems.
In this chapter, we will discuss the concept of AND-OR graphs and its use in game playing. Game playing is one
of the most direct applications of state space-search problem-solving paradigm. However, the search
procedures employed in game playing are different from the ones used in state search problems as these are
based on the concept of generate and test philosophy.
In this method, the generator generates individual moves in the search space; each of these moves is then
evaluated by the tester and the most promising one is chosen. The effectiveness of a search may be improved
by improving the generate-and-test procedures used.
The generate procedure should be such that it generates good moves (or paths), while the test procedure
recognizes the best moves out of these and explores them first.
In this chapter, we will develop search procedures for two-player games as they are more common and easier
to design and execute.

Problem Reduction
In real-world applications, complicated problems can be divided into simpler sub-problems; the solution of
each sub-problem may then be combined to obtain the final solution.
A given problem may be solved in a number of ways. For instance, if you wish to own a cellular phone then it
may be possible that either someone gifts one to you or you earn money and buy one for yourself. The AND-
OR graph which depicts these possibilities is shown in Fig. 3.1 (Rich & Knight, 2003).
An AND-OR graph provides a simple representation of a complex problem and hence aids in better
understanding.

Prof. Manzoor Ahmed, Dept. of AI&ML, SKIT 2


Principles of Artificial Intelligence/18AI55 Module – 2

Thus, this structure may prove to be useful to us in a number of problems involving real-life situations. To find
a solution using AND-OR graphs, we need an algorithm similar to the A* algorithm (discussed in Chapter 2) with
an ability to handle AND arcs.
Let us consider a problem known as the Tower of Hanoi to illustrate the need of problem-reduction concept. It
consists of three rods and a number of disks of different sizes which can slide onto any rod (Paul Brna 1996).
The puzzle starts with the disks being stacked in descending order of their sizes, with the largest at the bottom
of the stack and the smallest at the top, thus making a conical shape. The objective of the puzzle is to move the
entire stack to another rod by using the following rules:

 Only one disk may be moved at a time.

 Each move consists of taking the uppermost disk from one of the rods and sliding it onto another rod,
on top of the other disks that may already be present on that rod.

 No disk may be placed on top of a smaller disk.


Consider that there are n disks in one rod (rod_1). Now, our aim is to move these n disks from rod_1 to rod_2
making use of rod_3. Let us develop an algorithm which shows that this problem can be solved by reducing it
to smaller problems.
Basically the method of recursion will be used to solve this problem. The game tree that is generated will contain
AND-OR arcs. The solution of this problem will involve the following steps:
• If n= 1, then simply move the disk from rod_1 to rod_2.

• If n > 1, then somehow move all the top smaller n - 1 disks in the same order from rod_1 to rod_3, and
then move the largest disk from rod_1 to rod_2.

• Finally, move n - 1 smaller disks from rod_3 to rod_2. So, the problem is reduced to moving n-1 disks
from one rod to another, first from rod_1 to rod_3 and then from rod_3 to rod_2;
Example 3.1: Let us consider the case of 3 disks. The start and goal states are shown in Fig.3.2

Prof. Manzoor Ahmed, Dept. of AI&ML, SKIT 3


Principles of Artificial Intelligence/18AI55 Module – 2

The search space graph (not completely expanded) shown in Fig. 3.3 is an AND-OR graph. Here, we have shown
two alternative paths. One path is from root to state A and the other is from root to state A'.
The path through A requires all the three states A, B, and C to be achieved to solve the problem. In order to
achieve state A we need to expand it in a similar fashion, which will again require drawing of AND arcs.
The process continues till we achieve the goal state, that is, state C is achieved. It is important to note that the
subtasks in the process are not independent of each other and therefore cannot be achieved in parallel.
State B will be obtained after state A has been achieved, and state C will be obtained after state B has been
achieved. The second path is from root to state A', then to state B', and then to state C'. This path is to be
continued till we reach the goal state. The path from root to state A is optimal whereas the path from root to
state A' is longer.
We will use the heuristic function f for each node in the AND-OR graph similar to the one used in the algorithm
A* to compute the estimated value. A given node in the graph may be either an OR node or an AND node. In an
AND-OR graph, the estimated costs for all paths generated from the start node to level one are calculated by
the heuristic function and placed at the start node itself.
The best path is then chosen to continue the search further; unused nodes on the chosen best path are explored
and their successors are generated. The heuristic values of the successor nodes are calculated and the cost of
parent nodes is revised accordingly. This revised cost is propagated back to the start node through the chosen
path.

Prof. Manzoor Ahmed, Dept. of AI&ML, SKIT 4


Principles of Artificial Intelligence/18AI55 Module – 2

Let us explain this concept by considering a hypothetical example. Consider an AND-OR graph (Fig. 3.4) where
each arc with a single successor has a cost of 1; also assume that each AND arc with multiple successors has a
cost of 1 for each of its components for the sake of simplicity.
In the tree shown in Fig. 3.4, let us assume that the numbers listed in parenthesis, (), denote the estimated
costs, while the numbers in the square brackets, [ ], represent the revised costs of path.
Thick lines in the figure indicate paths from a given node. We begin our search from start node A and compute
the heuristic values for each of its successors, say B and (C, D) as 19 and (8,9) respectively.
The estimated cost of paths from A to B is 20 (19 + cost of one arc from A to B) and that from A to (C, D) is 19 (8
+ 9 + cost of two arcs, A to C and A to D).
The path from A to (C, D) seems to be better than that from A to B. So, we expand this AND path by extending
C to (G, H), and D to (I, J).
Now, the heuristic values of G, H, I, and J are 3, 4, 8, and 7, respectively, which lead to revised costs of C and D
as 9 and 17, respectively. These values are then propagated up and the revised costs of path from A to (C, D) is
calculated as 28 (9+17 + cost of arcs A to C and A to D).

Note that the revised cost of this path is now 28 instead of the earlier estimation of 19; thus, this path is no
longer the best path now. Therefore, choose the path from A to B for expansion. After expansion we see that
the heuristic value of node B is 17 thus making the cost of the path from A to B equal to 18.
This path is the best so far; therefore, we further explore the path from A to B. The process continues until
either a solution is found or all paths lead to dead ends, indicating that there is no solution.
It should be noted that the propagation of the estimated cost of the path is not relevant in A* algorithm as it is
used for an OR graph where there is a clear path from the start to the current node and the best node is
expanded.
In case of an AND-OR graph, there need not be a direct path from the start node to the current node because
of the presence of AND arcs. Therefore, the cost of the path is recalculated by propagating the revised costs.
For handling such graphs, a modified version of the algorithm A* called AO* algorithm is used.

Prof. Manzoor Ahmed, Dept. of AI&ML, SKIT 5


Principles of Artificial Intelligence/18AI55 Module – 2

Node Status Labelling Procedure


At any point in time, a node in an AND-OR graph may be either a terminal node or a non-terminal AND/OR
node. The labels used to represent these nodes in a graph (or tree) are described as follows:

 Terminal node: A terminal node in a search tree is a node that cannot be expanded further. If this node
is the goal node, then it is labelled as solved; otherwise, it is labelled as unsolved. It should be noted that
this node might represent a sub-problem.

 Non-terminal AND node: A non-terminal AND node is labelled as unsolved as soon as one of its
successors is found to be unsolvable; it is labelled as solved if all of its successors are solved.

 Non-terminal OR node: A non-terminal OR node is labelled as solved as soon as one of its successors is
labelled solved; it is labelled as unsolved if all its successors are found to be unsolvable.
Let us explain the labelling procedure with the help of an example. The AND-OR trees are generated level wise
as shown in Figs 3.5 to 3.7. The first two cycles are shown in Fig. 3.5.
In the first cycle, we expand the start node A to node B and nodes (C, D) (Fig. 3.5). The heuristic values at node
B and nodes (C, D) are computed as 4 and (2, 3), respectively.
The estimated costs of paths from A to B and from A to (C, D) are determined as 5 and 7, respectively assuming
that the cost of each arc is one. Here dotted lines show propagation of heuristic value to the root.
Thus, we find that the path from A to B is better. In the second cycle, node B is expanded to nodes E and F (Fig.
3.5). The estimated cost of the path on this route is revised to 8 at the start node A. Thus, the best path is found
to be that from A to (C, D) instead of A to B.

In the third cycle, node C is expanded to {G, (H, 1)} (Fig. 3.6). Here we notice that the heuristic value at nodes H
and I is 0 indicating that these are terminal solution nodes; H and I are thus labelled as solved. Node C also gets
labelled as solved using the status labelling procedure.

Prof. Manzoor Ahmed, Dept. of AI&ML, SKIT 6


Principles of Artificial Intelligence/18AI55 Module – 2

In the fourth cycle, node D is expanded to J (Fig. 3.7). This node is also labelled as solved, and subsequently
node D attains the solved label. The start node A also gets labelled as solved as C and D both are labelled as
solved. Along with the labelling status, the cost of the path is also propagated.
In this example, the solution graph with minimal cost equal to 5 is obtained by tracing down through the marked
arrows as A → (C → (H, I), D → J).

It is not always necessary that the shorter path will lead to a solution. Sometimes, the longer path may prove
to be better. Consider the example discussed above. Assume that node J is labelled as unsolved after the fourth
cycle (Fig. 3.7), then D is also labelled as unsolved. As a result, A will also be labelled as unsolved through this
path.
Therefore, another path, even though having higher cost, will be tried. Suppose, node E is expanded to P, P to
Q, Q to R, and finally, R to C (Fig. 3.8).
We notice that node C is already solved and then in accordance with the labelling procedure. nodes R, Q, P, E,
B, and A get labelled as solved with the total cost as 8. This new path to C is longer than the previous one coming
directly from A to C. However, since the shorter path will not lead to a solution (as D is unsolvable), the longer
path through R is better.

Prof. Manzoor Ahmed, Dept. of AI&ML, SKIT 7


Principles of Artificial Intelligence/18AI55 Module – 2

Algorithm Steps for AND-OR Graphs

Cyclic Graphs
If the graph is cyclic (containing cyclic paths) then the algorithm outlined above will not work properly unless
modified. If the successor is generated and found to be already in the graph, then we must check that the node
in the graph is not an ancestor of the node being expanded.
If not, then the newly discovered path to the node may be entered in the graph. We can now precisely state
the steps that need to be taken for performing heuristic search of an AND-OR graph. N.J. Nilsson named this
algorithm as AO* as it is used for searching a solution in an AND-OR graph.
Rather than using the two lists, OPEN and CLOSED, as used in OR graph search algorithms given in the previous
chapter, a single structure called graph G is used in AO* algorithm. This graph represents the part of the search
graph generated explicitly so far.
Each node in the graph will point down to its immediate successors as well as up to its immediate predecessor,
and will have h value (an estimate of the cost of a path from current node to a set of solution nodes) associated
with it.
The value of g (cost from start to current node) is not computed at each node, unlike the case of A* algorithm,
as it is not possible to compute a single such value since there may be many paths to the same state.
Moreover, such a value is not necessary because of the top-down traversing of the best-known path which
guarantees that only nodes that are on the best path will be considered for expansion.
So, h will be a good estimate for an AND-OR graph search instead off. While propagating the cost upward to
the parent node, the value of g will be added to h in order to obtain the revised cost of the parent.

Prof. Manzoor Ahmed, Dept. of AI&ML, SKIT 8


Principles of Artificial Intelligence/18AI55 Module – 2

Further, we have to use the node-labelling (solved or unsolved) procedure described earlier for determining the
status of the ancestor nodes on the best path. The detailed algorithm for AO* is given below. The threshold
value is chosen to take care of unsolved nodes.

Prof. Manzoor Ahmed, Dept. of AI&ML, SKIT 9


Principles of Artificial Intelligence/18AI55 Module – 2

The algorithm for propagation of the newly discovered information up to the graph is given below:

Interaction between Sub-Goals


The AO* algorithm discussed above fails to take into account an interaction between sub-goals which may lead
to non-optimal solution. Let us explain the need of interaction between sub-goals. In the graph shown in Fig.
3.9, we assume that both C and D ultimately lead to a solution.
In order to solve A (AND node), both B and D have to be solved, The AO* algorithm considers the solution of B
as a completely separate process from the solution of D.
Node B is expanded to C and D both of which eventually lead to solution. Using the AO* algorithm, node C is
solved in order to solve B as the path B  C seems to be better than path B  D.
We note that it is necessary to solve D in order to solve A. But we realize that node D will also solve B and hence
there would be no need to solve C.
We can clearly see that the cost of solving A through the path A  B  D is 9, whereas in case of solving B
through C, the cost of A comes out to be 12. Since AO* does not consider such interactions, we may fail to find
the optimal path for this problem.

Prof. Manzoor Ahmed, Dept. of AI&ML, SKIT 10


Principles of Artificial Intelligence/18AI55 Module – 2

Game Playing
A game is defined as a sequence of choices where each choice is made from a number of discrete alternatives.
Each sequence ends in a certain outcome and every outcome has a definite value for the opening player.
Games can be classified into two types: perfect information games and imperfect information games.
Perfect information games are those in which both the players have access to the same information about the
game in progress; for example, Checker, Tic-Tac-Toe, Chess, Go, etc.
On the other hand, in imperfect information games, players do not have access to complete information about
the game; for example, games involving the use of cards (such as Bridge) and dice.

Game Problem versus State Space Problem


It should be noted that there is a natural correspondence between games and state space problem For example,
in state space problems, we have a start state, intermediate states, rules or operator and a goal state. In game
problems also we have a start state, legal moves, and winning position (goals). To further clarify the
correspondence between the two problems the comparisons as shown in Table 3.1.

A game begins from a specified initial state and ends in a position that can be declared a win for one, a loss for
the other, or possibly a draw. A game tree is an explicit representation of all possible plays of the game.
The root node is an initial position of the game. Its successors are the positions that the first player can reach in
one move; their successors are the positions resulting from the second player's moves and so on. Terminal or
leaf nodes are represented by WIN, LOSS, or DRAW. Each path from the root to a terminal node represents a
different complete play of the game.
Game theory is based on the philosophy of minimizing the maximum possible loss and maximizing the minimum
gain. In game playing involving computers, one player is assumed to be the computer, while the other is a
human.
During a game, two types of nodes are encountered, namely, MAX and MIN. The MAX node will try to maximize
its own game, while minimizing the opponent's (MIN) game.
Either of the two players, MAX and MIN, can play as the first player. We will assign the computer to be the MAX
player and the opponent to be the MIN player.
Our aim is to make the computer win the game by always making the best possible move at its turn. For this,
we have to look ahead at all possible moves in the game by generating the complete game tree and then decide
which move is the best for MAX. As a part of game playing, game trees labelled as MAX level and MIN level are
generated alternately.

Prof. Manzoor Ahmed, Dept. of AI&ML, SKIT 11


Principles of Artificial Intelligence/18AI55 Module – 2

Status Labelling Procedure in Game Tree


We label each level in the game tree according to the player who makes the move at that point in the game.
The leaf nodes are labelled as WIN, LOSS, or DRAW depending on whether they represent a win, loss, or draw
position from MAX's point of view.
Status labelling procedure for a node with WIN, LOSS, or DRAW in case of game tree is given as follows:

The function STATUS(j) assigns the best status that MAX can achieve from position j if it plays optimally against
a perfect opponent. The status of the leaf nodes is assigned by the rules of the game from MAX's point of view.
Solving a game tree implies labelling the root node with one of labels, namely: WIN (W), LOSS (L), or DRAW (D).
There is an optimal playing strategy associated with each root label, which tells how that label can be
guaranteed regardless of the way MIN plays. An optimal strategy for MAX is a sub-tree in which all nodes,
starting from first MAX, are WIN.

Prof. Manzoor Ahmed, Dept. of AI&ML, SKIT 12


Principles of Artificial Intelligence/18AI55 Module – 2

The hypothetical game tree shown in Fig. 3.10 is generated when the MAX player plays first. As mentioned
earlier, the status of the leaf nodes is calculated in accordance with the rules of the game as W, L, or D from
MAX's point of view. The status labelling procedure is used to propagate the status to non-terminal nodes till
root and is shown by attaching the status to the node. Thick lines in the game tree show the winning paths for
MAX, while dotted lines show the status propagation to the root node. It should be noted that all the nodes on
the winning path are labelled as W.

Nim Game Playing


The Nim game is believed to have originated in China, but the exact location of its origin is no certain. Charles
L. Bouton of Harvard University developed the complete theory of the Nim game in the year 1901.
The Game: There is a single pile of matchsticks (> 1) and two players. Moves are made by t players alternately.
In a move, each player can pick up a maximum of half the number of match sticks in the pile. Whoever takes
the last matchstick loses.
Let us consider the game for explaining the concept with single pile of 7 matchsticks for the sake of simplicity.
Each player in a particular move can pick up a maximum of half the number matchsticks from the pile at a given
point in time. We will develop complete game tree with either MAX or MIN playing as first player.
The convention used for drawing a game tree is that each node contains the total number of sticks in the pile
and is labelled as W or L in accordance with the status labelling procedure.
The player who has to pick up the last stick loses. If a single stick is left at the MAX level then as a rule of the
game, MAX node is assigned the status L, whereas if one stick is left at the MIN level, then W is assigned to MIN
node as MAX wins.

Prof. Manzoor Ahmed, Dept. of AI&ML, SKIT 13


Principles of Artificial Intelligence/18AI55 Module – 2

The label L or W have been assigned from MAX's point of view at leaf nodes. Arcs carry the number of sticks to
be removed. Dotted lines show the propagation of status. The complete game tree for Nim with MAX playing
first is shown in Fig. 3.12. We can see from this figure that the MIN player always wins irrespective of the move
made by the first player.

Now, let us consider a game tree with MIN as the first player and see the results. The game tree for this situation
is shown in Fig. 3.13. Thick lines show the winning path for MAX. From the search tree given in the figure, we
notice that MAX wins irrespective of the moves of MIN player. Thick lines show the winning paths where all
nodes have been labelled as W.
From the trees given in Figs 3.12 and 3.13, we can infer that the second player always wins regardless of the
moves of the first player in this particular case.
Since the game is played between computer and a human being, we will now be discussing game-playing
strategies with respect to a computer. In this case, MAX player is considered computer program. Let us
formulate some strategy for MAX player so that MAX can win the game.

Strategy If at the time of MAX player's turn there are N matchsticks in a pile, then MAX force a win by leaving
M matchsticks for the MIN player to play, where M ϵ {1, 3, 7, 15, 2, ...} using the rule of game (that is, MAX can
pick up a maximum of half the number of m sticks in the pile). The sequence {1, 3, 7, 15, 31, 63,...} can be
generated using the formula 2Xi-1 +1, where X0 = 1 for i > 0.

Prof. Manzoor Ahmed, Dept. of AI&ML, SKIT 14


Principles of Artificial Intelligence/18AI55 Module – 2

Now we will formulate a method which will determine the number of matchsticks that has picked up by MAX
player. There are two ways of finding this number:
• The first method is to look up from the sequence (1, 3, 7, 15, 31, 63, ...) and figure out the closest number
less than the given number N of matchsticks in the pile. The difference between N and that number
gives the desired number of sticks that have to be picked up. For example, if N = 45, the closest number
to 45 in the sequence is 31, so we obtain the desired number of matchsticks to be picked up as 14 on
subtracting 31 from 45. In this case we have to maintain a sequence (1,3,7, 15, 31, 63, ...).
• The second method is a simple one, in which the desired number is obtained by removing the most
significant digit from the binary representation of N and adding it to the least significant digit position.
Consider the same example discussed above, where N = 45. The binary representation of 45 is (101101)2.
Remove 1 from most significant digit position from (101101), and add it to least significant position, that is,
001101 + 000001 = 001110 = 14. Thus, 14 matchsticks must be withdrawn to leave a safe position and to enable
the MAX player to force a win. Table 3.2 illustrates the working of the second method for some values of N to
get the number of matchsticks that have to be removed.

Prof. Manzoor Ahmed, Dept. of AI&ML, SKIT 15


Principles of Artificial Intelligence/18AI55 Module – 2

It should be noted that the complete game tree is never generated in order to guess the best path; instead,
depending on the move made by the MIN player, MAX has to apply the above-mentioned strategy and play the
game accordingly. We can clearly formulate two cases where MAX player will always win if the above strategy
is applied. These cases are as follows:
• CASE 1 MAX is the first player and initially there are Ne (3,7, 15, 31, 63, ...) matchsticks.
• CASE 2 MAX is the second player and initially there are Ne (3,7, 15, 31, 63, ...} matchsticks.

Validity of Cases for Winning of MAX Player


Let us show the validity of the cases mentioned above by considering suitable examples.
CASE 1: If MAX is the first player and N ≠ (3, 7, 15, 31, 63, ...}, then MAX will always win. Consider a pile of 29
sticks and let MAX be the first player. The complete game tree for this case is shown in Fig. 3.14. From the figure,
it can be seen that MAX always wins. This case can be validated for any number of sticks & {3, 7, 15, 31, ...}.
Thus, in this case, we can conclude by observing the figure that MAX is bound to win irrespective of how MIN
plays.

Prof. Manzoor Ahmed, Dept. of AI&ML, SKIT 16


Principles of Artificial Intelligence/18AI55 Module – 2

CASE 2: If MAX is the second player and N ϵ {3,7, 15, 31, 63, ...}, then MAX will always win. Consider a pile of 15
sticks and let MAX be the second player. The complete game tree for this case is shown in Fig. 3.15. From the
figure, it can be observed that MAX always wins. This case can be validated for any number of sticks e {3, 7, 15,
31, 63,...}.

There are other two cases where MAX can force a win if MIN is not playing optimally. These cases have been
discussed below with examples. We do not have any clear strategy for these cases except that whenever
possible MAX should leave M matchsticks for MIN to play, where M ϵ {, 3, 7, 15, 31, 63, ...}.
CASE 3: If MAX is the first player and N ϵ (3,7, 15, 31, 63, ...) at root of the game, then MAX can force a win using
the strategy mentioned above in all cases except when MAX gets a number from the sequence (3, 7, 15, 31, 63,
...} at its turn.
Assume that N = 15. Fig. 3.16 shows that MAX wins in all cases except when it gets 7 matchsticks in its turn.

Prof. Manzoor Ahmed, Dept. of AI&ML, SKIT 17


Principles of Artificial Intelligence/18AI55 Module – 2

We can easily see from Fig. 3.17 that MAX can even win the game when it gets 7 sticks at its turn in the game
for all values except when it gets 3 sticks at its turn in the game.
Therefore, we can conclude that if MAX is playing with M sticks ϵ (3, 7, 15, 31, 63,...} at any point in the game,
it can win for all cases except for the case when its gets value 3. In these situations also, MAX can win if opponent
is playing without any strategy.

Prof. Manzoor Ahmed, Dept. of AI&ML, SKIT 18


Principles of Artificial Intelligence/18AI55 Module – 2

CASE 4: If MAX is the second player and N ≠ 3, 7, 15, 31, 63, ...} then MAX can force a win using the above-
mentioned strategy in all cases except when it gets a number from the sequence {3,7, 15, 31, 63, ...} at its
turn.
Let us consider an example where N = 29 and let MIN be the first player. Figure 3.18 shows that MAX wins in all
cases except when it gets 15 matchsticks at its turn. MAX might lose in case of it getting 15.

Prof. Manzoor Ahmed, Dept. of AI&ML, SKIT 19


Principles of Artificial Intelligence/18AI55 Module – 2

Prof. Manzoor Ahmed, Dept. of AI&ML, SKIT 20


Principles of Artificial Intelligence/18AI55 Module – 2

Bounded Look-Ahead Strategy and Use of Evaluation Functions


In all the examples discussed in the previous section, complete game trees were generated and with the help
of the status labelling procedure, the status is propagated up to the root.
Therefore, status labelling procedure requires the generation of the complete game tree or at least a sizable
portion of it. In reality, for most of the games, trees of possibilities are too large to be generated and evaluated
backward from the terminal nodes to root in order to determine the optimal first move.
For example, in the game of Checkers, there are 1040 non-terminal nodes and we will require 1021 centuries if
3 billion nodes are generated every second. Similarly, in Chess, 10,120 non-terminal nodes are generated and
will require 10,101 centuries (Rich & Knight, 2003).
Therefore, this approach of generating complete game trees and then deciding on the optimal first move is not
practical. One may think of looking ahead up to a few levels before deciding the move.
If a player can develop the game tree to a limited extent before deciding on the move, then this shows that the
player is looking ahead; this is called look-ahead strategy.
If a player is looking ahead n number of levels before making a move, then the strategy is called n-move look-
ahead. For example, a look-ahead of 2 levels from the current state of the game means that the game tree is to
be developed up to 2 levels from the current state.
The game may be one -ply (depth one), two -ply (depth two), and so on. In this strategy, the actual value of a
terminal state is unknown since we are not doing an exhaustive search. Hence, we need to make use of an
evaluation function.

Using Evaluation Functions


The process of evaluation of a game is determined by the structural features of the current state. The steps
involved in the evaluation procedure are as follows:
• The first step is to decide which features are of value in a particular game.
• The next step is to provide each feature with a range of possible values.
• The last step is to devise a set of weights in order to combine all the feature values into a single value.
Evaluation functions represent estimates of a given situation in the game rather than accurate calculations. This
function provides numerical assessment of how favourable the game state is for MAX.
We can use a convention in which a positive number indicates a good position for MAX while a negative number
indicates a bad position.
The general strategy for MAX is to play in such a manner that it maximizes its winning chances, while
simultaneously minimizing the chances of the opponent.
The heuristic values of the nodes are determined at some level and then the status is accordingly propagated
up to the root. The node which offers the best path is then chosen to make a move.
For the sake of convenience, let us assume the root node to be a MAX node. Consider the one-ply and two-ply
games shown in Fig. 3.19; the score of leaf nodes is assumed to be calculated using evaluation functions. The
values at the nodes are backed up to the starting position.

Prof. Manzoor Ahmed, Dept. of AI&ML, SKIT 21


Principles of Artificial Intelligence/18AI55 Module – 2

The procedure through which the scoring information travels up the game tree is called the MINIMAX
procedure. This procedure represents a recursive algorithm for choosing the next move in two-player game.
In this, a value is associated with each position or state of the game; the value is computed using an evaluation
function and it denotes the extent to which it would be favourable for a player to reach that position.
The player is then required to make a move which maximizes the minimum value of the position resulting from
the opponent's possible following moves. The MINIMAX procedure evaluates each leaf node (up to some fixed
depth) using a heuristic evaluation function and obtains the values corresponding to the state.
By convention of this algorithm, the moves which lead to a win of the MAX player are assigned a positive
number, while the moves that lead to a win of the MIN player are assigned a negative number. MINIMAX
procedure is a depth-first, depth-limited search procedure.
For a two-player, perfect-information game, the MINIMAX procedure can solve the problem provided there are
sufficient computational resources for the same. This procedure assumes that each plaver takes the best option
in each step.
MINIMAX procedure starts from the leaves of the tree (which contain the final scores with respect to the MAX
player) and then proceeds upwards towards the root. In the following section, we will describe MINIMAX
procedure in detail.

MINIMAX Procedure
Lack of sufficient computational resources prevent the generation of a complete game tree; hence, the search
depth is restricted to a constant.
The estimated scores generated by a heuristic evaluation function for leaf nodes are propagated to the root
using MINIMAX procedure, which is a recursive algorithm where a player tries to maximize its chances of a win
while simultaneously minimizing that of the opponent.
The player hoping to achieve a positive number is called the maximizing player, while the opponent is called the
minimizing player.

Prof. Manzoor Ahmed, Dept. of AI&ML, SKIT 22


Principles of Artificial Intelligence/18AI55 Module – 2

At each move, the MAX player will try to take a path that leads to a large positive number; on the other hand,
the opponent will try to force the game towards situations with strongly negative static evaluations.
A game tree shown in Fig. 3.20 is a hypothetical game tree where leaf nodes show heuristic values, whereas
internal nodes show the backed-up values. This game tree is generated using MINIMAX procedure up to a depth
of three. At MAX level, maximum value of its successor nodes is assigned, whereas at MIN level, minimum value
of its successor nodes is assigned.
The MAX node moves to a state that has a score of 5 in the example considered above. After this MIN will get
a chance to play a move. Whenever MAX gets a chance to play, it will generate a game tree of depth 3 from the
state generated by MIN player in order to decide its next move.
The process will continue till the game ends with, hopefully, MAX player winning. The algorithmic steps of a
MINIMAX procedure can be written in the following manner [Rich & Knight 2003):

MINIMAX Procedure

The algorithmic steps of this procedure may be written as follows:

• Keep on generating the search tree till the limit, say depth d of the tree, has been reached from the current
position.

• Compute the static value of the leaf nodes at depth d from the current position of the game tree using evaluation
function.

• Propagate the values till the current position on the basis of the MINIMAX strategy.

Prof. Manzoor Ahmed, Dept. of AI&ML, SKIT 23


Principles of Artificial Intelligence/18AI55 Module – 2

MINIMAX Strategy
The steps in the MINIMAX strategy are written as follows:
• If the level is minimizing level (level reached during the minimizer's turn), then
• Generate the successors of the current position
• Apply MINIMAX to each of the successors
• Return the minimum of the results
• If the level is a maximizing level (level reached during the maximizer's turn), then
• Generate the successors of current position
• Apply MINIMAX to each of these successors
• Return the maximum of the results
Algorithm 3.3 makes use of the following functions:
• GEN (Pos): This function generates a list of SUCCs (successors) of the Pos, where Pos represents a
variable corresponding to position.
• EVAL (Pos, Player): This function returns a number representing the goodness of Pos for the player from
the current position.
• DEPTH (Pos, Depth): It is a Boolean function that returns true if the search has reached the maximum
depth from the current position, else it returns false.

The MINIMAX function returns a structure consisting of Val field containing heuristic value of the current state
obtained by EVAL function and Path field containing the entire path from the current state. This path is
constructed backwards starting from the last element to the first element because of recursion.

Prof. Manzoor Ahmed, Dept. of AI&ML, SKIT 24


Principles of Artificial Intelligence/18AI55 Module – 2

Let us consider the following Tic-Tac-Toe example to illustrate the use of static evaluation function and
MINIMAX algorithm.

Tic-Tac-Toe Game
Tic-tac-toe is a two-player game in which the players take turns one by one and mark the spaces in a 3 x 3 grid
using appropriate symbols. One player uses 'o' and other uses 'X' symbol. The player who succeeds in placing
three respective symbols in a horizontal, vertical, or diagonal row wins the game.
Let us define the static evaluation function f to a position P of the grid (board) as follows:
Let us define the static evaluation function f to a position P of the grid (board) as follows:
• If P is a win for MAX, then
f(P) = n, (where n is a very large positive number)

• If P is a win for MIN, then


f(P) =-n
• If P is not a winning position for either player, then
f(P) = (total number of rows, columns, and diagonals that are still open for MAX)
- (total number of rows, columns, and diagonals that are still open for MIN)
Consider the symbol x for MAX and the symbol o for MIN and the board position P at some given point in time
(Fig. 3.21). Assume that MAX starts the game. After MIN has played, it is the turn of MAX at board position P.

Tic-Tac-Toe (Example) Grey lines in Fig. 3.21 represent the positions that are open for X (MAX) and dotted grey
lines represent those for o (MIN). We notice that both the diagonals, last two rows, and first and third columns
are still open for MAX player (i.e., 6 lines are available for the symbol). On the other hand, the first and third
rows and columns are open for MIN (i.e., 4 lines are available for the symbol o). Thus,

 Total number of rows, columns, and diagonals still open for MAX (thick lines) = 6
 Total number of rows, columns, and diagonals still open for MIN (dotted lines) = 4
f(P) = (total number of rows, columns, and diagonals that are still open for MAX) - (total number of rows,
columns, and diagonals that are still open for MIN) = 2
Therefore, the board position P has been evaluated by static evaluation function and assigned the value 2.
Similarly, all the board positions after MIN player has played are evaluated and the move by MAX player to the
best-scored board position is made. Using the fact that MINIMAX algorithm is a depth-first process, we can
improve its efficiency by using a dynamic branch-and-bound technique; in this technique, partial solutions that
appear to be clearly worse than known solutions are abandoned.

Prof. Manzoor Ahmed, Dept. of AI&ML, SKIT 25


Principles of Artificial Intelligence/18AI55 Module – 2

Alpha-Beta Pruning
The strategy used to reduce the number of tree branches explored and the number of static evaluation applied
is known as alpha-beta pruning. This procedure is also called backward pruning, which is a modified depth-first
generation procedure. The purpose of applying this procedure is to reduce the amount of work done in
generating useless nodes (nodes that do not affect the outcome) and is based on common sense or basic logic.
The alpha-beta pruning procedure requires the maintenance of two threshold values: one representing a lower
bound (α) on the value that a maximizing node may ultimately be assigned (we call this alpha) and another
representing upper bound (β) on the value that a minimizing node may be assigned (we call it beta).
Each MAX node has an alpha value, which never decreases and each MIN node has a beta value, which never
increases. These values are set and updated when the value of a successor node is obtained. The search is
depth-first and stops at any MIN node whose beta value is smaller than or equal to the alpha value of its parent,
as well as at any MAX node whose alpha value is greater than or equal to the beta value of its parent.

Let us consider the systematic development of a game tree and propagation of α and β values using alpha-beta
(α-β) pruning algorithm up to second level stepwise in depth-first order. In Fig. 3.22, the MAX player expands
root node A to B and suppose MIN player expands B to D.
Assume that the evaluation function generates α = 2 for state D. At this point, the upper bound value β = 2 at
state B and is shown as ≤ 2.
After the first step, we have to backtrack and generate another state E from B in the second step as shown in
Fig. 3.23. The state E gets α = 7 and since there is no further successor of B (assumed), the β value at state B
becomes equal to 2. Once the β value is fixed at MIN level, the lower bound α = 2 gets propagated to state A as
≥ 2.
In the third step, expand A to another successor C, and then expand C's successor to F with α = 1. From Fig. 3.24
we note that the value at state C is 51 and the value of a root A cannot be less than 2; the path from A through
C is not useful and thus further expansion of C is pruned. Therefore, there is no need to explore the right side
of the tree fully as that result is not going to alter the move decision. Since there are no further successors of A
(assumed), the value of root is fixed as 2, that is, α = 2.

Prof. Manzoor Ahmed, Dept. of AI&ML, SKIT 26


Principles of Artificial Intelligence/18AI55 Module – 2

The complete diagram of game tree generation using (α-β) pruning algorithm is shown in Fig. 3.25 as follows:

Let us consider an example of a game tree of depth 3 and branching factor 3 (Fig. 3.26). If the full game tree of
depth 3 is generated then there are 27 leaf nodes for which static evaluation needs to be done. On the other
hand, if we apply the α-β pruning, then only 16 static evaluations need to be made.

Let us write the MINIMAX algorithm using α-β pruning concept (Algorithm 3.4). We notice that at the
maximizing level, we use β to determine whether the search is cut-off, while at the minimizing level, we use α
to prune the search.
Therefore, the values of α and β must be known at maximizing or minimizing levels so that they can be passed
to the next levels in the tree. Thus, each level should have both values: one to use and the other to pass to the
next level. This procedure will therefore simply negate these values at each level.
The effectiveness of α-β pruning procedure depends greatly on the order in which the paths are examined. If
the worst paths are examined first, then there will be no cut-offs at all. So, the best possible paths should be
examined first, in case they are known in advance.
It has been shown by researchers that if the nodes are perfectly ordered, then the number of terminal nodes
considered by search to depth d using α-β pruning is approximately equal to twice the number of nodes at
depth d/2 without α-β pruning. Thus, the doubling of depth by some search procedure is a significant gain.

Prof. Manzoor Ahmed, Dept. of AI&ML, SKIT 27


Principles of Artificial Intelligence/18AI55 Module – 2

Refinements to a-B Pruning


In addition to α-β pruning, a number of modifications can be made to the MINIMAX procedure in order to
improve its performance. One of the important factors that need to be considered during a search is when to
stop going deeper in the search tree. Further, the idea behind α-β pruning procedure can be extended by cutting
off additional paths that appear to be slight improvements over paths that have already been explored (Rich &
Knight, 2003).

Pruning of Slightly Better Paths


In Fig. 3.27, we see that the value 4.2 is only slightly better than 4, so we may terminate further exploration of
node C. Terminating exploration of a sub-tree that offers little possibility for improvement over known paths is
called futility cut off.

Prof. Manzoor Ahmed, Dept. of AI&ML, SKIT 28


Principles of Artificial Intelligence/18AI55 Module – 2

Waiting for Quiescence


Consider a one-level deep game tree shown in Fig. 3.28.

If node B is extended to one more level, we obtain a tree as shown in Fig. 3.29.

From Fig. 3.29, we can see that our estimate of the worth of node B has changed. We can stop exploring the
tree at this level and assign a value of – 2 to B, and therefore, decide that B is not a good move. Such short-term
measures do not unduly influence our choice of moves, we should continue the search further until no such
drastic change occurs from one level to the next or till the condition is stable. Such situation is called waiting
for quiescence (Fig. 3.30).

Secondary Search
To provide a double check, explore a game tree to an average depth of more ply and on the basis of that, choose
a particular move. The chosen branch is then to be further expanded up to two levels to make sure that it still
looks good. This technique is called secondary search.

Alternative to a- pruning MINIMAX Procedure


The MINIMAX procedure still has some problems even with all the refinements discussed above. It is based on
the assumption that the opponent will always choose an optimal move. In a winning situation, this assumption
is acceptable but in a losing situation, one may try other options and gain some benefit in case the opponent
makes a mistake (Rich & Knight, 2003).

Prof. Manzoor Ahmed, Dept. of AI&ML, SKIT 29


Principles of Artificial Intelligence/18AI55 Module – 2

Suppose, we have to choose one move out of two possible moves, both of which may lead to bad situations for
us if the opponent plays perfectly. MINIMAX procedure will always choose the bad move out of the two;
however, here we can choose an option which is slightly less bad than the other. This is based on the assumption
that the less bad move could lead to a good situation for us if the opponent makes a single mistake. Similarly,
there might be a situation when one move appears to be only slightly more advantageous than the other. Then,
it might be better to choose the less advantageous move. To implement such systems, we should have a model
of individual playing styles of opponents.

Iterative Deepening
Rather than searching till a fixed depth in a given game tree, it is advisable to first search only one-ply, then
apply MINIMAX to two-ply, then three-ply till the final goal state is searched (CHESS 5 uses this procedure).
There is a good reason why iterative deepening is popular in case of games such as chess and others programs.
In competitions, there is an average amount of time allowed per move. The idea that enables us to conquer this
constraint is to do as much look-ahead as can be done in the available time. If we use iterative deepening, we
can keep on increasing the look-ahead depth until we run out of time. We can arrange to have a record of the
best move for a given look-ahead even if we have to interrupt our attempt to go one level deeper. This could
not be done using (unbounded) depth-first search. With effective ordering, a-ß pruning MINIMAX algorithm can
prune many branches and the total search time can be decreased.

Two-Player Perfect Information Games


Even though a number of approaches and methods have been discussed in this chapter, it is still difficult to
develop programs that can enable us to play difficult games. This is because every game requires thorough
analysis and careful combination of search and knowledge. Al research ers have developed programs for various
games. Some of them are as follows:

Chess
The first two chess programs were proposed by Greenblatt, et al. (1967) and Newell & Simon (1972). Chess is
basically a competitive two-player game played on a chequered board with 64 squares arranged in an 8 x 8
square. Each player is given sixteen pieces of the same colour (black or white). These include one king, one
queen, two rooks, two knights, two bishops, and eight pawns. Each of these pieces moves in a unique manner.
The player who chooses the white pieces gets the first turn. The objective of this game is to remove the
opponent's king from the game. The player who fulfils this objective first is declared the winner. The players get
alternate chances in which they can move one piece at a time. Pieces may be moved to either an unoccupied
square or a square occupied by an opponent's piece; the opponent's piece is then captured and removed from
the game. The opponent's king has to be placed in such a situation where the king is under immediate attack
and there is no way to save it from the attack. This is known as checkmate. The players should avoid making
moves that may place their king under direct threat (or check).

Checkers
Checkers program was first developed by Arthur Samuel (1959, 1967); it had a learning component to improve
the performance of players by experience. Checkers (or draughts) is a two-player game played on a chequered
8 X 8 square board. Each player gets 12 pieces of the same colour (dark or light) which are placed on the dark
squares of the board in three rows. The row closest to a player is called the king row. The pieces in the king row
are called kings, while others are called men. Kings can move diagonally forward as well as backward. On the
other hand, men may move only diagonally forward.

Prof. Manzoor Ahmed, Dept. of AI&ML, SKIT 30


Principles of Artificial Intelligence/18AI55 Module – 2

A player can remove opponent's pieces from the game by diagonally jumping over them. When men pieces
jump over king pieces of the opponent, they transform into kings. The objective of the game is to remove all
pieces of the opponent from the board or by leading the opponent to such a situation where the opposing
player is left with no legal moves.

Othello
Othello (also known as Reversi) is a two-player board game which is played on an 8 x 8 square grid with pieces
that have two distinct bi-coloured sides. The pieces typically are shaped as coins, but each possesses a light and
a dark face, each face representing one player. The objective of the game is to make your pieces constitute a
majority of the pieces on the board at the end of the game, by turning over as many of your opponent's pieces
as possible. Advanced computer programs for Othello were developed by Rosenbloom in 1982 and
subsequently Lee & Mahajan in 1990 leading to it becoming a world championship level game.

GO
It is a strategic two-player board game in which the players play alternately by placing black and white stones
on the vacant intersections of a 19 x 19 board. The object of the game is to control a larger part of the board
than the opponent. To achieve this, players try to place their stones in such a manner that they cannot be
captured by the opposing player. Placing stones close to each other helps them support one another and avoid
capture. On the other hand, placing them far apart creates an influence across a larger part of the board. It is a
strategy that enables players to play a defensive as well as an offensive game and choose between tactical
urgency and strategic planning. A stone or a group of stones is captured and removed if it has no empty adjacent
intersections, that is, it is completely surrounded by stones of the opposing colour. The game is declared over
and the score is counted when both players consecutively pass on a turn, indicating that neither side can
increase its territory or reduce that of its opponent's.

Backgammon
It is also a two-player board game in which the playing pieces are moved using dice. A player wins by removing
all of his pieces from the board. Although luck plays an important role, there is a large scope for strategy. With
each roll of the dice a player must choose from numerous options for moving his checkers and anticipate the
possible counter-moves by the opponent. Players may raise the stakes during the game. Backgammon has been
studied with great interest by computer scientists. Similar to chess, advanced backgammon software has been
developed which is capable of beating world-class human players. Backgammon programs with high level of
competence were developed by Berliner in 1980 and by Tesauro & Sejnowski in 1989.

Prof. Manzoor Ahmed, Dept. of AI&ML, SKIT 31

You might also like