Ai Unit 3
Ai Unit 3
Game theory is a type of decision theory in which one’s choice of action is determined
after taking into account all possible alternatives available to an opponent playing the
same game, rather than just by the possibilities of several outcome results.
Game theory does not insist on how a game should be played but tells the procedure and
principles by which action should be selected.
Properties of a Game
Step-1: In the first step, the algorithm generates the entire game-tree and apply the utility
function to get the utility values for the terminal states. In the below tree diagram, let's take A
is the initial state of the tree. Suppose maximizer takes first turn which has worst-case initial
value =- infinity, and minimizer will take next turn which has worst-case initial value =
+infinity.
Step 2: Now, first we find the utilities value for the Maximizer, its initial value is -∞, so we
will compare each value in terminal state with initial value of Maximizer and determines the
higher nodes values. It will find the maximum among the all.
= -3
Step 4: Now it's a turn for Maximizer, and it will again choose the maximum of all nodes
value and find the maximum value for the root node. In this game tree, there are only 4
layers, hence we reach immediately to the root node, but in real games, there will be more
than 4 layers.
That was the complete workflow of the minimax two player game.
if maximizingPlayer:
bestValue = -infinity
bestValue = max(bestValue, v)
return bestValue
else:
bestValue = +infinity
bestValue = min(bestValue, v)
return bestValue
The main drawback of the minimax algorithm is that it gets really slow for complex games
such as Chess, go, etc. This type of games has a huge branching factor, and the player has
lots of choices to decide. This limitation of the minimax algorithm can be improved
from alpha-beta pruning .
a. Alpha: The best (highest-value) choice we have found so far at any point along
the path of Maximizer. The initial value of alpha is -∞.
b. Beta: The best (lowest-value) choice we have found so far at any point along
the path of Minimizer. The initial value of beta is +∞.
The Alpha-beta pruning to a standard minimax algorithm returns the same move as
the standard algorithm does, but it removes all the nodes which are not really
affecting the final decision but making algorithm slow. Hence by pruning these nodes,
it makes the algorithm fast.
α>=β
Let's take an example of two-player search tree to understand the working of Alpha-beta
pruning
Step 1: At the first step the, Max player will start first move from node A where α= -∞ and
β= +∞, these value of alpha and beta passed down to node B where again α= -∞ and β= +∞,
and Node B passes the same value to its child D.
Step 2: At Node D, the value of α will be calculated as its turn for Max. The value of α is
compared with firstly 2 and then 3, and the max (2, 3) = 3 will be the value of α at node D
and node value will also 3.
Step 3: Now algorithm backtrack to node B, where the value of β will change as this is a turn
of Min, Now β= +∞, will compare with the available subsequent nodes value, i.e. min (∞, 3)
= 3, hence at node B now α= -∞, and β= 3.
In the next step, algorithm traverse the next successor of Node B which is node E, and the
values of α= -∞, and β= 3 will also be passed.
Step 4: At node E, Max will take its turn, and the value of alpha will change. The current
value of alpha will be compared with 5, so max (-∞, 5) = 5, hence at node E α= 5 and β= 3,
where α>=β, so the right successor of E will be pruned, and algorithm will not traverse it, and
the value at node E will be 5.
Step 5: At next step, algorithm again backtrack the tree, from node B to node A. At node A,
the value of alpha will be changed the maximum available value is 3 as max (-∞, 3)= 3, and
β= +∞, these two values now passes to right successor of A which is Node C.
At node C, α=3 and β= +∞, and the same values will be passed on to node F.
Step 6: At node F, again the value of α will be compared with left child which is 0, and
max(3,0)= 3, and then compared with right child which is 1, and max(3,1)= 3 still α remains
3, but the node value of F will become 1.
Step 7: Node F returns the node value 1 to node C, at C α= 3 and β= +∞, here the value of
beta will be changed, it will compare with 1 so min (∞, 1) = 1. Now at C, α=3 and β= 1, and
again it satisfies the condition α>=β, so the next child of C which is G will be pruned, and the
algorithm will not compute the entire sub-tree G.
Step 8: C now returns the value of 1 to A here the best value for A is max (3, 1) = 3.
Following is the final game tree which is the showing the nodes which are computed and
nodes which has never computed. Hence the optimal value for the maximizer is 3 for this
example.
MONTE CARLO TREE SEARCH DEFINITION
Monte Carlo tree search is a heuristic search algorithm that relies on intelligent tree search to
make decisions. It’s most often used to perform game simulations, but it can also be utilized
in cyber security, robotics and text generation.
Monte Carlo tree search is a method that relies on intelligent tree search that balances
exploration and exploitation. It performs random sampling in the form of simulations and
stores the statistics of actions to make more educated choices in each subsequent iteration.
Monte Carlo tree search only searches a few layers deep into the tree and prioritizes which
parts of the tree to explore. It then simulates the outcome rather than exhaustively expanding
the search space. In doing so, it limits how many evaluations it has to make.
The individual evaluation relies on the playout/simulation in which the algorithm effectively
plays the game from a given starting point all the way to the leaf state by making completely
random decisions, and then it records the results which is then used to update all the nodes in
the random path all the way to the root. When it completes the simulation, it then selects the
state that has the best rollout score.
The Monte Carlo tree search algorithm has four phases. We assign state values and number
of interactions for each node.
1.SELECTION
In this phase, the algorithm uses the following formula to calculate the state value of all the
possible next moves and pick the one which gives the maximum value.
Monte Carlo tree search algorithm equation.
The first term vi is the exploitation term. The second term, which is the square-root of log
N /ni is the exploration term. The algorithm considers both nodes , one with a high state
value, and one that is relatively unexplored, to make a selection. This contact defines the
weightage between exploitation and exploration.
At the beginning when no node is explored, it makes a random selection because there is no
data available to make a more educated selection.
When a node is unexplored, i.e. when n1=0, the second term becomes ∞ and thus obtains a
maximum possible value and automatically becomes a candidate for selection. Thus, the
equation makes sure all children get selected at least once.
2. EXPANSION
In this phase, we expand from one node and start looking one level deeper. The node we
expanded from becomes the parent (current state), and its children become the possible next
moves.
3. SIMULATION
In this phase, we simulate the game from the selected child node in phase one and continue
the game by making random choices until we reach an end state, i.e. a win, lose or draw.
Let’s assign following values to these results/outcomes:
Win = +1
Loose = -1
Draw = 0
4. BACKPROPAGATION
In this phase, we backpropagate and update the result we found in the simulation phase to all
the nodes in the random path we traversed and up till the root node. This sets the
value v(i) which is then used in the selection phase of the formula.
There are several advantages to using a Monte Carlo tree search, including:
1. Domain agnostic.
2. The ability to halt it at any time.
3. Asymmetric tree growth.
DOMAIN AGNOSTIC
Monte Carlo tree search doesn’t require any strategic or tactical knowledge about the given
domain to make reasonable decisions. The algorithm can function effectively with no
knowledge of a game, apart from its legal moves and end conditions. This means that a single
Monte Carlo tree search implementation can be reused for a number of games with little
modification.
2. ANYTIME ALGORITHM
The algorithm can be halted at any time to return the current best estimate. The search tree
built thus far may be discarded or preserved for future reusability.
3. ASYMMETRIC
Monte Carlo tree search performs asymmetric tree growth that adapts to the topology of the
search space. The algorithm visits more interesting nodes more often and focuses its search
time in more relevant parts of the tree.
This makes the Monte Carlo tree search suitable for games with large branching factors, such
as 19x19 Go. Such large combinatorial spaces typically cause problems for standard depth-
or breadth-based search methods, but the adaptive nature of Monte Carlo tree search means
that it will eventually find optimal moves and focus its search effort there.
DISADVANTAGES
1. MEMORY REQUIREMENT
As the tree growth becomes rapid after a few iterations, it requires a huge amount of memory.
2. RELIABILITY
There is a reliability issue with Monte Carlo tree search. In certain scenarios, there might be a
single branch or path that might lead to loss against the opposition when implemented for
those turn-based games.
3. ITERATIONS
Monte Carlo tree search algorithm needs a huge number of iterations to be able to effectively
decide the most efficient path. So, there is a bit of a speed issue there.
1. GAME SIMULATION
It’s used in two-player board games like tic-tac-toe, Chess and Go.
2. SECURITY
Malware is one of the biggest threats in IT security, with millions of malicious applications
released every year at an ever growing rate. Active malware analysis is one that focuses on
acquiring knowledge about dangerous software by executing actions that trigger a response in
the malware. It uses Monte Carlo tree search.
3. ROBOTICS
Mobile robots hold great promise in reducing the need for humans to perform jobs such as
vacuuming, seeding, harvesting, painting, search and rescue and inspection. Many multi-
robots on-line coverage path planning algorithms have been developed and Monte Carlo tree
search planner is now being used because of its efficient completion time.
4. TEXT GENERATION
Monte Carlo Tree search is used in simulation-based natural language generation that
accounts for both building a correct syntactic structure and reflecting the given situational
information as input for the generated sentence. The Monte Carlo tree search for this
nontrivial search problem in simulation, uses context-free grammar rules as search operators.
Constraint Satisfaction Problem (CSP)
The goal of a CSP is to find an assignment of values to the variables that satisfies all the
constraints. This assignment is called a solution to the CSP.
Unary Constraints :
A unary constraint is a constraint on a single variable. For example, Variable A not
equal to “Red”.
Binary Constraints :
A binary constraint involves two variables and specifies a constraint on their values.
For example, a constraint that two tasks cannot be scheduled at the same time would
be a binary constraint.
Global Constraints :
Global constraints involve more than two variables and specify complex relationships
between them. For example, a constraint that no two tasks can be scheduled at the
same time if they require the same resource would be a global constraint.
Finite domains have a finite number of possible values, such as colors or integers.
Infinite domains have an infinite number of possible values, such as real numbers.
Continuous domains have an infinite number of possible values, but they can be represented
by a finite set of parameters, such as the coefficients of a polynomial function.In
mathematics, a continuous domain is a set of values that can be described as a continuous
range of real numbers. This means that there are no gaps or interruptions in the values
between any two points in the set.
On the other hand, an infinite domain refers to a set of values that extends indefinitely in one
or more directions. It may or may not be continuous, depending on the specific context.
To express this problem mathematically, one needs a variable set, a domain set, and a
constraint set. These can be defined as follows -
The variable set is V = {A, B, C, D, E}
The domain set is D = {Blue Orange, Brown}
The constraint is that no adjacent area can have the same color. Thus, Set Const = {A?B, A?
C, B?C, B?D, C?D, D?E}
Visually, it's clear that A, B, and C need to have different colors. However, A and D, as well
as A and E, can have the same colors. But, D, and E need to have different colors. So,
The n-Queen problem, in which nine queens are placed on a chess board, such that no
two queens can attack each other. This can be solved through backtracking, or putting
one queen on the board at a time and checking to ensure that there are no conflicts.
Crossword puzzles, in which only words that meet the constraints solve the puzzle.
Backtracking is also a workable approach here.
Sudoku, as mentioned above, which puts constraints on the numbers that solve the
puzzle. Experts recommend filling in the rows and columns that are already the most
full, because starting with the easy problems helps make it easier to solve harder
problems.
Map coloring problems, like the one mentioned in the example. Backtracking is a
good approach here as well.
Converting Process
To be converted to a CSP, a problem must be broken down into a set of variables, a domain
of discrete values, and constraints. Then, it can be solved as a CSP problem.
CONSTRAINT PROPOGATION
Constraint propagation is the process of using the constraints to reduce the domain of
possible values for each variable, and to infer new constraints from the existing ones. For
example, if you have a variable X that can take values from 1 to 10, and a constraint that X
must be even, then you can reduce the domain of X to 2, 4, 6, 8, and 10. Similarly, if you
have a constraint that X + Y = 12, and you know that X = 4, then you can infer that Y = 8. By
applying constraint propagation iteratively, you can eliminate inconsistent values and
simplify the problem.
One of the main advantages of constraint propagation is that it can reduce the search space
and prune branches that lead to dead ends. This can make the problem easier to solve and
improve the performance of your algorithm. For example, if you use constraint propagation
to assign colors to a map, you might find that some regions have only one possible color left,
and you can assign it without further exploration. Another advantage of constraint
propagation is that it can reveal hidden structures and symmetries in the problem, and help
you find more elegant and general solutions. For example, if you use constraint propagation
to solve a Sudoku puzzle, you might discover that some cells belong to a subset that can be
solved independently of the rest.
To use constraint propagation in algorithm design, you need to follow some steps. First, you
need to formulate your problem as a CSP, by identifying the variables, the domains, and the
constraints. Second, you need to choose a suitable algorithm for applying constraint
propagation, such as arc consistency, path consistency, or k-consistency. Third, you need to
implement the algorithm in your preferred programming language, using data structures such
as queues, stacks, or graphs. Fourth, you need to test and evaluate your algorithm on different
instances of the problem, and compare it with other approaches.
Graph Coloring: The problem where the constraint is that no adjacent sides can have the
same color.
Sudoku Playing: The gameplay where the constraint is that no number from 0-9 can be
repeated in the same row or column.
Crypt arithmetic Problem is a type of constraint satisfaction problem where the game is
about digits and its unique replacement either with alphabets or other symbols.
In crypt arithmetic problem, the digits (0-9) get substituted by some possible alphabets or
symbols.
The task in crypt arithmetic problem is to substitute each digit with an alphabet to get the
result arithmetically correct.
We can perform all the arithmetic operations on a given crypt arithmetic problem.
The result should satisfy the predefined arithmetic rules, i.e., 2+2 =4, nothing else.
The problem can be solved from both sides, i.e., left hand side (L.H.S), or right hand side
(R.H.S) Let’s understand the crypt arithmetic problem as well its constraints better with the
help of an example:
Starting from the left hand side (L.H.S) , the terms are S and M. Assign a digit which could
give a satisfactory result. Let’s assign S->9 and M->1.
9. Local search for CSP:
The initial state assigns a value to every variable, and the search change the value of one
variable at a time.
The min-conflicts heuristic: In choosing a new value for a variable, select the value that
results in the minimum number of conflicts with other variables.
BACKTRACKING SEARCH FOR CSP
Backtracking is a general algorithm for solving some computational problems, most notably
constraint satisfaction problems, that incrementally builds candidates to the solutions and
abandons a candidate's backtracks as soon as it determines that the candidate cannot be
completed to a reasonable solution. The backtracking algorithm is used in various
applications, including the N-queen problem, the knight tour problem, maze solving
problems, and the search for all Hamilton paths in a graph.
Backtracking is an algorithmic technique whose goal is to use brute force to find all solutions
to a problem. It entails gradually compiling a set of all possible solutions. Because a problem
will have constraints, solutions that do not meet them will be removed.
Backtrack(s)
return false
if is a new solution
backtrack(expand s)
It finds a solution by building a solution step by step, increasing levels over time,
using recursive calling. A search tree known as the state-space tree is used to find these
solutions. Each branch in a state-space tree represents a variable, and each level represents a
solution.
A backtracking algorithm uses the depth-first search method. When the algorithm begins to
explore the solutions, the abounding function is applied so that the algorithm can determine
whether the proposed solution satisfies the constraints. If it does, it will keep looking. If it
does not, the branch is removed, and the algorithm returns to the previous level.
In any backtracking algorithm, the algorithm seeks a path to a feasible solution that includes
some intermediate checkpoints. If the checkpoints do not lead to a viable solution, the
problem can return to the checkpoints and take another path to find a solution. Consider the
following scenario:
1. In this case, S represents the problem's starting point. You start at S and work your way to
solution S1 via the midway point M1. However, you discovered that solution S1 is not a
viable solution to our problem. As a result, you backtrack (return) from S1, return to M1,
return to S, and then look for the feasible solution S2. This process is repeated until you
arrive at a workable solution.
2. S1 and S2 are not viable options in this case. According to this example, only S3 is a
viable solution. When you look at this example, you can see that we go through all
possible combinations until you find a viable solution. As a result, you refer to
backtracking as a brute-force algorithmic technique.
3. A "space state tree" is the above tree representation of a problem. It represents all possible
states of a given problem (solution or non-solution).
Step 2: Otherwise, if all paths have been exhausted (i.e., the current point is an endpoint),
return failure because there is no feasible solution.
Step 3: If the current point is not an endpoint, backtrack and explore other points, then
repeat the preceding steps.
Now, this tutorial is going to use a straightforward example to explain the theory behind the
backtracking process. You need to arrange the three letters x, y, and z so that z cannot be next
to x.
According to the backtracking, you will first construct a state-space tree. Look for all
possible solutions and compare them to the given constraint. You must only keep solutions
that meet the following constraint:
The following are possible solutions to the problems: (x,y,z), (x,z,y), (y,x,z), (y,z,x), (z,x,y)
(z,y,x).
Nonetheless, valid solutions to this problem are those that satisfy the constraint that keeps
only (x,y,z) and (z,y,x) in the final solution set.
There are the following scenarios in which you can use the backtracking:
It is used to solve a variety of problems. You can use it, for example, to find a feasible
solution to a decision problem.
In some cases, it is used to find all feasible solutions to the enumeration problem.
Backtracking, on the other hand, is not regarded as an optimal problem-solving technique.
It is useful when the solution to a problem does not have a time limit.
Following an example of a backtracking algorithm, you will now look at different types.
6. {
8. {
14. }
15. }
16. }
The solution vectors (z1, z2,... zs) are regarded as a global array z [1: n]. All of the possible
elements for the zth position of the tuple that satisfy Bs are generated one at a time and then
adjoined to the current vector (z1... zs-1). When zs is attached, a check is made to see if a
solution has been found. When the for loop on line 7 is finished, there are no more values for
zs, and the current copy of the backtrack ends.
1. Backtracking Algorithm(s)
5.{
6.s=1;
7. While (s!= 0) do
8.{
9.If ( there remains are untried z [1] £ X ( z [1], z [2], ….., z [s-1]) and Bk (z[1], ….., z[s]) is true)
then
10. {
13. S = s + 1
14. }
16.}
17.}
X() returns the set of all possible values assigned to the solution vector's first component, z1.
The component z1 will accept values that satisfy the bounding function B1 (z1).
A Hamiltonian path, also known as a Hamilton path, is a graph path connecting two graph
vertices that visit each vertex exactly once. If a Hamiltonian way exists with adjacent
endpoints, the resulting graph cycle is a Hamiltonian or Hamiltonian cycle.
The problem of placing n queens on the nxn chessboard so that no two queens attack each
other is known as the n-queens puzzle.
Return all distinct solutions to the n-queens puzzle given an integer n. You are free to
return the answer in any order.
Each solution has a unique board configuration for the placement of the n-queens, where
'Q' and. '' represent a queen and a space, respectively.
There are numerous maze-solving algorithms, which are automated methods for solving
mazes. The random mouse, wall follower, Pledge, and Trémaux's algorithms are used within
the maze by a traveler who has no prior knowledge of the maze. In contrast, a person or
computer programmer uses the dead-end filling and shortest path algorithms to see the entire
maze at once.
The Knight's tour problem is the mathematical problem of determining a knight's tour. A
common problem assigned to computer science students is to write a program to find a
knight's tour. Variations of the Knight's tour problem involve chess boards of different sizes
than the usual n x n irregular (non-rectangular) boards.