Unit II Games and Search Strategies
Unit II Games and Search Strategies
Minimax Algorithm:
Alpha-Beta Pruning:
This technique can significantly speed up the search process and make it
feasible to search deeper into the game tree.
Monte Carlo Tree Search (MCTS):
MCTS is a popular technique for making decisions in games with large
branching factors and complex decision spaces.
It combines random simulations (rollouts) with tree exploration to focus on
promising branches of the game tree.
MCTS has been highly successful in games like Go and has led to
breakthroughs in AI game playing.
Q-Learning:
The Q-values represent the expected cumulative rewards for taking certain
actions in specific states.
Q-learning is well-suited for discrete action spaces and has been used in games
like checkers and backgammon.
Policy gradient methods aim to directly learn a policy (a mapping from states to
actions) that maximizes expected rewards.
These methods are suitable for both discrete and continuous action spaces.
They have been applied to games like poker and continuous control tasks.
Evolutionary Algorithms:
They have been used for game playing by evolving strategies and decision
policies.
For games with large state spaces, heuristic search methods and pattern
databases can be used to estimate the optimal value of game states.
Deep learning methods, such as neural networks, have been applied to various
aspects of game playing, including decision-making.
These systems encode human expertise and strategies to help the AI agent make
informed decisions.
Ultimately, the choice of approach depends on the nature of the game, the
complexity of the decision space, the available computational resources, and the
specific goals of the AI agent. Many modern AI game-playing systems combine
multiple techniques to achieve optimal decisions in complex game
environments.
Alpha-Beta Pruning
o Alpha-beta pruning is a modified version of the minimax algorithm. It is an
optimization technique for the minimax algorithm.
o As we have seen in the minimax search algorithm that the number of game states it
has to examine are exponential in depth of the tree. Since we cannot eliminate the
exponent, but we can cut it to half. Hence there is a technique by which without
checking each node of the game tree we can compute the correct minimax decision,
and this technique is called pruning. This involves two threshold parameter Alpha
and beta for future expansion, so it is called alpha-beta pruning. It is also called
as Alpha-Beta Algorithm.
o Alpha-beta pruning can be applied at any depth of a tree, and sometimes it not only
prune the tree leaves but also entire sub-tree.
o The two-parameter can be defined as:
a. Alpha: The best (highest-value) choice we have found so far at any point
along the path of Maximizer. The initial value of alpha is -∞.
b. Beta: The best (lowest-value) choice we have found so far at any point along
the path of Minimizer. The initial value of beta is +∞.
o The Alpha-beta pruning to a standard minimax algorithm returns the same move as
the standard algorithm does, but it removes all the nodes which are not really
affecting the final decision but making algorithm slow. Hence by pruning these
nodes, it makes the algorithm fast.
Note: To better understand this topic, kindly study the minimax algorithm.
α>=β
1. We will first start with the initial move. We will initially define the alpha
and beta values as the worst case i.e. α = -∞ and β= +∞. We will
prune the node only when alpha becomes greater than or equal to
beta.
2. Since the initial value of alpha is less than beta so we didn’t prune it.
Now it’s turn for MAX. So, at node D, value of alpha will be calculated. The
value of alpha at node D will be max (2, 3). So, value of alpha at node D
will be 3.
3. Now the next move will be on node B and its turn for MIN now. So, at
node B, the value of alpha beta will be min (3, ∞). So, at node B values will
be alpha= – ∞ and beta will be 3.
In the next step, algorithms traverse the next successor of Node B which is
node E, and the values of α= -∞, and β= 3 will also be passed.
4. Now it’s turn for MAX. So, at node E we will look for MAX. The current
value of alpha at E is – ∞ and it will be compared with 5. So, MAX (- ∞, 5)
will be 5. So, at node E, alpha = 5, Beta = 5. Now as we can see that alpha
is greater than beta which is satisfying the pruning condition so we can
prune the right successor of node E and algorithm will not be traversed and
the value at node E will be 5.
6. In the next step the algorithm again comes to node A from node B. At
node A alpha will be changed to maximum value as MAX (- ∞, 3). So now
the value of alpha and beta at node A will be (3, + ∞) respectively and will
be transferred to node C. These same values will be transferred to node F.
7. At node F the value of alpha will be compared to the left branch which is
0. So, MAX (0, 3) will be 3 and then compared with the right child which is
1, and MAX (3,1) = 3 still α remains 3, but the node value of F will become
1.
8. Now node F will return the node value 1 to C and will compare to beta
value at C. Now its turn for MIN. So, MIN (+ ∞, 1) will be 1. Now at node C,
α= 3, and β= 1 and alpha is greater than beta which again satisfies the
pruning condition. So, the next successor of node C i.e. G will be pruned
and the algorithm didn’t compute the entire subtree G.
Now, C will return the node value to A and the best value of A will be MAX
(1, 3) will be 3.
The above represented tree is the final tree which is showing the nodes
which are computed and the nodes which are not computed. So, for this
example the optimal value of the maximizer will be 3.
Constraint Propagation:
Constraint propagation is a fundamental technique used in artificial
intelligence, particularly in constraint satisfaction problems (CSPs). It's a
process where the values of variables are adjusted or pruned based on the
constraints placed upon them. The main goal is to reduce the search space
by iteratively enforcing the constraints until no further deductions can be
made.
Initial State:
Start with a set of variables, each with a domain of possible values.
Apply initial constraints to these variables.
Propagation:
When a variable is assigned a value, the constraints involving this variable
are used to reduce the domains of other variables.
This reduction can result in further variable assignments or in pruning the
domain of a variable.
Iterative Process:
Constraint propagation is an iterative process. After a variable is assigned
a value, the constraints are used to update the domains of other variables.
This process continues until no further changes can be made.
Domains Reduction:
Constraint propagation reduces the domains of the variables, making the
problem more manageable.
Domains are reduced by eliminating values that cannot participate in a
solution, as determined by the constraints.
Termination:
The process terminates when no further changes are possible. This can
happen when all variables are assigned a value or when no more reductions
can be made.
Consistency Checking:
At each step, the consistency of the problem is checked. If any inconsistency
is detected, the process stops, and it is determined that the problem has no
solution.
Constraint propagation ensures that as soon as a value is assigned to a
variable, any immediate consequences of that assignment are inferred and
applied to other variables. This method can greatly reduce the search
space, making the problem more efficient to solve.
When a variable is assigned a value, the constraints are used to reduce the
domains of other variables.
Backtracking CSP’s:
Execution:
Initial State:
Domains: X1 = {Red, Green, Blue}, X2 = {Red, Green, Blue}, X3 = {Red,
Green, Blue}
Assignment: {}
Apply Constraints:
Initially, the constraints are not satisfied, as adjacent regions can have the
same color.
Apply constraint propagation to make the constraints consistent.
After Applying Constraint Propagation:
Apply AC-3 algorithm:
Begin with a queue of all the arcs in the CSP: {(X1, X2), (X2, X1), (X2, X3),
(X3, X2)}
Until the queue is empty, select an arc (X, Y) from the queue.
If removals are made from the domain of X, add all arcs (Z, X) to the
queue, where Z is a neighbor of X and Z ≠ Y.
After applying AC-3, we get the following:
X1 = {Red, Green, Blue}
X2 = {Red, Green, Blue}
X3 = {Red, Green, Blue}
Further Reduction with Constraints:
Since we have three colors and two adjacent regions, each variable can be
assigned a different color.
After constraint propagation, the domains will be reduced to only one value
per variable.
X1 = {Red}
X2 = {Green}
X3 = {Blue}
Result:
X1 = Red, X2 = Green, X3 = Blue
Conclusion:
Inference system
Inference means deriving new sentences from old. Inference system allows us to add a new
sentence to the knowledge base. A sentence is a proposition about the world. Inference
system applies logical rules to the KB to deduce new information.
Inference system generates new facts so that an agent can update the KB. An inference
system works mainly in two rules which are given as:
o Forward chaining
o Backward chaining
1. TELL: This operation tells the knowledge base what it perceives from the environment.
2. ASK: This operation asks the knowledge base what action it should perform.
3. Perform: It performs the selected action.
function KB-AGENT(percept):
persistent: KB, a knowledge base
t, a counter, initially 0, indicating time
TELL(KB, MAKE-PERCEPT-SENTENCE(percept, t))
Action = ASK(KB, MAKE-ACTION-QUERY(t))
TELL(KB, MAKE-ACTION-SENTENCE(action, t))
t=t+1
return action
The knowledge-based agent takes percept as input and returns an action as output. The agent
maintains the knowledge base, KB, and it initially has some background knowledge of the
real world. It also has a counter to indicate the time for the whole process, and this counter is
initialized with zero.
Each time when the function is called, it performs its three operations:
MAKE-ACTION-SENTENCE generates a sentence which asserts that the chosen action was
executed.
1. Knowledge level
Knowledge level is the first level of knowledge-based agent, and in this level, we need to
specify what the agent knows, and what the agent goals are. With these specifications, we can
fix its behavior. For example, suppose an automated taxi agent needs to go from a station A
to station B, and he knows the way from A to B, so this comes at the knowledge level.
2. Logical level:
At this level, we understand that how the knowledge representation of knowledge is stored.
At this level, sentences are encoded into different logics. At the logical level, an encoding of
knowledge into logical sentences occurs. At the logical level we can expect to the automated
taxi agent to reach to the destination B.
3. Implementation level:
This is the physical representation of logic and knowledge. At the implementation level agent
perform actions as per logical and knowledge level. At this level, an automated taxi agent
actually implement his knowledge and logic so that he can reach to the destination.
Example:
1. a) It is Sunday.
2. b) The Sun rises from West (False proposition)
3. c) 3+3= 7(False proposition)
4. d) 5 is a prime number.
a. Atomic Propositions
b. Compound propositions
Example:
Example:
Logical Connectives:
Logical connectives are used to connect two simpler propositions or representing a
sentence logically. We can create compound propositions with the help of logical
connectives. There are mainly five connectives, which are given as follows:
Truth Table:
In propositional logic, we need to know the truth values of propositions in all
possible scenarios. We can combine all the possible combination with logical
connectives, and the representation of these combinations in a tabular format is
called Truth table. Following are the truth table for all logical connectives:
Truth table with three propositions:
We can build a proposition composing three propositions P, Q, and R. This truth
table is made-up of 8n Tuples as we have taken three proposition symbols.
Inference in Artificial intelligence
Inference:
In artificial intelligence, we need intelligent computers which can create new logic
from old logic or by evidence, so generating the conclusions from evidence and
facts is termed as Inference.
Inference rules:
Inference rules are the templates for generating valid arguments. Inference rules are
applied to derive proofs in artificial intelligence, and the proof is a sequence of the
conclusion that leads to the desired goal.
In inference rules, the implication among all the connectives plays an important role.
Following are some terminologies related to inference rules:
From the above term some of the compound statements are equivalent to each
other, which we can prove using truth table:
Hence from the above truth table, we can prove that P → Q is equivalent to ¬ Q → ¬
P, and Q→ P is equivalent to ¬ P → ¬ Q.
Example:
2. Modus Tollens:
The Modus Tollens rule state that if P→ Q is true and ¬ Q is true, then ¬ P will also
true. It can be represented as:
Statement-1: "If I am sleepy then I go to bed" ==> P→ Q
Statement-2: "I do not go to the bed."==> ~Q
Statement-3: Which infers that "I am not sleepy" => ~P
Hypothetical Syllogism:
The Hypothetical Syllogism rule state that if P→R is true whenever P→Q is true, and
Q→R is true. It can be represented as the following notation:
Example:
Statement-1: If you have my home key then you can unlock my home. P→Q
Statement-2: If you can unlock my home then you can take my money. Q→R
Conclusion: If you have my home key then you can take my money. P→R
4. Disjunctive Syllogism:
The Disjunctive syllogism rule state that if P∨Q is true, and ¬P is true, then Q will be
true. It can be represented as:
Example:
5. Addition:
The Addition rule is one the common inference rule, and it states that If P is true,
then P∨Q will be true.
Example:
Proof by Truth-Table:
6. Simplification:
The simplification rule state that if P∧ Q is true, then Q or P will also be true. It can
be represented as:
Proof by Truth-Table:
7. Resolution:
The Resolution rule state that if P∨Q and ¬ P∧R is true, then Q∨R will also be true. It
can be represented as
Proof by Truth-Table:
Prolog:
Algorithms, which are widely used in dealing with hierarchical data and complex problem-solving
tasks.
Facts: Base knowledge about the world. For example, parent(bob, alice). asserts that Bob is Alice's
parent.
Rules: Logical expressions that describe how facts relate. For instance, grandparent(X, Z) :- parent(X,
Y), parent(Y, Z). states that X is a grandparent of Z if X is a parent of Y and Y is a parent of Z.
Queries: Used to interrogate the database using the facts and rules defined. For example, ?-
grandparent(bob, charlie). asks if Bob is a grandparent of Charlie.
2. Expert Systems: Prolog is suited for developing systems that require decision-making
support based on complex rules, such as medical diagnosis or legal advice systems.
3. Education: Often used in academic settings to teach the concepts of logic programming and
symbolic reasoning.
4. Database Querying: Prolog can act as a query language for deductive databases.
Prolog, which stands for Programming in Logic, is a high-level programming language based on
formal logic. It was created in the 1970s and is primarily used in artificial intelligence and
computational linguistics. Here are a few definitions to encapsulate what Prolog is:
Each of these definitions highlights Prolog’s utility in scenarios that require complex decision-making
and pattern recognition, making it uniquely suited for tasks in AI and systems that require intricate
rule-based logic.
1. Logic-based: Prolog programs consist of a series of rules and facts which express
relationships between items and set out procedures for solving problems.
2. Declarative: In Prolog, you declare what needs to be achieved rather than specifying how to
achieve it, which is typical in imperative programming languages.
4. Pattern Matching: Prolog uses pattern matching to execute appropriate rules and
manipulate symbolic data.
Basic Concept:
The main idea behind simulated annealing is to simulate the cooling process of a material in a way
that avoids being trapped in local optima, thereby increasing the probability of finding a global
optimum. The algorithm achieves this by occasionally accepting worse solutions early in the process,
allowing it to explore the solution space more thoroughly. As the "temperature" decreases, the
algorithm becomes less likely to accept worse solutions, gradually focusing the search around the
best found solutions.
1. Initialization:
2. Iteration:
At each iteration, generate a "neighbor" solution from the current solution. This
typically involves making a small random change to the current solution.
Evaluate the change in the cost function between the current solution and the new
neighbor solution.
3. Acceptance Criterion:
If the neighbor solution is better than the current solution, it is always accepted.
If the neighbor solution is worse, it may still be accepted with a probability that
depends on the difference in cost and the current temperature. This probability is
calculated using the formula P(e)=e−ΔE/T, where ΔE is the increase in the cost, and T
is the current temperature. This acceptance of worse solutions helps to escape local
minima.
4. Cooling Schedule:
5. Termination:
The process continues until the system stabilizes (no further changes occur), or a
predetermined stopping condition is met (like a minimum temperature or a
maximum number of iterations).
Simulated annealing is used in various fields where optimal solutions are crucial yet difficult to find
due to the complexity of the problem space. Some common applications include:
2. Routing problems: such as the traveling salesman problem, where the goal is to find the
shortest possible route that visits each city and returns to the origin city.
3. Machine learning and AI: for training neural networks and other models where the goal is to
find the set of parameters that minimize a loss function.
1. Flexibility: It can be applied to any optimization problem provided a suitable cost function
and a method for generating neighbor solutions.
2. Global Optimization: It has a higher probability of finding a global optimum than simple
local search algorithms.
Simulated annealing is an effective optimization technique, particularly for complex problems where
other methods might easily become trapped in local optima. Its effectiveness depends significantly
on the choice of cooling schedule, initial temperature, and method for generating neighboring
solutions.