0% found this document useful (0 votes)
11 views112 pages

Unit 3 Updated

The document discusses adversarial search methods in artificial intelligence, focusing on game-playing techniques such as Minimax and Alpha-beta pruning. It explains key concepts like game trees, player actions, and utility values, highlighting how these algorithms optimize decision-making in competitive environments. Additionally, it covers the fundamentals of game theory and constraint satisfaction problems (CSPs), providing examples and applications in real-world scenarios.

Uploaded by

maaiethiru8
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views112 pages

Unit 3 Updated

The document discusses adversarial search methods in artificial intelligence, focusing on game-playing techniques such as Minimax and Alpha-beta pruning. It explains key concepts like game trees, player actions, and utility values, highlighting how these algorithms optimize decision-making in competitive environments. Additionally, it covers the fundamentals of game theory and constraint satisfaction problems (CSPs), providing examples and applications in real-world scenarios.

Uploaded by

maaiethiru8
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 112

21CSC206T

ARTIFICIAL INTELLIGENCE
UNIT – 3
Adversarial search Methods-Game playing

• Adversarial search: Adversarial search is a game-playing


technique where the agents are surrounded by a
competitive environment. A conflicting goal is given to the
agents (multiagent). These agents compete with one
another and try to defeat one another in order to win the
game
• Search based on Game theory; Agents: Competitive
environment
• According to game theory, a game is played between two
players. To complete the game, one has to win the game
and the other looses automatically.’
• Such Conflicting goal- adversarial search
– Tic-Tac-Toe, Checkers, Chess – Only mind works, no luck works
Adversarial search Methods-Game
Playing-Important concepts
• Techniques required to get the best
optimal solution (Choose
Algorithms for best optimal solution
within limited time)
– Pruning: A technique which
allows ignoring the unwanted
portions of a search tree which
make no difference in its final
result.
– Heuristic Evaluation Function: It
allows to approximate the cost
value at each level of the search
tree, before reaching the goal
node.

https://www.tutorialandexample.com/adversarial-search-in-artificial- intelligence/#:~:text=AI
%20Adversarial%20search%3A%20Adversarial%20search,order%20to%20win%20the%20game.
Game playing and knowledge structure-
Elements of Game Playing search
• To play a game, we use a game tree to know all • For example, in chess, tic-tac-
the possible choices and to pick the best one out. toe, we have two or three
There are following elements of a game-playing: possible outcomes. Either to
win, to lose, or to draw the
• S0: It is the initial state from where a game begins. match with values +1,-1 or 0.
• PLAYER (s): It defines which player is having the • Game Tree for Tic-Tac-Toe
current turn to make a move in the state. – Node: Game states,
• ACTIONS (s): It defines the set of legal moves to Edges: Moves taken by
be used in a state. players
• RESULT (s, a): It is a transition model
which defines the result of a move.
• TERMINAL-TEST (s): It defines that the game has
ended and returns true.
• UTILITY (s,p): It defines the final value with which
the game has ended. This function is also known
as Objective function or Payoff function. The
price which the winner will get i.e.
• (-1): If the PLAYER loses.
• (+1): If the PLAYER wins.
• (0): If there is a draw between the PLAYERS.
https://www.tutorialandexample.com/adversarial-search-in-artificial-
intelligence/#:~:text=AI%20Adversarial%20search%3A%20Adversarial
Game playing and knowledge structure-
Elements of Game Playing search
• INITIAL STATE (S0): The top node in the
game-tree represents the initial
state in the tree and shows all the
possible choice to pick out one.
• PLAYER (s): There are two players, MAX
and MIN. MAX begins the game
by picking one best move and place X in
the empty square box.
• ACTIONS (s): Both the players can make
moves in the empty boxes chance
by chance.
• RESULT (s, a): Ex:The moves made
by MIN and MAX will decide the
outcome of the game.
• TERMINAL-TEST(s): When all the empty
boxes will be filled, it will be
the terminating state of the game.
• UTILITY: At the end, MAX or MIN
Win, and accordingly, the price will
Game as a search problem
• Types of algorithms in Adversarial search
– In a normal search, we follow a sequence of actions
to reach the goal or to finish the game
optimally.
– But in an adversarial search, the result depends on
the players which will decide the result of the
game.
– It is also obvious that the solution for the goal state
will be an optimal solution because the player
will try to win the game with the shortest
path and under limited time.
• Minmax Algorithm
• Alpha-beta Pruning
Minimax approach
• Minimax/Minmax:
– Decision strategy-Game Theory
• Minimize loosing chances- Maximize winning chances
• Two-player game strategy
• Players will be two namely:
• MAX: Increases his chances of winning the game.
• MIN: Decrease the chances of MAX to win the game.
• Result of the game/Utility value
– Heuristic function propagating from initial node to root node
– Backtracking technique-Best choice
Minimax Algorithm
• Follows DFS
– Follows same path cannot change in middle- i.e., Move
once made cannot be altered- That is this is DFS
and not BFS
• Algorithm
– Keep on generating the Complete game tree/ search tree
till a limit d.
– Use the utility of nodes at level n to derive the utility of
nodes at level n-1. (Propagate the values from
the leaf node till the current position following
the minimax strategy).
– Continue backing up values towards the root (one layer at
a time).
– Make the best move from the choices.
Minimax Algorithm
Minimax Algorithm
Example 1: Minimax Algorithm
Example 2: Minimax Algorithm
Example 3: Use the Minimax algorithm to compute
the minimax value at each node for the game tree
below
Use the Minimax algorithm to compute the
minimax value at each node for the game tree
below
Example 4
Minimax Algorithm Example 5
• For example, in the the two
players
figure, MAX and MIN are
there. MAX starts the game by choosing
one path and propagating all the nodes of
that path. Now, MAX will backtrack to the
initial node and choose the best path
where his utility value will be the
maximum. After
this, its MIN chance. MIN will also
propagate through a path and again will
backtrack, but MIN will choose the path
which could minimize MAX winning
chances or the utility value.
• So, if the level is minimizing, the node will
accept the minimum value from the
successor nodes. If the level is
maximizing, the node will accept the
maximum value from the successor.
• Note: The time complexity of MINIMAX
algorithm is O(bd) where b is the branching
factor and d is the depth of the search
tree.
Alpha beta pruning
• Cutoff the search by exploring less number of nodes
• It makes same moves as Minimax algorithm does-
but prunes unwanted branches using the pruning
techniques
• Alpha beta pruning works on 2 threshold values α and β
– α: It is the best highest value, a MAX player can have. It is the
lower bound, which represents negative infinity value.
– β: It is the best lowest value, a MIN player can have. It is the
upper bound which represents positive infinity.
• So, each MAX node has α-value, which never
decreases, and each MIN node has β-value, which never
increases.
• Note: Alpha-beta pruning technique can be applied to
trees of any depth, and it is possible to prune the entire
subtrees easily.
Alpha beta pruning
• Advanced version of MINIMAX algorithm
• Any optimization algorithm- performance
measure is the first consideration
• Drawback of Minimax:
– Explores each node in the tree deeply to provide
the best path among all the paths
– Increases time complexity
• Alpha beta pruning: Overcomes drawback by
less exploring nodes of search tree
• 1. Alpha-beta pruning is a modified version of the min-max algorithm. It
is an optimization technique for the min-max algorithm.
• 2. The number of nodes (game states) that must be examined in the min-
max search algorithm is proportional to the depth of the tree. We
cannot completely eliminate the exponent, but we can reduce it by half.
• 3. There is a technique called pruning that allows us to compute the
correct min-max decision without having to check each node of
the game tree. Because this involves two threshold parameters,
alpha and beta, for future expansion, it is referred to as alpha-
beta pruning.
• 4. Alpha-beta pruning can be used at any depth in a tree, and it
sometimes prunes not only the tree leaves but also the entire sub-
tree.
• 5. The two parameters can be defined as:
• a. Alpha: The best (highest-value) choice we have found so far at any
point along the path of Maximizer. The initial value of alpha is – ∞.
• b. Beta: The best (lowest-value) choice we have found so far at any point
along the path of Minimizer. The initial value of beta is + ∞.
Algorithm Alpha beta pruning
Alpha beta pruning
Alpha beta pruning
Alpha beta pruning
• Any one player will start the game. Following the DFS order, the player will
choose one path and will reach to its depth, i.e., where he will find
the TERMINAL value.
• If the game is started by player P, he will choose the maximum value in
order to increase its winning chances with maximum utility value.
• If the game is started by player Q, he will choose the minimum value in
order to decrease the winning chances of A with the best possible
minimum utility value.
• Both will play the game alternatively.
• The game will be started from the last level of the game tree, and the
value will be chosen accordingly.
• Like in the below figure, the game is started by player Q. He will pick
the
leftmost value of the TERMINAL and fix it for beta (β). Now, the next
TERMINAL value will be compared with the β-value. If the value will be
smaller than or equal to the β-value, replace it with the current β-value
otherwise no need to replace the value.
• After completing one part, move the achieved β-value to its upper
node
and fix it for the other threshold value, i.e., α.
• Now, its P turn, he will pick the best maximum value. P will move to
explore the next part only after comparing the values with the current α-
value. If the value is equal or greater than the current α-value, then only it
will be replaced otherwise we will prune the values.
• The steps will be repeated unless the result is not obtained.
• So, number of pruned nodes in the above example are four and MAX wins
the game with the maximum UTILITY value, i.e.,3
• The rule which will be followed is: “Explore nodes if necessary otherwise
prune the unnecessary nodes.”
• Note: It is obvious that the result will have the same UTILITY value that
Use the Alpha-Beta pruning algorithm to prune the game tree in Problem 1
(a) assuming child nodes are visited from left to right. Show all final alpha and
beta values computed at root, each internal node explored, and at the top of
pruned branches. Note: Follow the algorithm in Figure 5.7 in the textbook
[edition 3]. Also show the pruned branches.
Use the Alpha-Beta pruning algorithm to prune the game tree in Problem 1
(a) assuming child nodes are visited from left to right. Show all final alpha and
beta values computed at root, each internal node explored, and at the top of
pruned branches. Note: Follow the algorithm in Figure 5.7 in the textbook
[edition 3]. Also show the pruned branches.
Game theory problems
• Game theory is basically a branch of mathematics
that is used to typical strategic interaction
between different players (agents), all of which
are equally rational, in a context with predefined
rules (of playing or maneuvering) and outcomes.
• GAME can be defined as a set of players, actions,
strategies, and a final playoff for which all the
players are competing.
• Game Theory has now become a describing
factor for both Machine Learning algorithms and
many daily life situations.

https://www.geeksforgeeks.org/game-theory-in-ai/
Game theory problems
• Types of Games
– Zero-Sum and Non-Zero Sum Games: In non-zero-sum games, there are multiple
players and all of them have the option to gain a benefit due to any move by
another player. In zero-sum games, however, if one player earns something, the
other players are bound to lose a key playoff.
– Simultaneous and Sequential Games: Sequential games are the more popular
games where every player is aware of the movement of another player.
Simultaneous games are more difficult as in them, the players are involved in a
concurrent game. BOARD GAMES are the perfect example of sequential games
and are also referred to as turn-based or extensive-form games.
– Imperfect Information and Perfect Information Games: In a perfect information
game, every player is aware of the movement of the other player and is also
aware of the various strategies that the other player might be applying to win
the ultimate playoff. In imperfect information games, however, no player is
aware of what the other is up to. CARDS are an amazing example of Imperfect
information games while CHESS is the perfect example of a Perfect Information
game.
– Asymmetric and Symmetric Games: Asymmetric games are those win in which
each player has a different and usually conflicting final goal. Symmetric games
are those in which all players have the same ultimate goal but the strategy being
used by each is completely different.
– Co-operative and Non-Co-operative Games: In non-co-operative games, every
player plays for himself while in co-operative games, players form alliances in
order to achieve the final goal.
https://www.geeksforgeeks.org/game-theory-in-ai/
Game theory problems
Examples
• Prisoner’s Dilemma.
• Closed-bag exchange Game,
• The Friend or Foe Game, and
• The iterated Snowdrift Game.

https://www.geeksforgeeks.org/game-theory-in-ai/
Constraint satisfaction problems (CSP)

• Constraint satisfaction problems (CSPs) are


mathematical questions defined as a set of objects
whose state must satisfy a number of constraints or
limitations.
• It is a search procedure that operates in a space of
constraints.
• Any problem in the world can mathematically be represented
as CSP.
• The solution is typically a state that can satisfy all the
constraints.

• A constraint satisfaction problem (CSP) consists of


• CSP = { V,D,C}
– a set of variables, V = { V1,V2,V3….}
– a domain for each variable D = { D1,D2,D3….}
– And a set of constraints C = { C1,C2,C3….}
- Dr Faritha Banu / Department of CSE 33
Constraint
• It is mathematical/logical relationship among the
attributes of one or more objects.
• It is important to know the type of constraint.
– Unary Constraint - single variable.
– Binary Constraint - two variable.
– Higher order Constraint - 3 or more variables.
•Constraints can restrict the values of
variables. Example Problems
• Crypt Arithmetic Puzzles
• Map Coloring
• Crossword Puzzles

- Dr Faritha Banu / Department of CSE 34


Examples of CSPs
• Assignment problems
– e.g., who teaches what class
• Timetabling problems
– e.g., which class is offered when and
where?
• Transportation scheduling
• Factory scheduling

Some example of CSP in real world


problems:

- Dr Faritha Banu / Department of CSE 35


Crypt Arithmetic problem
Crypt Arithmetic Puzzles
Crypt Arithmetic Puzzles
Crypt Arithmetic Puzzles
Crypt Arithmetic Puzzles
Crypt Arithmetic Puzzles
Crypt Arithmetic Puzzles

• Constraints:
1. Variables: can take values from 0-9
2. No two variables should take same value
3. The values should be selected such a way that it
should
comply with arithmetic properties.

- Dr Faritha Banu / Department of CSE 43


- Dr Faritha Banu / Department of CSE 44
- Dr Faritha Banu / Department of CSE 45
- Dr Faritha Banu / Department of CSE 46
- Dr Faritha Banu / Department of CSE 47
- Dr Faritha Banu / Department of CSE 48
- Dr Faritha Banu / Department of CSE 49
- Dr Faritha Banu / Department of CSE 50
Constraint Domain
• It describes different constrainers,
operators, arguments, variables
and their domains.
• It consists of
1. Set of variables (var)
2. Set of all types of functions (f)
3. Legal set of operators (o)
4. Domain variables (dv)
5. Range of variables (rg)
constraint domain is five-tuple and
represented as
D={var,f,o,dv,rg}
- Dr Faritha Banu / Department of CSE 51
Constraint
CSP as a Search Problem
• Initial state:
– {} – all variables are unassigned
• Successor function:
– a value is assigned to one of the unassigned variables
with no conflict
• Goal test:
– a complete assignment
• Path cost:
– a constant cost for each step
• Solution appears at depth n if there are n
variables

- Dr Faritha Banu / Department of CSE 53


Example: The map coloring problem.
• The task of coloring each region red, green or
blue in such a way that no neighboring
regions have the same color.
• To formulate this as CSP, we define the variable to
be the regions: WA, NT, Q, NSW, V, SA, and T.
• The domain of each variable is the set {red,
green, blue}. The constraints require
• neighboring regions to have distinct colors: for
example, the allowable combinations for WA
and NT are the pairs
{(red,green),(red,blue),(green,red),(green,blue),
(blue,red), (blue,green)}.
• (The constraint can also be represented as the
inequality WA ≠NT). There are many
possible solutions, such as
{WA = red, NT = green,
- Dr Faritha Banu Q = red,
/ Department of CSE NSW = green, 54
- Dr Faritha Banu / Department of CSE 55
- Dr Faritha Banu / Department of CSE 56
Constraint Graph
• Constraint Graph: A CSP is usually represented
as an undirected graph, called constraint
graph where the nodes are the variables and
the edges are the binary constraints.
CSP – Backtracking Algorithm
• The backtracking algorithm is a depth-first search algorithm
that methodically investigates the search space of
potential solutions up until a solution is discovered that
satisfies all the restrictions.
• The backtracking algorithm is a popular method for resolving
CSPs. It looks for the search space by picking a
variable, setting a value for it, and then recursively
scanning through the other variables.
• In the event of a conflict, it goes back and tries a different
value for the preceding variable.
CSP – Backtracking Algorithm
• The backtracking algorithm’s essential elements are:
• Variable Ordering: The order in which variables are chosen is
known as variable ordering.
• Value Ordering: The sequence in which values are assigned to
variables is known as value ordering.
• Constraint Propagation: Reducing the domain of variables
based on constraint compliance is known as
constraint propagation.
Improving Backtracking
- Dr Faritha Banu / Department of CSE 60
CSP – Forward Checking
• Forward-checking algorithm: The backtracking technique
has been improved using forward checking.
• It tracks the remaining accurate values of the unassigned
variables after each assignment and reduces the domains
of variables whose values don’t match the assigned ones.
As a result, the search space is smaller, and constraint
propagation is more effectively accomplished.
• IDEA:
• Keep track of remaining legal values for unassigned
variables.
• Terminate the search when any variable has no legal
values.
CSP – Forward Checking

SA can no longer green


CSP – Forward Checking

Now , Forward checking identifies failure. Hence backtracks


for other solution.

https://www.youtube.com/watch?v=R6S7UqkFg8E
Forward Checking
Forward Checking

• To understand the forward checking, we shall


see 4 Queens problem.
• If an arrangement on the board of a queen x,
hampers the position of queens x+1, then
this forward check ensures that the queen x
should not be placed at the selected position
and a new position is to be looked upon.

- Dr Faritha Banu / Department of CSE 67


- Dr Faritha Banu / Department of CSE 68
• Q1 and Q2 are placed in row1 and 2 in the left
sub-tree, so, search is halted, since No positions
are left for Q3 and Q4.
• Forward Checking keeps track of the next moves
that are available for the unassigned variables.
• The search will be terminated when there is
no legal move available for the unassigned
variables.

Solutions for 4 queen


problem

- Dr Faritha Banu / Department of CSE 69


Intelligent Backtracking

• Conflict set is maintained using forward


checking and maintained.
• Considering the 4 Queens problem,Conict needs
to be detected by the user of conflict set
so that a backtrack can occur
• Backtracking with respect to the conflict set is
called as conflict-directed backjumping
• Backjumping approach can't actually restrict the
earlier committed mistakes in some other
branches

- Dr Faritha Banu / Department of CSE 70


Homework
• Solve the given map coloring problem using Forward checking.
Homework
• Try to assign {Purple, Pink, Yellow} for the given graph below using Intelligent
backtracking in CSP.
Homework
• Color the graph given below using Heuristics in CSP. .
Intelligent Agents
• An agent is something that perceive(sense) its
environment, makes decisions and act on those
decisions to achieve a goal (through actuators)

Ex: Robot vacuum cleaner


Human agent:
eyes, ears, and other organs for sensors; hands,
legs, mouth, and other body parts for actuators
Human act according to the environment through sense
organs.
• Robotic agent:
cameras and infrared range finders for sensors; various
motors for actuators
Agents and environments

• Percept: agent’s perceptual inputs at an instant

• The agent function maps from percept sequences to


actions:
[f: P*  A]
• The agent program runs on the physical
architecture to produce f

• agent = architecture + program


Architecture of an Agent

Typically consists of 3 main components:

Perception- the agent receives info. from environment


(sensors).
Decision Making- The agent decides what action have
to take based on the perception (processing and
reasoning).
Actions- the agent performs an action in the
environment (actuators).

Ex: Robot Vacuum cleaner


How does the agent interact with
environment?
1. Perception
2. Action
3. Feedback
Vacuum-cleaner world

• Percepts: location and state of the


environment, e.g., [A,Dirty], [B,Clean]

• Actions: Left, Right, Suck, NoOp


Suck
Left
Intelligent agents
• The agent can operate without direct human
intervention or other software methods. It controls
its activities and internal environment. The
agent independently decide which steps it will
take in its current condition to achieve the
best improvements. The agent achieves
autonomy if its performance is measured by
its experiences in the context of learning and
adapting.
Rationality
• A rational agent is one that does the right thing. i.e. the table
for the agent function is filled out “correctly.”
• But what does it mean to do the right thing? We use a
performance measure to evaluate any given sequence of
environment states. (if the sequence of action is desirable, not
fixed))
Definition of a rational agent:
For each possible precept sequence, a rational agent should select an
action that is expected to maximize its performance measure, given
the evidence provided by the percept sequence and whatever built-in
knowledge the agent possesses.
• Importantly, we emphasize that the performance is assessed in
terms of environment states and not agent states; self-assessment
is often susceptible to self-delusion.
• Here is a relevant rule of thumb: It is advisable to design
performance measures according to what one actually wants in the
environment, as opposed to how one believes that agent should behave.
Rationality :

What is rational at any given time depends on (at least) four


things:
(1) The performance measure (Ex. Amount dirt cleaned)
(2) The agent’s prior knowledge (Ex: Room A, try to clean , moves
to
Room B)
(3) The actions the agents can perform (Ex: Agent cannot
clean corners, can clean the center)
(4) The agent’s percept sequence to date. ( Ex: What agent has
learnt over time)
Rationality
• Rationality is distinct from omniscience
• An omniscient agent knows the actual outcome of its actions
and can act accordingly; but omniscience is impossible
in reality.

• Rationality maximizes expected performance not actual


performance, while perfection is maximizing actual
performance.

• a rational agent not only to gather information but also to learn as


much as possible from what it perceives (information
gathering, exploration)

• A rational agent should be autonomous—it should learn what it can


to compensate for partial or incorrect prior knowledge.
After sufficient experience of its environment, the behavior of a
PEAS

To design a rational agent, we must specify the task


environment
Consider, e.g., the task of designing an automated taxi:

Performance measure: safety, destination, profits, legality,


comfort, ...
Environment : streets/freeways, traffic, pedestrians, weather, ...

Actuators: steering, accelerator, brake, horn, speaker/display, ...


Sensors: video, accelerometers, gauges, engine sensors,
keyboard,
GPS, ...
Performance
• measure
To design a rational agent, we must specify the task environment.
• PEAS stands for performance measure,
environment, actuators, and sensors. PEAS
defines AI models and helps determine the task
environment for an intelligent agent.
• Performance measure: It defines the success of an
agent. It evaluates the criteria that determines
whether the system performs well.
• Environment: It refers to the external context in which
an AI system operates. It encapsulates the
physical and virtual surroundings, including other
agents, objects, and conditions.
• Actuators: They are responsible for executing actions
based on the decisions made. They interact with
the environment to bring about desired changes.
• Sensors: An agent observes and perceives its
Performance
• PEAS measure
Example
Environment types

https://artificialintelligence.readthedocs.io/en/
latest/part1/chap2.html
Environment types

https://www.youtube.com/watch?v=V6lWWaA
Invg
Environment types

https://www.youtube.com/watch?v=V6lWWaA
Invg
Environment types

https://www.youtube.com/watch?v=V6lWWaA
Invg
Environment types

https://www.youtube.com/watch?v=V6lWWaA
Invg
Environment types
Environment types

https://www.youtube.com/watch?v=V6lWWaA
Invg
Environment types

https://artificialintelligence.readthedocs.io/en/
latest/part1/chap2.html
task observable determ./ episodic/ static/ discrete/ agents
environm. stochastic sequential dynamic continuous
crossword fully determ. sequential static discrete single
puzzle
chess with fully strategic sequential semi discrete multi
clock
poker partial stochastic sequential static discrete multi

back fully stochastic sequential static discrete multi


gammon
taxi partial stochastic sequential dynamic continuous multi
driving
medical Partial stochastic sequential dynamic continuous single
diagnosis
image fully determ. episodic semi continuous single
analysis
partpicking partial stochastic episodic dynamic continuous single
robot
refinery partial stochastic sequential dynamic continuous single
controller
interact. partial stochastic sequential dynamic discrete multi
Eng. tutor
Agent types
• Five basic types in order of increasing
generality:
• (problem solvers)

• Table Driven agents

• Simple reflex agents

• Model-based reflex agents

• Goal-based agents

• Utility-based agents
Table Driven Agent.
current state of decision process
. (An example for the vacuum world is below).
function Reflex-Vacuum-Agent([location,status]) returns an action
if status = Dirty then return Suck else if location = A then
return Right else if location = B then return Left
•The simple rule based agent. These agents select actions on
the basis of the current percept, ignoring the rest of the percept
history.

NO MEMORY
Fails if environment
is partially
observable

example: vacuum cleaner


world
The main difference between simple reflex agents and model-
based reflex agents is that the latter keep track of the state of the
world.
Model-based reflex agents
description of Model the state of the world by:
current world modeling how the world
state changes how it’s actions change
the world

• This can work even with partial


information
• It’s is unclear what to
do without a
clear goal
Function MODEL-BASED-REFLEX-AGENT (percept)
returns an action
Persistent: state, the agent’s current conception of the
world state
model, a description of how the next state depends on the
current state and action
rules, a set of condition-action rules
action, the most recent action, initially none
state<- UPDATE-STATE (state, action percept,

model) rule <- RULE-MATCH(state, rules)


action <- rule.ACTION
return action
The main difference between model-based reflex agents and goal-
based agents is that it does not act on fixed condition-action rules,
but on some sort of goal information that describes situations that
are desirable (e.g., in the case of route-finding the destination).
Goal-based agents
Goals provide reason to prefer one action over the
other. We need to predict the future: we need to plan &
search
The main difference between goal-based agents and utility-based agents
is that the performance measure is more general. It does not only
consider a binary distinction between “goal achieved” and “goal not
achieved” but allows comparing different world states according to their
relative utility or expected utility, respectively (i.e., how happy the agent
is with the resulting state based on speed and energy consumption in
routing etc).
Utility-based agents
Some solutions to goal states are better than
others. Which one is best is given by a utility
function.
Which combination of goals is preferred?
Learning agents
How does an agent improve over time?
By monitoring it’s performance and suggesting
better modeling, new action rules,
etc.

Evaluates
current
world
state

changes
action
rules “old agent”=
model world
and decide on
actions
suggests to be taken
explorations

References
1. Parag Kulkarni, Prachi Joshi, “Artificial Intelligence –Building Intelligent Systems
”PHI learning private Ltd, 2015
• 2.Kevin Night and Elaine Rich, Nair B., “Artificial Intelligence (SIE)”, Mc Graw Hill-
2008.
• 3.Stuart Russel and Peter Norvig “AI – A Modern Approach”, 2nd Edition,
Pearson
Education 2007
• 4. www.javatpoint.com
• 5. www.geeksforgeeks.org
• 6.www.mygreatlearning.com
• 7. www.tutorialspoint.com
https://www.youtube.com/watch?v=cS7Khd0qtVY

• https://www.youtube.com/watch?v=R6S7UqkFg8E

108

You might also like