0% found this document useful (0 votes)
20 views61 pages

Unit 3

Uploaded by

pr0075
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views61 pages

Unit 3

Uploaded by

pr0075
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 61

ARTIFICIAL INTELLIGENCE

UNIT - III

BY
Ms.P.VIDYASRI, AP/CSE
Adversarial Search Problems and Intelligent Agent
1. Adversarial Search Methods (Game Theory)
2. Mini Max Algorithm
3. Alpha Beta Pruning
4. Constraint Satisfactory Problems
5. Constraints
6. Crypt Arithmetic Puzzles
7. Constraint Domain
8. CSP as a search problem (Room colouring)
9. Intelligent Agent
10. Rationality and Rational Agent
11. Performance Measures
12. Rationality and Performance
13. Flexibility and Intelligent Agents
14. Task environment and its properties
15. Types of agents
Adversarial Search Methods (Game Theory):
• Searches required for game playing (between two players) are bit different.

• State space is represented in a tree.

• Games may give either perfect or imperfect information based on the type of game.

• Games may have multi-agent environment.

• Environment is co-operative and competitive.

• Adversarial search is a technique used in artificial intelligence for decision-making in competitive scenarios.

• The goal is to choose the best possible action considering the moves of an opponent.

• Two primary algorithms used for adversarial search are Minimax and Alpha-Beta Pruning.
Minimax Algorithm:
• In this algorithm, the AI player explores the entire game tree, considering all possible moves and their
outcomes.
• It alternates between maximizing its own utility (score) and minimizing the opponent's utility.
• The AI assumes that the opponent will make the best possible move.
• The Minimax algorithm recursively evaluates the game tree until it reaches a terminal state (end of the game),
and then it chooses the move that leads to the best possible outcome for itself.
• Minimax algorithm can be used in case of two-player(Maximizer and other is called Minimizer) games such as
tic-tac-toe, chess etc.
• Maximizer will try to get the Maximum possible score, Minimizer will try to get the minimum possible score.
• This algorithm applies DFS, have to go all the way through the leaves to reach the terminal nodes.
• At the terminal node, the terminal values are given so we will compare those value and backtrack the tree until
the initial state occurs.
Algorithm:
minimax(node, depth(3), player)
if depth ==0 or node is a terminal node then
return value(node)
if player = ‘MAX’ then // for Maximizer Player
Set α = -infinity
for each child of node do
Value = minimax(child, depth-1, ’MIN’)
α = max(α,value) //gives Maximum of the values
return (α)
else // for Minimizer player
Set α = +infinity
for each child of node do
Value = minimax(child, depth-1, ’MAX’)
α = min(α, value) //gives minimum of the values
return (α)
A maximizing player would call it as minimax(start, depth, MAX)
Step-1:
• Algorithm generates the entire game-tree and apply the utility function to get the utility values for the terminal
states.
• Let's take A is the initial state of the tree.
• Suppose maximizer takes first turn which has worst-case initial value =- infinity,
and minimizer will take next turn
which has worst-case initial value = +infinity.
Step-2:
• Find the utilities value for the Maximizer, its initial value is -∞, so will compare each value in terminal state
with initial value of Maximizer and determines the higher nodes values.
• It will find the maximum among the all.

• For node D max(-1, -∞) => max(-1,4)= 4


• For Node E max(2, -∞) => max(2, 6)= 6
• For Node F max(-3, -∞) => max(-3,-5) = -3
• For node G max(0, -∞) = max(0, 7) = 7
Step-3:
• In the next step, it's a turn for minimizer, so it will compare all nodes value with +∞, and will find the 3 rd layer
node values.

• For node B= min(4,6) = 4


• For node C= min(-3, 7) = -3
Step-4:
• Now it's a turn for Maximizer, and it will again choose the maximum of all nodes value and find the maximum
value for the root node.
• In this game tree, there are only 4 layers, hence we reach immediately to the root node, but in real games, there
will be more than 4 layers.

• For node A max(4, -3)= 4


Alpha-beta pruning:
• Alpha-beta pruning is a modified version of the minimax algorithm.
• It is an optimization technique for the minimax algorithm.
• The two-parameter can be defined as:
• Alpha: The best (highest-value) choice at any point along the path of Maximizer. The initial value of alpha
is -infinity.
• Beta: The best (lowest-value) choice at any point along the path of Minimizer. The initial value of beta
is +infinity.
• The main condition which required for alpha-beta pruning is:
α>=β
• The Max player will only update the value of alpha.
• The Min player will only update the value of beta.
• While backtracking the tree, the node values will be passed to upper nodes instead of values of alpha and beta.
• We will only pass the alpha, beta values to the child nodes.
Algorithm:
pruning(node, depth, alpha, beta, player)
if depth ==0 or node is a terminal node then
return value(node)
if player = ‘MAX’ then // for Maximizer Player
Set α = -infinity
for each child of node do
Value = pruning(child, depth-1, alpha, beta, ’MIN’)
α = max(α,value) //gives Maximum of the values
alpha = max(alpha, α)
if beta<=alpha
break
return (α)
else // for Minimizer player
Set α = +infinity
for each child of node do
Value = pruning(child, depth-1, alpha, beta,’MAX’)
α = min(α, value) //gives minimum of the values
beta = min(beta, α)
if beta<=alpha
break
return (α)
Step 1: At the first step the, Max player will start first move from node A where α= -∞ and β= +∞, these value of
alpha and beta passed down to node B where again α= -∞ and β= +∞, and Node B passes the same value to its
child D.
Step 2: At Node D, the value of α will be calculated as its turn for Max. The value of α is compared with firstly 2
and then 3, and the max (2, 3) = 3 will be the value of α at node D and node value will also 3.
Step 3: Now algorithm backtrack to node B, where the value of β will change as this is a turn of Min, Now β=
+∞, will compare with the available subsequent nodes value, i.e. min (∞, 3) = 3, hence at node B now α= -∞, and
β= 3.

In the next step, algorithm traverse the next successor of Node B which is node E, and the values of α= -∞, and
β= 3 will also be passed.
Step 4: At node E, Max will take its turn, and the value of alpha will change. The current value of alpha will be
compared with 5, so max (-∞, 5) = 5, hence at node E α= 5 and β= 3, where α>=β, so the right successor of E will
be pruned, and algorithm will not traverse it, and the value at node E will be 5.
Step 5: At next step, algorithm again backtrack the tree, from node B to node A. At node A, the value of
alpha will be changed the maximum available value is 3 as max (-∞, 3)= 3, and β= +∞, these two values now
passes to right successor of A which is Node C.
At node C, α=3 and β= +∞, and the same values will be passed on to node F.
Step 6: At node F, again the value of α will be compared with left child which is 0, nd max(3,0)= 3, and then
compared with right child which is 1, and max(3,1)= 3 still α remains 3, but the node value of F will become
1.
Step 7: Node F returns the node value 1 to node C, at C α= 3 and β= +∞, here the value of beta will be changed, it
will compare with 1 so min (∞, 1) = 1. Now at C, α=3 and β= 1, and again it satisfies the condition α>=β, so the
next child of C which is G will be pruned, and the algorithm will not compute the entire sub-tree G.
Step 8: C now returns the value of 1 to A here the best value for A is max (3, 1) = 3. Following is the final game
tree which is the showing the nodes which are computed and nodes which has never computed. Hence the optimal
value for the maximizer is 3 for this example.
Constraint Satisfaction Problem:
• Constraint Satisfaction Problem (CSP) is a classical problem-solving technique in artificial intelligence that

involves finding a solution to a set of variables subject to specified constraints.

• CSPs are widely used in various domains, including scheduling, planning, resource allocation, and

configuration.

• By adding a condition to a problem, it reduces the search space of the problem.

• Problem need to satisfy a set of constraints such as logical constraints, algebraic constraints, data constraints,

and even resource constraints and action constraints.


• Here are the key components of a Constraint Satisfaction Problem:

1.Variables: Variables represent the unknowns or decision variables in the problem. Each variable has a domain,

which is the set of possible values it can take.

2.Domains: Domains represent the set of values that each variable can take. These values must satisfy certain

constraints.

3.Constraints: Constraints specify relationships or restrictions among variables. They define which combinations

of variable assignments are allowed or disallowed.


• In a 6x6 Sudoku problem, we have a 6x6 grid, divided into 6 rows and 6 columns.
• The objective is to fill in each cell of the grid with a number from 1 to 6, such that each row, each column,
and each of the six 2x3 subgrids (often called "boxes") contains all of the numbers from 1 to 6 without any
repetition.
Variables:
• Each cell in the grid represents a variable. So, in a 6x6 Sudoku, there are 36 variables in total, one for each
cell.
Domain:
• The domain of each variable is the set {1, 2, 3, 4, 5, 6}, since we're dealing with numbers from 1 to 6.
Constraints:
1.Row Constraint: Each number in a row must be unique.
2.Column Constraint: Each number in a column must be unique.
3.Box Constraint: Each number in a 2x3 subgrid (box) must be unique.
These constraints ensure that the solution satisfies the rules of Sudoku.
• Now, the CSP solver's task is to find an assignment of values to variables (i.e., fill in the grid) such that all
constraints are satisfied.
• This might involve various techniques like constraint propagation, backtracking, or local search algorithms
to efficiently search through the solution space and find a valid solution.
• With the help of constraints, it reduces the complexity in computation, reduces time and space complexity.
• In Constraint Satisfaction Problems (CSPs), various terms describe different aspects of assignments of values to

variables within the problem:

1.Assignment: An assignment is a mapping of values to the variables in the CSP. For example, in a Sudoku puzzle,

assigning a number to a cell would be an example of an assignment.

2.Consistent Assignment: An assignment is consistent if it does not violate any of the constraints of the CSP. In

other words, it adheres to all the rules and requirements of the problem. For instance, in a Sudoku puzzle, if no row,

column, or box has repeated numbers, the assignment is consistent.

3.Complete Assignment: An assignment is complete if it includes values for all variables in the CSP. In other words,

every variable has been assigned a value. In a Sudoku puzzle, a complete assignment would be one where every cell

has a number filled in, and the entire grid adheres to the rules of Sudoku.
Problem: Suppose we have three tasks (Task A, Task B, and Task C) that need to be scheduled on three days (Day
1, Day 2, and Day 3). Each task has specific requirements regarding which days it can be scheduled.
Components:
1.Variables: Each task represents a variable.
1. Variable 1: Task A
2. Variable 2: Task B
3. Variable 3: Task C
2.Domains: The domain of each variable represents the set of possible days on which the task can be scheduled.
1. Domain 1: {Day 1, Day 2, Day 3}
2. Domain 2: {Day 1, Day 2, Day 3}
3. Domain 3: {Day 1, Day 2, Day 3}
3.Constraints: The constraints specify the relationships or restrictions among variables. Let's say the constraints
are as follows:
1. Task A cannot be scheduled on the same day as Task B.
2. Task C must be scheduled on Day 3.
Solution:
Given the problem and constraints, a valid solution could be:
• Task A: Day 1
• Task B: Day 2
• Task C: Day 3
This solution satisfies all constraints:
• Task A is scheduled on Day 1, as required.
• Task B is scheduled on Day 2, as required.
• Task C is scheduled on Day 3, as required.
• Task A and Task B are scheduled on different days, satisfying the constraint.
This is a simple example of a CSP involving scheduling tasks on specific days while satisfying constraints. CSP
algorithms can be used to find solutions to more complex scheduling problems with additional constraints and
variables.
• In Constraint Satisfaction Problems (CSPs), the domain refers to the set of possible values that each variable
can take.
• Depending on the problem domain, these values can be categorized into different types:

1. Discrete Domain: In a discrete domain, the values that variables can take are distinct and separate. Examples
include the digits {1, 2, 3, 4, 5, 6} in a Sudoku puzzle or the set {red, green, blue} representing colors in a map
coloring problem.

2. Continuous Domain: In a continuous domain, the values that variables can take form a continuous range, often
representing real numbers. Examples a temperature sensor to monitor the outdoor temperature, the variable
representing the temperature could take on values like 10.5°C, 20.2°C, 30.1°C, and so on.

3. Finite Domain: In a finite domain, the set of possible values for variables is finite, meaning there is a limited
number of distinct values that can be assigned to each variable. Examples include the digits {1, 2, 3, 4, 5, 6} in a
6x6 Sudoku puzzle or the set of {red, green, blue} representing colors in a map coloring problem with a limited
number of regions.

4. Infinite Domain: In an infinite domain, the set of possible values for variables is infinite, meaning there is an
unlimited number of values that can be assigned to each variable. Examples include problems involving real
numbers where the domain extends infinitely in one or more.
• In Constraint Satisfaction Problems (CSPs), constraints define relationships among variables.
• Here are different types of constraints commonly encountered:
1. Unary Constraint: A unary constraint involves a single variable. Example X ≠ 3
2. Binary Constraint: A binary constraint relates two variables. It specifies restrictions on the combinations of
values that two variables can simultaneously take. Example X 1 ≠ X2
3. Ternary Constraint: A ternary constraint involves three variables. It specifies restrictions on the combinations of
values that three variables can simultaneously take. Example Y between X and Z, as X<Y<Z
4. Global Constraint: Global constraints involve an arbitrary number of variables (more than three variables) and
can express complex relationships between them. They are often used to capture patterns or structures that involve
more than just a few variables. Example 6x6 sudoku.
5. Preference Constraint: Preference constraints express preferences or priorities among solutions. They are not
hard constraints that must be satisfied, but rather guide the search towards more desirable solutions. Example:
Allocating classes for a professor in College timetable scheduling problem with the morning sessions as preferences.
Crypt Arithmetic Puzzles:
• Cryptarithmetic puzzles, also known as alphametics, are puzzles where arithmetic operations (usually
addition or multiplication) are performed using letters to represent digits.
• The challenge is to assign each letter a digit in such a way that the arithmetic equation is satisfied.
• This type of problem is commonly used in AI and constraint satisfaction problems.
Problem: 1
USA
+ USSR
________
PEACE
• In this puzzle, each letter represents a unique digit from 0 to 9.
• The goal is to find the digit-to-letter mappings that satisfy the equation.
To solve this cryptarithmetic puzzle using AI techniques, we can formulate it as a Constraint Satisfaction
Problem (CSP):
Variables:
Each letter in the puzzle represents a variable. In this case, the variables are S, E, N, D, M, O, R, and Y.
Domains:
The domain of each variable is the set of digits {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}.
Constraints:
All letters must be assigned different digits (uniqueness constraint).

The addition equation must hold true:


1. A + R = E (unit's place)
2. S + S + carry1 = C (ten's place)
3. U + S = A (hundred's place)
4. U + carry2 = E (thousand's place)
5. carry3 = P (ten-thousand's place)
With these constraints, we can use constraint satisfaction algorithms (e.g., backtracking with constraint
propagation) to find the digit-to-letter mappings that satisfy the puzzle.
For the example above, a solution could be:

932

+9338

_______

10270

Where P=1, E=0, A=2, C=7, U=9, S=3, R=8.

AI techniques such as constraint satisfaction can be applied to solve a wide range of these puzzles efficiently.
Problem: 2

Find A! + B! +C! = ABC

As we know, 1! = 1

2! = 2

3! = 6

4! = 24

5! = 120

6! = 720

Solution:

1! + 4! + 5! = 145
Map coloring problem:
• The map coloring problem is a classic problem in artificial intelligence (AI) and graph theory.
• It involves assigning colors to regions on a map in such a way that no two adjacent regions have the same
color.
• This problem is often used to illustrate various AI techniques, including constraint satisfaction, search
algorithms, and heuristic methods.
• Representation of Western Australia as WA, Northern Territory as NT and so on.
Problem Description:
Given a map with regions (vertices) and borders (edges) connecting them, the task is to assign a color to each region
in such a way that no two adjacent regions share the same color. The objective is to find a valid coloring for the map.

CSP Formulation:
1.Variables:
Each variable represents a region on the map.
{WA,SA,NT,Q,NSW,V,T}
2.Domain:
• The domain of each variable is the set of colors that can be used to color the region. Typically, this domain
consists of a finite number of colors
{RED, GREEN, BLUE}.
3.Constraints:
• For each pair of adjacent regions, there is a constraint that the colors assigned to them must be different.
{WA ≠ NT, WA ≠ SA, NT ≠Q, NT ≠ SA, SA ≠ Q, SA ≠ NSW, SA ≠ V, Q ≠ NSW, NSW ≠ V}, {T} is not compared
here, as it was not have common borders with any other city. If SA has BLUE color, then WA and NT should not
have the same color. If WA is RED, then NT should be GREEN.

Benefits:
• If any conditions seems to be fail, then stop the process.
• This highly reduces the search space of 37.
Intelligent Agents:
• Intelligence is a cumulative manifestation of different activities such as learning, sensing, understanding and
knowledge augmentation.
• Intelligence has an association with environment.
Those intelligence has to be gathered, processed in order to make appropriate action, so there is a need for an
object called agent.
• Intelligent agents are a fundamental concept in artificial intelligence (AI) that refer to autonomous entities
capable of perceiving their environment, reasoning, and taking actions to achieve specific goals.
• An agent is anything that perceiving its environment through sensors and acting upon that environment
through actuators.
• Agent, environment and interaction between them are very important.
• Here are some key components of intelligent agents:
1. Perception: Agents perceive their environment through sensors or by receiving data inputs.
2. Reasoning: Agents use their internal knowledge representation and processing capabilities to make decisions
based on the information they've perceived. This may involve logical reasoning, probabilistic reasoning, or
other methods of inference.
3. Decision-making: Based on their perception and reasoning, agents select appropriate actions to achieve their
goals. This could involve choosing from a set of predefined actions or learning new actions through
experience.
4. Acting: Agents execute the chosen actions in their environment, which may lead to changes in the
environment that the agent perceives, thus starting the cycle anew.
Three essential components:

• Sensor: Sensor is a device which detects the change in the environment and sends the information to other
electronic devices. An agent observes its environment through sensors.

• Actuators: Actuators are the component of machines that allows agent to make decision in order to take
actions.

• Effectors: Effectors are the devices which perform action in the environment. Effectors can be legs, wheels,
arms, fingers, wings, fins, and display screen.
How it works:
1. Perceptions: The agent constantly gathers information about its environment through sensors (cameras,
LiDAR in robots). Each piece of information is called a percept.
2. Percept Sequence: As the agent operates, it builds a history of all its past perceptions. This sequence of
percepts becomes the input for the agent function.
3. Agent Function: This function acts like a decision-making algorithm. Agent function represents
theoretical/mathematical concept. Agent’s behavior is mathematically described by Agent Function and a
function mapping any given percept sequence to an action, Practically it is described by an Agent Program

Where P represents percept and A represents Action. Agent program is an actual implementation of that
function in code form. Program translates mathematical idea of function into real world of agent. It analyzes
the entire percept sequence, considering the complete history of the agent's experience in the environment.
4. Action Output: Based on the analysis, the agent function determines the most suitable action for the agent to
take in the current situation. This action could be physical (a robot moving its arm) or digital (a recommendation
system suggesting a movie).
• Essentially, the agent function acts as a bridge between the agent's perception of the world and its actions
within that world.

• Imagine a self-driving car navigating a busy street.

• Perceptions: The car receives a continuous stream of data from its cameras and LiDAR (seeing a red light,
detecting a car slowing down in front).

• Percept Sequence: This builds a history of what the car has observed (red light appeared at time X, car in front
started braking at time Y).

• Agent Function: This function analyzes the entire sequence (considering the red light and the slowing car).

• Action Output: Based on this analysis, the function might decide to apply the brakes (action) to avoid a
collision.
Agent = Architecture + Program

• Architecture (Hardware): This refers to the physical platform on which the agent operates. It's essentially the
“component" of the agent. Example for a robot - its architecture would include its sensors (cameras, LiDAR),
motors, and actuators (arms, wheels).

• Program (Software): This is the "brain" of the agent. It's the software or code that implements the agent
function and makes the agent intelligent. The program receives sensory data from the architecture, processes it
based on the agent's goals and knowledge, and decides on actions.

• Together, the architecture and program work in tandem to create a complete intelligent agent.

• Think of a person as an intelligent agent.

• The architecture would be the person's body - their eyes, ears, and muscles.

• The program would be the person's brain - their thoughts, decision-making processes, and learned
knowledge.
Types of agent programs/agents:

• Simple reflex agents

• Model-based reflex agents

• Goal-based agents

• Utility-based agents

• Learning agents
Simple Reflex Agent:
Simple reflex agent:
• A simple reflex agent is the most basic type of AI agent.

• Sensor percepts from the environment.

• Actuator performs the act in the environment.

• React Now, Think Later: It reacts purely on its current perception of the world, without considering the past or
future.

• Condition-Action Rules: It has a set of pre-programmed rules like "if X happens, then do Y." If it's raining (X),
it opens the umbrella (Y).

• Choose the action based on Condition-Action Rules.

• Implement the action.

• These agents only succeed in the fully observable environment.

• They do not have knowledge of non-perceptual parts of the current state


Model based reflex agent:
Model based reflex agent :

• These agents maintain an internal state or model of the world and use it to make decisions beyond just the
current percept.

• A model-based reflex agent is a step up from a simple reflex agent in AI. It's like a simple agent with a built-
in mental map to help it navigate the world.

• Handling Partial Observability.

• Sense-Think-Act Cycle: It follows a three-step cycle:

• Sense: It perceives its surroundings using sensors.

• Think: It uses its internal model to understand what the current perception means in the context of the
bigger picture.
• Act: Based on its understanding, it chooses an action from a set of pre-programmed rules.

• For example, A chess playing AI that considers the history of moves and the current board state to decide the
Goal based agent:
Goal based agent:

• A goal-based agent in AI is a significant step up in sophistication from reflex-based agents.

• These agents are driven by goals or objectives.

• They evaluate multiple possible actions based on how well they help achieve those goals.

• It has some knowledge about the environment and how its actions will affect it.

• This knowledge helps it evaluate different options and make informed decisions called Knowledge
Representation.

• A goal-based agent has a clear objective in mind and actively works towards achieving it.

• After every action, the current state is compared with the goal state.

• Helpful for searching and planning problems.

• For example, A cleaning robot, cleans the room by identifying the dirt, until the room meets the predefined
cleanliness standard.
Utility based agent:
Utility based agent:

• Similar to goal-based agents, but they consider not just whether an action achieves a goal, but how desirable
the outcome is.
• Goals with Options: They still have goals, but they can often achieve those goals in multiple ways. The key
for a utility-based agent is to choose the best way, not just any way.
• Utility Function: This agent uses a special function called a utility function. This function assigns a
numerical value (the utility) to each possible outcome of an action. Higher values mean a more desirable
outcome.
• Maximizing Happiness (sort of): The agent tries to choose the action that leads to the outcome with the
highest utility. In a way, it's trying to maximize its "happiness" by achieving the goal in the most preferable
way according to the utility function.

• Associated degree of happiness is calculated.

• For example, A delivery drone that delivers packages to customers, considering factors such as delivery time,
customer satisfaction.
Learning agent:
Learning agent:

• In artificial intelligence, a learning agent is an agent that possesses the ability to learn from its past experiences.

• It begins with rudimentary information and gains the ability to act and adapt on its own through learning.

• Four primary conceptual components:

• Learning element: Advancing by picking up knowledge from the environment.

• Critic: Provides criticism/feedback that indicates how well the agent is performing in relation to a predetermined
performance criterion.

• Performance element: Deciding which outside action to take.

• Problem generator: Making recommendations/suggestions for activities that will result in novel and informative
experiences.

• As a result, learning agents are capable of learning, evaluating performance, and identifying fresh approaches to raise
performance.

• For example, A spam filter that learns from user feedback.


Vacuum-cleaner world (Agent):

Goal: To clean up the whole area.


Environment: Square A and B
Perception: Location and status. (Ex: A/B, Clean/Dirty)
Actions: Move left, Move right, suck, do nothing
Rationality:
• In AI, rationality refers to an agent's ability to make decisions that achieve the best possible outcome, considering
its goals and the current situation.

• It's essentially about "doing the right thing.“

• Rationality of an agent depends:


• Performance Measure: This defines what success means for the agent. It could be winning a game,
maximizing efficiency, or achieving a specific user goal. The performance measure defining the criterion of
success ( Goal State )
• Agent's Perception: This refers to the information the agent gathers about its environment through sensors
(real world) or data inputs (digital world). The agent’s prior knowledge of the environment too.
• Possible Actions: The agent considers all the actions it can take within its environment.
• Choosing the Right Action: An agent selects the action that best achieves its performance measure based on
its current perception and understanding of the environment. The agent’s percept sequence up to now.
Rational Agent:
• Building on the idea of rationality, a rational agent is an intelligent agent that strives to make the most logical
decisions to achieve its goals.

• It’s a special breed of Intelligent agent.

• Here are some key aspects of a rational agent:

• Clear Preferences: It has a well-defined set of goals and priorities.

• Models Uncertainty: It can account for incomplete information and make decisions even when the
environment is unpredictable.
• Maximizes Performance: It chooses actions that have the highest chance of achieving its goals,
considering the available information and potential outcomes.

• Think of a chess-playing AI program as an example. It perceives the board (environment), considers all
possible moves (actions), and chooses the one that best positions it for victory (performance measure).
Example of a rational agent (Vacuum-cleaner)
• Performance measure:
Awards one point/credit for each clean square at each time step, over 10000 time steps.
• Prior knowledge about the environment
The geography of the environment
Only two squares
The effect of the actions
• Actions that can perform
Left, Right, Suck and NoOp
• Percept sequences
Where is the agent?
Whether the location contains dirt?

Under this circumstance, the agent is rational.


Task environments:
• Task environments are the problems, While the rational agents are the solutions.
• To bring an agent to a rational agent, we set the task environment using PEAS description/representation.
• PEAS representation includes
P-Performance
E-Environment
A-Actuators
S-Sensors

Environment types:
In AI, the environment refers to everything outside the agent that the agent interacts with to achieve a goal. These environments can be categorized
along several different aspects:

1.Observability:
Fully observable: The agent has access to all the information it needs about the environment. (Example: Chessboard)
Partially observable: The agent only has access to some of the information about the environment (Example: A robot navigating a maze with limited
sensors)

2.Determinism:
Deterministic: The environment follows a set of rules, and the next state is completely predictable based on the current state and the agent's actions.
(Example: Tic-tac-toe)
Stochastic: There's randomness in the environment, and the next state is not always predictable. (Example: Weather forecasting)

Environment types:
In AI, the environment refers to everything outside the agent that the agent interacts with to achieve a goal. These environments can be categorized along several different aspects:

1.Observability:
Fully observable: The agent has access to all the information it needs about the environment. (Example: Chessboard)
Partially observable: The agent only has access to some of the information about the environment (Example: A robot navigating a maze with limited sensors)

2.Determinism:
Deterministic: The environment follows a set of rules, and the next state is completely predictable based on the current state and the agent's actions. (Example: Tic-tac-toe)
Stochastic: There's randomness in the environment, and the next state is not always predictable. (Example: Weather forecasting)

3.Agent Numbers:
Single-agent: There's only one agent interacting with the environment. (Example: Playing a game against a computer)
Multi-agent: There are multiple agents that may cooperate or compete with each other. (Example: Playing a game against another human player)
4.Agent interaction:
Episodic environments: These are like a collection of independent tasks. The agent makes decisions based
solely on the current situation, and past actions have no bearing on future ones. (Example: Customer Service
Chatbot)
Sequential environments: These are more like ongoing stories. The current state of the environment depends
on past actions, and the agent needs to consider this history to make optimal decisions. (Example: Robot
Navigation)

5.Dynamics:
Static: The environment remains constant throughout the interaction with the agent. (Example: Solving a math
equation)
Dynamic: The environment changes over time, and the agent's actions can influence those changes. (Example:
Self-driving car)

6.Action and State Space:


Discrete: There are a finite number of actions the agent can take and states the environment can be in.
(Example: Moving a knight in chess)
Continuous: The number of actions and states are infinite or very large. (Example: Robot arm with many
degrees of freedom)
7.Environmental awareness:
Known Environment: The agent has complete knowledge of the environment's rules, possible actions, and the
outcomes of those actions. (Example: Playing tic-tac-toe)
Unknown Environment: The agent has limited or no prior knowledge about the environment. It needs to learn
the rules and dynamics through exploration and interaction. (Example: Self-driving car in a new city)

8.State perception:
Accessible Environment: The agent has complete and accurate information about everything relevant in the
environment. It's like having a clear picture of the entire playing field. (Example: A chessboard)
Inaccessible Environment: The agent lacks complete or accurate information about the environment. It's like
operating in a fog or with limited senses. (Example: Self-driving car in bad weather)

You might also like