2021 Lecture02 IntelligentAgents
2021 Lecture02 IntelligentAgents
INTELLIGENT AGENTS
2
Agents and Environments
3
What is Agent?
• AI studies how to make computers do things that people are
better at if they could
• Extend what they do to huge data sets
• Do it fast, in near real-time
• Not make mistakes
• Such systems are called Agents.
4
What is Agent?
• An agent perceives its environment through sensors and
acts upon that environment through actuators.
5
Examples of agents
6
The agent’s behavior
• Percept: the agent’s perceptual inputs at any given instant
• Percept sequence: the complete history of everything the
agent has ever perceived
• An agent’s behavior is described by the agent function that
maps any given percept sequence to an action.
𝒇: 𝓟 → 𝓐
• Agent program: the implementation of the agent function
mathematical practical
7
The Vacuum-cleaner world
8
The Vacuum-cleaner world
Percept sequence Action
[A, Clean] Right
[A, Dirty] Suck
[B, Clean] Left
[B, Dirty] Suck
[A, Clean], [A, Clean] Right
[A, Clean], [A, Dirty] Suck
...
...
Partial tabulation of a [A, Clean], [A, Clean], [A, Clean] Right
simple agent function for
the vacuum-cleaner world [A, Clean], [A, Clean], [A, Dirty] Suck
9
The Vacuum-cleaner world
The agent program for a simple reflex agent in the two-state vacuum environment.
10
Why do we need agents?
• A tool for analyze systems
• All areas of engineering can be seen as designing artifacts
that interact with the world.
• AI designs artifacts that have significant computational
resources and the task environment requires nontrivial
decision making
11
The concept of rationality
• Rationality
• Omniscience, learning, and autonomy
12
Rational agents
• A rational agent is one that does the right thing.
• Every entry in the table for the agent function is filled out correctly.
• What is “right” thing?
• The actions that cause the agent to be most successful
• We need ways to measure success.
Performance measure
13
Performance measure
• An agent, based on its percepts → generates actions
sequence → environment goes to sequence of states
• If this sequence of states is desirable, then the agent performed well.
• Performance measure evaluates any given sequence of
environment states (remember, not agent states!!!).
• An objective function that decides how the agent does successfully.
E.g., 90%? 30%?
14
Design performance measures
• General rule: Design performance measures according to
What one actually wants in the environment
Not how one thinks the agent should behave
15
Rationality
• What is rational at any given time depends on
16
Definition of a rational agent
For each possible percept sequence, a rational agent should select
an action that is expected to maximize its performance measure,
given the evidence provided by the percept sequence and whatever
built-in knowledge the agent has.
17
The Vacuum-cleaner agent
• Performance measure
• Award one point for each clean square at each time step, over 10000
time steps
• Prior knowledge about the environment
• The geography of the environment (2 squares)
• The effect of the actions
• Actions that can perform
• Left, Right, Suck and Do Nothing
• Percept sequences
• Where is the agent?
• Whether the location contains dirt?
• Under this circumstance, the agent is rational.
18
Omniscience, learning, and autonomy
19
Omniscience vs. Rationality
Omniscience Rationality
• Know the actual outcome of Maximize performance measure
actions in advance given the percepts sequence to
• No other possible outcomes date and prior knowledge
• However, impossible in real
world Rationality is not perfection
• Example?
20
Information gathering
• The agent must not engage in unintelligent activities due to
inadvertency.
• Information gathering – Doing actions in order to modify
future percepts (e.g., exploration)
• This is an important part of rationality.
21
Learning
• A rational agent must learn as much as possible from what it
perceives.
• Its initial configuration may be modified and augmented as it gains
experience.
• There are extreme cases in which the environment is
completely known a priori.
22
Autonomy
• A rational agent should be autonomous – Learn what it can
to compensate for partial or incorrect prior knowledge.
• If an agent just relies on the prior knowledge of its designer rather
than its own percepts, then the agent lacks autonomy.
• E.g., a clock
• No input (percepts)
• Run its own algorithm (prior knowledge)
• No learning, no experience, etc.
23
The Nature of
Environments
• Specifying the task environment
• Properties of task environments
24
The task environment
• Task environments are essentially the “problems” to which
rational agents are the “solutions.”
Problems – Solutions
25
The task environment
• The task environment includes
• Performance measure
• Environment
PEAS • Agent’s Actuators
• Agent’s Sensors
26
An example: Automated taxi driver
• Performance measure
• How can we judge the automated driver?
• Which factors are considered?
• getting to the correct destination
• minimizing fuel consumption
• minimizing the trip time and/or cost
• minimizing the violations of traffic laws
• maximizing the safety and comfort
• etc.
27
An example: Automated taxi driver
• Environment
• A variety of roads (rural lane, urban alley, etc.)
• Traffic lights, other vehicles, pedestrians, stray animals, road works,
police cars, puddles, potholes, etc.
• Interaction with the passengers
• Actuators (for outputs)
• Control over the accelerator, steering, gear, shifting and braking
• A display to communicate with the customers
• Sensors (for inputs)
• Controllable cameras for detecting other vehicles, road situations
• GPS (Global Positioning System) to know where the taxi is
• Many more devices are necessary: speedometer, accelerometer, etc.
28
An example: Automated taxi driver
29
Software agents
• Sometimes, the environment may not be the real world.
• E.g., flight simulator, video games, Internet
• They are all artificial but very complex environments
31
32
Properties of Task environment
33
Fully Observable vs. Partially observable
34
Single agent vs. Multiagent
• Single agent: An agent operates by itself in an environment.
• E.g., solving crossword → single-agent, playing chess → two-agent
• Which entities must be viewed as agents?
• Whether B’s behavior is described as maximizing a performance
measure whose value depends on A’s behavior.
• Competitive vs. Cooperative multiagent environment
• E.g., playing chess → competitive, driving on road → cooperative
35
Deterministic vs. Stochastic
• Deterministic: The next state of the environment is
completely determined by the current state and the action
executed by the agent.
• E.g., the vacuum world → deterministic, driving on road → stochastic
• Most real situations are so complex that they must be
treated as stochastic.
36
Episodic vs. Sequential
• Episodic: The agent’s experience is divided into atomic
episodes, in each of which the agent receives a percept and
then performs a single action.
• Quality of action depends just on the episode itself
• Do not need to think ahead
• Sequential: A current decision could affect future decisions.
37
Static vs. Dynamic
• Static: The environment is unchanged while an agent is
deliberating.
• E.g., crossword puzzles → static, taxi driving → dynamic
• Dynamic: The agent is continuous asked what it wants to do
• If it has not decided yet, that counts as deciding to do nothing.
38
Properties of Task environment
• Discrete vs. continuous
• The distinction applies to the state of the environment, to the way
time is handled, and to the agent’s percepts and actions
• E.g., the chess has a finite number of distinct states, percepts and
actions; while the vehicles’ speeds and locations sweep through a
range of continuous values smoothly over time.
• Known vs. unknown
• Known environment: the outcomes (or outcome probabilities if the
environment is stochastic) for all actions are given.
• Unknown environment: the agent needs to learn how it works to
make good decisions.
39
Environments and their characteristics
40
Properties of Task environment
• The simplest environment: Fully observable, deterministic,
episodic, static, discrete and single-agent.
• Most real situations: Partially observable, stochastic,
sequential, dynamic, continuous and multi-agent.
41
Quiz 02: Task environment
• For each of the following activities, characterize its task
environment in term of properties listed.
• Playing a tennis match in a tournament
• Practicing tennis against a wall
42
The structure of agents
• Agent programs
• Simple reflex agents
• Model-based reflex agents
• Goal-based agents
• Utility-based agents
• Learning agents
43
The agent architecture
44
The agent programs
• They take the current percept as input from the sensors and
return an action to the actuators.
• Agent program vs. Agent function
• The agent program takes only the current percept, because nothing
more is available from the environment.
• The agent function gets the entire percept sequence that the agent
must remember.
45
A trivial agent program
• Keep track of the percept sequence and index into a table of
actions to decide what to do.
46
A trivial agent program
• 𝑃 = the set of possible percepts
• 𝑇 = lifetime of the agent
• I.e., the total number of percepts it receives
• The size of the look up table is σ𝑇𝑡= 𝑃 𝑡
47
The key challenge of AI
• Write programs that produce rational behavior from a small
amount of code rather than a large amount of table entries
• E.g., calculate square roots – a five-line program of Newton’s
Method vs. a huge lookup tables
https://en.wikipedia.org/wiki/Newton%27s_method
48
Types of agent programs
Utility-
based
Goal-based
agents
agents
Model-
based
Simple
reflex agent
reflex agent
49
Simple reflex agents
• The simplest kind of agent, limited intelligence
• Select actions based on the current percept, ignoring the
rest of the percept history
• The connection from percept to action is represented by
condition-action rules.
IF current percept THEN action
• E.g., IF car-in-front-is-braking THEN initiate-braking.
• Limitations
• Knowledge sometimes cannot be stated explicitly → low applicability
• Work only if the environment is fully observable
50
function SIMPLE-REFLEX-AGENT(percept) returns an action
persistent: rules, a set of condition–action rules
state ← INTERPRET-INPUT(percept)
rule ← RULE-MATCH(state, rules)
action ← rule.ACTION
return action
51
A Simple reflex agent in nature
percepts
(size, motion)
RULES:
(1) If small moving object, then activate SNAP
(2) If large moving object, then activate AVOID and inhibit SNAP
ELSE (not moving) then NOOP
53
IF THEN
Saw an object ahead and turned right, and it’s now clear ahead Go straight
Saw an object ahead and turned right, and object ahead again Halt
See no object ahead Go straight
See an object ahead Turn randomly
Example table agent with internal state
54
Goal-based agents
• Current state of the environment is always not enough
• The agent further needs some sort of goal information that
describes desired situations.
• E.g., at a road junction, the taxi can turn left, turn right, or go straight
on, depending on where the taxi is trying to get to.
• Less efficient but more flexible
• Knowledge supporting the decisions is represented explicitly and can
be modified.
55
Goal-based agents
56
Utility-based agent
• Goals are inadequate to generate high-quality behavior in
most environments.
• Many action sequences can get the goals, some are better, and
some are worse, e.g., go home by taxi or Grab car?
• An agent’s utility function is essentially an internalization of
the performance measure.
• Goal → success, utility → degree of success (how successful it is)
• If state A is more preferred than others, then A has higher utility.
57
Utility-based agent
58
Utility-based agent: Advantages
• When there are conflicting goals
• Only some of which can be achieved, e.g., speed and safety
• The utility function specifies the appropriate tradeoff.
• When there are several goals that the agent can aim for
• None of which can be achieved with certainty
• The utility weights the likelihood of success against the importance of
the goals.
• The rational utility-based agent chooses the action that
maximizes the expected utility of the action outcomes
59
Learning agents
• After an agent is programmed, can it work immediately?
• No, it still need teaching
• Once an agent is done, what can we do next?
• Teach it by giving it a set of examples
• Test it by using another set of examples
• We then say the agent learns → learning agents
60
Learning agents
61
Learning agents
• A learning agent is divided into four conceptual components
1. Learning element → Make improvement
2. Performance element → Select external actions
3. Critic → Tell the Learning element how well the agent is doing with
respect to fixed performance standard. (Feedback from user or
examples, good or not?)
4. Problem generator → Suggest actions leading to new and
informative experiences
64
THE END
65