Chapter 3
Chapter 3
2. Consistency
A second, slightly stronger condition called consistency (or sometimes
monotonicity) is required only for applications of A∗ to graph search. A
heuristic h(n) is consistent if, for every node n and every successor nt of n
generated by any action a, the estimated cost of reaching the goal from n is no
greater than the step cost of getting to nt plus the estimated cost of reaching the
goal from nt:
h(n) ≤ c(n, a, nt )+ h(nt) .
This is a form of the general triangle inequality, which stipulates that each side
of a triangle cannot be longer than the sum of the other two sides. Here, the
triangle is formed by n, nt, and the goal Gn closest to n.
For an admissible heuristic, the inequality makes perfect sense: if there were a
route from n to Gn via nt that was cheaper than h(n), that would violate the
property that h(n) is a lower bound on the cost to reach Gn.
Optimality of A*
To test the heuristic functions h1 and h2, we generated 1200 random problems
with solution lengths from 2 to 24 (100 for each even number) and solved them
with iterative deepening search and with
A∗ tree search using both h1 and h2. Figure 3.29 gives the average number of
nodes generated by each strategy and the effective branching factor.
One might ask whether h2 is always better than h1. The answer is “Essentially,
yes.” It is easy to see from the definitions of the two heuristics that, for any
node n, h2(n) ≥ h1(n). We thus say that h2 dominates h1. Domination translates
directly into efficiency: A∗ using h2 will never expand more nodes than A∗
using h1.
The idea behind pattern databases is to store these exact solution costs for
every possible subproblem instance—in our example, every possible
configuration of the four tiles and the blank. Then we compute an admissible
heuristic hDB for each complete state encountered during a search simply by
looking up the corresponding subproblem configuration in the database. The
database itself is constructed by searching back from the goal and recording the
cost of each new pattern encountered; the expense of this search is amortized
over many subsequent problem instances.
The choice of 1-2-3-4 is fairly arbitrary; we could also construct databases for
5-6-7-8, for 2 4-6-8, and so on. Each database yields an admissible heuristic,
and these heuristics can be combined, as explained earlier, by taking the
maximum value.
The heuristics obtained from the 1-2-3-4 database and the 5-6-7-8 could be
added, since the two subproblems seem not to overlap. This is not an admissible
heuristic, because the solutions of the 1-2-3-4 subproblem and the 5-6-7-8
subproblem for a given state will almost certainly share some moves it is
unlikely that 1-2-3-4 can be moved into place without touching 5-6-7-8, and
vice versa.
The sum of the two costs is still a lower bound on the cost of solving the entire
problem is a disjoint pattern databases.
iv. Learning heuristics from experience
A heuristic function h(n) is supposed to estimate the cost of a solution
beginning from the state at node n.
How could an agent construct such a function?
Solution: learn from experience.
Example:
Each optimal solution to an 8-puzzle problem provides examples from which
h(n) can be learned. Each example consists of a state from the solution path and
the actual cost of the solution from that point. From these examples, a learning
algorithm can be used to construct a function h(n) that can (with luck) predict
solution costs for other states that arise during search. Techniques for doing just
this using neural nets, decision trees, and other methods.
Inductive learning methods work best when supplied with features of a state
that are relevant to predicting the state’s value, rather than with just the raw
state description.
For example, the feature “number of misplaced tiles” might be helpful in
predicting the actual distance of a state from the goal. Let’s call this feature
x1(n). We could take 100 randomly generated 8-puzzle configurations and
gather statistics on their actual solution costs.
We might find that when x1(n) is 5, the average solution cost is around 14, and
so on. Given these data, the value of x1 can be used to predict h(n). Of course,
we can use several features. A second feature x2(n) might be “number of pairs
of adjacent tiles that are not adjacent in the goal state.” How should x1(n) and
x2(n) be combined to predict h(n)? A common approach is to use a linear
combination:
h(n) = c1x1(n)+ c2x2(n) .
The constants c1 and c2 are adjusted to give the best fit to the actual data on
solution costs.
LOGICAL AGENTS
3.5 Knowledge—based agents
• An intelligent agent needs knowledge about the real world for taking decisions
and reasoning to act efficiently.
• Knowledge-based agents are those agents who have the capability of
maintaining an internal state of knowledge, reason over that knowledge, update
their knowledge after observations and take actions. These agents can represent
the world with some formal representation and act intelligently.
• Knowledge-based agents are composed of two main parts:
• Knowledge-base and
• Inference system.
The Wumpus world is a cave which has 4/4 rooms connected with
passageways. So there are total 16 rooms which are connected with each other.
We have a knowledge-based agent who will go forward in this world. The cave
has a room with a beast which is called Wumpus, who eats anyone who enters
the room. The Wumpus can be shot by the agent, but the agent has a single
arrow.
• The agent explores a cave consisting of rooms connected by passageways.
• Lurking somewhere in the cave is the Wumpus, a beast that eats any agent that
enters its room.
• Some rooms contain bottomless pits that trap any agent that wanders into the
room.
• Occasionally, there is a heap of gold in a room.
• The goal is to collect the gold and exit the world without being eaten.
Environment:
• A 4*4 grid of rooms.
• The agent initially in room square [1, 1], facing toward the right.
• Location of Wumpus and gold are chosen randomly except the first square
[1,1].
• Each square of the cave can be a pit with probability 0.2 except the first
square.
Actions/Actuators:
The agent can move Forward, TurnLeft by 90◦, or TurnRight by 90◦.
The agent dies a miserable death if it enters a square containing a pit or a
live wumpus.
If an agent tries to move forward and bumps into a wall, then the agent
does not move.
The action Grab can be used to pick up the gold if it is in the same square
as the agent.
The action Shoot can be used to fire an arrow in a straight line in the
direction the agent is facing.
The arrow continues until it either hits (and hence kills) the wumpus or
hits a wall. The agent has only one arrow, so only the first Shoot action
has any effect.
Finally, the action Climb can be used to climb out of the cave, but only
from square [1,1].
Sensors:
The agent has five sensors, each of which gives a single bit of information:
In the square containing the wumpus and in the directly (not diagonally)
adjacent squares, the agent will perceive a Stench.
In the squares directly adjacent to a pit, the agent will perceive a Breeze.
In the square where the gold is, the agent will perceive a Glitter.
When an agent walks into a wall, it will perceive a Bump.
When the wumpus is killed, it emits a woeful Scream that can be
perceived anywhere in the cave.
The percepts will be given to the agent program in the form of a list of
five symbols;
For example: if there is a stench and a breeze, but no glitter, bump, or scream,
the agent program will get [Stench, Breeze, None, None, None].
The Wumpus agent’s first step
The first step taken by the agent in the wumpus world.
(a) The initial situation, after percept [None, None, None, None, None].
(b) After one move, with percept [None, Breeze, None, None, None].
Now agent needs to move forward, so it will either move to [1, 2], or
[2,1]. Let's suppose
agent moves to the room [2, 1], at this room agent perceives some breeze which
means Pit is around this room. The pit can be in [3, 1], or [2,2], so we will add
symbol P? to say that, is this Pit room?
Now agent will stop and think and will not make any harmful move. The
agent will go back to the [1, 1] room. The room [1,1], and [2,1] are
visited by the agent, so we will use symbol V to represent the visited
squares.
At the third step, now agent will move to the room [1,2] which is OK. In
the room [1,2] agent perceives a stench which means there must be a
Wumpus nearby. But Wumpus cannot be in the room [1,1] as by rules of
the game, and also not in [2,2] (Agent had not detected any stench when
he was at [2,1]). Therefore agent infers that Wumpus is in the room [1,3],
and in current state, there is no breeze which means in [2,2] there is no
Pit and no Wumpus. So it is safe, and we will mark it OK, and the agent
moves further in [2,2].
3.7 Logic
The possible models are just all possible assignments of real numbers to
the variables x and y.
• Each such assignment fixes the truth of any sentence of arithmetic whose
variables are x and y.
• If a sentence α is true in model m, say that m satisfies α or sometimes
m is a model of α.
• The notation: M(α) to mean the set of all models of α.
• Notion of truth involves the relation of logical entailment between
sentences—the idea that asentence follows logically from another
sentence.
Mathematical notation: α |= β (sentence α entails the sentence β.)
• The formal definition of entailment is this: α |= β if and only if, in every
Hence, KB |= α2: the agent cannot conclude that there is no pit in [2,2].
Figure 7.5 is called model checking because it enumerates all possible
⇔ (if and only if). The sentence W1,3 ⇔ ¬W2,2 is a biconditional. Some
other books write this as ≡.
Semantics
• The semantics defines the rules for determining the truth of a sentence
with respect to a particular model.
• In propositional logic, a model simply fixes the truth value—true or
false—for every proposition symbol.
For example,
If the sentences in the knowledge base make use of the proposition
symbols P1,2, P2,2, and P3,1, then one possible model is
m1 = {P1,2 = false, P2,2 = false, P3,1 = true} .
The semantics for propositional logic must specify how to compute the
truth value of any sentence, given a model.
Atomic sentences are easy:
• True is true in every model and False is false in every model.
• The truth value of every other proposition symbol must be specified
directly in the model.
For example, in the model m1 given earlier, P1,2 is false.
For complex sentences, we have five rules, which hold for any
subsentences P and Q in any model m
(here “iff” means “if and only if”):
For example, we can easily show (using truth tables) that P ∧ Q and Q ∧
equivalent if they are true in the same set of models.
The best-known rule is called Modus Ponens (Latin mode that affirms)
and is written
The notation means that, whenever any sentences of the form α ⇒ β and
α are given, then the sentence β can be inferred.
Let us see how these inference rules ad equivalences can be used in the
wumpus world. We start with the knowledge base containing R1 through
R5 and show how to prove ¬P1,2, that is, there is no pit in [1,2].
Now apply Modus Ponens with R8 and the percept R4 (i.e., ¬B1,1), to
R9 : ¬(P1,2 ∨ P2,1) .
obtain
• GOAL: the goal is a state that contains the sentence we are trying to
prove.
Proof by resolution
By the same process that led to R10 earlier, we can now derive the
absence of pits in [2,2] and [1,3] (remember that [1,1] is already known
to be pitless):
R13 : ¬P2,2 .
R14 : ¬P1,3 .
Now comes the first application of the resolution rule: the literal ¬P2,2 in
In English; if there’s a pit in one of [1,1], [2,2], and [3,1] and it’s not in
[2,2], then it’s in [1,1] or [3,1]. Similarly, the literal ¬P1,1 in R1 resolves
with the literal P1,1 in R16 to give R17 : P3,1 .
In English: if there’s a pit in [1,1] or [3,1] and it’s not in [1,1], then it’s in
[3,1]. These last two inference steps are examples of the unit resolution
inference rule,
There is one more technical aspect of the resolution rule: the resulting
clause should contain only one copy of each literal. The removal of
wherever possible.
A resolution algorithm
Apply the resolution procedure to a very simple inference in the wumpus world.
When the agent is in [1,1], there is no breeze, so there can be no pits in
neighboring squares. The relevant knowledge base is
If a clause in RC (S) contains the literal ¬Pi and all its other literals are
false under the assignment chosen for P1,... , Pi−1, then assign false to
Pi.
Otherwise, assign true to Pi.
The best way to understand the algorithm is through an example and a picture.
Figure 7.16(a) shows a simple knowledge base of Horn clauses with A and B as
known facts. Figure 7.16(b) shows the same knowledge base drawn as an
AND–OR graph.
In AND–OR graphs, multiple links joined by an arc indicate a
conjunction—every link must be proved—while multiple links without
an arc indicate a disjunction—any link can be proved.
It is easy to see how forward chaining works in the graph.
It is easy to see that forward chaining is sound: every inference is
essentially an application of Modus Ponens.
Forward chaining is also complete: every entailed atomic sentence will be
derived.
Forward chaining is an example of the general concept of data-driven
reasoning—that is, reasoning in which the focus of attention starts with
the known data.
It can be used within an agent to derive conclusions from incoming
percepts, often without a specific query in mind.
Backward chaining is a form of goal-directed reasoning. It is useful for
answering specific questions such as “What shall I do now?” and “Where are
my keys?”