0% found this document useful (0 votes)
104 views205 pages

Artificial Intelligence 1

Uploaded by

nataji50020
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
104 views205 pages

Artificial Intelligence 1

Uploaded by

nataji50020
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 205

What is AI?

an attempt of

 AI is the reproduction of human reasoning and


intelligent behavior by computational methods

Intelligent
behavior
Computer

Humans

Dec 24, 2015 1


What is AI?
(R&N)

Discipline that systematizes and automates reasoning


processes to create machines that:

Act like humans Act rationally


Think like humans Think rationally

Dec 24, 2015 2


Act like humans Act rationally
Think like humans Think rationally

 The goal of AI is to create computer systems that


perform tasks regarded as requiring intelligence when
done by humans

 AI Methodology: Take a task at which people are better,


e.g.:
 Prove a theorem
 Play chess
 Plan a surgical operation
 Diagnose a disease
 Navigate in a building
 and
But build
do weawant to duplicate
computer systemhuman imperfections?
that does it automatically
Dec 24, 2015 3
Act like humans Act rationally
Think like humans Think rationally

 Here, how the computer performs tasks does matter


 The reasoning steps are important
  Ability to create and manipulate symbolic
knowledge (definitions, concepts, theorems, …)
 What is the impact of hardware on low-level
reasoning, e.g., to go from signals to symbols?

Dec 24, 2015 4


Act like humans Act rationally
Think like humans Think rationally

 Now, the goal is to build agents that always make the


“best”
best decision given what is available (knowledge,
time, resources)
 “Best” means maximizing the expected value of a
utility function
  Connections to economics and control theory
 What is the impact of self-consciousness,
emotions, desires, love for music, fear of dying,
etc ... on human intelligence?
Dec 24, 2015 5
Acting humanly: The Turing Test approach
One of the earliest papers to address the question of
machine intelligence specifically in relation to the
modern digital computer was written in 1950 by the
British mathematician
Alan Turing.
 Turing, known mainly for his contributions to the
theory of computability, considered the question of
whether or not a machine could actually be made to
think.
 Noting that the fundamental ambiguities in the
question itself (what is thinking? what is a machine?)
precluded any rational answer, he proposed that the
question of intelligence be replaced by a more clearly
Dec 24, 2015
defined empirical test. 6
The interrogator is free,
however, to ask any
questions, no matter how
devious or indirect, in an
effort to uncover the
computer’s identity.

If the interrogator cannot


distinguish the machine from
the human, then, Turing
argues, the machine may be
assumed to be intelligent.

Dec 24, 2015


7
 The interrogator has to guess if the conversation is with
a program or a person; the program passes the test if it
fools the interrogator 30% of the time.

 Turing conjectured that, by the year 2000, a computer


with a storage of 109 units could be programmed well
enough to pass the test, but he was wrong.
1024 MB
 Some people have been fooled for 5 minutes; for example,
the ELIZA program and the Internet chatbot called MGONZ
have fooled humans who didn't realize they might be
talking to a program, and the program ALICE fooled one
judge in the 2001 Loebner Prize competition.

 But no program has come close to the 30% criterion against


trained judges.
Dec 24, 2015 8
Capabilities that a Computer
Program needs to have to pass the
Turing Test
 Natural language processing to enable it to
communicate successfully in English.
 Knowledge representation to store what it knows or
hears;
 Automated reasoning to use the stored information
to answer questions and to draw new conclusions;
 Machine learning to adapt to new circumstances and
to detect and extrapolate patterns.

Dec 24, 2015 9


The total Turing Test includes a video signal so that
the interrogator can test the subject's perceptual
abilities, as well as the opportunity for the interrogator
to pass physical
objects "through the hatch."

To pass the total Turing Test, the computer will need


 Computer vision to perceive objects, and

 Robotics to manipulate objects and move about.

These six disciplines compose most of AI, and


Turing deserves credit for designing a test that
remains relevant 50 years later.

Dec 24, 2015 10


Today: The Difference Between Us and
Them
CAPTCHA: Telling Humans and Computers Apart
Automatically
A CAPTCHA is a program that protects websites against bots by
generating and grading tests that humans can pass but current
computer programs cannot. For example, humans can read distorted
text as the one shownCAPTCHA
below, but current
standscomputer
for????programs can't:

"Completely Automated Public Turing test to tell


Computers and Humans Apart”

Jan 2, 2014 11
Can Machines Act/Think
Intelligently?
 Maybe yes, maybe not, if intelligence is not
separated from the rest of “being human”
 Yes, if intelligence is narrowly defined as
information processing

 AI has made impressive achievements showing that


tasks initially assumed to require intelligence can be
automated
 But each success of AI seems to push further the limits
of what we consider “intelligence”
Dec 24, 2015 12
Currently, no computers exhibit full artificial
intelligence (that is, are able to simulate human
behavior).
The greatest advances have occurred in the field of
games playing.
Deep
Blue?

Jan 2, 2014 13
Behind the success
The system derived its playing strength mainly out of
brute force computing power.
It was a massively parallel, with 30 nodes, with each
node containing a 120 MHz.
Its chess playing program was written in C and ran
under the AIX operating system.
It was capable of evaluating 200 million positions per
second, twice as fast as the 1996 version.
The Deep Blue chess computer that defeated Kasparov
in 1997 would typically search to a depth of between
six and eight moves to a maximum of twenty or even
more moves in some situations.
Jan 2, 2014 14
Some Achievements
 Computers have won over world champions in
several games, including Checkers, and Chess, but
still do not do well in Go
 AI techniques are used in many systems: formal
calculus, video games, route planning, logistics
planning, pharmaceutical drug design, medical
diagnosis, hardware and software trouble-shooting,
speech recognition, traffic monitoring, facial
recognition, medical image analysis, part inspection,
etc...
 Stanford’s robotic car, Stanley, autonomously
traversed 132 miles of desert.
 Some industries (automobile, electronics) are highly
robotized, while other robots perform brain and
heart surgery, are rolling on Mars, fly autonomously,
…,
 24,
Dec But
2015 home robots still remain a thing of the future 15
Some Big Open Questions
 AI (especially, the “rational agent” approach) assumes that
intelligent behaviors are only based on information processing? Is
this a valid assumption?
 If yes, can the human brain machinery solve problems that are
inherently intractable for computers?
 In a human being, where is the interface between “intelligence”
and the rest of “human nature”, e.g.:
• How does intelligence relate to emotions felt?
• What does it mean for a human to “feel” that he/she understands
something?
 Is this interface critical to intelligence? Can there exist a general
theory of intelligence independent of human beings? What is the
role of the human body?

Dec 24, 2015 16


I, Robot
In
In the
the movie
movie I,I, Robot,
Robot, the
the most
most impressive
impressive
feature
feature of
of the
the robots
robots is
is not
not their
their ability
ability to
to solve
solve
complex
complex problems,
problems, butbut how
how they
they blend
blend human-
human-
like
like reasoning
reasoning with
with other
other key
key aspects
aspects of
of human
human
beings
beings (especially,
(especially, self-consciousness,
self-consciousness, fearfear of
of
dying,
dying, distinction
distinction between
between right
right and
and wrong),
wrong), … …

Dec 24, 2015 17


MAIN AREAS OF AI
 Knowledge representation
(including formal logic)

Agent
 Search, especially heuristic
Perception search (puzzles, games)
Robotics
 Planning
Reasoning
 Reasoning under uncertainty,
Search
Learning including probabilistic
reasoning
Knowledge Constraint  Learning
Planning rep. satisfaction
 Agent architectures
 Robotics and perception
Natural
language
... Expert  Natural language processing
Systems
Dec 24, 2015 18
Bits of History
 1956: The name “Artificial Intelligence” is coined
 60’s: Search and games, formal logic and theorem
proving
 70’s: Robotics, perception, knowledge representation,
expert systems
 80’s: More expert systems, AI becomes an industry
 90’s: Rational agents, probabilistic reasoning,
machine learning
 00’s: Systems integrating many AI methods, machine
learning, reasoning under uncertainty, robotics again
Dec 24, 2015 19
The human brain: Perhaps the most complex
information processing machine in nature
Forebrain (Cerebral Cortex):
Language, maths, sensation,
movement, cognition, emotion
Midbrain: Information Routing;
involuntary controls
Cerebellum: Motor
Control
Hindbrain: Control of breathing,
heartbeat, blood circulation

Spinal cord: Reflexes,


information highways between
Brain : a computational machine?
 brains better at perception / cognition
 slower at numerical calculations
 parallel and distributed Processing
 associative memory
 Evolutionarily, brain has developed algorithms most
suitable for survival
 Algorithms unknown: the search is on
 Brain astonishing in the amount of information it
processes
o Typical computers: 109 operations/sec
o Housefly brain: 1011 operations/sec
Brain facts & figures
• Basic building block of
nervous system: nerve cell
(neuron)
• ~ 1012 neurons in brain
• ~ 1015 connections between them
• Connections made at “synapses”
• The speed: events on millisecond scale in neurons, nanosecond
scale in silicon chips
Difference between brain &
computers

• Highly efficient use of energy in brain


• High Adaptability
• Tremendous amount of compressions: space is a
premium for the cranium
• One cubic centimeter of brain tissue contains
– 50 million neurons
– Several hundred miles of axons which are
“wires” for transmitting signals
– Close to trillion synapses- the connections
between neurons
Immense memory capacity
• It is estimated that the human brain's ability to store
memories is equivalent to about 2.5 petabytes of binary
data.
• The brain is remarkably energy-efficient, running on
about 12 watts—the electricity it takes to light some
high-efficiency light bulbs. It would require so much
energy to run a computer as powerful as the human
brain—perhaps as much as “
a gigawatt of power, the amount currently consumed b
y all of Washington, D.C.
” —that it may be impractical.
Brain vs. computer’s procesing
• Associative memory vs. adressable memory

• Parallel Distributed Processing (PDP) vs. Serial


computation

• Fast responses to complex situations vs. precisely


repeatable steps

• Preference for Approximations and “good enough”


solutions vs exact solutions

• Mistakes and biases vs. cold logic


Brain vs. Computers (contd.)
• Excellent pattern recognition vs. excellent number
crunching
• Emotion- brain’s steer man- assigning values to
experiences and future possibilities vs. computer being
insensitive to emotions
• Evaluate potential outcomes efficiently and rapidly
when information is uncertain vs. “Garbage in Garbage
out” situation”
INTELLIGENT
AGENTS

ABILITY TO EXIST
TO BE AUTONOMOUS,
REACTIVE,
GOAL-ORIENTED, ETC.
Dec 24, 2015 27
Agents
An agent is anything that can be viewed as perceiving
its environment through sensors and acting upon that
environment through actuators

•Human agent: eyes, ears, and other organs for


sensors;
•Actuators: hands, legs, mouth, and other body parts

•Robotic agent: cameras and infrared range finders


•Actuators: various motors
An agent and its
environment
sensors

percepts
environment ?
agent
actions

effectors
 Percept refers to the agent's perceptual inputs at any given
instant.
 An agent's percept sequence is the complete history of
everything the agent has ever perceived. In general, an agent's
choice of action at any given instant can depend on the entire
percept sequence observed to date.
 An agent's behavior is described by the agent function that
maps any given percept sequence to an action. It can be given
Vacuum-cleaner world
• Percepts: location and
contents, e.g., [A,Dirty]
• Actions: Left, Right, Suck,
Internally, the agent function
for an artificial agent will be
implemented by an agent
program.
The agent function is an
abstract mathematical
description; the agent program
is a concrete implementation,
running on the agent
RATIONAL
AGENTS
Ideal Rational Agent: For each possible percept sequence,
such an agent does whatever action is expected to maximize
its performance measure, on the basis of the evidence
provided by the percept sequence and whatever built-in
knowledge the agent has.
Performance measure
An objective criterion for success of an agent's behavior.
E.g., performance measure of a vacuum-cleaner agent could be
amount of dirt cleaned up, amount of time taken, amount of electricity
consumed, amount of noise generated, etc.
Rationality
What is rational at any given time depends on
four things:
The performance measure that defines the criterion of
success.
The agent's prior knowledge of the environment.
The actions that the agent can perform.
The agent's percept sequence to date.

Dec 24, 2015 32


Task Environments :
"problems" to which rational agents are the
"solutions."

In designing an agent, the


first step must always be to
specify the task
environment as fully as
possible.

Dec 24, 2015 33


PEAS description of the task environment for an
automated taxi.

Dec 24, 2015 34


• fully observable vs. partially observable
– sensors capture all relevant information from the
environment
• deterministic vs. stochastic (non-deterministic)
– changes in the environment are predictable
• episodic vs. sequential (non-episodic)
– independent perceiving-acting episodes
• static vs. dynamic
– no changes while the agent is “thinking”
• discrete vs. continuous
– limited number of distinct percepts/actions
• single vs. multiple agents
– interaction and collaboration among agents
Dec 24, 2015 35
Given that almost all AI formalisms (planning,
learning, etc) are NP-Complete or worse, some
form of search is generally unavoidable (i.e., no
smarter algorithm available).

Dec 24, 2015 36


Problem-solving agents decide what to do by finding sequences of
actions that lead to desirable states.

Dec 24, 2015 37


• Formulate goal: Initial
State
– be in Bucharest
(Romania)
• Formulate problem:
– action: drive between
Goal
pair of connected State
cities (direct road)
– state: be in a city
(20 world states)
• Find solution:
– sequence of cities
leading from start to Environment: fully observable (map),
goal state, e.g., Arad, deterministic, and the agent the effects of each
Sibiu, Fagaras, action. Is this really the case?
Bucharest
• Execution
Map is somewhat of a “toy” example. Our
– drive from Arad to real interest: Exponentially large spaces,
Bucharest according toe.g. with 10100 or more states. Far beyond
the solution
full search. Humans can often still handle
those! One of the mysteries of
An agent with several immediate options of unknown value can
decide what to do by first examining different possible
sequences of actions that lead to states of known value, and
then choosing the best sequence.
This process of looking for such a sequence is called search.
A search algorithm takes a problem as input and returns a
solution in the form of an action sequence. Once a solution is
found, the actions it recommends can be carried out. This is
called the execution phase.
Thus, we have a simple "formulate, search, execute" design for
the agent
After formulating a goal and a problem to solve, the agent calls a
Dec 24, 2015 search procedure to solve it. 39
A problem can be defined formally by four components:

1.The initial state that the agent starts in. E.g, the initial state
for the agent in Romania might be described as In(Arad).

2.A description of the possible actions available to the agent. The


most common formulation uses a successor function. Given a
particular state x, SUCCESSOR-FN(x) returns a set of (action,
successor) ordered pairs, where each action is one of the legal
actions in state x and each successor is a state that can be reached
from x by applying the action.
1. The set of all states reachable from the initial state is known
as the state space.
2. A path in the state space is a sequence of states connected
Dec 24, 2015 by a sequence of actions. 40
3. The goal test, which determines whether a given state is a
goal state.
o Sometimes there is an explicit set of possible goal states,
and the test simply checks whether the given state is one of
them.
o Sometimes the goal is specified by an abstract property
rather than an explicitly enumerated set of states. For
example, in chess, the goal is to reach a state called
“checkmate”, where the opponent's king is under attack and
can't escape.
4. A path cost function that assigns a numeric cost to each
path. The problem-solving agent chooses a cost function that
reflects its own performance measure.
o The cost of a path can be described as the sum of the costs
of the individual actions along the path.
Dec 24, 2015 o The step cost of taking action a to go from state x to state y 41
If there are n
locations, then no.
of possible states?
Start state

Goal (reach one in


this set of states)
• states? The agent is in one of 8 possible world states.
• actions? Left, Right, Suck [simplified: left out No-op]
• goal test?No dirt at all locations (i.e., in one of bottom two states
1 per action
• path cost? Minimum path from Start to Goal state:
3 actions
Alternative, longer plan: 4 actions
Note: path with thousands of steps before reaching goal also exist.
Example: The 8-puzzle “sliding tile puzzle”
Aside:
variations
on goal state.
eg empty square
bottom right or
in middle.
• states?
the boards, i.e., locations of tiles
• actions? move blank left, right, up, down
• goal test? goal state (given; tiles in order)
1 per move

• path cost? •Finding optimal solution of n-puzzle family is NP-


hard!
•Also, from certain states you can’t reach the goal.
•Total number of states ?
•9! = 362,880 (not all connected… only half can
reach goal state)
 State space S
 Successor function:
S x  S  SUCCESSORS(x)  2S
1
3 2  Initial state s0
 Goal test:
xS  GOAL?(x) =T or F
 Arc cost

44
G

 Each state is represented by a


distinct node
 An arc (or edge) connects a node
s to a node s’ if s’ 
SUCCESSORS(s)
 The state graph may contain
more than one connected
component
 A solution is a path connecting I
the initial node to a goal node
(any one)
 The cost of a path is the sum of
the arc costs along this path If this is G
 An optimal solution is a
solution path of minimum cost 45
Defining the problem as State Space
 Search
State space is a set of states that a problem
can be in.

 The set of states forms a graph where two


states are connected if there is an operation
that can be performed to transform the first
state into the second.

 State space search is a process used in


artificial intelligence, in which successive
configurations or states of an instance are
considered, with the goal of finding a goal state
with a desired property.
Jan 2, 2014 46
Jan 2, 2014 47
 E.g. Game
Given any of Tic-Tac-Toe
board situation, there is only a finite number of moves
that a player can make.
 Starting with an empty board, the first player may place an X in
any one of nine places. Each of these moves yields a different
board that will allow the opponent eight possible responses, and
so on.
 We can represent this collection of possible moves and responses
by regarding each board configuration as a node or state in a
graph. The links of the graph represent legal moves from one
board configuration to another. The resulting structure is a state
space graph.
 The state space representation thus enables us to treat all
possible games of tic-tac-toe as different paths through the state
space graph.
 Given this representation, an effective game strategy will search
Jan 2, 2014 through the graph for the paths that lead to the most wins and 48
Portion of
the State
space for
Tic-Tac-Toe

Jan 2, 2014 49
E.g. Water Jug Problem
We give you two jugs with a maximum capacity
of 4-litre and 3-litre each and a pump to fill
each of the jugs. Neither have any measuring
markers on it. Now your job is to get exactly 2
litres of water into the
4-litre jug. How will you do that? How will you
define
The statethespace
state space?
can be described as the set of ordered pairs of
integers
(x, y), viz. x = 0, 1, 2, 3, or 4 and y = 0, 1,2, or 3; where x and
y represent the quantity of water in the 4-litre jug and 3-litre
jug
Therespectively.
start state
is (0, 0)

The goal state (2, n)


is
Jan 2, 2014 50
Production rules for Water Jug Problem

What other rules can we have?

Jan 2, 2014 51
Find a sequence of rules to solve the water jug
problem
 Required a control structure that
loops through a simple cycle in
which some rule whose left side
matches the current state is
chosen, the appropriate change to
the state is made as described in
the corresponding right side, and
the resulting state is checked to
see if it corresponds to goal state.
 One solution to the water jug
problem
 Shortest such sequence will have a
impact on the choice of appropriate
mechanism to guide the search for
solution.
Jan 2, 2014 52
Introduced in 1878 by Sam Loyd.

Sam Loyd offered $1,000 of his own money to the


first person who would solve the following problem:

1 2 3 4 1 2 3 4

5 6 7 8 ? 5 6 7 8

9 10 11 12 9 10 11 12

13 14 15 13 15 14

But no one ever won the prize !!


Dec 24, 2015 53
1 2 3 4

....
8 2
3 4 7 5 6 7 8
5 1 6 9 10 11 12
13 14 15
 8-puzzle  9! = 362,880 states
 15-puzzle  16! ~ 2.09 x 1013 states
 24-puzzle  25! ~ 1025 states

But only half of these states are reachable from any


given state (but you may not know that in advance)

54
 A tile j appears after a tile i if either j appears on the same row
as i to the right of i, or on another row below the row of i.
 For every i = 1, 2, ..., 15, let ni be the number of tiles j < i that
appear after tile i (permutation inversions)
 N = n2 + n3 +  + n15 + row number of empty tile

1 2 3 4
Find N for the following state.
5 10 7 8
9 6 11 12
13 14 15
n2 = 0 n3 = 0 n4 = 0
n5 = 0 n6 = 0 n7 = 1
n8 = 1 n9 = 1 n10 = 4 N=7+4
n11 = 0 n12 =0 n13 = 0
n14 = 0 n15 =0 55
Proposition: (N mod 2) is invariant under any legal move
of the empty tile
Proof:
 Any horizontal move of the empty tile leaves N unchanged
 A vertical move of the empty tile changes N by an even
increment
( 1  1  1  1)
 For a goal state g to be reachable from a state s, a necessary &
sufficient condition is that N(g) and N(s) have the same parity
 The state graph consists of two connected components of equal
size
1 2 3 4 1 2 3 4
5 6 7 5 6 11 7
s= s’ = N(s’) = N(s) + 3 + 1
9 10 11 8 9 10 8
13 14 15 12 13 14 15 12
56
15- Puzzle
Sam Loyd off ered $1,000 of his own money t o
t he fi rst person who would solve t he f ollowing
problem:

1 2 3 4 1 2 3 4

5 6 7 8 ? 5 6 7 8

9 10 11 12 9 10 11 12

13 14 15 13 15 14

N=4 N=5
So, the second state is not
reachable from the first, and
Sam Loyd took no risk with
his money ...

57
What is the Actual State Space?
a) The set of all states?
[e.g., a set of 16! states for the 15-puzzle]

b) The set of all states reachable from a given initial


state?
[e.g., a set of 16!/2 states for the 15-puzzle]
In general, the answer is a)
[because one does not know in advance which states are
reachable]

But a fast test determining whether a state is reachable from


another is very useful, as search techniques are often
inefficient when a problem has no solution
58
Goal state
15-puzzle
Search space: Korf:
16!/2 = 1.0461395 e+13, Disk errors
become a
about 10 trillion.
problem.
Too large to store in RAM
(>= 100 TB). A challenge to search
for a path from a given board to goal.

Longest minimum path: 80 moves. Just 17 boards, e.g,

Average minimum soln. length: 53.


People can find solns. But not necessarily
minimum length. See solve it! (Gives strategy.)
Korf, R., and Schultze, P. 2005. Large-scale parallel breadth-first search. In
Proceedings of the 20th National Conference on Artificial Intelligence (AAAI-05).
See Fifteen Puzzle Optimal Solver. With effective search: opt. solutions in seconds!
Average: milliseconds.
Where are the 10 trillion states?
# states in billions

minimum distance from goal state (# moves)


dist. # states dist. # states

etc.
4 17 boards farthest away from goal state (80 moves)
1
13
<2
,5,
6 >
?
<15,12,11>/
<9,10,14>
?
What is it ?
about these
17 boards
Each require 80 moves to reach:
out of over
10 trillion? Intriguing similarities. Each number has its own
few locations. <3 ,
Interesting machine learning task: 7,8>
Learn to recognize the hardest boards!
(Extremal Combinatorics, e.g. LeBras, Gomes, and Selman AAAI-12)
17 boards farthest away from goal state (80 moves)

Most regular extreme case: Goal state


Each quadrant
reflected along
diagonal. “move
tiles furthest away”
 It is often not feasible (or
8-, 15-, 24-Puzzles too expensive) to build a
complete representation
of the state graph
8-puzzle  362,880 states

0.036 sec

15-puzzle  2.09 x 1013 states

~ 55 hours

24-puzzle  1025 states


> 109 years

100 millions states/sec

63
a r ch
se
h e se
s are ex
T l e m pl Requires positioning millions of components
ro b o m
p e ly c & connections on a chip to
r em
ext minimize area, circuit delays & stray
capacitances
The layout problem comes after maximize manufacturing
the logical yield.
design phase, and is
usually split into two parts:
Cell layout: the primitive components of the circuit are grouped
into cells, each of which performs some recognized function. In cell
layout, The aim is to place the cells on the chip so that they do not
overlap & there is room for the connecting wires to be placed between
the cells.
Channel routing: finds a specific route for each wire through the
Decgaps
24, 2015 between the cells. 64
Having formulated some problems, we now need to
solve them. This is done by a search through the state
space.
These search techniques use an explicit search tree
that is generated by the initial state and the
successor.
The essence of search is following up one option &
putting the others aside for later, in case the first
choice does not lead to a solution.
The choice of which state to expand is determined by
In search
the general, we may have a search graph
strategy.
strategy
instead of a search tree, when the same
state can be reached from multiple paths.
A node is a data structure with five components:
STATE: the state in the state space to which the node
corresponds;
PARENT-NODE: the node in the search tree that generated this
node;
ACTION: the action that was applied to the parent to generate the
node;
PATH-COST: the cost, traditionally denoted by g ( n ) , of the path
from the initial state to the node,
as indicated by the parent pointers;
DEPTH: the number of steps
along the path from the initial state.
If a state is too large, it may be
preferable to only represent the initial
state and (re-)generate the other
Dec 24, 2015 66
Fringe
• Set of search nodes that have not been
expanded yet
• Implemented as a queue FRINGE
– INSERT(node,FRINGE)
– REMOVE(FRINGE)
• The ordering of the nodes in FRINGE defines
the search strategy
The
general
tree-
search
algorithm.

Dec 24, 2015 68


• A search strategy is defined by picking the order of node
expansion.
• Strategies are evaluated along the following dimensions:
completeness: does it always find a solution if one exists?
time complexity: number of nodes generated
space complexity: maximum number of nodes in memory
optimality: does it always find a least-cost solution?
• In AI, where the graph is represented implicitly by the initial state
and successor function and is frequently infinite, complexity is
expressed in terms of three quantities:
b: maximum branching factor of the search tree
d: depth of the least-cost solution
m: maximum depth of the state space (may be ∞)
Uninformed search strategies use only the information
available in the problem definition. They have no additional
information about states beyond that provided in the
problem definition. All they can do is generate successors
and distinguish a goal state from a non-goal state.

Strategies that know whether one non-goal state is


"more promising" than another are called informed
search or heuristic search strategies.

All search strategies are distinguished by the


order in which nodes are expanded.
Dec 24, 2015 70
• A breadth-first search (BFS)
A
explores nodes nearest the root
before exploring nodes further
B C away
• Breadth-first search can be
D E F G implemented by calling TREE-
SEARCH with an empty fringe that
is a first-in-first-out (FIFO) queue,
H I J K assuring that the nodes that are
visited first will be expanded first.
L M N O P Q • The FIFO queue puts all newly
generated successors at the end
• Node are explored in the order of the queue, which means that
ABCDEFGHIJKLMNOP
shallow nodes are expanded
Q
before deeper nodes.
71 • J will be found before N
Is it complete?
It is complete. If the shallowest goal node is at some finite depth d, and b
is finite, BFS will eventually find it after expanding all shallower nodes.
Is it optimal?
The shallowest goal node is not necessarily the optimal one; technically,
breadth-first search is optimal if the path cost is a non-decreasing
function of the depth of the node.
Time & Space Complexity?
The no. of nodes generated if every state has b successors and the
solution is the last node at level d (check for goal only when node is
expanded)
1+b+b2+b3+… +bd + b(bd-1) = O(bd+1)
Every node that is generated must remain in memory, because it is either
part of the fringe or is an ancestor of a fringe node. The space
Dec 24,complexity
2015 is, therefore, the same as the time complexity 72
Those who do complexity analysis
are worried. Why?

The following table lists the time and memory required for a
breadth-first search with
branching factor b = 10, for various values of the solution depth
d.
The table assumes that 10,000 nodes can be generated per
second
A node requires 1000 bytes of storage.
Many search problems fit roughly within these assumptions (give
or take a factor of 100) when run on a modern personal computer.
Dec 24, 2015 73
Time and memory requirements for breadth-first search.
The numbers shown assume branching factor b = 10;
10,000 nodes/second; 1000 bytes/node.

Dec 24, 2015 74


The memory requirements are a bigger problem for BFS
than is the execution time. 3 1 hours would not be too
long to wait for the solution to an important problem of
depth 8, but few computers have the terabyte of main
memory it would take.
The time requirements are still a major factor. If your
problem has a solution at depth 12, then (given our
assumptions) it will take 35 years for BFS (or indeed
any uninformed search) to find it.
In general, exponential-complexity search problems
cannot be solved by uninformed methods for any but
the smallest instances.
Dec 24, 2015 75
Expands the node n with the lowest path cost.
It does not care about the number of steps a path has, but only
about their total cost.
fringe = queue ordered by path cost
Complete - provided the cost of every step is greater than or
equal to some small positive constant c.
Optimality - the algorithm expands nodes in order of
increasing path cost. Therefore, the first goal node selected for
expansion is the optimal solution.

Completeness &
Optimality?
Dec 24, 2015 76
Time & Space Complexity
o Uniform-cost search is guided by path costs rather than depths, so
its complexity cannot easily be characterized in terms of b and d.
o Instead, let C* be the cost of the optimal solution, & assume that
every action costs at least ε.
o Then the algorithm's worst-case time and space complexity is
O(b1+C*/ε), which can be much greater than bd.
o This is because uniform-cost search can, and often does, explore
large trees of small steps before exploring paths involving large
and perhaps useful steps.
o When all step costs are equal, b1+C*/ε, is just bd.

Dec 24, 2015 77


• fringe = LIFO queue, i.e., put
A
Using the same successors at front (“push on
assumptions as before, stack”)
B
& assuming C
that nodes • Very modest memory
at the same depth as requirements. Needs to store only
D
the goalE node have
F Gno a single path from the root to a
leaf node, along with the
successors, we find
remaining unexpanded sibling
that depth-first search
H I J K
nodes for each node on the path.
would require 118 Once a node has been expanded,
L kilobytes
M N Oinstead
P ofQ it can be removed from memory
10 are
petabytes as soon as all its descendants
• Node explored inatthe
depth have been fully explored.
order A dBD = E12,
H LaM factor
NIOPC
of
F G10J Kbillion
Q times less • For a state space with maximum
• N will be space. depth m, DFS requires storage of
found before J
Dec 24, 2015 only (bm + 1) nodes. 78
Search Strategies: Blind Search

Criterion Breadth- Depth-


First First
Time bd bm
Space bd bm
Optimal? Yes No
Complete Yes No
?

b: branching factor d: solution depth


m: maximum depth 79
Do the analysis of BFS & DFS on this tree.

Jan 2, 2014 80
BFS

Jan 2, 2014 81
DFS

Jan 2, 2014 82
The problem of unbounded trees can be alleviated by
supplying depth-first search with a predetermined depth
limit l.
Nodes at depth l are treated as if they have no successors.
It introduces an additional source of incompleteness if we
choose l < d, that is, the shallowest goal is beyond the
depth limit.
Depth-limited search will also be nonoptimal if we choose l
> d.
Its time complexity is O(bl) and its space complexity is
O(bl).
Dec 24, 2015
Depth-first search can be viewed as a special case of 83
function ID-DFS(problem) returns solution/fail
for depth = 0 to ∞ do
result ← DLS(problem,depth)
if result ≠ cutoff then return result

•Iterative deepening combines the benefits of depth-first


and breadth-first search.
•Like depth-first search, its memory requirements are
very modest: O(bd).
•Like breadth-first search, it is complete when the
branching factor is finite and optimal when the path cost
is a non-decreasing function of the depth of the node.
Dec 24, 2015 84
An informed search strategy
one that uses problem-specific knowledge beyond the
definition of the problem itself
can find solutions more efficiently than an uninformed
strategy.
might not always find the best solution
but is guaranteed to find a good solution in reasonable
time.
By sacrificing completeness it increases efficiency.
Heuristics plays a major role in search strategies
because of exponential nature of the most problems.
Heuristics help to reduce the number of alternatives
from an exponential number to a polynomial number.

Jan 2, 2014 86
 Unfortunately, like all rules of discovery and invention,
heuristics are fallible.
 A heuristic is only an informed guess of the next step to be
taken in solving a problem.
 It is often based on experience or intuition.
 Because heuristics use limited information, such as
knowledge of the present situation or descriptions of states
currently on the open list, they are not always able to
predict the exact behavior of the state space farther along in
the search.
 A heuristic can lead a search algorithm to a suboptimal solution
or fail to find any solution at all.
 This is an inherent limitation of heuristic search. It cannot be
eliminated by “better” heuristics or more efficient search
Jan 2, 2014 algorithms 87
h(n) = estimated cost of the cheapest path
from node n to a goal node.
Some heuristics are better than others, and the better (more
informed) the heuristic is, the fewer nodes it needs to examine in
the search tree to find a solution.

In choosing heuristics, we usually consider that a heuristic that


reduces the number of nodes that need to be examined in
the search tree is a good heuristic.

It is also important to consider the efficiency of running the


heuristic itself. In other words, if it takes an hour to compute a
heuristic value for a given state, the fact that doing so saves a few
minutes of total search time is irrelevant.
Dec 24, 2015 88
Admissible Heuristics
h(n) is an admissible heuristic if h(n) never overestimates the
cost to reach the goal.
Admissible heuristics are by nature optimistic, because
they think the cost of solving the problem is less than it actually
is.

Jan 2, 2014 89
An algorithm in which a node is selected for expansion
based on an evaluation function f(n)
Traditionally the node with the lowest evaluation function is
selected
Not an accurate name…expanding the best node first would
be a straight march to the goal.
Choose the node that appears to be the best
There is a whole family of BEST-FIRST-SEARCH algorithms with
different evaluation functions.

Dec 24, 2015 90


The most widely-known form of best-first
search is called A* search. It evaluates nodes s
by combining g(n)
g(n) - the minimum cost to reach the node,
and f(n) = g(n)
n +h(n)
h(n) - the estimated cost to get from the node
to any one of the goal states.
f(n) = g(n) + h(n) h(n)
So, f (n) = estimated cost of the cheapest
solution through n
Can the value of f(n) change?
G
Jan 2, 2014 91
1. Initialize: Set OPEN = {s}, Closed = { }, g(s)
= 0, f(s) = h(s)
2. Fail: If OPEN = { }, Terminate & Fail
3. Select: Select the minimum cost state, n,
from OPEN. . Save n in
CLOSED.
4. Terminate: If n  G, terminate with success &
return f(n)

Jan 2, 2014 92
5. Expand: For each successor, m, of n
If m  [OPEN  CLOSED] //expanding m first time
Set g(m) = g(n) + C(n, m)
Set f(m) = g(m) + h(m)
Insert m in OPEN

If m  [OPEN  CLOSED]
Set g(m) = min{g(m), g(n)+ C(n, m)}
Set f(m) = g(m) + h(m)
If f(m) has decreased & m  CLOSED, move m to
OPEN
6. Loop: Go to Step 2. 93
Jan 2, 2014
1/ 2/ 3/ 4/ Estimated heuristic
1 2 1 1 1 2 1
value
2 0 6 5
1 3 3 1
5/ 7/ 8/
5 6/
1 1 1 5 1
7
2 1 5
1 1
1 4 0 5 12 is the goal state, so its
9/ 1 1 1 heuristic value is 0.
1 8 0/ 3 1/ 1 2/
2 4 1 0

Jan 2, 2014 94
1/ 2/ 3/ 4/
1 2 1 1 1 2 1
2 0 6 5
1 3 3 1
5/ 7/ 8/
5 6/
1 1 1 5 1
7
2 1 5
1 1
1 4 0 5
9/ 1 1 1
1 8 0/ 3 1/ 1 2/
2 4 1 0

CLOSED
1(12), 2(12), 6(12), 5(13), 10(13), 11(11), 12(13)

Jan 2, 2014 95
1/
5 Apply A* algorithm
3 2

3/
2/
2
4
3
4 3
4/
2
1

5/
3
2
0
6/
0
Jan 2, 2014 96
How Heuristics is developed
Consider the 8-puzzle problem. The start state of
the puzzle is a random configuration, and the
goal state is as shown.

Typically, it takes about 20 moves to get


from a random start state to the goal state,
so the search tree has a depth of? around
20.
The branching factor depends on where the blank square
is.

?
If it is in the middle of the grid, the branching factor is 4
 if it is on an edge, the branching factor is 3
if it is in a corner, the branching factor is 2
Jan 2, 2014 97
 So, an exhaustive search of the search tree would
need to examine around 320 states, which is around
3.5 billion.
In all, there
 Because how many states
are only are
9! or possible
362,880 in the 8states,
possible
thepuzzle
searchproblem?
tree could clearly be cut down
significantly by avoiding repeated states.
 It is useful to find ways to reduce the search tree
further, in order to devise a way to solve the problem
efficiently.
 A heuristic would help us to do this, by telling us
approximately how many moves a given state is from
the goal state.
Jan 2, 2014 98
1 2 3 GOAL 1 2 3
8 4 7 8 4
7 6 5 6 5

t up
l ef
right

1 2 3 1 2 3 1 2 3
7 8 4 7 8 4 7 4
6 5 6 5 6 8 5

How to decide which move is the


best?
Jan 2, 2014 99
The first heuristic we consider is to count
how many tiles are in the wrong place.
We will call this heuristic, h1(node).

In the case of the state shown, h1(node) = 8 because all


the tiles are in the wrong place.

However, this is misleading because we could imagine a


state with a heuristic value of 8 but where each tile
could be moved to its correct place in one move.
Jan 2, 2014 100
An improved heuristic, h2, takes into
account how far each tile has to move to
get to its correct state. This is achieved
by summing the Manhattan distances
of each tile from its correct
position.

h2(node) = 3 + 3 + 2 + 2 + 1 + 2 + 2 + 3 =
18

Jan 2, 2014 101


Note
h2(node) ≥ h1(node) for any node.

This means that h2 dominates h1 , which means that


a search method using heuristic h2 will always perform
more efficiently than the same search method using
h1 .

This is because h2 is more informed than h1 .

Although a heuristic must never overestimate the cost,


it is always better to choose the heuristic that gives the
highest possible underestimate of cost. The ideal
Jan 2, 2014
heuristic would thus be one that gave exactly accurate 102
A third heuristic function, h3 , takes into account the fact
that there is extra difficulty involved if two tiles have to
move past each other because tiles cannot jump over
each other.
This heuristic uses a function k(node), which is equal to
the number of direct swaps that need to be made
between adjacent tiles to move them into the correct
sequence.
h3(node) = h2(node) + (2 x k(node))

Because k(node) is at least 0, h3(node) must be greater


103
Jan 2, 2014
In many optimization problems, the path to the goal is
irrelevant; the goal state itself is the solution
Find configuration satisfying constraints, e.g., n-queens
In such cases, we can use local search algorithms -
keep a single "current" state, try to improve it
Although local search algorithms are not systematic,
they have two key advantages:
they use very little memory-usually a constant amount;
they can often find reasonable solutions in large or infinite
(continuous) state spaces for which systematic algorithms are
unsuitable.

• State space = set of "complete" configurations


A complete, local
search algorithm
always finds a goal if
one exists;
an optimal algorithm
always finds a,
global
A one-dimensional state space landscape
minimum/maximum.
in which elevation corresponds to the
objective function

Dec 24, 2015 105


Hill climbing is an example of an informed search
method because it uses information about the search
space to search in a reasonably efficient manner.
You try to climb a mountain in fog with an altimeter
but no map.
Check the height 1 foot away from your current
location in each direction: north, south, east, and
west.
As soon as you find a position where the height is
higher than your current position, move to that
location and restart the algorithm.
If all directions lead lower than your current position,
then you stop and assume you have reached the
Jan 2, 2014 106
summit.
Simple Hill Climbing
Algorithm:
1. Evaluate the initial state. If it is also goal state, then
return it & quit. Else continue with the initial state as
the current state.
2. Loop until a solution is found or until there are no new
operators left to be applied in the current state:
a. Select an operator that has not yet been applied to
the current state and apply it to produce a new state
b. Evaluate the new state
i. If it is the goal state, then return it and quit.
ii. If it is not a goal state but it is better than the
current state, then make it the current state.
iii. If it is not better than the current state, then
107
Jan 2, 2014
continue in the loop.
Hill climbing example
start 2 8 3 1 2 3
1 6 4 h = -4 goal 8 4 h=0
7 5 7 6 5

-5 -5 -2
2 8 3 1 2 3
1 4 h = -3 8 4 h = -1
7 6 5 7 6 5

-3 -4
2 3 2 3
1 8 4 1 8 4 h = -2
7 6 5 7 6 5
h = -3 -4
f(n) = -(number of tiles out of place)
Hill-climbing does not look ahead beyond the immediate
neighbors of the current state.
"Like climbing Everest in thick fog with amnesia”

• Will terminate when at local optimum.


• The order of application of operators can make a big
difference.
• In examining a search tree, hill climbing will move to the first
successor node that is “better” than the current node—in other
words, the first node that it comes across with a heuristic value
lower than that of the current node.
• Can’t see past a single move in the state space.
Jan 2, 2014 109
Example of a local maximum

1 2 5
-4
7 4
start 8 6 3 goal
1 2 5 1 2 5 1 2 5
7 4 7 4 -4 7 4 0
8 6 3 8 6 3 8 6 3
-3
1 2 5
7 4 -4
8 6 3
Steepest-Ascent Hill Climbing
A variation on simple hill climbing.
Instead of moving to the first state that is better, move
to the best possible state that is one move away.
Consider all the moves from the current state and select
the best one as the next state.
Insteepest ascent hill climbing you will always
make your next state the best successor of your
current state, and will only make a move if that
successor is better than your current state.
The order of operators does not matter.
Not just climbing to a better state, climbing up the
Jan 2, 2014 111
Algorithm: Steepest-Ascent Hill
Climbing

Jan 2, 2014 112


When Hill-climbing fails
This simple policy has three well-
known drawbacks:
 Foot hills / Local Maxima: a local
maximum as opposed to global
maximum.
 Plateaus: An area of the search space
where evaluation function is flat, thus
requiring random walk.
 Ridge: The orientation of the high
region, compared to the set of available
moves, makes it impossible to climb up.
However, two moves executed serially
may increase the height.
Jan 2, 2014 113
Hill Climbing: Ways Out
 Backtrack to some earlier node and try going in a
different direction.
 This is particularly reasonable if at that node there was
another direction that looked promising.
 This is a fairly good way of dealing with local maxima.
 Make a big jump to try to get in a new section.
 This is a particularly good way of dealing with plateaus.
 If the only rules available describe single small steps, apply
them several times in the same direction.
 Moving in several directions at once.
 Apply two or more rules before doing the test.
 This is a particularly good strategy for dealing with ridges.
Jan 2, 2014 114
Hill Climbing: Some Disadvantages Still Remain
 Hill climbing is not always very effective. It is particularly
unsuited to problems where the value of the heuristic
function drops off suddenly as you move away from a
solution.
 Hill climbing is a local method: Decides what to do next by
looking only at the “immediate” consequences of its choices.
rather than by exhaustively exploring all the consequences.
 It shares with other local methods, such as the nearest neighbor
heuristic, the advantage of being less combinatorially explosive than
comparable global methods.
 But it also shares with other local methods a lack of a guarantee that
it will be effective.
 Global information might be encoded in heuristic functions.
Jan 2, 2014 115
Blocks World Problem – the operators are: pick up one block and
put it on the table; pick up one block and put it on another one

Start Goal
A D

D C

C B

B A

Blocks World

Jan 2, 2014 116


Local heuristic:
+1 for each block that is resting on the thing it is supposed to
be resting on.
-1 for each block that is resting on a wrong thing.

Start A Goal D

0
Score? D 4
Score? C

C B

B A

Blocks World

Jan 2, 2014 117


Hill climbing will halt because all these states have lower scores than the
current state. The process has reached a local maximum that is not the
global maximum. The problem is that by purely local examination of
support structures, the current state appears to be better than any of its
successors because more blocks rest on the correct objects. To solve this
problem, it is necessary to disassemble a good local structure (the stack B
through D) because it is in the wrong global context.

D 2

B A

A 0

D From the above state,


0 what are the
possible
C
statesC
in which
D
the system
C
0 can
be? What are their scores?
B B A B A D
Jan 2, 2014 118
Global heuristic:
For each block that has the correct support structure: +1 to
every block in the support structure.
For each block that has a wrong support structure: -1 to
every block in the support structure.

Star A Goal D
t-6
Score? D
6
Score? C

C B

B A

Blocks World

Jan 2, 2014 119


This new heuristic function captures the two key aspects of this
problem:
incorrect structures are bad and should be taken apart; and
correct structures are good and should be built up.
As a result, the same hill climbing procedure that failed with the
earlier heuristic function now works perfectly.
Unfortunately, it
D 3 is not always
C possible to
construct such
B A
a perfect
A 6 heuristic
D From the above 2
state, what arefunction.
the
Cpossible statesC in which the system can
1
D C

B
be? WhatB
areA their scores?
B A D
Jan 2, 2014 120
Hill Climbing: Conclusion
• Can be very inefficient in a large, rough
problem space.

• Global heuristic may have to pay for


computational complexity.

• Often useful when combined with other


methods, getting it started right in the right
general neighbourhood.

Jan 2, 2014 121


A hill-climbing algorithm that
A purely random walk -
never makes "downhill" moves
that is, moving to a
towards states with lower value
successor chosen
(or higher cost) is guaranteed to
uniformly at random from
be incomplete, because it can
the set of successors-is
get stuck on a local
complete, but
maximum.
extremely inefficient.
Therefore, it seems reasonable to try to combine hill
climbing with a random walk in some way that yields

both efficiency and completeness.


Simulated annealing is such an algorithm
Jan 2, 2014 122
In metallurgy, annealing is the process used to temper or
harden metals and glass by heating them to a high
temperature and then gradually cooling them, thus
allowing the material to coalesce into a low -energy
 If youstate.
crystalline heat a solid past melting point and then cool it,
the structural properties of the solid depend on the
rate of cooling.
 If the liquid is cooled slowly enough, large crystals will be
formed.
 However, if the liquid is cooled quickly (quenched) the
crystals will contain imperfections.

To understand simulated annealing, let's switch


our point of view from hill climbing to gradient
descent (i.e., minimizing cost).
Jan 2, 2014 123
 A variation of hill climbing in which, at the beginning of the
process, some uphill moves may be made.
 To do enough exploration of the whole space early on, so that
the final solution is relatively insensitive to the starting state.
 Lowering the chances of getting caught at a local minima, or
plateau, or a ridge.

Simulated Annealing
Generate a new neighbor from current state.
◦ If it’s better take it.
◦ If it’s worse then take it with some probability
proportional to the temperature and the delta
(difference) between the new and old states.

Jan 2, 2014 124


Probability of transition to higher energy state is given by
function:
P=e –∆E/kt

Where ∆E is the positive change in the energy level, t is the


temperature & k is Boltzmann constant.
 Thus, in the physical valley descending that occurs during
annealing, the probability of a large uphill move is lower
than the probability of a small one.
one
 Also, the probability that an uphill move will be made
decreases as the temperature decreases.
 Thus such moves are more likely during the beginning of
the process when the temperature is high,
high and they
become less likely at the end as the temperature
Jan 2, 2014 becomes lower. 125
One way to
Convergence of simulated characterize this
AT INIT_TEMP annealing
Unconditional Acceptance process is that
downhill moves
are allowed
HILL CLIMBING Move accepted with
probability
=e
anytime.
anytime Large
-(^C/temp)

upward moves
COST FUNCTION, C

HILL CLIMBING
may occur early
on, but as the
process
HILL CLIMBING
progresses, only
relatively small
upward moves are
AT FINAL_TEMP

allowed until
NUMBER OF ITERATIONS finally the process
converges to a
local minimum 126
Jan 2, 2014
Jan 2, 2014 127
• Algorithm: Simulated Annealing
1. Evaluate the initial state. If it is also a goal state, then return it and quit. Otherwise, continue
with the initial state as the current state.
2. Initialize BEST-SO-FAR to the current state.
3. Initialize T according to the annealing schedule.
4. Loop until a solution is found or until there are no new operators left to be applied in the
current state.
(a) Select an operator that has not yet been applied to the current state & apply it to
produce a new state.
(b) Evaluate the new state. Compute ∆E = (value of current) - (value of new state)
• If the new state is a goal state, then return it and quit.
• If it is not a goal state but is better than the current state, then make it the current
state. Also set
BEST-SO-FAR to this new state.
• If it is not better than the current state, then make it the current state with probability p'
as defined. This step is usually implemented by invoking a random number generator to
produce a number in the range [0,1 ]. If that number is less than p', then the move is
accepted. Otherwise, do nothing.
Jan 2, 2014 128
Annealing Schedule
It has three/four components:

The initial value to be used for temperature.

The criteria that will be used to decide when the


temperature of the system should be reduced.

The amount by which the temperature will be


reduced each time it is changed.

There may also be a fourth component of the


schedule, namely, when to quit.
Jan 2, 2014 129
Jan 2, 2014 130
Jan 2, 2014 131
Jan 2, 2014 132
Propositional Logic,
reasoning patterns in Propositional
Logic, First order Logic, Inference in
First Order Logic – Unification & Lifting,
Forward & backward Chaining,
resolution.
A knowledge-based agent includes a knowledge base and an
inference system.
A knowledge base is a set of representations of facts of the world.
Each individual representation is called a sentence.
The sentences are expressed in a knowledge representation
language.
The agent operates as follows:
1. It TELLs the knowledge base what it perceives.
2. It ASKs the knowledge base what action it should perform.
3. It performs the chosen action.

Examples of sentences
The moon is made of paneer
If A is true then B is true
A is false
All humans are mortal
134
Knowledge Base
Knowledge Base: set of sentences represented in a
knowledge representation language and represents
assertions about the world.

ask
tell
Inference rule: when one ASKs questions of the KB, the
answer should follow from what has been TELLed to the KB
previously.
The agent maintains a knowledge base, KB, which may initially
contain some background knowledge.
Each time the agent program is called, it does three things.
1.It TELLS the knowledge base what it perceives.
2.It ASKS the knowledge base what action it should perform.
3.The agent records its choice with TELL and executes the action.
The second TELL is necessary to let the knowledge base know that
the hypothetical action has actually been executed.
Dec 24, 2015 136
• Performance measure
– gold +1000, death -1000
– -1 per step, -10 for using the arrow
• Environment
– Squares adjacent to wumpus are smelly
– Squares adjacent to pit are breezy
– Glitter iff gold is in the same square
– Shooting kills wumpus if you are facing it
– Shooting uses up the only arrow
– Grabbing picks up gold if in same square
– Releasing drops the gold in same square
• Actuators: Left turn, Right turn, Forward, Grab, Release, Shoot
• Sensors: Stench, Breeze, Glitter, Bump, Scream

137
Wumpus world characterization
• Fully Observable? No – only local perception
• Deterministic? Yes – outcomes exactly specified
• Static? Yes – Wumpus and Pits do not move
• Discrete? Yes
• Episodic? No – sequential at the level of actions
• Single-agent? Yes – The wumpus itself is
. essentially a natural feature, not
. another agent

138
A typical Wumpus world
• The agent always
starts in the field
[1,1].
• The task of the
agent is to find the
gold, return to the
field [1,1] and
climb out of the
cave.

139
Agent in a Wumpus world: Percepts
• The agent perceives
– a stench in the square containing the wumpus and in the
adjacent squares (not diagonally)
– a breeze in the squares adjacent to a pit
– a glitter in the square where the gold is
– a bump, if it walks into a wall
– a woeful scream everywhere in the cave, if the wumpus is killed
• The percepts will be given as a five-symbol list:
– If there is a stench, and a breeze, but no glitter, no
bump, and no scream, the percept is
[Stench, Breeze, None, None, None]
• The agent can not perceive its own location.
140
Exploring a Wumpus world
Directly observed:
S: stench
B: breeze
G: glitter
A: agent
Inferred (mostly):
OK: safe square
P: pit
W: wumpus

141
Exploring a wumpus world
The first step taken
by the agent in the
wumpus world.
(a) The initial
situation, after
percept [None, None,
None, None, None].
(b) After one move,
with percept
[None, Breeze, None,
None,
In 1,1 we don’t get B or S, so we know 1,2 and 2,1 areNone].
safe.
Move to 2,1.
In 2,1 we feel a breeze.
So we know there is a pit in 3,1 or 2,2.
142
Exploring a wumpus world
 So go back to 1,1
then to 1,2 where
we smell a stench.
Percept?
[Stench, None, None,
None, None]
 Stench in 1,2, so
the wumpus is in
1,3 or 2,2.
 We don't smell a stench in 2,1, so 2,2 can't be the wumpus, so 1,3 must be
the wumpus.
 We don't feel a breeze in 1,2, so 2,2 can't be a pit, so 3,1 must be a pit.
 2,2 has neither pit nor wumpus and is therefore okay.
 We move to 2,2. We don’t get any sensory input.
 So we know that 2,3 and 3,2 are ok.
 Move to 3,2, where we observe stench, breeze and glitter!
 We have found the gold and won. 143
• Can represent general knowledge about an environment by a set
of rules and facts

• Can gather evidence and then infer new facts by combining


evidence with the rules

• The conclusions are guaranteed to be correct if


– The evidence is correct
– The rules are correct
– The inference procedure is correct
-> logical reasoning

• The inference may be quite complex


– E.g., evidence at different times, combined with different rules,
etc
Entailment
• One thing follows from another
KB |= 
• KB entails sentence  if and only if  is true in
worlds where KB is true.
• g. x+y=4 entails 4=x+y
• Entailment is a relationship between sentences that
is based on semantics.
Worlds
• A world is a collection of prepositions and logical
expressions relating those prepositions
• Example:
– Propositions: MohanLovesKheer, OrangesGrowInSky, …
– Expressions: MohanLovesKheer  OrangesGrowInSky

• A proposition “says something” about the world, but


since it is atomic (you can’t look inside it to see
component parts), propositions tend to be very
specialized and inflexible

146
Models
• Models are formal definitions of possible states of the
world
• We say m is a model of a sentence  if  is true in m
• M() is the set of all models of 
• Then KB ╞  if and only if M(KB)  M()

– E.g. KB = KKR won and Dehli Dare Devils Won


– α = KKR won
M()

M(KB)
Entailment in the Wumpus World
• Situation after detecting
nothing in [1,1], moving right,
breeze in [2,1]
• What are possible models
for ? – assume only possibility
pit or no pit.

? ?

B
V ?
V
Wumpus Models

There are 3 Boolean choices ⇒ 8 possible models


(ignoring sensory data)
Wumpus Models There is a breeze
in square 2,1
One of our rules
B says:
B
Breeze in square
⇔ pit in adjacent
B
B B
square
So what is
entailed by these
B B facts and this
B
rule?

KB = wumpus-world rules + observations.


Only the three models are consistent with KB.
Wumpus Models

B
B

B
B B

B B
B

KB = wumpus world + observations


1=“[1,2] is safe”
KB |= 1
Wumpus Models

B
B

B
B B

B B
B

KB = wumpus world + observations


2=“[2,2] is safe”
KB |= 2 ??
Wumpus Models

B
B

B
B B

B B
B

KB = wumpus world + observations


2=“[2,2] is safe”
KB |= 2 ??
Wumpus Models

B
B

B
B B

B B
B

KB = wumpus world + observations


2=“[2,2] is safe”
KB |= 2 NOT!
Representing Knowledge
• The agent that solves the wumpus world can
most effectively be implemented by a
knowledge-based approach
• Need to represent states and actions, update
internal representations, deduce hidden
properties and appropriate actions
• Need a formal representation for the KB
• And a way to reason about that representation

155
Logic in general
• Logics are formal languages for representing information
such that conclusions can be drawn
• Syntax defines the sentences in the language
• Semantics define the "meaning" of sentences;
– i.e., define truth of a sentence in a world
• E.g., the language of arithmetic
• x+2 ≥ y is a sentence; x2+y > {} is not a sentence
– x+2 ≥ y is true iff the number x+2 is no less than the number
y.
– x+2 ≥ y is true in a world where x = 7, y = 1
– x+2 ≥ y is false in a world where x = 0, y = 6

156
Syntax of Propositional Logic
 TRUE and FALSE are sentences
 The propositional variables P, Q, R, … are sentences
 Parentheses around a sentence forms a sentence
 Combining sentences with the following logical connectives
forms a sentence
Symbol Example Name Sentence Name
 PQ and Conjunction
 PQ or Disjunction
 P not Negation
 PQ implies Implication
 PQ is equivalent Equivalence
(biconditional)
Jan 5, 2013 157
Jan 5, 2013 158
Idempotent P  P P P  P P

Associative (P  Q)  R P  (Q  R) (P  Q)  R P  (Q  R)

Commutative P  Q Q  P P  Q Q  P P  Q Q  P

Distributive P  (Q  R) (P  Q)  (P  R) P  (Q  R) (P  Q)  (P  R)

De Morgan ~ (P  Q) ~ P  ~ Q ~ (P  Q) ~ P ~ Q
Eliminarea
P  Q ~ P  Q
implicatiei
Eliminarea
P  Q (P  Q)  (Q  P)
implicatiei double

Jan 5, 2013 159


Symbolize the following statements
1. Ram or Mohan will make dinner.
2. Either Ram will not make dinner or Mohan will not
make dinner.
3. Ram will make dinner iff Mohan will.
4. Ram makes dinner when, only when, Mohan does
not.
5. If Ram makes dinner then Mohan will make dinner.
6. Ram will make dinner if Mohan does not make
Dinner.

Jan 5, 2013 160


• Deriving a logical conclusion by combining many propositions
and using formal logic: hence, determining the truth of
arguments.

• Definition of Argument:

• An argument is a sequence of statements in which the


conjunction of the initial statements (called the
premises/hypotheses) is said to imply the final statement
(called the conclusion).

• An argument can be presented symbolically as

(P1 Λ P2 Λ ... Λ Pn)  Q

Dec 24, 2015


where P1, P2, ..., Pn represent the hypotheses and Q represents 161
 Definition of valid argument:
 An argument is valid if whenever the hypotheses are all
true, the conclusion must also be true.
 A valid argument is intrinsically true, i.e.
(P1 Λ P2 Λ ... Λ Pn)  Q is a tautology.

 How to arrive at a valid argument?



Using a proof sequence

 Definition of Proof Sequence:


 It is a sequence of wffs in which each wff is either a hypothesis or
the result of applying one of the formal system’s derivation rules
to earlier wffs in the sequence.

Dec 24, 2015 162


Inference rules
• Logical inference is used to create new sentences
that logically follow from a given set of predicate
calculus sentences (KB).
• An inference rule is sound if every sentence X
produced by an inference rule operating on a KB
logically follows from the KB.
– (That is, any derived sentence is true; the inference rule does
not create any contradictions)
• An inference rule is complete if it is able to produce
every expression that logically follows from the KB.
– (That is, any true sentence is derivable)

163
Rule of inference Tautology Name
p q
p [ p  ( p  q)]  q Modus ponens
q
q
p q [ q  ( p  q)]   p Modus tollen
p
p q
q r [( p  q)  (q  r )]  ( p  r ) Hypothetical syllogism
p r
p q
p (( p  q)   p )  q Disjunctiv e syllogism
q
p
p  ( p  q) Addition
p q
p q
( p  q)  p Simplification
p
p
q (( p)  (q))  ( p  q) Conjunctio n
p q
p q
p  r [( p  q)  ( p  r )]  ( p  r ) Resolution
q  r
Dec 24, 2015 164
An example
Using the rules of inference to build arguments
1. It is not sunny this afternoon and it is colder than yesterday.
2. If we go swimming it is sunny.
3. If we do not go swimming then we will take a canoe trip.
4. If we take a canoe trip then we will be home by sunset.
5. We will be home by sunset

1. p  q
p It is sunny this afternoon
2. r p
q It is colder than yesterday
r We go swimming 3. r  s
s We will take a canoe trip 4. s t
t We will be home by sunset (the conclusion) 5. t

hypotheses
propositions
1. p  q Rule of inference Tautology Name
2. r p p q
p [ p  ( p  q)]  q Modus ponens
3. r  s q
q
4. s t p q [ q  ( p  q)]   p Modus tollen
p
5. t p q
q r [( p  q)  (q  r )]  ( p  r ) Hypothetical syllogism
Step
Step Reason
Reason
Reason p r
p q
1. pp  qq
 Hypothesis
Hypothesis
Hypothesis
p (( p  q)   p)  q Disjunctive syllogism
2. p Simplification
Simplifica tionusing
using(1)
(1) q
p
3. r p Hypothesis
Hypothesis p  ( p  q) Addition
p q
4. r Modus
Modustollens
tollensusing
using(2)
(2)and
and(3)
(3) p q
( p  q)  p Simplification
5. r  s Hypothesis p
p
6. s Modus ponens using (4) and (5) q (( p)  (q))  ( p  q) Conjunctio n

7. s t Hypothesis p q
p q
8. t Modus ponens using (6) and (7) p  r [( p  q)  ( p  r )]  ( p  r ) Resolution
q  r
Agents have no independent access to
the world
• The reasoning agent often gets its knowledge about the facts of
the world as a sequence of logical sentences.
• It must draw conclusions only from them , without independent
access to the world.
• Thus it is very important that the agent’s reasoning is sound!

reasoning agent

167
Wumpus world sentences
• Let Pi,j be true if there is a pit in [i, j].
• Let Bi,j be true if there is a breeze in [i, j].
• We have
– ¬ P1,1
– ¬B1,1
– B2,1
• "Pits cause breezes in adjacent squares"
– B1,1 ⇔ (P1,2 ∨ P2,1)
– B2,1 ⇔ (P1,1 ∨ P2,2 ∨ P3,1)

168
• Proposition Symbols for each i,j:
– Let Pi,j be true if there is a pit in square i,j
– Let Bi,j be true if there is a breeze in square i,j
• Sentences in KB
– “There is no pit in square 1,1”
R1:  P1,1

– “A square is breezy iff pit in a neighboring square”


R2: B1,1  (P1,2  P2,1)
R3: B1,2  (P1,1  P1,3  P2,2)

– “Square 1,1 has no breeze”, “Square 1,2 has a breeze”


R4: ~ B1,1
R:B
Using the knowledge base containing Rl through R5, show that
there is no pit in [1,2]
Assume that the
following premises are
true:
R1:  P1,1
R2: B1,1  (P1,2  P2,1)
R3: B1,2  (P1,1  P1,3  P2,2)
R4: ~ B1,1
R5: B1,2
rtp:
C: ~P1,2

Dec 24, 2015 170


Logics in General
Language Ontological Epistemological
Commitment Commitment
Propositional Logic Facts True / False / Unknown

First-Order Logic Fact, objects, relations True / False / Unknown

Temporal Logic Facts, objects, relations, True / False / Unknown


times
Probability Theory Facts Degree of belief  [0,1]

Fuzzy Logic Degree of truth  [0,1] Known interval value

Jan 5, 2013 172


• First-Order Logic assumes that the world
contains:
contains
– Objects
• E.g. people, houses, numbers, theories, colors, football
games, wars, centuries, …
– Relations
• E.g. red, round, prime, bogus, multistoried, brother of,
bigger than, inside, part of, has color, occurred after, owns,
comes between, …
– Functions
• E.g. father of, best friend, third quarter of, one more than,
beginning of, …
Jan 5, 2013 173
More expressive logic than propositional

• Constants are objects: Ram, mangoes


• Predicates are properties and relations:
– likes(Ram, mangoes) // have truth values
• Functions transform objects:
– fruit_of(mango_tree) // return a value
• Variables represent any object:
likes(X, mangoes)
• Quantifiers qualify values of variables
– True for all objects (Universal):
X. likes(X, mangoes)
– Exists at least one object (Existential):
X. likes(X, mangoes)
Jan 5, 2013 174
Eg. Represent the following statement in predicate
logic
Every one loves their mother.

Difference ?
?

 x y Mother (x, y)  Loves (x, y)


 x Loves (x, Mother (x))

Function
Predicate will return a
will return T or F value
Jan 5, 2013 175
Components of First-Order Logic
• Sentence  AtomicSentence | Sentence
|Sentence Connective Sentence
|Quantifier Variable, …Sentence
|(Sentence)
• Atomic Sentence  Predicate(Term, …)
|Term = Term
• Term  Function(Term, …)|Constant | Variable
• Connective  |||
• Quantifier  |

Jan 5, 2013 176


Quantification examples
• Everyone like chocolate
– x likes(x, chocolate)
• Someone likes chocolate
– x likes(x, chocolate)
• All children like chocolate
– x child(x)  likes(x, chocolate)
• Everyone likes chocolate unless they are allergic to it
– x likes(x, chocolate)  allergic(x, chocolate)
– x allergic (x, chocolate)  likes(x, chocolate)
• Not everyone like chocolate
– (x likes(x, chocolate))
– x likes(x, chocolate)
• No one likes chocolate
– (x likes(x, chocolate))
– x likes(x, chocolate)

Jan 5, 2013 177


Nesting Variables
• Everyone likes some kind of food
y x, food(x)  likes(y, x)
• There is a kind of food that everyone likes
x y, food(x)  likes(y, x)
• Someone likes all kinds of food
y x, food(x)  likes(y, x)
• Every food has someone who likes it
x y, food(x)  likes(y, x)

Jan 5, 2013 178


Quantification with Classes

Jan 5, 2013 179


Equality
• We allow the usual infix = operator
– Father(Ram) = Dasharath
– x, sibling(x, y)  (x=y)

• Generally, we also allow mathematical


operations when needed, e.g.
– x,y, NatNum(x)  NatNum(y) x = (y+1)  x > y

Jan 5, 2013 180


Try this!!
• In English:
– “Every Monday and Wednesday I go to John’s house
for dinner”
• In first order predicate logic:

X ((day_of_week(X, monday) day_of_week(X, weds))

(go_to(me, house_of(john)) eat(me, dinner))).

 Note the change from “and” to “or”


– Translating is problematic
Jan 5, 2013 182
 None of the students take both Arts & Maths.
The best way to start is to see what predicates are required.
Student (x)  x is a student
Takes (x, y)  subject x is taken by y
o One way is to negate the predicate that all students take both Arts
& Maths
[x Student(x)  Takes(Arts, x)  Takes(Maths, x) ]

Can it be
replaced by ?

Jan 5, 2013 183


o Another way by using existential quantifier

x Student(x)  [Takes(Arts, x)  Takes(Maths, x)]

 Only one student failed in Maths.

Failed (x, y)  Student y failed in subject x

o First show one student failed in Maths; then show he was the only
one.

x [Student(x)  Failed (Maths, x)]

x [Student(x)  Failed (Maths, x)  y[(Student(y)  (x=y)) 


Failed(Maths, y)]]

Jan 5, 2013 184


 Only one student failed in both Arts & Maths.

x [Student(x)  Failed (Maths, x)  Failed (Arts, x) 


y[(Student(y)  (x=y))  (Failed(Maths, y)  Failed (Arts,
y) ]]

 The best score in Maths is better than the best score in Arts.

Score is not a true/false value – it is a numeric value. So we need a


function which will return the score.

Function: Score(subject, student)

We need a predicate to compare two scores

Greater(x, y): x > y

Jan 5, 2013 185


 The best score in Maths is better than the best score in Arts.

One way of showing it is that for every student x who has taken Arts,
there is a student y who has taken Maths and his score is better
than the score of x in Arts.

x [ Student(x)  Takes(Arts, x)  y [ Student(y) 


Takes(Maths, y)  Greater(Score(Maths, y), Score(Arts, x)]]

Best Score in Maths > Best score in Arts


All

Jan 5, 2013 186


 No one likes a professor unless the professor is smart.

x [ Professor(x)  Smart(x)  y [ Likes(x, y)]

Jan 5, 2013 187


We define effective procedures for answering
questions posed in First- order logic.

Jan 5, 2013 188


 Universal Instantiation

o The rule of Universal Instantiation (UI) says that we


can infer any sentence obtained by substituting a
ground term (a term without variables) for the
variable.
o x Likes(x, Ice-cream) with the substitution {x /
Ram} gives us
Likes(Ram, Ice-cream)

Jan 5, 2013 189


 Existential Instantiation

o From x Likes(x, Ice-cream) we may infer


Likes(Man, Ice-cream} as long as Man does not appear
elsewhere in the Knowledge base.
o Basically, the existential sentence says there is some
object satisfying a condition, and the instantiation process
is just giving a name to that object. Naturally, that name
must not already belong to another object.
o In logic, the new name is called a
Skolem constant.
constant

Jan 5, 2013 190


 Existential Introduction

o From Likes(Ram, Ice-cream) we may infer


x Likes(x, Ice-cream)

Jan 5, 2013 191


 The law says that it is a crime for a Gaul to sell potion
formulas to hostile nations.
 The country Rome, an enemy of Gaul, has acquired
some potion formulas, and all of its formulas were sold
to it by Druid.
 Druid is a Gaul.
 Is Druid a criminal?

192
Jan 5, 2013
Predicates needed to translate these statements?
oGaul(x)
oHostile(z)
oPotion(y)
oCriminal(x)
oSells(x, y, z)  x sells y to z
oOwns(x, y)  x owns y

 The law says that it is a crime for a Gaul to sell potion


formulas to hostile nations.
x y z [Gaul(x)  Potion(y)  Hostile(z)  Sells(x, y, z) 
Criminal(x)]
Jan 5, 2013 193
 The country Rome, an enemy of Gaul, has acquired
some potion formulas, and all of its formulas were sold
to it by Druid.
Hostile(Rome)
y Potion(y)  Owns(Rome, y)
y Potion(y)  Owns(Rome, y) 
Sells(Druid, y, Rome)

 Druid is a Gaul.

Gaul (Druid)

194
Jan 5, 2013
1. x y z [Gaul(x)  Potion(y)  Hostile(z)  Sells(x, y,
z)  Criminal(x)]

2. Hostile(Rome)

3. y Potion(y)  Owns(Rome, y)

4. y Potion(y)  Owns(Rome, y) 
Sells(Druid, y, Rome)

5. Gaul (Druid)

6. Goal ???

– Criminal(Druid)
Jan 5, 2013 195
Forward Chaining
Start with the formula & reach the Goal
3. y Potion(y)  Owns(Rome, y)

Potion(P)  Owns(Rome, P) //Existential Instantiation

4. y Potion(y)  Owns(Rome, y) 
Sells(Druid, y, Rome)

Potion(P)  Owns(Rome, P) 
Sells(Druid, P, Rome) //Universal Instantiation

Jan 5, 2013 196


We will be using Unification
y Potion(y)  Owns(Rome, y)

Potion(y)  Owns(Rome, y) 
Potion(P)  Owns(Rome, P) Sells(Druid, y, Rome)
{y / P}
Hostile(Rome) Sells(Druid, P, Rome)

[Gaul(x)  Potion(y)  Hostile(z) 


Gaul(Druid) Sells(x, y, z)  Criminal(x)]

{y / P
z / Rome
x / Druid}
Criminal(Druid)
Jan 5, 2013 197
Backward Chaining
Start with the Goal & try to deduce whether it is true or
not
Goal: Criminal(Druid)

The goal matches with the RHS of the following rule, where x has
been instantiated with Druid

[Gaul(x)  Potion(y)  Hostile(z)  Sells(x, y, z)  Criminal(x)]

Jan 5, 2013 198


Criminal(Druid)
Already solved

Gaul(Druid) Potion(y) Hostile(z) Sells(Druid, y, z)

Matches with the RHS of the following rule; y is replaced by


potion P
y Potion(y)  Owns(Rome, y)  Sells(Druid, y, Rome)

Jan 5, 2013 199


Criminal(Druid)

Gaul(Druid) Potion(y) Hostile(z) Sells(Druid, y, z)

As soon as we
instantiate y with P,
then this y will also Potion(P) Owns(Rome, P)
become P

As soon as we
instantiate z with
Rome, then this z
will also become
Rome
Jan 5, 2013 200
Criminal(Druid)

Gaul(Druid) Potion(y) Hostile(z) Sells(Druid, y, z)

Potion(P) Hostile(Rome) Potion(P) Owns(Rome, P)

All are solved, so we have completed deduction

Jan 5, 2013 201


Generalized Modus Ponens (GMP)

Jan 5, 2013 202


Unification
UNIFY(p,q) = σ where SUBST(σ, p) = SUBST(σ, q)

Goal of
unification:
finding σ
Jan 5, 2013 203
Unification

Jan 5, 2013 204


Unification
P Q σ
Student(x) Student(Ram) {x/Ram}

Sells(Ram, x) Sells(x, coke) {x/coke, x/Ram}


Is it correct?

P Q σ
Student(x) Student(Ram) {x/Ram}

Sells(Ram, x) Sells(y, coke) {x/coke, y/Ram}

Jan 5, 2013 205


More Unification Examples
VARIABLE term

1 – unify(P(a,X), P(a,b)) σ = {X/b}


2 – unify(P(a,X), P(Y,b)) σ = {Y/a, X/b}
3 – unify(P(a,X), P(Y,f(a)) σ = {Y/a, X/f(a)}
4 – unify(P(a,X), P(X,b)) σ = failure
Note: If P(a,X) and P(X,b) are independent, then we can replace X
with Y and get the unification to work.

Jan 5, 2013 206


Introduction to Artificial Intelligence - Introduction to AI,
Intelligent agents, Problem Solving by Searching, Informed
Search & Exploration – Heuristic Search Strategies, Heuristic
Functions, Hill climbing, Simulated Annealing search.

Knowledge & Reasoning – Propositional Logic, reasoning


patterns in Propositional Logic, First order Logic, Inference
in First Order Logic – Unification & Lifting, Forward &
backward Chaining, resolution.

Knowledge Representation – Ontological Engineering,


Categories & Objects, Action, Situation & Events, Semantic
Networks.
Dec 24, 2015 207

You might also like