0% found this document useful (0 votes)
30 views188 pages

Unit 1

Uploaded by

Alan Wesley
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views188 pages

Unit 1

Uploaded by

Alan Wesley
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 188

Artificial Intelligence

1
Introduction

• What is AI?
• The foundations of AI
• A brief history of AI
• The state of the art
• Introductory problems

2
What is AI?

3
What is AI?

• Intelligence: “ability to learn, understand and


think” (Oxford dictionary)

• AI is the study of how to make computers


make things which at the moment people do
better.

• Examples: Speech recognition, Smell, Face,


Object, Intuition, Inferencing, Learning new
skills, Decision making, Abstract thinking
4
What is AI?

Thinking humanly Thinking rationally

Acting humanly Acting rationally

5
Acting Humanly: The Turing Test
• The Turing Test is a method of inquiry in artificial intelligence (AI) for determining
whether or not a computer is capable of thinking like a human being. The test is
named after Alan Turing, the founder of the Turing Test
• Turing proposed that a computer can be said to possess artificial intelligence if it can
mimic human responses under specific conditions.
• The original Turing Test requires three terminals, each of which is physically
separated from the other two. One terminal is operated by a computer, while the
other two are operated by humans.
• During the test, one of the humans functions as the questioner, while the second
human and the computer function as respondents. The questioner interrogates the
respondents within a specific subject area, using a specified format and context. After
a preset length of time or number of questions, the questioner is then asked to decide
which respondent was human and which was a computer.
• The test is repeated many times. If the questioner makes the correct determination in
half of the test runs or less, the computer is considered to have artificial intelligence
because the questioner regards it as "just as human" as the human respondent.

Huma
n

AI
Imitation Game
System 6
Human
Acting Humanly: The Turing Test

• Predicted that by 2000, a machine might have


a 30% chance of fooling a lay person for 5
minutes.

• Anticipated all major arguments against AI in


following 50 years.

• Suggested major components of AI:


knowledge,
reasoning, language, understanding, learning.

7
Thinking Humanly: Cognitive Modelling

• Not content to have a program correctly


solving a problem.
More concerned with comparing its reasoning
steps
to traces of human solving the same problem.

• Requires testable theories of the workings of


the
human mind: cognitive science.

8
Thinking Rationally: Laws of Thought

• Aristotle was one of the first to attempt to


codify “right thinking”, i.e., irrefutable
reasoning processes.

• Formal logic provides a precise notation and


rules for representing and reasoning with all
kinds of things in the world.

• Obstacles:
 Informal knowledge representation.
 Computational complexity and resources.
9
Acting Rationally

• Acting so as to achieve one’s goals, given


one’s beliefs.

• Does not necessarily involve thinking.

• Advantages:
 More general than the “laws of thought”
approach.
 More amenable to scientific development than human-
based approaches.

10
The Foundations of AI

• Philosophy (423 BC  present):


 Logic, methods of reasoning.
 Mind as a physical system.
 Foundations of learning, language, and rationality.

• Mathematics (c.800  present):


 Formal representation and proof.
 Algorithms, computation, decidability, tractability.
 Probability.

11
The Foundations of AI

• Psychology (1879  present):


 Adaptation.
 Phenomena of perception and motor control.
 Experimental techniques.

• Linguistics (1957  present):


 Knowledge representation.
 Grammar.


12
A Brief History of AI
• The gestation of AI (1943  1956):
 1943: McCulloch & Pitts: Boolean circuit model of
brain.
 1950: Turing’s “Computing Machinery and Intelligence”.
 1956: McCarthy’s name “Artificial Intelligence” adopted.

• Early enthusiasm, great expectations (1952  1969):


 Early successful AI programs: Samuel’s checkers,
Newell & Simon’s Logic Theorist, Gelernter’s
Geometry
Theorem Prover.
 Robinson’s complete algorithm for logical reasoning.


13
A Brief History of AI

• A dose of reality (1966  1974):


 AI discovered computational complexity.
 Neural network research almost disappeared after
Minsky & Papert’s book in 1969.

• Knowledge-based systems (1969  1979):


 1969: DENDRAL by Buchanan et al..
 1976: MYCIN by Shortliffle.
 1979: PROSPECTOR by Duda et al..



14
A Brief History of AI

• AI becomes an industry (1980  1988):


 Expert systems industry booms.
 1981: Japan’s 10-year Fifth Generation project.

• The return of NNs and novel AI (1986  present):


 Mid 80’s: Back-propagation learning algorithm
reinvented.
 Expert systems industry busts.
 1988: Resurgence of probability.
 1988: Novel AI (ALife, GAs, Soft Computing, …).
 1995: Agents everywhere.
 2003: Human-level AI back on the agenda.
15
Task Domains of AI
• Mundane Tasks:
– Perception
• Vision
• Speech
– Natural Languages
• Understanding
• Generation
• Translation
– Common sense reasoning
– Robot Control
• Formal Tasks
– Games : chess, checkers etc
– Mathematics: Geometry, logic,Proving properties of programs
• Expert Tasks:
– Engineering ( Design, Fault finding, Manufacturing planning)
– Scientific Analysis
– Medical Diagnosis
– Financial Analysis

16
Physical Symbol System

•Heart of AI is based on Newell & Simons Hypothesis/ Physical


Symbol System Hypothesis
•A PSS consists of

- Symbols - set of entities that are physical patterns


- Symbol Structures - number of instances/tokens related in
some physical way
- Processes – operate on these expressions to produce other
expressions
S1 Symbol

CC$7899ABVV

Symbol
S1 S2 S3 S4 S5 S6 Structures -
“Expressions”
S1 S2 S3 S4 S5 S6

S7 S8 S9 S10 S11 S12

S13 S14 S15 S16 S17 S18 Collection of


“Expressions”
S19 S20 S21 S22 S23 S24

S25 S26 S27 S28 S29 S30

At any given time, a PSS contains


- collection of expressions
- processes to operate on these expressions
creation, modification, reproduction, destruction
Physical Symbol System

Key Concepts

Designation – the symbol/expressions can refer to


something else.

Interpretation – the expressions can refer to its own


computational processes which the system can evoke
and execute
Examples of Symbol Systems

System Symbol Expressions Processes

Logic And, Or, Not, Propositions Rules of


(T/F) Inference

Algebra +, -, *, /, Equations Rules of Algebra


y, z, 1, 2, 3…. (2+3=5)

Digital 0, 1 00000111000…. Program


Computer .

-----Chess Chess Pieces Position of Legal Chess


pieces on the Moves
board
Physical Symbol System Hypothesis

- By Allen Newell and Herbert Simon

“A physical symbol system has the necessary


and sufficient means for general intelligent
action”
Claims

- Human thinking is a kind of symbol manipulation


- Symbol manipulation necessary for intelligence

- Machines can be intelligent, because


- Symbol manipulation is sufficient for
intelligence

Core assumption of Strong AI


Grounds of PSS

Brain similar with a physical symbol system

Psychological Experiments – decision making, learning processes

Running AI programs –chess, checkers, theorem proving,

– Productivity of Human Thoughts


Examples of PSS

System Symbol Expressions Processes

Digital 0, 1 00000111000…. Program


Computer .

-----Chess Chess Pieces Position of Legal Chess


pieces on the Moves
board
Brain Encoded in Thoughts Mental
brain Operations like
thinking
AI Program Data Data Data
AI Technique
• Intelligence requires Knowledge
• Knowledge posesses less desirable properties such as:
– Voluminous
– Hard to characterize accurately
– Constantly changing
– Differs from data that can be used
• AI technique is a method that exploits knowledge that
should be represented in such a way that:
– Knowledge captures generalization
– It can be understood by people who must provide it
– It can be easily modified to correct errors.
– It can be used in variety of situations

26
The State of the Art

• Computer beats human in a chess game.


• Computer-human conversation using speech
recognition.
• Expert system controls a spacecraft.
• Robot can walk on stairs and hold a cup of water.
• Language translation for webpages.
• Home appliances use fuzzy logic.
• ......
27
Tic Tac Toe

• Three programs are presented :


– Series increase
– Their complexity
– Use of generalization
– Clarity of their knowledge
– Extensability of their approach

28
Introductory Problem: Tic-Tac-Toe

X X
o

29
Introductory Problem: Tic-Tac-Toe
Program 1:
Data Structures:
• Board: 9 element vector representing the board, with 1-9 for
each square. An element contains the value 0 if it is blank, 1 if
it is filled by X, or 2 if it is filled with a O
• Movetable: A large vector of 19,683 elements ( 3^9), each
element is 9-element vector.
Algorithm:
1. View the vector as a ternary number. Convert it to a
decimal number.
2. Use the computed number as an index into
Move-Table and access the vector stored there.
3. Set the new board to that vector.

30
Introductory Problem: Tic-Tac-Toe

Comments:
This program is very efficient in time.

1. A lot of space to store the Move-Table.

2. A lot of work to specify all the entries in the


Move-Table.

3. Difficult to extend.

31
Introductory Problem: Tic-Tac-Toe

1 2 3
4 5 6
7 8 9

32
Introductory Problem: Tic-Tac-Toe
Program 2:
Data Structure: A nine element vector representing the board.
But instead of using 0,1 and 2 in each element, we store 2 for
blank, 3 for X and 5 for O
Functions:
Make2: returns 5 if the center sqaure is blank. Else any other
balnk sq
Posswin(p): Returns 0 if the player p cannot win on his next
move; otherwise it returns the number of the square that
constitutes a winning move. If the product is 18 (3x3x2),
then X can win. If the product is 50 ( 5x5x2) then O can win.
Go(n): Makes a move in the square n
Strategy:
Turn = 1 Go(1)
Turn = 2 If Board[5] is blank, Go(5), else Go(1)
Turn = 3 If Board[9] is blank, Go(9), else Go(3)
Turn = 4 If Posswin(X)  0, then Go(Posswin(X))
.......
33
Introductory Problem: Tic-Tac-Toe

Comments:

1. Not efficient in time, as it has to check several


conditions before making each move.

2. Easier to understand the program’s strategy.

3. Hard to generalize.

34
Introductory Problem: Tic-Tac-Toe

8 3 4
1 5 9
6 7 2
15  (8 +
5) 35
Introductory Problem: Tic-Tac-Toe

Comments:

1. Checking for a possible win is quicker.

2. Human finds the row-scan approach easier,


while
computer finds the number-counting
approach more
efficient.

36
Introductory Problem: Tic-Tac-Toe

Program 3:

1. If it is a win, give it the highest rating.

2. Otherwise, consider all the moves the opponent


could make next. Assume the opponent will make
the move that is worst for us. Assign the rating of
that move to the current node.

3. The best node is then the one with the highest


rating.
37
Introductory Problem: Tic-Tac-Toe

Comments:

1. Require much more time to consider all


possible
moves.

2. Could be extended to handle more


complicated
games.

38
Introductory Problem: Question
Answering
“Mary went shopping for a new coat. She found a red
one she really liked. When she got it home, she
discovered that it went perfectly with her favourite
dress”.

Q1: What did Mary go shopping for?

Q2: What did Mary find that she liked?

Q3: Did Mary buy anything?

39
Introductory Problem: Question
Answering
Program 1:

1. Match predefined templates to questions to


generate
text patterns.
2. Match text patterns to input texts to get answers.

“What did X Y” “What did Mary go shopping


for?”

“Mary go shopping for Z”

Z = a new coat
40
Introductory Problem: Question
Answering
Program 2:
Structured representation of sentences:

Event2: Thing1:
instance: Finding instance: Coat
tense: Past colour: Red
agent: Mary
object: Thing 1

41
Introductory Problem: Question
Answering
Program 3:
Background world knowledge:
C finds M

C leaves L C buys M

C leaves L

C takes M

42
To build a system for solving the
problem
• Define the problem precisely
• Analyse the problem
• Isolate and represent the task knowledge
• Choose best problem solving technique

43
State Space Search
• Move from initial state to final state using rules
– Initial state => Start board position
– Rules => Legal moves
– Goal => win state
– Eg. In chess “While pawn_at sqr(a,b) -> move_pawn_from(a,b)
to (c,d)
• State space representation form the basic for AI
technique.

44
A Water Jug Problem
Puzzle-Solving as Search
• You have a 4-gallon and a 3-gallon water jug
• You have a faucet with an unlimited amount of water
• You need to get exactly 2 gallons in 4-gallon jug

• State representation: (x, y)


– x: Contents of four gallon
– y: Contents of three gallon

• Start state: (0, 0)


• Goal state (2, n)
• Operators
– Fill 3-gallon from faucet, fill 4-gallon from faucet
– Fill 3-gallon from 4-gallon , fill 4-gallon from 3-gallon
– Empty 3-gallon into 4-gallon, empty 4-gallon into 3-gallon
– Dump 3-gallon down drain, dump 4-gallon down drain
Production Rules for the Water
Jug Problem
1 (x,y)  (4,y) Fill the 4-gallon jug
if x < 4

2 (x,y)  (x,3) Fill the 3-gallon jug


if y < 3

3 (x,y)  (x – d,y) if x > 0 Pour some water out of the 4-gallon jug

4 (x,y)  (x,y – d) if x > 0 Pour some water out of the 3-gallon jug

5 (x,y)  (0,y) if x > 0 Empty the 4-gallon jug on the ground

6 (x,y)  (x,0) if y > 0 Empty the 3-gallon jug on the ground

Pour water from the 3-gallon jug into


7 (x,y)  (4,y – (4 – x)) the 4-gallon jug until the 4-gallon
if x + y ≥ 4 and y > 0 jug is full
The Water Jug Problem (cont’d)
8 (x,y)  (x – (3 – y),3) Pour water from the 4-gallon jug into
if x + y ≥ 3 and x > 0 the 3-gallon jug until the 3-gallon
jug is full
9 (x,y)  (x + y, 0)
Pour all the water from the 3-gallon
if x + y ≤ 4 and y > 0 jug into the 4-gallon jug

10 (x,y)  (0, x + y) Pour all the water from the 4-gallon


if x + y ≤ 3 and x > 0 jug into the 3-gallon jug

11 (0,2)  (2,0) Pour the 2 gallons from the 3-gallon


jug into the 4-gallon jug
12 (x,2)  (0,2)
Empty the 4-gallon jug on the ground
One Solution to the Water Jug
Problem
Gallons in the Gallons in Rule
4-Gallon Jug the 3-Gallon Applied
Jug
0 0 2
0 3 9
3 0 2
3 3 7
4 2 5 or 12
0 2 9 or 11
2 0
• To provide formal description about the
problem

– Define state space


– Define Operators/rules
– Define start state
– Define goal states

50
Production system

• To structure AI programs that facilitates


describing and performing search process in
an efficient manner.
• It consists of
– A set of rules with LHS that determines applicability
of rule and RHS that describes output to be
performed
– One or more knowledgebase
– Control strategy that specifies the order in which
rules will be compared and the way of resolving the
conflicts

51
Control Strategy
• It should avoid loops

• It should be systematic

52
Breadth First Search
• Maintain queue of nodes to visit
• Evaluation
– Complete?
Yes a

- May not get trapped b c


exploring a blind
alley
d e f g h
- Best solution can be
found if multiple
solutions exist. © Daniel S. Weld 53
Depth First Search
• Maintain stack of nodes to visit
• Evaluation
– Complete?
Not for infinite spaces
a

- Requires less memory


- By chance can find b e
solution without
examining much of the
search space a d f g h

© Daniel S. Weld 54
Heuristic search
• Technique that improves efficiency of search process
by sacrificing claims of completeness
• Used to solve hard problems
• General domain heuristics: Nearest neighbour
heuristics
• Domain specific heuristic : Select locally superior
alternate at each step.

55
Heuristic Function

• Evaluates individual problem states and


determine how desirable they are
• Maps problem state descriptions to measure of
desirability usually represented as a number
• It gives an estimate whether a node is on the
desirable path to a solution
• Guides search process to solution
• Guides to follow a path when several are
available

56
Problem Characteristics

1. Is the problem decomposable?


2. Can solution steps be ignored or undone?
3. Is the universe predictable?
4. Is a good solution absolute or relative?
5. Is the solution a state or a path?
6. What is the role of knowledge?
7. Does the task require interaction with a
person?

57
58
Can solution steps be ignored?
• Mathematical theorem proving:
- Start with proving a lemma
- If not useful we ignore it and start with another one
• 8-puzzle :
- We make a wrong move and realize it
- Backtrack and find next alternative move
• Chess:
- We make a move
- Not possible to recover back

The problems are classified into 3 types


1. Ignorable, in which solution steps be ignored eg. Theorem
proving
2. Recoverable, in which solution steps are undone. Eg. 8 puzzle
3. Irrecoverable, in wich solution steps cannot be undone. Eg.59
Chess
Is problem Universe Predictable?

• 8-puzzle problem:
– When we make a move we know what will happen
– Possible to plan entire sequence on moves and be
confident on what will be the resulting state
– We can backtrack to earlier moves
• Bridge game:
– We need to plan before first play, but cannot play with
certainty
– Outcome is uncertain

60
61
Is the solution a state or a path?

• “The bank president came for the function”


• Bank has 2 interpretation

River Financial
bank institution

• Based on context it is considered as a financial


institution. Here solution is a state
• Water jug problem solution is a path

62
Role of knowledge

• Chess:
– Knowledge to represent rules for legal moves
– Requires control mechanism for search procedure
• Election survey
– Knowledge is the survey report collected from
various locations(voluminous)

63
Production system
characteristics
• Monotonic: execution of a rule never prevents
the execution of other applicable rules
• Non-monotonic: without this property
• Partially commutative: if application of a
particular sequence of rules transforms state x
into state y, then any permutation of those
rules that are allowable, transforms x into y
• Commutative: monotonic + partially
commutative

64
Problems and production system

monotonic Non-

monotonic

Partially Theorem Robot

commutative proving navigation

Non-partially Chemical Bridge

commutative synthesis
65
Problems and production system

monotonic Non-

monotonic

Partially
Ignorable Recoverable
commutative

Non-partially Irrecoverable
Irrecoverable
commutative Unpredictable
66
Sample problems - 8-puzzle

67
Sample problems - 8-puzzle

68
Sample problems - 8 queens

69
Sample problems - 8 queens

70
Sample problems - 8 queens

71
Sample problems -
Cryptarithmetic

72
Sample problems -
Cryptarithmetic

73
Heuristic Search
• Heuristic - a “rule of thumb” used to help guide search
– often, something learned experientially and recalled when
needed
• Heuristic Function - function applied to a state in a search space to
indicate a likelihood of success if that state is selected
– heuristic search methods are known as “weak methods”
because of their generality and because they do not apply a
great deal of knowledge
– the methods themselves are not domain or problem specific,
only the heuristic function is problem specific
• Heuristic Search –
– given a search space, a current state and a goal state
– generate all successor states and evaluate each with our
heuristic function
– select the move that yields the best heuristic value
• Here and in the accompanying notes, we examine various heuristic
search algorithms
– heuristic functions can be generated for a number of problems
like games, but what about a planning or diagnostic situation?
Example Heuristic Function
• Simple heuristic for 8-puzzle:
– add 1 point for each tile in the right location
– subtract 1 point for each tile in the wrong location
• Better heuristic for 8-puzzle
– add 1 point for each tile in the right location
– subtract 1 point for each move to get a tile to the right location
• The first heuristic only takes into account the local tile
position
– it doesn’t consider such factors as groups of tiles in proper
position
– we might differentiate between the two types of heuristics as
local vs global

Goal: Current: Moves:


1 2 3 4 2 3 7 down (simple: -5, better: -8)
4 5 6 5 7 1 6 right (simple: -5, better: -8)
7 8 6 8 8 left (simple: -3, better: -7)
Goal: Current: Moves:
1 2 3 4 2 3 7 down (simple: -5, better: -8)
4 5 6 5 7 1 6 right (simple: -5, better: -8)
7 8 6 8 8 left (simple: -3, better: -7)
• Simple (7 down)

4(-1) 2 (1) 3(1)


5(-1) (-1) 1(-1)
6 (-1) 7(-1) 8(-1)
• Better (7 down) : -5 -3(moves to put 1 and 5 in correct location

2 (1) 3(1)
4(-1) 5(-1) 1(-1)
6 (-1) 7(-1) 8(-1)
Example Heuristics: 8 Puzzle

From the start state, which operator do we select (which state do we move
into)? The first two heuristics would recommend the middle choice (in this case,
we want the lowest heuristic value) while the third heuristic tells us nothing
useful (at this point because too much of the puzzle is not yet solved)
Heuristic Search Algorithms

• Generate and Test


• Hill Climbing
• Best First Search
• Problem Reduction
• Constraint Satisfaction
• Means Ends Analysis

78
Generate and Test

• Depth First Search with backtracking


• Time consuming
• Space complexity is high
• By chance may find best solution in minimal
time

79
Hill Climbing
• Given an initial state perform the following
until you reach a goal state or a dead end
– generate all successor states
– evaluate each state with the heuristic function
– move to the state that is highest
• This algorithm only tries to improve during
each selection, but not find the best solution
• 3 types
– Simple HC
– Steepest Ascent HC
– Simulated Annealing
• In simple hill climbing, generate and evaluate states until you find one with a best value, then immediately move on to it

• (7)

(3) (4) (2)

(4) (2)
A

B C D

E F G H
STEEPEST ASCENT HILL CLIMBING
• In steepest ascent hill climbing, generate
all successor states, evaluate them, and
then move to the best value available (as
long as it is better than the current value)
– in both of these, you can get stuck in a local
maxima but not reach a global maxima

A
(7)

B C D
(3) (5) (2)
82
Simulated Annealing
• Another idea is simulated annealing
– the idea is that early in the search, we haven’t invested much yet, so we can
make some downhill moves
• in the 8 puzzle, we have to be willing to “mess up” part of the solution to
move other tiles into better positions
– We use objective function rather than heuristic function
– Valley descending rather than hill climbing
– It is based on process of annealing where physical substances such as metals are
melted and then gradually cooled until solid state is reached.
– Goal: To produce minimal energy final state
– Physical substances move from higher energy configuration to lower ones->
valley descending
– Probability for transition
• P= eE/kT
E - +ve change in energy level
T- Temperature
k- Boltzman constant
-
Annealing schedule: Rate at which the system is cooled

83
Algorithm – Simulated Annealing
1. Evaluate the initial state
if it goal return else continue initial state-> current state
2. Intialize BestSoFar to current state
3. Intialise T according to annealing schedule
4. Loop until solution is found or until there are no new operators left to be applied to
current state
a. Select an operator tat has not yet been applied to current state and apply it to produce new
state
b. Evaluate new state
E = val(current state)-val(new state)
c. If new state -> goal return
d. If not goal but better than current state then add it to BestSoFar
e. If not better than current state eval P= eE/kT
If P > rand(0-1) then down hill move is acceptable
f. Revise T according to annealing Schedule
5. Return BestSoFar

84
Simulated Annealing

(7)
A

B C D
(3) (4) (5)

F G
E

(4) P=e(4-3)/k*0.1 > rand(0-1) then

85
• 3 states can be reached from Hill climbing
– Local Maxima:
• A state that is better than all its neighbours but not better than some
other state farther away

A
(5) (4) (2)
B C D

E F

(4)
G

86
• Plateau :
– Flat area in search space in which all neighbouring
states have the same values.
– Not possible to determine the direction to move

B C D
(3) (3) (3)

87
• Ridge
– Special kind of local maxima
– Area in search space that is higher than the surrounding areas and itself, has a
slope
– Orientation of high region compared to set of available moves and directions in
which they move make it impossible to traverse a ridge by a single move

(3) (8)

A
(4)
B - C
-
F
D E
H
G
P(goal) 88
Blocks World Problem
Initial state Goal state
H
A
H Assign 1 if block in G

G correct location else -1 F

F  Pleatue E
D
E
C
D
- A B
C 1 H A
B 1 G
1 G 1 G
1 F
1 F 1 F
1 E
1 1 E 1 E
D
1 1 D 1 D
C
1 - H 1 C 1 C
B 1
-1 A -1 B 1 A -1 H -1 B
1
89
h(a) =4 h(b)=4 h(c)=4
Blocks World Problem
Initial state Goal state
• Assign –ve value based on H
A no.of wrong blocks above
H G
which it is resides
• Best node with high heuristic F
G
value is chosen E
F
D
E
C
D
- A B
C 7 H A
B -6 G
-5 G -5 G
-5 F
-4 F -4 F
-4 E
-3 -3 E -3 E
D
-2 -2 D -2 D
C
-1 - H -1 C -1 C
B 1
0 A B A H B

90
h(a) =-28 h(b)=-16 h(c)=-15
Best First Search
• A* Algorithm-

• A* Algorithm is one of the best and popular


techniques used for path finding and graph
traversals.
• A lot of games and web-based maps use this
algorithm for finding the shortest path efficiently.
• It is essentially a best first search algorithm.
• OR-graph implementation. => either chose path1 or
path 2

91
Working Principle:
A* Algorithm works as-
•It maintains a tree of paths originating at the
start node.
•It extends those paths one edge at a time.
•It continues until its termination criterion is
satisfied.
• A* Algorithm extends the path that minimizes
the following function-
Here, f(n) = g(n) + h(n)
•‘n’ is the last node on the path
•g(n) is the cost of the path from start node to node ‘n’
•h(n) is a heuristic function that estimates cost of the
cheapest path from node ‘n’ to the goal node
92
Algorithm-

•The implementation of A* Algorithm involves maintaining two lists- OPEN


and CLOSED.
•OPEN contains those nodes that have been evaluated by the heuristic
function but have not been expanded into successors yet.
•CLOSED contains those nodes that have already been visited.

The algorithm is as follows-

Step-01:
a
•Define a list OPEN. OPEN={a} h(a)=0+8
•Initially, OPEN consists solely of a single node, the start node S.
Step-02:
If the list is empty, return failure and exit.
Step-03:
Remove node n with the smallest value of f(n) from OPEN and move it to
list CLOSED.
•If node n is a goal state, return success and exit. OPEN={} CLOSED=a93
Step-04:
Expand node n.
a
Step-05:
b c
•If any successor to n is the goal node, return success and the solution by
tracing the path from goal node to S.
•Otherwise, go to Step-06.
Step-06: OPEN={b,c}
CLOSED ={a}
•For each successor node, (1+5) (1+4)
a
•Apply the evaluation function f to the node.
•If the node has not been in either list, add it to OPEN. b c
Step-07:
•Go back to Step-02.

94
a OPEN={b,c}
CLOSED={a}
a
(6)
(1+5) (1+4) OPEN={b,d,e}
CLOSED={a,c} b c
b (2+5) (2+7) c
e
d

(2+1) a OPEN={d,e}
CLOSED={a,b,c}

goal
b c

f d e

95
a OPEN={b,c}
CLOSED={a}
a
(6)
(1+5) (1+4) OPEN={b,d}
CLOSED={a,c,e} b c
b c (2+5)

e
{3+1) d
parent of b is e rather than a

(1+4)

b
(2+1) a OPEN={d,e}
CLOSED={a,b,c}
goal parent of c is b rather than a

b c Two conditions possible


in A* algorithm
•Successor Node in open list
d e •Successor Node in closed list
c

96
Given an initial state of a 8-puzzle problem and final state to be
reached-

• Find the most cost-effective path to reach the


final state from initial state using A* Algorithm.
• Consider g(n) = Depth of node and h(n) =
Number of misplaced tiles.

97
Solution-
•A* Algorithm maintains a tree of paths originating at the initial state.
•It extends those paths one edge at a time.
•It continues until final state is reached.

98
Problem
Reduction
Problem Reduction
• So far search strategies discussed were
for OR graphs.
– Here several arcs indicate a different ways
of solving problem.
• Another kind of structure is AND-OR
graph (tree).
• Useful for representing the solution of
problem by decomposing it into smaller
sub-problems.
Problem Reduction – Contd..
• Each sub-problem is solved and final
solution is obtained by combining
solutions of each sub-problem.
• Decomposition generates arcs that we will
call AND arc.
• One AND arc may point to any number of
successors, all of which must be solved.
• Such structure is called AND–OR graph
rather than simply AND graph.
Example of AND-OR Tree

Acquire TV

Steal TV Earn Money Buy TV


AND–OR Graph

• To find a solution in AND–OR graph, we need


an algorithm similar to A*
– with the ability to handle AND arc appropriately.
• In search for AND-OR graph, we will also use
the value of heuristic function f for each
node.
AND–OR Graph Search
• Traverse AND-OR graph, starting from the
initial node and follow the current best
path.
• Accumulate the set of nodes that are on
the best path which have not yet been
expanded.
• Pick up one of these unexpanded nodes
and expand it.
• Add its successors to the graph and
compute f (using only h) for each of them.
AND–OR Graph Search – Contd..

• Change the f estimate of newly expanded


node to reflect the new information
provided by its successors.
– Propagate this change backward through the
graph to the start.
• Mark the best path which could be
different from the current best path.
• Propagation of revised cost in AND-OR
graph was not there in A*.
Contd.. Example

• Consider AND-OR graph given on next


slide.
– Let us assume that each arc with single
successor will have a cost of 1 and each AND
arc with multiple successor will have a cost of
1 for each of its components for the sake of
simplicity.
 Here the numbers listed in the circular
brackets ( ) are estimated cost and the revised
costs are enclosed in square brackets [ ].
 Thick lines indicate paths from a given node.
A
(20) (19) initially estimated values

[18] [28] revised values


B C D
(19) (8) (9) estimated values

E [17] F G [9] H I [17] J revised values


(5) (10) (3) (4) (8) (7) estimated values
Explanation

• Initially we start from start node A and


compute heuristic values for each of its
successors, say {B, (C and D)} as {19, (8,
9)}.
• The estimated cost of paths from A to B is 20
(19 + cost of one arc from A to B) and from A
to (C and D) path is 19 ( 8+9 + cost of two
arcs A to C and A to D).
• The path from A to (C and D) seems to be
better. So expend this AND path by
expending C to {(G and H)} and D to {(I and
J)}.
Contd..

• Now heuristic values of G, H, I and J are 3,


4, 8 and 7 respectively.
• This leads to revised cost of C and D as 9
and 17 respectively.
• These values are propagated up and the
revised costs of path from A through (C
and D) is calculated as 28 (9 + 17 + cost
of arcs A to C and A to D).
• Now the revised cost of this path is 28
instead of earlier estimation of 19 and this
path is no longer a best path.
Contd..
• Then choose path from A to B for
expansion.
• After expansion we see that heuristic
value of node B is 17 thus making cost of
path from A to B to be 18.
• This path is still best path so far, so
further explore path from A to B.
• The process continues until either a
solution is found or all paths have lead to
dead ends, indicating that there is no
solution.
Cyclic Graph

• If a graph is cyclic (containing cycle) then


the algorithm discussed earlier does not
operate unless modified as follows:
– If successor is generated and found to be
already in the graph, then
• we must check that the node in the graph is not an
ancestor of the node being expanded.
• If not, then newly discovered path to the node be
entered in the graph.
• We can now state precisely the steps
taken for performing heuristic search of an
AND-OR graph.
Cyclic Graph – Contd..

• Algorithm for searching AND-OR graph is


called AO*
– Here we maintain single structure G,
representing the part of the search graph
explicitly generated so far rather than two lists,
OPEN and CLOSED as in previous algorithms.
• Each node in the graph will
– point both down to its immediate successors
and up to its immediate predecessor.
– have an h value (an estimate of the cost of a
path from current node to a set of solution
nodes) associated with it.
Cyclic Graph – Contd..

• We will not store g (cost from start to


current node) as it is not possible to
compute a single such value since there
may be many paths to the same state.
– The value g is also not necessary because of
the top-down traversing of the best-known
path which guarantees that only nodes on
the best path will ever be considered for
expansion.
– So h will be good estimate for AND/OR graph
search.
The "Solve" labeling Procedure
• A terminal node is labeled as
– "solved" if it is a goal node (representing a
solution of sub-problem)
– "unsolved" otherwise (as we can not further
reduce it)
• A non-terminal AND node labeled as
– "solved" if all of its successors are "solved".
– "unsolved" as soon as one of its successors is
labeled "unsolved".
• A non-terminal OR node is labeled as
– "solved" as soon as one of its successors is
labeled "solved".
– "unsolved" if all its successors are "unsolved".
Example
1. After one cycle
A (3)

B (2) C (1) D (1)


2. After two cycle
A (4)

B (5) C (1) D (1)


Best path
E (4) F (6)
Example – contd..
3. After three cycle
A (5)

B (5) C (2) D (1)


Solved
E (4) F (6)
G (2) H (0) I (0)

Solved Solved

4. After four cycle


A (5)
Solved

B (5) C (2) D (1)


Solved
E (4) F (6)
G (2) H (0) I (0)

Solved Solved
AO* Algorithm
• Let graph G consists initially the start node. Call it
INIT.
• Compute h(INIT).
• Until INIT is SOLVED or h(INIT) > Threshold or Un_Sol
{1
– Traverse the graph starting from INIT and follow the
current best path.
– Accumulate the set of nodes that are on the path which
have not yet been expanded or labeled as SOLVED.
– Select one of these unexpanded nodes. Call it NODE
and expand it.
– Generate the successors of NODE. If there are none,
then assign Threshold as the value of this NODE else
for each SUCC that is also not ancestor of NODE do the
following
{2
• Add SUCC to the graph G and compute h for each.
• If h(SUCC) = 0 then it is a solution node and label it as
SOLVED.
Contd..
• Propagate the newly discovered information
up the graph as follows:
– Initialize S with NODE.
– Until S is empty
{3
 Select from S, a node such that the selected node has no ancestor
in G occurring in S /* to avoid cycle */.
 Call it CURRENT and remove it from S.
 Compute the cost of each arcs emerging from CURRENT.
Cost of AND arc = (h of each of the nodes at the end of the arc)
+ (cost of arc itself)
 Assign the minimum of the costs as new h
value of CURRENT.
 Mark the best path out of CURRENT
(with minimum cost).
 Mark CURRENT node as SOLVED if all of the
nodes connected to it through the new marked
arc have been labeled SOLVED.
 If CURRENT has been marked SOLVED or if the
cost of CURRENT was just changed, then new
status must be propagated back up the graph.
So add to S all of the ancestors of CURRENT.
3
}
2
}
1
}
Longer Path May be Better
• Consider another example
1

2 3 4 Unsolvable

5 6

7 8

9 10
Explanation

• Nodes are numbered in order of their generation.


• Now node 10 is expanded at the next step and
one of its successors is node 5.
• This new path to 5 is longer than the previous
path to 5 going through 3.
• But since the path through 3 will only lead to a
solution as there is no solution to 4, so the path
through 10 is better.
Interaction between Sub goals
• AO* may fail to take into account an
interaction between sub-goals.
A (10)

D (3)

E (2) C (5)

• Assume that both C and E ultimately lead to a


solution.
Contd..
• According to AO* algorithm, both C and D must
be solved to solve A.
• Algorithm considers the solution of D as a
completely separate process from the solution of
C.
– As there is no interaction between these two sub-goals).
• Looking just at the alternative from D, the path
from node E is the best path but it turns out that
C is must anyways, so it is better also to use it to
satisfy D.
• But to solve D, the path from node E is the best
path and will try to solve E.
• AO* algorithm does not consider such
interactions, so it will find a non-optimal path.
Difference between OR & AND-
OR
• In OR no need to revisit expanded node, but in
AND-OR revisit is possible
• In OR desired path is always the one with
lowest cost. But in AND-OR it is not so.

124
Constraint Satisfaction Problems
• What is a CSP?
– Finite set of variables X1, X2, …, Xn

– Nonempty domain of possible values for each variable


D1, D2, …, Dn

– Finite set of constraints C1, C2, …, Cm


• Each constraint Ci limits the values that variables can take,
• e.g., X1 ≠ X2
– Each constraint Ci is a pair <scope, relation>
• Scope = Tuple of variables that participate in the
constraint.
• Relation = List of allowed combinations of variable values.
May be an explicit list of allowed combinations.
May be an abstract relation allowing membership testing
and listing.
8-Queens

• Variables: Queens, one per column


– Q1, Q2, …, Q8

• Domains: row placement, {1,2,…,8}

• Constraints:
• Qi != Qj (j != i)
• |Qi – Qj| != |i – j|
Sudoku as a Constraint Satisfaction Problem
(CSP)
1 2 3 4 5 6 7 8
9
A
B
• Variables: 81 variables C
– A1, A2, A3, …, I7, I8, I9 D
E
– Letters index rows, top to bottom F
G
– Digits index columns, left to right H
• Domains: The nine positive digits I

– A1  {1, 2, 3, 4, 5, 6, 7, 8, 9}
– Etc.
• Constraints: 27 Alldiff constraints
– Alldiff(A1, A2, A3, A4, A5, A6, A7, A8, A9)
– Etc.
• [Why constraint satisfaction?]
CSP example: map coloring

• Variables: WA, NT, Q, NSW, V, SA, T


• Domains: Di={red,green,blue}
• Constraints:adjacent regions must have different
colors.
• E.g. WA  NT
CSP example: map coloring

• Solutions are assignments satisfying all constraints, e.g.

{WA=red,NT=green,Q=red,NSW=green,V=red,SA=blue,T=
green}
Varieties of constraints
• Unary constraints involve a single variable.
– e.g. SA  green

• Binary constraints involve pairs of variables.


– e.g. SA  WA

• Higher-order constraints involve 3 or more variables.


– Professors A, B,and C cannot be on a committee together
– Can always be represented by multiple binary constraints

• Preference (soft constraints)


– e.g. red is better than green often can be represented by a cost for
each variable assignment
– combination of optimization with CSPs
CSP Example: Cryptharithmetic
puzzle

7 3 4
7 3 4
----------
14 6 8
----------

O=4
R=8
W=3
U=6
T=7
F=1
CSP Example: Cryptharithmetic
puzzle
Means-Ends Analysis
• We have studied the strategies which can reason either in forward or
backward, but a mixture of the two directions is appropriate for solving a
complex and large problem.
• Such a mixed strategy, make it possible that first to solve the major part of
a problem and then go back and solve the small problems arise during
combining the big parts of the problem. Such a technique is called Means-
Ends Analysis.
• Means-Ends Analysis is problem-solving techniques used in Artificial
intelligence for limiting search in AI programs.
• It is a mixture of Backward and forward search technique.
• The MEA technique was first introduced in 1961 by Allen Newell, and
Herbert A. Simon in their problem-solving computer program, which was
named as General Problem Solver (GPS).
• The MEA analysis process centered on the evaluation of the difference
between the current state and goal state.

133
How means-ends analysis
Works:
• means-ends analysis process can be applied
recursively for a problem. It is a strategy to control
search in problem-solving. Following are the main
Steps which describes the working of MEA technique
for solving a problem.
• First, evaluate the difference between Initial State
and final State.
• Select the various operators which can be applied
for each difference.
• Apply the operator at each difference, which
reduces the difference between the current state
and goal state.

134
Operator Subgoaling
• In the MEA process, we detect the differences
between the current state and goal state.
• Once these differences occur, then we can apply an
operator to reduce the differences
• But sometimes it is possible that an operator cannot
be applied to the current state.
• So we create the subproblem of the current state, in
which operator can be applied, such type of
backward chaining in which operators are selected,
and then sub goals are set up to establish the
preconditions of the operator is called Operator
Subgoaling.

135
Algorithm for Means-Ends
• Let's we take Current state asAnalysis
CURRENT and Goal State as GOAL, then
following are the steps for the MEA algorithm.
• Step 1: Compare CURRENT to GOAL, if there are no differences between
both then return Success and Exit.
• Step 2: Else, select the most significant difference and reduce it by doing
the following steps until the success or failure occurs.
– Select a new operator O which is applicable for the current difference,
and if there is no such operator, then signal failure.
– Attempt to apply operator O to CURRENT. Make a description of two
states.
i) O-Start, a state in which O?s preconditions are satisfied.
ii) O-Result, the state that would result if O were applied In O-start.
– If(First-Part <------ MEA (CURRENT, O-START)
And
(LAST-Part <----- MEA (O-Result, GOAL), are successful, then signal
Success and return the result of combining FIRST-PART, O, and LAST-
PART.

136
Example- Robo navigation
o1
o1 o2
o2

loc1 loc2
Operator Pre-cond Result
PUSH(obj,loc) At(Robo,loc) and Large(obj) At(obj,loc) and
and at(robo,loc
Clear(obj) and armempty
CARRY(obj,loc At(Robo,loc) and small(obj) and At(obj,loc) and
) Clear(obj) and armempty at(robo,loc

WALK(loc) Move At(robo,loc)


PICKUP(obj) At(robo,obj) Holding(obj)
PUTDOWN(obj Holding(obj) Not holding(obj)
)
137
OLACE(o1,o2) At(robo,o2) and holding(o1) On(o1,o2)
Difference Table

• It shows which operator is best to perform an


operation
PUSH CARR WALK PICK PUT PLACE
Y UP DOW
N
Move obj * *
Move Robo *
Clear obj * *
Get obj1 on *
obj2
Arm empty *
Holding Obj *
138
Example- Robo navigation
o1
o1 put put
wal pick
put
dow
pick
put
dow PUSH
wal pick wal
dow
wal pick wal
dow o2
o2 k up
n
up
n
k up k
n
k up k
n

loc1 loc2
Operator Pre-cond Result
PUSH(obj,loc) At(Robo,loc) and Large(obj) At(obj,loc) and
and at(robo,loc
Clear(obj) and armempty
CARRY(obj,loc At(Robo,loc) and small(obj) and At(obj,loc) and
) Clear(obj) and armempty at(robo,loc

WALK(loc) Move At(robo,loc)


PICKUP(obj) At(robo,obj) Holding(obj)
PUTDOWN(obj Holding(obj) Not holding(obj)
)
139
OLACE(o1,o2) At(robo,o2) and holding(o1) On(o1,o2)
Intelligent Agents
Agents
• An agent is anything that can be viewed as
perceiving its environment through sensors
and acting upon that environment through
actuators
• Human agent: eyes, ears, and other organs
for sensors;
• hands,legs, mouth, and other body parts for
actuators
• Robotic agent: cameras and infrared range
finders for sensors;
• various motors for actuators
Agents and environments

• The agent function maps from percept histories


to actions:
[f: P*  A]
• The agent program runs on the physical
architecture to produce f
• agent = architecture + program
Vacuum-cleaner world

• Percepts: location and contents, e.g., [A,Dirty]


• Actions: Left, Right, Suck, NoOp
Rational agents
• An agent should strive to "do the right thing",
based on what it can perceive and the actions
it can perform. The right action is the one that
will cause the agent to be most successful
• Performance measure: An objective criterion
for success of an agent's behavior
• E.g., performance measure of a vacuum-
cleaner agent could be amount of dirt cleaned
up, amount of time taken, amount of electricity
consumed, amount of noise generated, etc.
Rational agents

• Rational Agent: For each possible percept


sequence, a rational agent should select
an action that is expected to maximize its
performance measure, given the evidence
provided by the percept sequence and
whatever built-in knowledge the agent
has.
Rational agents

•Agents can perform actions in order to modify


future percepts so as to obtain useful information
(information gathering, exploration)
•An agent is autonomous if its behavior is
determined by its own experience (with ability to
learn and adapt)
PEAS
• PEAS: Performance measure, Environment,
Actuators, Sensors
• Must first specify the setting for intelligent
agent design
• Consider, e.g., the task of designing an
automated taxi driver:
– Performance measure
– Environment
– Actuators
– Sensors
PEAS
• Must first specify the setting for intelligent
agent design
• Consider, e.g., the task of designing an
automated taxi driver:
– Performance measure: Safe, fast, legal,
comfortable trip, maximize profits
– Environment: Roads, other traffic, pedestrians,
customers
– Actuators: Steering wheel, accelerator, brake,
signal, horn
– Sensors: Cameras, sonar, speedometer, GPS,
odometer, engine sensors, keyboard
PEAS

• Agent: Medical diagnosis system


• Performance measure: Healthy patient,
minimize costs, lawsuits
• Environment: Patient, hospital, staff
• Actuators: Screen display (questions, tests,
diagnoses, treatments, referrals)
• Sensors: Keyboard (entry of symptoms,
findings, patient's answers)
PEAS

• Agent: Part-picking robot


• Performance measure: Percentage of parts in
correct bins
• Environment: Conveyor belt with parts, bins
• Actuators: Jointed arm and hand
• Sensors: Camera, joint angle sensors
PEAS

• Agent: Interactive English tutor


• Performance measure: Maximize student's
score on test
• Environment: Set of students
• Actuators: Screen display (exercises,
suggestions, corrections)
• Sensors: Keyboard
Environment types
• Fully observable (vs. partially observable): An agent's sensors
give it access to the complete state of the environment at each
point in time.(8 puzzle)else partially observable(Taxi)
• Deterministic (vs. stochastic): The next state of the environment
is completely determined by the current state and the action
executed by the agent.-Vacuum cleaner else it is stochastic -
Taxi (If the environment is deterministic except for the actions of
other agents, then the environment is strategic)-Chess
• Episodic (vs. sequential): The agent's experience is divided into
atomic "episodes" (each episode consists of the agent
perceiving and then performing a single action), and the choice
of action in each episode depends only on the episode itself.-
Taxi, If actions performed in a sequence – sequential - chess
Environment types
• Static (vs. dynamic): The environment is
unchanged while an agent is deliberating.-8puzzle
(The environment is semidynamic if the
environment itself does not change with the
passage of time but the agent's performance score
does)-Taxi
• Discrete (vs. continuous): A limited number of
distinct, clearly defined percepts and actions.-chess
• Single agent (vs. multiagent): An agent operating
by itself in an environment.
Environment types
Chess with Crossword Taxi driving
a clock

Fully observable Yes Yes No

Deterministic Strategic Yes No

Episodic No sequential No

Static Semi Yes No

Discrete Yes Yes No

Single agent No No yes

• The environment type largely determines the agent design


• The real world is (of course) partially observable, stochastic, sequential, dynamic,
continuous, multi-agent

Agent functions and programs

• An agent is completely specified by the agent


function mapping percept sequences to
actions
• One agent function (or a small equivalence
class) is rational
• Aim: find a way to implement the rational
agent function concisely
Table-lookup agent

• Uses a percept sequence / action table in


memory to
• find the next action. Implemented as a
(large) lookup table.
• Drawbacks:
– Huge table
– Take a long time to build the table
– No autonomy
– Even with learning, need a long time to learn the
table entries
Agent types

Four basic types in order of increasing


generality:
•Simple reflex agents
•Model-based reflex agents
•Goal-based agents
•Utility-based agents

Simple reflex agents Agent selects actions on the basis
of current percept only.

If tail-light of car in front is red,


then brake.

Agents do not have memory of past world states or


percepts.
So, actions depend solely on current percept.
Action becomes a “reflex.”

Uses condition-action rules.


Simple reflex agents

• Eg. Vacuum Cleaner


• Used in fully observable environment
• Function Simple-Reflex(percept) returns Action
state <- interpret-I/P(percept)
rule <- Rulematch(state,rules)
action<- Ruleaction(rule)
Return action
Model-based reflex agents
Model-based reflex agents
• To model partially observable environment
• To keep track of part of world it can’t see now.
• Current percept is combined with internal state to
generate updated description
• Eg. Car driving- to apply brake -> current road condition
and speed at current situation
• Function Model-Reflex(percept) returns Action
state <- updatestate(state, action percept)
rule <- Rulematch(state,rules)
action<- Ruleaction(rule)
Return action
Goal-based agents

Agent keeps track of the world state as well as set of goals it’s trying to achieve: chooses
actions that will (eventually) lead to the goal(s).
More flexible than reflex agents  may involve search and planning
Goal-based agents

• Key difference wrt Model-Based Agents:


In addition to state information, have goal
information that describes desirable situations to
be achieved.
• Agents of this kind take future events into consideration.
• What sequence of actions can I take to achieve
certain goals?
• Choose actions so as to (eventually) achieve a (given or
computed) goal.
• Eg. Automated taxi
e : g
du akin Utility-based agents
l
M M o
i o n
ci s
De

-selection of best path -TSP


Utility-based agents

• When there are multiple possible alternatives, how


to decide which one is best?
• Goals are qualitative: A goal specifies a crude
distinction between a happy and unhappy state, but
often need a more general performance measure
that describes “degree of happiness.”
• Utility function U: State  R indicating a measure of
success or happiness when at a given state.
• Important for making tradeoffs: Allows decisions
comparing choice between conflicting goals, and
choice between likelihood of success and
importance of goal (if achievement is uncertain).

Use decision theoretic models: e.g., faster vs. safer.


Learning agents

Performance element- select external action


Learning element – for making improvements
Critic- tells how well an agent is doing with respect to fixed performance standard
Problem generator – suggest actions that leads to new information experience
A knowledge-based agent

• A knowledge-based agent includes a knowledge


base and an inference system.
• A knowledge base is a set of representations of facts
of the world.
• Each individual representation is called a sentence.
• The sentences are expressed in a knowledge
representation language.
• The agent operates as follows:
1. It TELLs the knowledge base what it perceives.
2. It ASKs the knowledge base what action it should
perform.
3. It performs the chosen action.

167
Architecture of a
knowledge-based agent
• Knowledge Level.
– The most abstract level: describe agent by saying what it
knows.
– Example: A taxi agent might know that the Golden Gate Bridge
connects San Francisco with the Marin County.
• Logical Level.
– The level at which the knowledge is encoded into sentences.
– Example: Links(GoldenGateBridge, SanFrancisco,
MarinCounty).
• Implementation Level.
– The physical representation of the sentences in the logical
level.
– Example: ‘(links goldengatebridge sanfrancisco
marincounty)

168
Function KB-agent(percept) returns Action
static KB, a knowlegebase, t counter(time)
TELL(KB, MAKE-PERCEPT-SENTENCE(percept, t)
action<- ASK(KB,MakeAction_query(t))
TELL(KB, MAKE-ACT-SENTENCE(action, t)
t=t+1
Return action

Agent program does 2 things


•Tells K/W base what it percepts
•Asks K/W what action to perform
•Once action chosen records choice with TELL and execute action
•2nd TELL let K/W base to know that action has been executed.

169
Game
Playing
How to play a game
• A way to play such a game is to:
– Consider all the legal moves you can make
– Compute the new position resulting from each move
– Evaluate each resulting position and determine which
is best
– Make that move
– Wait for your opponent to move and repeat
• Key problems are:
– Representing the “board”
– Generating all legal next boards
– Evaluating a position
Evaluation function
• Evaluation function or static evaluator is used
to evaluate the “goodness” of a game position.
– Contrast with heuristic search where the evaluation
function was a non-negative estimate of the cost from the
start node to a goal and passing through the given node
• The zero-sum assumption allows us to use a single
evaluation function to describe the goodness of a
board with respect to both players.
– f(n) >> 0: position n good for me and bad for you
– f(n) << 0: position n bad for me and good for you
– f(n) near 0: position n is a neutral position
– f(n) = +infinity: win for me
– f(n) = -infinity: win for you
Evaluation function examples
• Example of an evaluation function for Tic-Tac-Toe:
f(n) = [# of 3-lengths open for me] - [# of 3-lengths open for
you]
where a 3-length is a complete row, column, or diagonal
• Alan Turing’s function for chess
– f(n) = w(n)/b(n) where w(n) = sum of the point value of
white’s pieces and b(n) = sum of black’s
• Most evaluation functions are specified as a weighted
sum of position features:
f(n) = w1*feat1(n) + w2*feat2(n) + ... + wn*featk(n)
• Example features for chess are piece count, piece
placement, squares controlled, etc.
Game trees

• Problem spaces for typical games are


represented as trees
• Root node represents the current
board configuration; player must decide
the best single move to make next
• Static evaluator function rates a board
position. f(board) = real number with
f>0 “white” (me), f<0 for black (you)
• Arcs represent the possible legal moves for a player
• If it is my turn to move, then the root is labeled a "MAX"
node; otherwise it is labeled a "MIN" node, indicating my
opponent's turn.
• Each level of the tree has nodes that are all MAX or all
MIN; nodes at level i are of the opposite kind from those
at level i+1
Minimax procedure
• Create start node as a MAX node with current board
configuration
• Expand nodes down to some depth (a.k.a. ply) of
lookahead in the game
• Apply the evaluation function at each of the leaf nodes
• “Back up” values for each of the non-leaf nodes until a
value is computed for the root node
– At MIN nodes, the backed-up value is the minimum of the values
associated with its children.
– At MAX nodes, the backed-up value is the maximum of the values
associated with its children.
• Pick the operator associated with the child node whose
backed-up value determined the value at the root
Minimax Algorithm
2

2 1 2 1

2 7 1 8 2 7 1 8 2 7 1 8

This is the move 2


Static evaluator selected by minimax
value
2 1
MAX
MIN 2 7 1 8
Partial Game Tree for Tic-Tac-
Toe

• f(n) = +1 if the position is a


win for X.
• f(n) = -1 if the position is a
win for O.
• f(n) = 0 if the position is a
draw.
Minimax Tree
MAX node

MIN node

value computed
f value by minimax
Alpha-beta pruning

• We can improve on the performance of the


minimax algorithm through alpha-beta
pruning
• Basic idea: “If you have an idea that is surely
bad, don't take the time to see how truly awful
it is.” -- Pat
MAX >=2Winston • We don’t need to compute
the value at this node.
MIN =2 <=1 • No matter what it is, it can’t
affect the value of the root
MAX node.
2 7 1 ?
Alpha-beta pruning
• Traverse the search tree in depth-first order
• At each MAX node n, alpha(n) = maximum value
found so far
• At each MIN node n, beta(n) = minimum value found
so far
– Note: The alpha values start at -infinity and only increase, while
beta values start at +infinity and only decrease.
• Beta cutoff: Given a MAX node n, cut off the search
below n (i.e., don’t generate or examine any more of
n’s children) if alpha(n) >= beta(i) for some MIN node
ancestor i of n.
• Alpha cutoff: stop searching below MIN node n if
beta(n) <= alpha(i) for some MAX node ancestor i of n.
Working of Alpha-Beta Pruning:
Let's take an example of two-player search tree to understand the working
of Alpha-beta pruning
Step 1: At the first step the, Max player will start first move from node A
where α= -∞ and β= +∞, these value of alpha and beta passed down to
node B where again α= -∞ and β= +∞, and Node B passes the same value
to its child D.
Step 2: At Node D, the value of α will be calculated as its turn for Max. The value of α is
compared with firstly 2 and then 3, and the max (2, 3) = 3 will be the value of α at node D and
node value will also 3.

181
• Step 3: Now algorithm backtrack to node B, where the value of β
will change as this is a turn of Min, Now β= +∞, will compare with
the available subsequent nodes value, i.e. min (∞, 3) = 3, hence at
node B now α= -∞, and β= 3.

182
• Step 4: At node E, Max will take its turn, and the value of alpha will
change. The current value of alpha will be compared with 5, so max (-∞,
5) = 5, hence at node E α= 5 and β= 3, where α>=β, so the right
successor of E will be pruned, and algorithm will not traverse it, and the
value at node E will be 5.

183
• Step 5: At next step, algorithm again backtrack the tree, from node B to
node A. At node A, the value of alpha will be changed the maximum
available value is 3 as max (-∞, 3)= 3, and β= +∞, these two values now
passes to right successor of A which is Node C.
• At node C, α=3 and β= +∞, and the same values will be passed on to node
F.
• Step 6: At node F, again the value of α will be compared with left child
which is 0, and max(3,0)= 3, and then compared with right child which is 1,
and max(3,1)= 3 still α remains 3, but the node value of F will become 1.

184
• Step 7: Node F returns the node value 1 to node C, at C α= 3
and β= +∞, here the value of beta will be changed, it will
compare with 1 so min (∞, 1) = 1. Now at C, α=3 and β= 1, and
again it satisfies the condition α>=β, so the next child of C
which is G will be pruned, and the algorithm will not compute
the entire sub-tree G.

185
• Step 8: C now returns the value of 1 to A here the best value for A is
max (3, 1) = 3. Following is the final game tree which is the showing
the nodes which are computed and nodes which has never computed.
Hence the optimal value for the maximizer is 3 for this example.

186
Alpha-beta algorithm
function MAX-VALUE (state, α, β)
;; α = best MAX so far; β = best MIN
if TERMINAL-TEST (state) then return UTILITY(state)
v := -∞
for each s in SUCCESSORS (state) do
v := MAX (v, MIN-VALUE (s, α, β))
if v >= β then return v
α := MAX (α, v)
end
return v

function MIN-VALUE (state, α, β)


if TERMINAL-TEST (state) then return UTILITY(state)
v := ∞
for each s in SUCCESSORS (state) do
v := MIN (v, MAX-VALUE (s, α, β))
if v <= α then return v
β := MIN (β, v)
end
return v
Effectiveness of alpha-beta
• Alpha-beta is guaranteed to compute the same value for
the root node as computed by minimax, with less or equal
computation
• Worst case: no pruning, examining bd leaf nodes, where
each node has b children and a d-ply search is performed
• Best case: examine only (2b)d/2 leaf nodes.
– Result is you can search twice as deep as minimax!
• Best case is when each player’s best move is the first
alternative generated
• In Deep Blue, they found empirically that alpha-beta
pruning meant that the average branching factor at each
node was about 6 instead of about 35!

You might also like