18CSC305J - Artificial Intelligence: Unit - 2
18CSC305J - Artificial Intelligence: Unit - 2
18CSC305J - Artificial Intelligence: Unit - 2
UNIT – 2
Unit 2 List of Topics
1 2
3 4
Blind strategies BFS
Search Methods
BFS characteristics
Completeness: if the branching factor is ‘b’ and the goal node is
at depth d, BFS will eventually find it.
Optimality: BFS is optimal if path cost is a non-decreasing
function of depth.a
Time complexity:
1 + b + b2 + b3 + . . . + bd + b(bd − 1) = O(bd+1).
Space complexity: O(bd+1).b
a
Otherwise, the shallowest node may not necessarily be optimal.
b
b branching factor; d depth of the goal node
spring 2011
Blind Search : Depth First Search (DFS)
Implementation:
fringe = LIFO queue, i.e., put successors at front.
1 2 3
4 5 N+1
…….
Search Methods
DFS characteristics
Small space requirements: only the path to the current node and
the siblings of each node in the path are stored.
Backtracking search generates only one successor for each
node.
Completeness: no, if the expanded subtree has an
infinite depth.
Optimality: no, if a solution located deeper, but located in a
subtree expanded earlier, is found.
Time complexity: O(b )m.
Space complexity: O(bm) (linear!).
spring 2011
Search Methods
In the above example , If we fix the depth limit to 2, DLS can be carried out similarly to
the DFS until the goal node is found to exist in the search domain of the tree.
spring 2011
Unit 2 List of Topics
DEPTH
LIMITED
SEARCH
Blind Search : Iterative Deepening DFS
(ID-DFS)
Blind Search : Iterative Deepening DFS Search Methods
(ID-DFS)
IDS characteristics
Completeness: yes.
Optimality: yes, if step cost = 1.
Time complexity:
(d + 1)b0 + db1 + (d − 1)b2 + . . . + bd = O(bd ).
Space complexity: O(bd).
Conclusion
IDS exhibits better performance, because it does not expand other
nodes at depth d.
APPLICATION: MAZE GAME
• BFS SOLUTION?
– S-1-2-3-5-8-10-12-14-16-19-G
– SEARCH SOLUTION FOUND IN 12
STEPS
DFS SOLUTION?
S-1-2-3-6-5-8-9-10-11-13-16-18-G
SEARCH SOLUTION FOUND IN 14 STEPS
APPLICATION: MAZE GAME
• BFS SOLUTION?
– S-1-2-3-5-8-10-12-14-16-19-G
– SEARCH SOLUTION FOUND IN 12 STEPS
5 15 19
S
6 7 8 17 18
1 G 21
10
9 20 21
2
12
3
11
19
x
x
14
5 4 13
16
15
19
APPLICATION: MAZE GAME
• BFS SOLUTION?
– S-1-2-3-5-8-10-12-14-16-19-G
– SEARCH SOLUTION FOUND IN 12 STEPS
5 15 19
S
6 7 8 17 18
1 G 21
10
9 20 21
2
12
3
11
19
x
x
14
5 4 13
16
15
19
Blind Search : Uniform Cost Search
Minimum is S->D->C->G2
And also G2 is one of the destination nodes thus we found our path.
In this way we can find the path with the minimum cumulative cost from a start node to
ending node – S->D->C->G2 with cost total cost as 13(marked with green color).
Uniform Cost Search
Implementation: fringe =
queue ordered by path cost
Equivalent to breadth-first if
all step costs all equal.
Breadth-first is only optimal
if step costs is increasing
with depth.
(e.g. constant). Can we
guarantee optimality for
any step cost?
Uniform-cost Search:
Expand node with
smallest path cost g(n).
Blind strategies UCS
Search Methods
Search Methods
32
Summary of Blind Search Algorithms
Complete? Yes No
33
Summary of Blind Search Algorithms
35
Unit 2 List of Topics
38
Generate-and-test
Example - Traveling Salesman Problem
(TSP)
• Traveler needs to visit n cities.
• Know the distance between each pair of cities.
• Want to know the shortest route that visits all
the cities once.
• n=80 will take millions of years to solve
exhaustively!
39
Generate-and-test
TSP Example
A 6 B
1 2
5 3
D 4 C
40
Generate-and-test Example
cities:
1. A - B - C - D B C D
2. A - B - D - C
3. A - C - B - D
4. A - C - D - B C D B D C B
...
D C D B B C
41
Best First Search Algorithms
• Implementation:
Order the nodes in fringe increasing order of cost.
• Special cases:
– greedy best-first search
– A* search
Romania with straight-line dist.
Greedy best-first search
c
b g
a goal state
start state
8 7 6 5
1 2 3 4
g(n)=0, h(n)=11, f(n)=11 g(n)=1, h(n)=2, f(n)=3 g(n)=2, h(n)=9, f(n)=11 g(n)=3, h(n)=8, f(n)=11
1 2 (10) 3 4 5 6 7
2 (10) 3 4 5 6 7 8
3 4 5 6 7 8 9 (4)
4 5 6 7 8 9 (4) 10
5 6 7 8 9 (4) 10 11
6 7 8 9 (4) 10 11 12
7 8 9 (4) 10 11 12 X
8 9 (4) 10 11 12 X
9 (4) 10 11 12 X
10 11 12 X
11 12 X
Demystifying AI algorithms
12 X
*
A search example
A* search example
*
A search example
*
A search example
*
A search example
*
A search example
try yourself !
6 1 straight-line distances
3 A D F 1
h(S-G)=10
2 h(A-G)=7
S 4 8 G
B E h(D-G)=1
h(F-G)=1
1 20 h(B-G)=10
C h(E-G)=8
h(C-G)=20
h(G-G)=0
The graph above shows the step-costs for different paths going from the start (S) to
the goal (G). On the right you find the straight-line distances.
1. Draw the search tree for this problem. Avoid repeated states.
2. Give the order in which the tree is searched (e.g. S-C-B...-G) for A* search.
Use the straight-line dist. as a heuristic function, i.e. h=SLD,
and indicate for each node visited what the value for the evaluation function, f, is.
Admissible heuristics
63
Admissible heuristics – 8 Puzzle Problem
Try it out!
START
1 2 3
7 8 4
6 5
GOAL
1 2 3
8 4
7 6 5
64
Admissible heuristics – 8 Puzzle Problem
START
1 2 3
START
7 8 4
1 2 3
6 5
7 8 4
6 5
START START START
1 2 3 1 2 3 1 2 3
GOAL 7 4 7 8 4 7 8 4
1 2 3 6 8 5 6 5 6 5
8 4
7 6 5 H=5 H=4 H=6
65
Admissible heuristics
• h1(S) = ?
• h2(S) = ?
66
Evaluation of Search Algorithms
Completeness
Is the algorithm guaranteed to nd a solution if one exists?
Optimality
When the algorithm nds a solution, is this the optimal one?
Time complexity
How long does it take to nd a solution?a
a
Often measured as number of nodes generated during search.
Space complexity
How much memory is needed to perform the search?a
a
Often measured as maximum number of nodes stored in memory.
Unit 2 List of Topics
E.g.,
1. forInitialize:
the 8-puzzle:Set G* = {s}, f(s) = h(s)
• h1(n) = number If s of
∈ misplaced tiles
T, label s as SOLVED
• h2(n) = total Manhattan distance
2. Terminate: If s is SOLVED, then terminate
(i.e., no. of squares from desired location of each tile)
3. Select: Select a non-terminal leaf node n from the
marked sub-tree below s in G*
4. Expand: Make explicit the successors of n
For each successor, m, not already in G*:
Set f(m) = h(m)
If m is a terminal node, label m as SOLVED
5. Cost Revision: Call cost-revise(n)
•6.h1(S) =?
Loop: Go To Step 2.
• h2(S) = ?
69
Cost Revision in AO*: cost-revise(n)
E.g.,
1. for the Z8-puzzle:
Create = {n}
2.
• h1(n) If Z =
= {number
} return of misplaced tiles
3. Select a node m from Z such that m has no descendants in Z
• h2(n) = total Manhattan distance
4. If m is an AND node with successors r1, r2, … rk:
(i.e., no. of squares from desired location of each tile)
Set f(m) = Σ [ f(ri) + c(m, ri) ]
Mark the edge to each successor of m
If each successor is labeled SOLVED, then label m as SOLVED
• If m is an OR node with successors r , r , … r :
1 2 k
Set f(m) = min { f(r ) + c(m, r ) }
i i
Mark the edge to the best successor of m
If the marked successor is labeled SOLVED, label m as SOLVED
1. If the cost or label of m has changed, then insert those parents of m into Z for which
• h (S) = ?
m is a marked successor
1Go to Step 2.
2.
• h2(S) = ?
70
Searching OR Graphs
E.g.,
• for
Howthe 8-puzzle:
does AO* fare when the graph has only OR nodes?
• •h1(n)
What= number of misplaced
are the roles tiles and upper-bound estimates?
of lower-bound
• h2(n) – =Pruning
total Manhattan
criteria: LB distance
> UB
(i.e., no. of squares from desired location of each tile)
Searching Game Trees
• Consider an OR tree with two types of OR nodes, namely Min nodes and Max
nodes
• In Min nodes we select the minimum cost successor
• In Max nodes we select the maximum cost successor
71
Shallow and Deep Pruning
Max node
14 C D
G Min node
• h1(S) = ? 5 E
• h2(S) = ? Cut-off
Shallow Deep Cut-off
72
Unit 2 List of Topics
74
Local search algorithms
• In many optimization problems, the path to the goal is
irrelevant; the goal state itself is the solution
75
Types of Local Search
• Hill-climbing Search
• Simulation Annealing Search
76
Terminology of Local Search
77
Local search algorithms
Hill Climbing
78
Hill Climbing
Algorithm
1. Evaluate the initial state.
2. Loop until a solution is found or there are
no new operators left to be applied:
− Select and apply a new operator
− Evaluate the new state:
goal → quit
better than current state → new current state
79
Steepest Ascent Hill Climbing
80
Steepest Ascent/Descent Hill Climbing
current
6
4 10 3 2 8
81
Hill Climbing Example
A local heuristic function
Count +1 for every block that sits on the correct thing. The goal state has
the value +8.
Count -1 for every block that sits on an incorrect thing. In the initial state
blocks C, D, E, F, G, H count +1 each. Blocks A, B count -1 each , for the
total of +4.
Move 1 gives the value +6 (A is now on the correct support). Moves 2a
and 2b both give +4 (B and H are wrongly situated). This means we have
a local maximum of +6.
82
Hill Climbing Example
A global heuristic function
Count +N for every block that sits on a correct stack of N things. The goal
state has the value +28.
Count -N for every block that sits on an incorrect stack of N things. That is,
there is a large penalty for blocks tied up in a wrong structure.
In the initial state C, D, E, F, G, H count -1, -2,
-3, -4, -5, -6. A counts -7 , for the total of -28.
83
Hill Climbing Example
84
Hill-climbing search
• Problem: depending on initial state, can get stuck in local maxima
85
Hill Climbing: Disadvantages
Local maximum
A state that is better than all of its
neighbours, but not better than
some other states far away.
Plateau
A flat area of the search space in
which all neighbouring states have
the same value.
Ridge
The orientation of the high region, compared to
the set of available moves, makes it impossible
to climb up. However, two moves executed
serially may increase the height. 86
Hill Climbing: Disadvantages
Ways Out
89
Simulated annealing search
• Idea: escape local maxima by allowing some "bad"
moves but gradually decrease their frequency.
90
Simulated annealing search
80
93
Simulated annealing example
94
Simulated annealing example
95
Simulated annealing example
1. The objective function to minimize is a simple function of two variables:
min f(x) = (4 - 2.1*x1^2 + x1^4/3)*x1^2 + x1*x2 + (-4 + 4*x2^2)*x2^2;
x1, x2 >=0, x1,x2 <=20 5bits Tmin = 50
Delta e(-DeltaE
Decimal Decimal Decimal Decimal Delta E = E<0? /T) < r? =
k=1 Binary x1 x1 Binary x2 x2 f(x) Binary x1'x1' Binary x2'x2' f(x') f(x')-f(x) Accept e(-DeltaE/T) r Accept
T=300 10 2 1001 9 25941.73 0 0 1000 8 16128 -9813.733333Yes Accept
T=T/2=150 0 0 1000 8 16128 1 1 1100 12 82382.23 66254.23333No 1.494E-192 0.6Accept
T=75 1 1 1100 12 82382.23 0 0 1110 14 152880 70497.76667No 0 0.7Accept
T=37.5
16128
6
2 5 1 1 3 2 1 2
96
Unit 2 List of Topics
98
Local Beam Search
99
Local Beam Search
100
Unit 2 List of Topics
• Search Space
• Cross Over
• Mutation
GA can be summarized as:
https://www.geeksforgeeks.org/genetic-algorithms/
Random Number Table for Solving
GA Problem
GA Calculation
Tree from
Max’s
perspective
Minimax Algorithm
• Minimax algorithm
– Perfect play for deterministic, 2-player game
– Max tries to maximize its score
– Min tries to minimize Max’s score (Min)
– Goal: move to position of highest minimax value
Identify best achievable payoff against best play
Minimax Algorithm
2 1 2 1
2 7 1 8 2 7 1 8 2 7 1 8
MIN
2 7 1 8
Minimax Algorithm (cont’d)
3 9 0 7 2 6
3 0 2
3 9 0 7 2 6
3 0 2
3 9 0 7 2 6
• Limitations
– Not always feasible to traverse entire tree
– Time limitations
• Key Improvement
– Use evaluation function instead of utility
• Evaluation function provides estimate of utility at given
position
Unit 2 List of Topics
Principle
– If a move is determined worse than another move already
examined, then there is no need for further examination of the
node.
α-β Pruning Example
α-β Pruning Example (cont’d)
α-β Pruning Example (cont’d)
α-β Pruning Example (cont’d)
α-β Pruning Example (cont’d)
Alpha-Beta Pruning (αβ prune)
• Rules of Thumb
β
Alpha-Beta Pruning Example
β
Alpha-Beta Pruning Example
β
The α-β algorithm
The α-β algorithm
Another Example
1. Search below a MIN
node may be
alpha-pruned if the
beta value is <= to
the alpha value of
some MAX ancestor.
S-159
Game Outcomes
S-160
Minimax Criterion
⚫ Look to the “cake cutting problem” to explain
⚫ Cutter – maximize the minimum the Chooser will
leave him
⚫ Chooser – minimize the maximum the Cutter will
get
Chooser Choose bigger piece Choose smaller piece
Cutter
Cut cake as evenly as Half the cake minus a Half the cake plus a
possible crumb crumb
⚫ If the upper and lower values are the same, the number is called the
value of the game and an equilibrium or saddle point condition exists
⚫ The value of a game is the average or expected game outcome if the game
is played an infinite number of times
⚫ A saddle point indicates that each player has a pure strategy i.e., the
strategy is followed no matter what the opponent does
Saddle Point
S-165
Pure Strategy - Minimax Criterion
Y1 Y2
Player X’s strategies X1 10 6 6
X2 -12 2 -12
Maximum Column 10 6
Number
S-166
Mixed Strategy Game
⚫ When there is no saddle point, players will play each strategy for a
certain percentage of the time
⚫ The most common way to solve a mixed strategy is to use the expected
gain or loss approach
⚫ A player plays each strategy a particular percentage of the time so that the
expected value of the game does not depend upon what the opponent
does
Y1 Y2 Expected Gain
P 1-P
X1 4 2 4P+2(1-P)
Q
X2 1 10 1P+10(1-p)
1-Q
4Q+1(1-Q) 2Q+10(1-q)
Mixed Strategy Game
: Solving for P & Q
4P+2(1-P) = 1P+10(1-P)
or: P = 8/11 and 1-p = 3/11
Expected payoff:
1P+10(1-P)
=1(8/11)+10(3/11)
EPX= 3.46
4Q+1(1-Q)=2Q+10(1-q)
or: Q=9/11 and 1-Q = 2/11
Expected payoff:
EPY=3.46
S-168
Mixed Strategy Game : Example
S-169
Mixed Strategy Game
Example
• This game can be solved by setting up the
mixed strategy table and developing the
appropriate equations:
S-170
Mixed Strategy Game: Example
S-171
Two-Person Zero-Sum and Constant-Sum
Games
Two-person zero-sum and constant-sum games are played according to
the following basic assumption:
Each player chooses a strategy that enables him/her to do the best he/she
can, given that his/her opponent knows the strategy he/she is following.
(1)
Two-Person Zero-Sum and Constant-Sum
Games (Cont)
Step 1 Check for a saddle point. If the game has none, go on to step 2.
Two-Person Zero-Sum and Constant-Sum
Games (Cont)
Let’s take the following example: Two TV channels (1 and 2) are competing
for an audience of 100 viewers. The rule of the game is to simultaneously
announce the type of show the channels will broadcast. Given the payoff
matrix below, what type of show should channel 1 air?
Two-person zero-sum game – Dominance
property
Player
\
Player A
B B1 B2 B3 B4
A1 3 5 4 2
A2 5 6 2 4
A3 2 1 4 0
A4 3 3 5 2
Solutio
B3 B4 Player B
A2 2 4 n
A4 5 2
B3 B4
Player A2 2 4
A A4 5 2
The Prisoner’s Dilemma
– If only one prisoner turns state’s evidence and testifies against his partner he
will go free while the other will receive a 3 year sentence.
– Each prisoner knows the other has the same offer
– The catch is that if both turn state’s evidence, they each receive a 2 year
sentence
– If both refuse, each will be imprisoned for 1 year on the lesser charge
A game is described by
Player B
Left Right
Top 3, 0 0, -4
Player A
Bottom 2, 4 -1, 3
Game
How to solve Playing like this?
a situation
• If Player A’s choice is optimal given Player B’s choice, and B’s
choice is optimal given A’s choice, a pair of strategies is a
Nash equilibrium.
• When the other players’ choice is revealed neither player like
to change her behavior.
• If a set of strategies are best responses to each other, the
strategy set is a Nash equilibrium.
Payoff matrix
Normal- or strategic form
Player B
Left Right
Top 1, 1 2, 3*
Player A
Bottom 2, 3* 1, 2
Solution
Player B
Left Right
Top 1, -1 -1, 1
Player A
-1, 1
Bottom 1, -1
Nash equilibrium in mixed strategies
Prisoner B
Confess Deny
Prisoner A
Confess -2, -2 0, -4
Solution
Confess is a dominant strategy for both. If both
Deny they would be better off. This is the
dilemma.
Nash Equilibrium – To do Problems!
COKE
L R B
PEPSI U 6,8* 4,7 L R
A U 7,6* 5,5
D 7,6 3,7
D 4,5 6,4
GAME PLAYING & MECHANISM DESIGN
Mother
Social Planner
Mechanism Designer
Kid 1 Kid 2
Rational and Rational and
Intelligent Intelligent
Example 1: Mechanism Design
Fair Division of a Cake
GAME PLAYING & MECHANISM DESIGN
Tenali Rama
(Birbal)
Mechanism Designer
Baby
Mother 1 Mother 2
Rational and Rational and
Intelligent Player Intelligent Player
4 60 4 80
Buyers Buyers
17-03-2021 18CSC305J_AI_UNIT3 2
Knowledge Representation & Reasoning
• The second most important concept in AI
• If we are going to act rationally in our environment, then we must have some way of
describing that environment and drawing inferences from that representation.
• how do we describe what we know about the world ?
• how do we describe it concisely ?
• how do we describe it so that we can get hold of the right piece of knowledge when
we need it ?
• how do we generate new pieces of knowledge ?
• how do we deal with uncertain knowledge ?
Knowledge Representation & Reasoning
Knowledge
Declarative Procedural
• Declarative knowledge deals with factoid questions (what is the capital of
India? Etc.)
• Procedural knowledge deals with “How”
• Procedural knowledge can be embedded in declarative knowledge
Planning
Given a set of goals, construct a sequence of actions that achieves
those goals:
• often very large search space
• but most parts of the world are independent of most other
parts
• often start with goals and connect them to actions
• no necessary connection between order of planning and order
of execution
• what happens if the world changes as we execute the plan
and/or our actions don’t produce the expected results?
Learning
12
17-03-2021 18CSC305J_AI_UNIT3
Requirements for a Knowledge-Based Agent
1. \what it already knows" [McCarthy '59]
A knowledge base of beliefs.
2. \it must rst be capable of being told" [McCarthy '59]
A way to put new beliefs into the knowledge base.
3. \automatically deduces for itself a suciently wide class of
immediate consequences" [McCarthy '59]
A reasoning mechanism to derive new beliefs from ones already
in the knowledge base.
17-03-2021 18CSC305J_AI_UNIT3 13
ARCHITECTURE OF A KNOWLEDGE-BASED
AGENT
• Knowledge Level.
• The most abstract level: describe agent by saying what it knows.
• Example: A taxi agent might know that the Golden Gate Bridge connects San
Francisco with the Marin County.
• Logical Level.
• The level at which the knowledge is encoded into sentences.
• Example: Links(GoldenGateBridge, SanFrancisco, MarinCounty).
• Implementation Level.
• The physical representation of the sentences in the logical level.
• Example: ‘(links goldengatebridge sanfrancisco marincounty)
14
17-03-2021 18CSC305J_AI_UNIT3
THE WUMPUS WORLD ENVIRONMENT
• The Wumpus computer game
• The agent explores a cave consisting of rooms connected by passageways.
• Lurking somewhere in the cave is the Wumpus, a beast that eats any agent that
enters its room.
• Some rooms contain bottomless pits that trap any agent that wanders into the
room.
• Occasionally, there is a heap of gold in a room.
• The goal is to collect the gold and exit the world without being eaten
15
17-03-2021 18CSC305J_AI_UNIT3
A TYPICAL WUMPUS WORLD
• The agent always starts in the
field [1,1].
• The task of the agent is to
find the gold, return to the
field [1,1] and climb out of
the cave.
17-03-2021
16 18CSC305J_AI_UNIT3
AGENT IN A WUMPUS WORLD: PERCEPTS
• The agent perceives
• a stench in the square containing the Wumpus and in the adjacent squares (not
diagonally)
• a breeze in the squares adjacent to a pit
• a glitter in the square where the gold is
• a bump, if it walks into a wall
• a woeful scream everywhere in the cave, if the wumpus is killed
• The percepts are given as a five-symbol list. If there is a stench and a breeze, but no
glitter, no bump, and no scream, the percept is
[Stench, Breeze, None, None, None]
17
17-03-2021 18CSC305J_AI_UNIT3
WUMPUS WORLD ACTIONS
• go forward
• turn right 90 degrees
• turn left 90 degrees
• grab: Pick up an object that is in the same square as the agent
• shoot: Fire an arrow in a straight line in the direction the agent is facing. The arrow
continues until it either hits and kills the wumpus or hits the outer wall. The agent
has only one arrow, so only the first Shoot action has any effect
• climb is used to leave the cave. This action is only effective in the start square
• die: This action automatically and irretrievably happens if the agent enters a square
with a pit or a live wumpus
18
17-03-2021 18CSC305J_AI_UNIT3
ILLUSTRATIVE EXAMPLE: WUMPUS WORLD
•Performance measure
• gold +1000,
• death -1000
(falling into a pit or being eaten by the wumpus)
• -1 per step, -10 for using the arrow
•Environment
• Rooms / squares connected by doors.
• Squares adjacent to wumpus are smelly
• Squares adjacent to pit are breezy
• Glitter iff gold is in the same square
• Shooting kills wumpus if you are facing it
• Shooting uses up the only arrow
• Grabbing picks up gold if in same square
• Releasing drops the gold in same square
• Randomly generated at start of game. Wumpus only senses current room.
•Sensors: Stench, Breeze, Glitter, Bump, Scream [perceptual inputs]
•Actuators: Left turn, Right turn, Forward, Grab, Release, Shoot
17-03-2021 18CSC305J_AI_UNIT3 19
WUMPUS WORLD CHARACTERIZATION
Fully Observable No – only local perception
Discrete Yes
17-03-2021 18CSC305J_AI_UNIT3 20
EXPLORING A WUMPUS WORLD
The knowledge base of the agent
consists of the rules of the
Wumpus world plus the percept
“nothing” in [1,1]
Boolean percept
feature values:
<0, 0, 0, 0, 0>
17-03-2021 18CSC305J_AI_UNIT3 21
EXPLORING A WUMPUS WORLD
17-03-2021 18CSC305J_AI_UNIT3 22
EXPLORING A WUMPUS WORLD
T=0 T=1
P?
A/B P?
V
1 2 3 4
Stench, none, none, none, none
17-03-2021 18CSC305J_AI_UNIT3 24
EXPLORING A WUMPUS WORLD
We reasoned about the possible states the Wumpus world can be in,
given our percepts and our knowledge of the rules of the Wumpus
world.
I.e., the content of KB at T=3.
W
What follows is what holds true in all those worlds that satisfy what is
known at that time T=3 about the particular Wumpus world we are in.
P
Example property: P_in_(3,1)
Models(KB) Models(P_in_(3,1)) P
26
17-03-2021 18CSC305J_AI_UNIT3
SUMMARY OF KNOWLEDGE BASED AGENTS
• Intelligent agents need knowledge about the world for making good decisions.
• The knowledge of an agent is stored in a knowledge base in the form of sentences in a
knowledge representation language.
• A knowledge-based agent needs a knowledge base and an inference mechanism. It
operates by storing sentences in its knowledge base, inferring new sentences with the
inference mechanism, and using them to deduce which actions to take.
• A representation language is defined by its syntax and semantics, which specify the
structure of sentences and how they relate to the facts of the world.
• The interpretation of a sentence is the fact to which it refers. If this fact is part of the
actual world, then the sentence is true.
27
17-03-2021 18CSC305J_AI_UNIT3
Knowledge and Reasoning
Table of Contents
• Knowledge and reasoning-Approaches and issues of knowledge reasoning-Knowledge base
agents
• Logic Basics-Logic-Propositional logic-syntax ,semantics and inferences-Propositional logic-
Reasoning patterns
• Unification and Resolution-Knowledge representation using rules-Knowledge representation
using semantic nets
• Knowledge representation using frames-Inferences-
• Uncertain Knowledge and reasoning-Methods-Bayesian probability and belief network
• Probabilistic reasoning-Probabilistic reasoning over time-Probabilistic reasoning over time
• Other uncertain techniques-Data mining-Fuzzy logic-Dempster -shafer theory
17-03-2021 18CSC305J_AI_UNIT3 28
What is a Logic?
• A language with concrete rules
• No ambiguity in representation (may be other errors!)
• Allows unambiguous communication and processing
• Very unlike natural languages e.g. English
• Many ways to translate between languages
• A statement can be represented in different logics
• And perhaps differently in same logic
• Expressiveness of a logic
• How much can we say in this language?
• Not to be confused with logical reasoning
• Logics are languages, reasoning is a process (may use logic)
18CSC305J_AI_UNIT3 29
17-03-2021
Syntax and Semantics
• Syntax
• Rules for constructing legal sentences in the logic
• Which symbols we can use (English: letters, punctuation)
• How we are allowed to combine symbols
• Semantics
• How we interpret (read) sentences in the logic
• Assigns a meaning to each sentence
• Example: “All lecturers are seven foot tall”
• A valid sentence (syntax)
• And we can understand the meaning (semantics)
• This sentence happens to be false (there is a counterexample)
Propositional Logic
• Syntax
• Propositions, e.g. “it is wet”
• Connectives: and, or, not, implies, iff (equivalent)
• For all X
• if (X is a rose)
• then there exists Y
• (X has Y) and (Y is a thorn)
Example: FOL Sentence
• “On Mondays and Wednesdays I go to John’s house for dinner”
40
17-03-2021 18CSC305J_AI_UNIT3
Truth tables
• Logic, like arithmetic, has operators, which apply to one, two, or more
values (operands)
• A truth table lists the results for each possible arrangement of operands
• Order is important: x op y may or may not give the same result as y op x
• The rows in a truth table list all possible sequences of truth values for n
operands, and specify a result for each sequence
• Hence, there are 2n rows in a truth table for n operands
41
17-03-2021 18CSC305J_AI_UNIT3
Unary operators
• There are four possible unary operators:
X Identity, (X)
X Constant true, (T) T T
T T F F
F T
X Negation, ¬X
X Constant false, (F)
T F
T F
F T
F F
• Only the last of these (negation) is widely used (and has a symbol,¬ ,for the operation
42
17-03-2021 18CSC305J_AI_UNIT3
Combined tables for unary operators
43
Binary operators
• There are sixteen possible binary operators:
X Y
T T T T T T T T T T F F F F F F F F
T F T T T T F F F F T T T T F F F F
F T T T F F T T F F T T F F T T F F
F F T F T F T F T F T F T F T F T F
• All these operators have names, but I haven’t tried to fit them in
• Only a few of these operators are normally used in logic
44
17-03-2021 18CSC305J_AI_UNIT3
Useful binary operators
• Here are the binary operators that are traditionally used:
• Notice in particular that material implication () only approximately means the same as the
English word “implies”
• All the other operators can be constructed from a combination of these (along with unary
not,
45
¬)
17-03-2021 18CSC305J_AI_UNIT3
Logical expressions
• All logical expressions can be computed with some combination of and (),
or (), and not () operators
• For example, logical implication can be computed this way:
X Y X X Y XY
T T F T T
T F F F F
F T T T T
F F T T T
• Notice that X Y is equivalent to X Y
46
17-03-2021 18CSC305J_AI_UNIT3
Another example
• Exclusive or (xor) is true if exactly one of its operands is true
X Y X Y X Y X Y (XY)(XY) X xor Y
T T F F F F F F
T F F T F T T T
F T T F T F T T
F F T T F F F F
47
17-03-2021 18CSC305J_AI_UNIT3
World
• A world is a collection of prepositions and logical expressions relating those
prepositions
• Example:
• Propositions: JohnLovesMary, MaryIsFemale, MaryIsRich
• Expressions:
MaryIsFemale MaryIsRich JohnLovesMary
• A proposition “says something” about the world, but since it is atomic (you
can’t look inside it to see component parts), propositions tend to be very
specialized and inflexible
48
17-03-2021 18CSC305J_AI_UNIT3
Models
A model is an assignment of a truth value to each proposition, for example:
• JohnLovesMary: T, MaryIsFemale: T, MaryIsRich: F
• An expression is satisfiable if there is a model for which the expression is true
• For example, the above model satisfies the expression
MaryIsFemale MaryIsRich JohnLovesMary
• An expression is valid if it is satisfied by every model
• This expression is not valid:
MaryIsFemale MaryIsRich JohnLovesMary
because it is not satisfied by this model:
JohnLovesMary: F, MaryIsFemale: T, MaryIsRich: T
• But this expression is valid:
MaryIsFemale MaryIsRich MaryIsFemale
49
17-03-2021 18CSC305J_AI_UNIT3
Inference rules in propositional logic
• Here are just a few of the rules you can apply when reasoning in propositional logic:
50
17-03-2021 18CSC305J_AI_UNIT3
Implication elimination
• A particularly important rule allows you to get rid of
the implication operator, :
• X Y X Y
• We will use this later on as a necessary tool for
simplifying logical expressions
• The symbol means “is logically equivalent to”
51
17-03-2021 18CSC305J_AI_UNIT3
Conjunction elimination
• Another important rule for simplifying logical expressions
allows you to get rid of the conjunction (and) operator, :
• This rule simply says that if you have an and operator at the
top level of a fact (logical expression), you can break the
expression up into two separate facts:
• MaryIsFemale MaryIsRich
• becomes:
• MaryIsFemale
• MaryIsRich
52
17-03-2021 18CSC305J_AI_UNIT3
Inference by computer
• To do inference (reasoning) by computer is basically a search process,
taking logical expressions and applying inference rules to them
• Which logical expressions to use?
• Which inference rules to apply?
• Usually you are trying to “prove” some particular statement
• Example:
• it_is_raining it_is_sunny
• it_is_sunny I_stay_dry
• it_is_rainy I_take_umbrella
• I_take_umbrella I_stay_dry
53• To prove: I_stay_dry
17-03-2021 18CSC305J_AI_UNIT3
Knowledge and Reasoning
Table of Contents
• Knowledge and reasoning-Approaches and issues of knowledge reasoning-
Knowledge base agents
• Logic Basics-Logic-Propositional logic-syntax ,semantics and inferences-
Propositional logic- Reasoning patterns
• Unification and Resolution-Knowledge representation using rules-Knowledge
representation using semantic nets
• Knowledge representation using frames-Inferences-
• Uncertain Knowledge and reasoning-Methods-Bayesian probability and belief
network
• Probabilistic reasoning-Probabilistic reasoning over time-Probabilistic
reasoning over time
• Other uncertain techniques-Data mining-Fuzzy logic-Dempster -shafer theory
17-03-2021 18CSC305J_AI_UNIT3 54
Reasoning Patterns
• Inference in propositional logic is NP-complete!
• However, inference in propositional logic shows
monoticity:
• Adding more rules to a knowledge base does not
affect earlier inferences
Forward and backward reasoning
• Situation: You have a collection of logical expressions (premises), and
you are trying to prove some additional logical expression (the
conclusion)
• You can:
• Do forward reasoning: Start applying inference rules to the logical
expressions you have, and stop if one of your results is the
conclusion you want
• Do backward reasoning: Start from the conclusion you want, and
try to choose inference rules that will get you back to the logical
expressions you have
• With the tools we have discussed so far, neither is feasible
56
17-03-2021 18CSC305J_AI_UNIT3
Example
• Given:
• it_is_raining it_is_sunny
• it_is_sunny I_stay_dry
• it_is_raining I_take_umbrella
• I_take_umbrella I_stay_dry
• You can conclude:
• it_is_sunny it_is_raining
• I_take_umbrella it_is_sunny
• I_stay_dry I_take_umbrella
• Etc., etc. ... there are just too many things you can conclude!
57
17-03-2021 18CSC305J_AI_UNIT3
Predicate calculus
• Predicate calculus is also known as “First Order Logic” (FOL)
• Predicate calculus includes:
• All of propositional logic
• Logical values true, false
• Variables x, y, a, b,...
• Connectives , , , ,
• Constants KingJohn, 2, Villanova,...
• Predicates Brother, >,...
• Functions Sqrt, MotherOf,...
• Quantifiers ,
58
17-03-2021 18CSC305J_AI_UNIT3
Constants, functions, and predicates
• A constant represents a “thing”--it has no truth value, and it
does not occur “bare” in a logical expression
• Examples: DavidMatuszek, 5, Earth, goodIdea
• Given zero or more arguments, a function produces a
constant as its value:
• Examples: motherOf(DavidMatuszek), add(2, 2),
thisPlanet()
• A predicate is like a function, but produces a truth value
• Examples: greatInstructor(DavidMatuszek),
59 isPlanet(Earth), greater(3, add(2, 2))
17-03-2021 18CSC305J_AI_UNIT3
Universal quantification
• The universal quantifier, , is read as “for each”
or “for every”
• Example: x, x2 0 (for all x, x2 is greater than or equal to zero)
• Typically, is the main connective with :
x, at(x,Villanova) smart(x)
means “Everyone at Villanova is smart”
• Common mistake: using as the main connective with :
x, at(x,Villanova) smart(x)
means “Everyone is at Villanova and everyone is smart”
• If there are no values satisfying the condition, the result is true
• Example: x, isPersonFromMars(x) smart(x) is true
60
17-03-2021 18CSC305J_AI_UNIT3
Existential quantification
• The existential quantifier, , is read “for some” or “there exists”
• Example: x, x2 < 0 (there exists an x such that x2 is less than zero)
• Typically, is the main connective with :
x, at(x,Villanova) smart(x)
means “There is someone who is at Villanova and is smart”
• Common mistake: using as the main connective with :
x, at(x,Villanova) smart(x)
This is true if there is someone at Villanova who is smart...
...but it is also true if there is someone who is not at Villanova
By the rules of material implication, the result of F T is T
61
17-03-2021 18CSC305J_AI_UNIT3
Properties of quantifiers
• x y is the same as y x
• x y is the same as y x
62
17-03-2021 18CSC305J_AI_UNIT3
Parentheses
• Parentheses are often used with quantifiers
• Unfortunately, everyone uses them differently, so don’t be upset at any
usage you see
• Examples:
• (x) person(x) likes(x,iceCream)
• (x) (person(x) likes(x,iceCream))
• (x) [ person(x) likes(x,iceCream) ]
• x, person(x) likes(x,iceCream)
• x (person(x) likes(x,iceCream))
• I prefer parentheses that show the scope of the quantifier
• x (x > 0) x (x < 0)
63
17-03-2021 18CSC305J_AI_UNIT3
More rules
• Now there are numerous additional rules we can apply!
• Here are two exceptionally important rules:
• x, p(x) x, p(x)
“If not every x satisfies p(x), then there exists a x that does not satisfy
p(x)”
• x, p(x) x, p(x)
“If there does not exist an x that satisfies p(x), then all x do not satisfy
p(x)”
• In any case, the search space is just too large to be feasible
• This was the case until 1970, when J. Robinson discovered resolution
64
17-03-2021 18CSC305J_AI_UNIT3
Knowledge and Reasoning
Table of Contents
• Knowledge and reasoning-Approaches and issues of knowledge reasoning-
Knowledge base agents
• Logic Basics-Logic-Propositional logic-syntax ,semantics and inferences-
Propositional logic- Reasoning patterns
• Unification and Resolution-Knowledge representation using rules-Knowledge
representation using semantic nets
• Knowledge representation using frames-Inferences-
• Uncertain Knowledge and reasoning-Methods-Bayesian probability and belief
network
• Probabilistic reasoning-Probabilistic reasoning over time-Probabilistic
reasoning over time
• Other uncertain techniques-Data mining-Fuzzy logic-Dempster -shafer theory
17-03-2021 18CSC305J_AI_UNIT3 65
Logic by computer was infeasible
• Why is logic so hard?
• You start with a large collection of facts (predicates)
• You start with a large collection of possible transformations (rules)
• Some of these rules apply to a single fact to yield a new fact
• Some of these rules apply to a pair of facts to yield a new fact
• So at every step you must:
• Choose some rule to apply
• Choose one or two facts to which you might be able to apply the rule
• If there are n facts
• There are n potential ways to apply a single-operand rule
• There are n * (n - 1) potential ways to apply a two-operand rule
• Add the new fact to your ever-expanding fact base
66
17-03-2021 18CSC305J_AI_UNIT3
• The search space is huge!
The magic of resolution
• Here’s how resolution works:
• You transform each of your facts into a particular form, called a clause
(this is the tricky part)
• You apply a single rule, the resolution principle, to a pair of clauses
• Clauses are closed with respect to resolution--that is, when you
resolve two clauses, you get a new clause
• You add the new clause to your fact base
• So the number of facts you have grows linearly
• You still have to choose a pair of facts to resolve
• You never have to choose a rule, because there’s only one
67
17-03-2021 18CSC305J_AI_UNIT3
The fact base
• A fact base is a collection of “facts,” expressed in predicate calculus, that are presumed to be true (valid)
• These facts are implicitly “anded” together
• Example fact base:
• seafood(X) likes(John, X) (where X is a variable)
• seafood(shrimp)
• pasta(X) likes(Mary, X) (where X is a different variable)
• pasta(spaghetti)
• That is,
• (seafood(X) likes(John, X)) seafood(shrimp)
(pasta(Y) likes(Mary, Y)) pasta(spaghetti)
• Notice that we had to change some Xs to Ys
• The scope of a variable is the single fact in which it occurs
68
17-03-2021 18CSC305J_AI_UNIT3
Clause form
• A clause is a disjunction ("or") of zero or more literals, some or all of
which may be negated
• Example:
sinks(X) dissolves(X, water) ¬denser(X, water)
• Notice that clauses use only “or” and “not”—they do not use “and,”
“implies,” or either of the quantifiers “for all” or “there exists”
• The impressive part is that any predicate calculus expression can be
put into clause form
• Existential quantifiers, , are the trickiest ones
69
17-03-2021 18CSC305J_AI_UNIT3
Unification
• From the pair of facts (not yet clauses, just facts):
• seafood(X) likes(John, X) (where X is a variable)
• seafood(shrimp)
• We ought to be able to conclude
• likes(John, shrimp)
• We can do this by unifying the variable X with the constant shrimp
• This is the same “unification” as is done in Prolog
• This unification turns seafood(X) likes(John, X) into
seafood(shrimp) likes(John, shrimp)
• Together with the given fact seafood(shrimp), the final deductive
70
17-03-2021 18CSC305J_AI_UNIT3
step is easy
The resolution principle
• Here it is:
• From X someLiterals
and X someOtherLiterals
----------------------------------------------
conclude: someLiterals someOtherLiterals
• That’s all there is to it!
• Example:
• broke(Bob) well-fed(Bob)
¬broke(Bob) ¬hungry(Bob)
--------------------------------------
well-fed(Bob) ¬hungry(Bob)
71
17-03-2021 18CSC305J_AI_UNIT3
A common error
• You can only do one resolution at a time
• Example:
• broke(Bob) well-fed(Bob) happy(Bob)
¬broke(Bob) ¬hungry(Bob) ∨ ¬happy(Bob)
• You can resolve on broke to get:
• well-fed(Bob) happy(Bob) ¬hungry(Bob) ¬happy(Bob) T
• Or you can resolve on happy to get:
• broke(Bob) well-fed(Bob) ¬broke(Bob) ¬hungry(Bob) T
• Note that both legal resolutions yield a tautology (a trivially true statement, containing X
¬X), which is correct but useless
• But you cannot resolve on both at once to get:
• well-fed(Bob) ¬hungry(Bob)
72
17-03-2021 18CSC305J_AI_UNIT3
Contradiction
• A special case occurs when the result of a resolution (the resolvent) is
empty, or “NIL”
• Example:
• hungry(Bob)
¬hungry(Bob)
----------------
NIL
• In this case, the fact base is inconsistent
• This will turn out to be a very useful observation in doing resolution
theorem proving
73
17-03-2021 18CSC305J_AI_UNIT3
A first example
• “Everywhere that John goes, Rover goes. John is at school.”
• at(John, X) at(Rover, X) (not yet in clause form)
• at(John, school) (already in clause form)
• We use implication elimination to change the first of these into clause
form:
• at(John, X) at(Rover, X)
• at(John, school)
• We can resolve these on at(-, -), but to do so we have to unify X with
school; this gives:
• at(Rover, school)
74
17-03-2021 18CSC305J_AI_UNIT3
Refutation resolution
• The previous example was easy because it had very few clauses
• When we have a lot of clauses, we want to focus our search on the
thing we would like to prove
• We can do this as follows:
• Assume that our fact base is consistent (we can’t derive NIL)
• Add the negation of the thing we want to prove to the fact base
• Show that the fact base is now inconsistent
• Conclude the thing we want to prove
75
17-03-2021 18CSC305J_AI_UNIT3
Example of refutation resolution
• “Everywhere that John goes, Rover goes. John is at school. Prove that Rover is
at school.”
1. at(John, X) at(Rover, X)
2. at(John, school)
3. at(Rover, school) (this is the added clause)
• Resolve #1 and #3:
4. at(John, X)
• Resolve #2 and #4:
5. NIL
• Conclude the negation of the added clause: at(Rover, school)
• This seems a roundabout approach for such a simple example, but it works well
for larger problems
76
17-03-2021 18CSC305J_AI_UNIT3
A second example
• Start with:
• it_is_raining it_is_sunny
• it_is_sunny I_stay_dry
• it_is_raining I_take_umbrella
• I_take_umbrella I_stay_dry
• Proof:
• Convert to clause form:
6. (5, 2) it_is_sunny
1. it_is_raining it_is_sunny
2. it_is_sunny I_stay_dry 7. (6, 1) it_is_raining
3. it_is_raining I_take_umbrella 8. (5, 4) I_take_umbrella
4. I_take_umbrella I_stay_dry 9. (8, 3) it_is_raining
• Prove that I stay dry: 10. (9, 7) NIL
5. I_stay_dry ▪ Therefore, (I_stay_dry)
▪ I_stay_dry
77
17-03-2021 18CSC305J_AI_UNIT3
Converting sentences to CNF
1. Eliminate all ↔ connectives
(P ↔ Q) ((P → Q) ^ (Q → P))
2. Eliminate all → connectives
(P → Q) (P Q)
3. Reduce the scope of each negation symbol to a single predicate
P P
(P Q) P Q
(P Q) P Q
(x)P (x)P
(x)P (x)P
4. Standardize variables: rename all variables so that each quantifier has its own
unique variable name
78
17-03-2021 18CSC305J_AI_UNIT3
Converting sentences to clausal form Skolem
constants and functions
5. Eliminate existential quantification by introducing Skolem
constants/functions
(x)P(x) P(c)
c is a Skolem constant (a brand-new constant symbol that is not used in any
other sentence)
(x)(y)P(x,y) (x)P(x, f(x))
since is within the scope of a universally quantified variable, use a Skolem
function f to construct a new value that depends on the universally
quantified variable
f must be a brand-new function name not occurring in any other sentence in
the KB.
E.g., (x)(y)loves(x,y) (x)loves(x,f(x))
In this case, f(x) specifies the person that x loves
79
17-03-2021 18CSC305J_AI_UNIT3
Converting sentences to clausal form
6. Remove universal quantifiers by (1) moving them all to the left end;
(2) making the scope of each the entire sentence; and (3) dropping
the “prefix” part
Ex: (x)P(x) P(x)
7. Put into conjunctive normal form (conjunction of disjunctions) using
distributive and associative laws
(P Q) R (P R) (Q R)
(P Q) R (P Q R)
8. Split conjuncts into separate clauses
9. Standardize variables so each clause contains only variable names
that do not occur in any other clause
80
17-03-2021 18CSC305J_AI_UNIT3
An example
(x)(P(x) → ((y)(P(y) → P(f(x,y))) (y)(Q(x,y) → P(y))))
2. Eliminate →
(x)(P(x) ((y)(P(y) P(f(x,y))) (y)(Q(x,y) P(y))))
3. Reduce scope of negation
(x)(P(x) ((y)(P(y) P(f(x,y))) (y)(Q(x,y) P(y))))
4. Standardize variables
(x)(P(x) ((y)(P(y) P(f(x,y))) (z)(Q(x,z) P(z))))
5. Eliminate existential quantification
(x)(P(x) ((y)(P(y) P(f(x,y))) (Q(x,g(x)) P(g(x)))))
6. Drop universal quantification symbols
(P(x) ((P(y) P(f(x,y))) (Q(x,g(x)) P(g(x)))))
81
17-03-2021 18CSC305J_AI_UNIT3
Example
7. Convert to conjunction of disjunctions
(P(x) P(y) P(f(x,y))) (P(x) Q(x,g(x)))
(P(x) P(g(x)))
8. Create separate clauses
P(x) P(y) P(f(x,y))
P(x) Q(x,g(x))
P(x) P(g(x))
9. Standardize variables
P(x) P(y) P(f(x,y))
P(z) Q(z,g(z))
P(w) P(g(w))
82
17-03-2021 18CSC305J_AI_UNIT3
Running example
• All Romans who know Marcus either hate Caesar or
think that anyone who hates anyone is crazy
83
17-03-2021 18CSC305J_AI_UNIT3
Step 1: Eliminate implications
• Use the fact that x y is equivalent to x y
84
17-03-2021 18CSC305J_AI_UNIT3
Step 2: Reduce the scope of
• Reduce the scope of negation to a single term, using:
• (p) p
• (a b) (a b)
• (a b) (a b)
• x, p(x) x, p(x)
• x, p(x) x, p(x)
85
17-03-2021 18CSC305J_AI_UNIT3
Step 3: Standardize variables apart
• x, P(x) x, Q(x)
becomes
x, P(x) y, Q(y)
• This is just to keep the scopes of variables from
getting confused
• Not necessary in our running example
86
17-03-2021 18CSC305J_AI_UNIT3
Step 4: Move quantifiers
• Move all quantifiers to the left, without changing their relative
positions
88
17-03-2021 18CSC305J_AI_UNIT3
Step 6: Drop the prefix (quantifiers)
• x, y, z,[ Roman(x) know(x, Marcus) ]
[hate(x, Caesar) (hate(y, z) thinkCrazy(x, y))]
• At this point, all the quantifiers are universal quantifiers
• We can just take it for granted that all variables are
universally quantified
•[ Roman(x) know(x, Marcus) ]
[hate(x, Caesar) (hate(y, z) thinkCrazy(x, y))]
89
17-03-2021 18CSC305J_AI_UNIT3
Step 7: Create a conjunction of disjuncts
becomes
90
17-03-2021 18CSC305J_AI_UNIT3
Step 8: Create separate clauses
• Every place we have an , we break our expression up
into separate pieces
• Not necessary in our running example
91
17-03-2021 18CSC305J_AI_UNIT3
Step 9: Standardize apart
• Rename variables so that no two clauses have the same
variable
• Not necessary in our running example
• Final result:
Roman(x) know(x, Marcus)
hate(x, Caesar) hate(y, z) thinkCrazy(x, y)
93
17-03-2021 18CSC305J_AI_UNIT3
Resolution in first-order logic
• Given sentences
P1 ... Pn
Q1 ... Qm
• in conjunctive normal form:
• each Pi and Qi is a literal, i.e., a positive or negated predicate symbol with its
terms,
• if Pj and Qk unify with substitution list θ, then derive the resolvent sentence:
subst(θ, P1 ... Pj-1 Pj+1 ... Pn Q1 …Qk-1 Qk+1 ... Qm)
• Example
• from clause P(x, f(a)) P(x, f(y)) Q(y)
• and clause P(z, f(a)) Q(z)
• derive resolvent P(z, f(y)) Q(y) Q(z)
• using θ = {x/z}
94
17-03-2021 18CSC305J_AI_UNIT3
Resolution refutation
• Given a consistent set of axioms KB and goal sentence Q, show that KB
|= Q
• Proof by contradiction: Add Q to KB and try to prove false.
i.e., (KB |- Q) ↔ (KB Q |- False)
• Resolution is refutation complete: it can establish that a given sentence
Q is entailed by KB, but can’t (in general) be used to generate all logical
consequences of a set of sentences
• Also, it cannot be used to prove that Q is not entailed by KB.
• Resolution won’t always give an answer since entailment is only
semidecidable
• And you can’t just run two proofs in parallel, one trying to prove Q and the
other trying to prove Q, since KB might not entail either one
96
17-03-2021 18CSC305J_AI_UNIT3
Refutation resolution proof tree
allergies(w) v sneeze(w) cat(y) v ¬allergic-to-cats(z) allergies(z)
w/z
y/Felix
z/Lise
sneeze(Lise) sneeze(Lise)
false
negated query
97
17-03-2021 18CSC305J_AI_UNIT3
We need answers to the following questions
98
17-03-2021 18CSC305J_AI_UNIT3
Unification
• Unification is a “pattern-matching” procedure
• Takes two atomic sentences, called literals, as input
• Returns “Failure” if they do not match and a substitution list, θ, if they do
• That is, unify(p,q) = θ means subst(θ, p) = subst(θ, q) for two atomic
sentences, p and q
• θ is called the most general unifier (mgu)
• All variables in the given two literals are implicitly universally
quantified
• To make literals match, replace (universally quantified) variables by
terms
99
17-03-2021 18CSC305J_AI_UNIT3
Unification algorithm
procedure unify(p, q, θ)
Scan p and q left-to-right and find the first corresponding
terms where p and q “disagree” (i.e., p and q not equal)
If there is no disagreement, return θ (success!)
Let r and s be the terms in p and q, respectively,
where disagreement first occurs
If variable(r) then {
Let θ = union(θ, {r/s})
Return unify(subst(θ, p), subst(θ, q), θ)
} else if variable(s) then {
Let θ = union(θ, {s/r})
Return unify(subst(θ, p), subst(θ, q), θ)
} else return “Failure”
end
100
17-03-2021 18CSC305J_AI_UNIT3
Unification: Remarks
• Unify is a linear-time algorithm that returns the most
general unifier (mgu), i.e., the shortest-length substitution
list that makes the two literals match.
• In general, there is not a unique minimum-length
substitution list, but unify returns one of minimum length
• A variable can never be replaced by a term containing that
variable
Example: x/f(x) is illegal.
• This “occurs check” should be done in the above pseudo-
code before making the recursive calls
101
17-03-2021 18CSC305J_AI_UNIT3
Unification examples
• Example:
• parents(x, father(x), mother(Bill))
• parents(Bill, father(Bill), y)
• {x/Bill, y/mother(Bill)}
• Example:
• parents(x, father(x), mother(Bill))
• parents(Bill, father(y), z)
• {x/Bill, y/Bill, z/mother(Bill)}
• Example:
• parents(x, father(x), mother(Jane))
• parents(Bill, father(y), mother(y))
102 • Failure
17-03-2021 18CSC305J_AI_UNIT3
Resolution example
Practice example : Did Curiosity kill the cat
• Jack owns a dog. Every dog owner is an animal lover. No animal lover
kills an animal. Either Jack or Curiosity killed the cat, who is named
Tuna. Did Curiosity kill the cat?
• These can be represented as follows:
A. (x) Dog(x) Owns(Jack,x)
B. (x) ((y) Dog(y) Owns(x, y)) → AnimalLover(x)
C. (x) AnimalLover(x) → ((y) Animal(y) → Kills(x,y))
D. Kills(Jack,Tuna) Kills(Curiosity,Tuna)
E. Cat(Tuna)
F. (x) Cat(x) → Animal(x) GOAL
G. Kills(Curiosity, Tuna)
17-03-2021 18CSC305J_AI_UNIT3
103
• Convert to clause form
D is a skolem constant
A1. (Dog(D))
A2. (Owns(Jack,D))
B. (Dog(y), Owns(x, y), AnimalLover(x))
C. (AnimalLover(a), Animal(b), Kills(a,b))
D. (Kills(Jack,Tuna), Kills(Curiosity,Tuna))
E. Cat(Tuna)
F. (Cat(z), Animal(z))
• Add the negation of query:
G: (Kills(Curiosity, Tuna))
104
17-03-2021 18CSC305J_AI_UNIT3
• The resolution refutation proof
R1: G, D, {} (Kills(Jack, Tuna))
R2: R1, C, {a/Jack, b/Tuna} (~AnimalLover(Jack),
~Animal(Tuna))
R3: R2, B, {x/Jack} (~Dog(y), ~Owns(Jack, y),
~Animal(Tuna))
R4: R3, A1, {y/D} (~Owns(Jack, D),
~Animal(Tuna))
R5: R4, A2, {} (~Animal(Tuna))
R6: R5, F, {z/Tuna} (~Cat(Tuna))
R7: R6, E, {} FALSE
105
17-03-2021 18CSC305J_AI_UNIT3
• The proof tree
G D
{}
R1: K(J,T) C
{a/J,b/T}
R2: AL(J) A(T) B
{x/J}
R3: D(y) O(J,y) A(T) A1
{y/D}
R4: O(J,D), A(T) A2
{}
R5: A(T) F
{z/T}
R6: C(T) A
{}
R7: FALSE
106
17-03-2021 18CSC305J_AI_UNIT3
Knowledge and Reasoning
Table of Contents
• Knowledge and reasoning-Approaches and issues of knowledge reasoning-
Knowledge base agents
• Logic Basics-Logic-Propositional logic-syntax ,semantics and inferences-
Propositional logic- Reasoning patterns
• Unification and Resolution
• Knowledge representation using rules-Knowledge representation using semantic
nets
• Knowledge representation using frames-Inferences-
• Uncertain Knowledge and reasoning-Methods-Bayesian probability and belief
network
• Probabilistic reasoning-Probabilistic reasoning over time-Probabilistic reasoning
over time
• Other uncertain techniques-Data mining-Fuzzy logic-Dempster -shafer theory
17-03-2021 18CSC305J_AI_UNIT3 107
Production Rules
• Condition-Action Pairs
• IF this condition (or premise or antecedent) occurs,
THEN some action (or result, or conclusion, or
consequence) will (or should) occur
• IF the traffic light is red AND you have stopped,
THEN a right turn is OK
Production Rules
• Each production rule in a knowledge base represents an
autonomous chunk of expertise
• When combined and fed to the inference engine, the set of rules
behaves synergistically
• Rules can be viewed as a simulation of the cognitive behaviour
of human experts
• Rules represent a model of actual human behaviour
• Predominant technique used in expert systems, often in
conjunction with frames
Forms of Rules
• IF premise, THEN conclusion
• IF your income is high, THEN your chance of being
audited by the Inland Revenue is high
• Conclusion, IF premise
• Your chance of being audited is high, IF your income
is high
Forms of Rules
• Inclusion of ELSE
• IF your income is high, OR your deductions are unusual, THEN
your chance of being audited is high, OR ELSE your chance of
being audited is low
• More complex rules
• IF credit rating is high AND salary is more than £30,000, OR
assets are more than £75,000, AND pay history is not "poor,"
THEN approve a loan up to £10,000, and list the loan in category
"B.”
• Action part may have more information: THEN "approve the loan"
and "refer to an agent"
Characteristics of Rules
First Part Second Part
Statement AND statements All conditions must be true for a conclusion to be true
C1→A1 Working
Environment
C2→A2 Memory
C3→A3
…
Cn→An
Conflict Conflict
Set Resolution
Recognise-Act Cycle
• Patterns in WM matched against production rule conditions
• Matching (activated) rules form the conflict set
• One of the matching rules is selected (conflict resolution) and
fired
• Action of rule is performed
• Contents of WM updated
• Cycle repeats with updated WM
Conflict Resolution
• Reasoning in a production system can be viewed as a type of
search
• Selection strategy for rules from the conflict set controls
search
• Production system maintains the conflict set as an agenda
• Ordered list of activated rules (those with their conditions
satisfied) which have not yet been executed
• Conflict resolution strategy determines where a newly-
activated rule is inserted
Salience
• Rules may be given a precedence order by assigning a
salience value
• Newly activated rules are placed in the agenda above all rules
of lower salience, and below all rules with higher salience
• Rule with higher salience are executed first
• Conflict resolution strategy applies between rules of the
same salience
• If salience and the conflict resolution strategy can ’ t
determine which rule is to be executed next, a rule is chosen
at random from the most highly ranked rules
Conflict Resolution Strategies
• Depth-first: newly activated rules placed above other rules in the
agenda
• Breadth-first: newly activated rules placed below other rules
• Specificity: rules ordered by the number of conditions in the LHS
(simple-first or complex-first)
• Least recently fired: fire the rule that was last fired the longest time
ago
• Refraction: don’t fire a rule unless the WM patterns that match its
conditions have been modified
• Recency: rules ordered by the timestamps on the facts that match
their conditions
Salience
• Salience facilitates the modularization of expert systems
in which modules work at different levels of abstraction
• Over-use of salience can complicate a system
• Explicit ordering to rule execution
• Makes behaviour of modified systems less predictable
• Rule of thumb: if two rules have the same salience, are
in the same module, and are activated concurrently,
then the order in which they are executed should not
matter
Common Types of Rules
• Knowledge rules, or declarative rules, state all the facts
and relationships about a problem
• Inference rules, or procedural rules, advise on how to
solve a problem, given that certain facts are known
• Inference rules contain rules about rules (metarules)
• Knowledge rules are stored in the knowledge base
• Inference rules become part of the inference engine
Animal Ostrich
Can breathe is-a Runs fast
Can eat Cannot fly
Has skin Is tall
Fish Salmon
is-a Can swim is-a Swims upstream
Has fins Is pink
Has gills Is edible
BEAGLE COLLIE
instance
size: small FICTIONAL
instance CHARACTER instance
instance
SNOOPY instance
LASSIE
friend of
17-03-2021 CHARLIE BROWN
18CSC305J_AI_UNIT3 127
Semantic Networks
What does or should a node represent?
• A class of objects?
• An instance of an class?
• The canonical instance of a class?
• The set of all instances of a class?
DOG COLLIE
Fixed Fixed
legs: 4 breed of: DOG
type: sheepdog
Default
diet: carnivorous Default
sound: bark size: 65cm
Variable Variable
size: colour:
colour:
ELEPHANT
subclass: MAMMAL
colour: grey
size: large
Nellie
instance: ELEPHANT
likes: apples
• elephant(clyde)
∴
mammal(clyde)
has_part(clyde, head)
ELEPHANT
subclass: MAMMAL
has_trunk: yes
*colour: grey
*size: large
*furry: no
Clyde
instance: ELEPHANT
colour: pink
owner: Fred
Nellie
instance: ELEPHANT
size:
17-03-2021 small 18CSC305J_AI_UNIT3 139
Frames (Contd.)
• Can represent subclass and instance relationships (both
sometimes called ISA or “is a”)
• Properties (e.g. colour and size) can be referred to as slots and
slot values (e.g. grey, large) as slot fillers
• Objects can inherit all properties of parent class (therefore
Nellie is grey and large)
• But can inherit properties which are only typical (usually called
default, here starred), and can be overridden
• For example, mammal is typically furry, but this is not so for an
elephant
• Deduction
• Induction
• Abduction
• In real life, it is not always possible to determine the state of the environment as it might not be clear. Due
to partially observable or non-deterministic environments, agents may need to handle uncertainty and deal
with it.
• Uncertain data: Data that is missing, unreliable, inconsistent or noisy
• Uncertain knowledge: When the available knowledge has multiple causes leading to multiple effects or
incomplete knowledge of causality in the domain
• Uncertain knowledge representation: The representations which provides a restricted model of the real
system, or has limited expressiveness
• Inference: In case of incomplete or default reasoning methods, conclusions drawn might not be completely
accurate. Let’s understand this better with the help of an example.
• IF primary infection is bacteria cea
• AND site of infection is sterile
• AND entry point is gastrointestinal tract
• THEN organism is bacteriod (0.7).
• In such uncertain situations, the agent does not guarantee a solution but acts on its own
assumptions and probabilities and gives some degree of belief that it will reach the required
solution.
• For example, In case of Medical diagnosis consider the rule Toothache = Cavity. This
is not complete as not all patients having toothache have cavities. So we can write a
more generalized rule Toothache = Cavity V Gum problems V Abscess… To make this
rule complete, we will have to list all the possible causes of toothache. But this is not
feasible due to the following rules:
• Laziness- It will require a lot of effort to list the complete set of antecedents and
consequents to make the rules complete.
• Theoretical ignorance- Medical science does not have complete theory for the
domain
• Practical ignorance- It might not be practical that all tests have been or can be
conducted for the patients.
• Such uncertain situations can be dealt with using
Probability theory
Truth Maintenance systems
Fuzzy logic.
17-03-2021 18CSC305J_AI_UNIT3 152
Uncertain knowledge and reasoning
Probability
• Probability is the degree of likeliness that an event will occur. It provides a certain degree of belief
in case of uncertain situations. It is defined over a set of events U and assigns value P(e) i.e.
probability of occurrence of event e in the range [0,1]. Here each sentence is labeled with a real
number in the range of 0 to 1, 0 means the sentence is false and 1 means it is true.
• Conditional Probability or Posterior Probability is the probability of event A given that B has
already occurred.
• P(A|B) = (P(B|A) * P(A)) / P(B)
• For example, P(It will rain tomorrow| It is raining today) represents conditional probability of it
raining tomorrow as it is raining today.
• P(A|B) + P(NOT(A)|B) = 1
• Joint probability is the probability of 2 independent events happening simultaneously like rolling
two dice or tossing two coins together. For example, Probability of getting 2 on one dice and 6 on
the other is equal to 1/36. Joint probability has a wide use in various fields such as physics,
astronomy, and comes into play when there are two independent events. The full joint probability
distribution specifies the probability of each complete assignment of values to random variables.
Bayes Theorem
• It is based on the principle that every pair of features being
classified is independent of each other. It calculates probability
P(A|B) where A is class of possible outcomes and B is given
instance which has to be classified.
• P(A|B) = P(B|A) * P(A) / P(B)
• P(A|B) = Probability that A is happening, given that B has
occurred (posterior probability)
• P(A) = prior probability of class
• P(B) = prior probability of predictor
• P(B|A) = likelihood
17-03-2021 18CSC305J_AI_UNIT3 154
Uncertain knowledge and reasoning
CONDITIONAL PROBABILITY
• The Bayesian network has mainly two components:
Causal Component
Actual numbers
• Each node in the Bayesian network has condition probability
distribution P(Xi |Parent(Xi) ), which determines the effect of the
parent on that node.
• Bayesian network is based on Joint probability distribution and
conditional probability. So let's first understand the joint probability
distribution:
Problem:
• Calculate the probability that alarm has sounded, but there is neither a burglary, nor an
earthquake occurred, and David and Sophia both called the Harry.
Solution:
• The Bayesian network for the above problem is given below. The network structure is showing
that burglary and earthquake is the parent node of the alarm and directly affecting the probability
of alarm's going off, but David and Sophia's calls depend on alarm probability.
• The network is representing that our assumptions do not directly perceive the burglary and also
do not notice the minor earthquake, and they also not confer before calling.
• The conditional distributions for each node are given as conditional probabilities table or CPT.
• Each row in the CPT must be sum to 1 because all the entries in the table represent an exhaustive
set of cases for the variable.
• In CPT, a boolean variable with k boolean parents contains 2K probabilities. Hence, if there are
two parents, then CPT will contain 4 probability values
Conditional probability
A P(S= True) P(S= False)
table for Sophia Calls:
The Conditional
True 0.75 0.25
probability of Sophia that
she calls is depending on
False 0.02 0.98
its Parent Node "Alarm."
AP(S= True)P(S=
False)True0.750.25False0.020.98 AP(S=
17-03-2021 18CSC305J_AI_UNIT3
True)P(S=
Bayesian probability and belief network
• From the formula of joint distribution, we can write the problem statement in the form of
probability distribution:
• P(S, D, A, ¬B, ¬E) = P (S|A) *P (D|A)*P (A|¬B ^ ¬E) *P (¬B) *P (¬E).
= 0.75* 0.91* 0.001* 0.998*0.999
= 0.00068045.
Hence, a Bayesian network can answer any query about the domain by using Joint distribution.
• The semantics of Bayesian Network:
• There are two ways to understand the semantics of the Bayesian network, which is given below:
1. To understand the network as the representation of the Joint probability distribution.
• It is helpful to understand how to construct the network.
2. To understand the network as an encoding of a collection of conditional independence
statements.
• It is helpful in designing inference procedure.
Bayes' theorem:
• Bayes' theorem is also known as Bayes' rule, Bayes' law, or Bayesian
reasoning, which determines the probability of an event with uncertain
knowledge.
• In probability theory, it relates the conditional probability and marginal
probabilities of two random events.
• Bayes' theorem was named after the British mathematician Thomas Bayes.
The Bayesian inference is an application of Bayes' theorem, which is
fundamental to Bayesian statistics.
• It is a way to calculate the value of P(B|A) with the knowledge of P(A|B).
• Bayes' theorem allows updating the probability prediction of an event by
observing new information of the real world.
Bayesian Network
When designing a Bayesian Network, we keep
the local probability table at each node.
Bayesian Network - Example
Consider a Bayesian Network as given below:
The updated
Bayesian Network
is:
Fuzzy Input
Fuzzy Output
197
Fuzzification
⚫ Establishes the fact base of the fuzzy system. It identifies the input and output of the
system, defines appropriate IF THEN rules, and uses raw data to derive a
membership function.
⚫ Consider an air conditioning system that determine the best circulation level by
sampling temperature and moisture levels. The inputs are the current temperature
and moisture level. The fuzzy system outputs the best air circulation level: “none”,
“low”, or “high”. The following fuzzy rules are used:
1. If the room is hot, circulate the air a lot.
2. If the room is cool, do not circulate the air.
3. If the room is cool and moist, circulate the air slightly.
⚫ A knowledge engineer determines membership functions that map temperatures
to fuzzy values and map moisture measurements to fuzzy values.
198
Inference
⚫ Evaluates all rules and determines their truth values. If an input does not
precisely correspond to an IF THEN rule, partial matching of the input data is
used to interpolate an answer.
⚫ Continuing the example, suppose that the system has measured temperature
and moisture levels and mapped them to the fuzzy values of .7 and .1
respectively. The system now infers the truth of each fuzzy rule.
⚫ To do this a simple method called MAX-MIN is used. This method sets the
fuzzy value of the THEN clause to the fuzzy value of the IF clause. Thus, the
method infers fuzzy values of 0.7, 0.1, and 0.1 for rules 1, 2, and 3
respectively.
199
Composition
⚫ Combines all fuzzy conclusions obtained by inference into a single conclusion.
Since different fuzzy rules might have different conclusions, consider all rules.
⚫ Continuing the example, each inference suggests a different action
⚫ rule 1 suggests a "high" circulation level
⚫ rule 2 suggests turning off air circulation
⚫ rule 3 suggests a "low" circulation level.
⚫ A simple MAX-MIN method of selection is used where the maximum fuzzy value
of the inferences is used as the final conclusion. So, composition selects a fuzzy
value of 0.7 since this was the highest fuzzy value associated with the inference
conclusions.
200
Defuzzification
⚫ Convert the fuzzy value obtained from composition into a “crisp” value. This
process is often complex since the fuzzy set might not translate directly into a
crisp value.Defuzzification is necessary, since controllers of physical systems
require discrete signals.
⚫ Continuing the example, composition outputs a fuzzy value of 0.7. This
imprecise value is not directly useful since the air circulation levels are “none”,
“low”, and “high”. The defuzzification process converts the fuzzy output of
0.7 into one of the air circulation levels. In this case it is clear that a fuzzy
output of 0.7 indicates that the circulation should be set to “high”.
201
Defuzzification
⚫ There are many defuzzification methods. Two of the more common
techniques are the centroid and maximum methods.
⚫ In the centroid method, the crisp value of the output variable is
computed by finding the variable value of the center of gravity of the
membership function for the fuzzy value.
⚫ In the maximum method, one of the variable values at which the fuzzy
subset has its maximum truth value is chosen as the crisp value for the
output variable.
202
Example: Design of Fuzzy Expert System – Washing
Machine
0.7
0.2
X1
Demystifying AI algorithms
DeFuzzification
Washing Time Long = (Y- 30)/(40-30)
Washing Time Medium = (Y- 20)/(30-20)
5 10 20 30 40 60
(Y – 20)/(30-20) = 0.5
X1 and X2 = 0.5 Y – 20 = 0.5* 10 = 5
Y = 25 Mins
Demystifying AI algorithms
Knowledge and Reasoning
Table of Contents
• Knowledge and reasoning-Approaches and issues of knowledge reasoning-
Knowledge base agents
• Logic Basics-Logic-Propositional logic-syntax ,semantics and inferences-
Propositional logic- Reasoning patterns
• Unification and Resolution-Knowledge representation using rules-Knowledge
representation using semantic nets
• Knowledge representation using frames-Inferences-
• Uncertain Knowledge and reasoning-Methods-Bayesian probability and belief
network
• Probabilistic reasoning-Probabilistic reasoning over time
• Other uncertain techniques-Data mining-Fuzzy logic-Dempster -shafer theory
17-03-2021 18CSC305J_AI_UNIT3 207
Dempster Shafer Theory
• Dempster Shafer Theory is given by Arthure P.Dempster in 1967 and his student
Glenn Shafer in 1976.
This theory is being released because of following reason:-
• Bayesian theory is only concerned about single evidences.
• Bayesian probability cannot describe ignorance.
• DST is an evidence theory, it combines all possible outcomes of the problem.
Hence it is used to solve problems where there may be a chance that a different
evidence will lead to some different result.
The uncertainity in this model is given by:-
• Consider all possible outcomes.
• Belief will lead to believe in some possibility by bringing out some evidence.
• Plausibility will make evidence compatibility with possible outcomes.
) = ({
P(ϴ) = ({
Detectives after receiving the crime scene, assign mass probabilities to various elements of the
power set:
Event Mass
No one is guilty 0
B is guilty 0.1
J is guilty 0.2
S is guilty 0.1
Either B or J is guilty 0.1
Either B or S is guilty 0.1
Either S or J is guilty 0.3
One of the 3 is guilty 0.1
17-03-2021 18CSC305J_AI_UNIT3 212
Dempster Shafer Problem
Belief in A:
The belief in an element A of the power set is the sum of the masses of
elements which are subsets of A (including A itself)
Ex: Given A= {q1, q2, q3}
Bet (A)
={m(q1)+m(q2)+m(q3)+m(q1,q2)+m(q2,q3),m(q1,q3)+m(q1,q2,q3)}
Ex: Given the above mass assignments,
Bel(B) = m(B) =0.1
Bel (B,J) = m(B)+m(J)+m(B,J) = 0.1+0.2=0.1 0.4
RESULT: A {B} {J} {S} {B,J} {B,S} {S,J} {B,J,S}
•
M(A) 0.1 0.2 0.1 0.1 0.1 0.3 0.1
•
Bel (A) 0.1 0.2 0.1 0.4 0.3 0.6 1.0