module 4
module 4
Unifica on
Forward Chaining
Representa on languages are systems used to store and reason about knowledge. They
enable us to describe, infer, and derive conclusions about the world. The sec on highlights
the transi on from proposi onal logic (covered earlier) to first-order logic (FOL), which is
more expressive and capable of handling complex knowledge.
Programming languages (e.g., C++, Java, Lisp) are o en used to represent computa onal
processes and data. However, they have significant limita ons when used as general-
purpose representa on languages:
Strengths:
o Statements like World[2,2] ← Pit can describe specific facts, such as a pit
being in square [2,2].
Weaknesses:
o Lack general mechanisms for deriving facts from other facts. This means
updates to data structures are domain-specific and rely on the programmer's
knowledge.
o Cannot easily handle par al informa on (e.g., uncertainty or disjunc on, like
"There is a pit in [2,2] or [3,1]").
These limita ons make programming languages less suited for domain-independent
knowledge representa on compared to declara ve systems like proposi onal logic.
Proposi onal Logic
Proposi onal logic is a declara ve language with key proper es that make it useful for
knowledge representa on:
1. Declara ve Nature:
2. Expressiveness:
o For example: "There is a pit in [2,2] or [3,1]" can be wri en as P2,2 ∨ P3,1.
3. Composi onality:
o For example: S1,4 ∧ S1,2 means "stench in square [1,4] and square [1,2],"
combining the meanings of S1,4 and S1,2.
Limited expressiveness for environments with many objects or rela onships. For
instance, describing rules like "Squares adjacent to pits are breezy" requires mul ple
individual sentences, leading to redundancy.
To address the limita ons of proposi onal logic, First-Order Logic was developed. FOL
introduces:
Quan fiers: Universal (∀) and existen al (∃) quan fiers for concise representa on.
o Example: "All squares adjacent to pits are breezy" can be wri en as:
1. Ambiguity:
o Words can have mul ple meanings depending on context. For example,
"spring" could mean a season or a coiled object.
2. Context Dependence:
o Sentences like "Look!" rely heavily on the surrounding context for meaning.
Without storing context, it’s difficult to infer their meaning in a knowledge
base.
3. Sapir-Whorf Hypothesis:
Guugu Yimithirr speakers, who lack rela ve direc ons (e.g., le , right),
excel in naviga ng open terrain using absolute direc ons (e.g., north,
south).
Recent studies suggest that humans process language and store knowledge using nonverbal
representa ons:
fMRI Studies:
o Brain imaging reveals that words and concepts ac vate specific neural
pa erns.
o Computers trained on fMRI data can predict words people think of,
demonstra ng shared representa on across individuals.
While formal logic treats different representa ons of the same knowledge as equivalent,
prac cal reasoning systems favor certain representa ons because they are more efficient.
For example:
Succinctness: More concise representa ons lead to faster reasoning.
Learning: Representa on choice influences how effec vely systems learn from data.
o While proposi onal logic is simple and unambiguous, it lacks the richness to
describe rela onships or general rules (e.g., "All neighboring squares are
smelly").
o Enhanced logic systems like first-order logic (FOL) build on proposi onal logic
to address these limita ons.
Natural language is expressive but ambiguous. Logic can mimic its expressiveness without
ambiguity. Key elements borrowed include:
Objects:
Rela ons:
o Examples:
Func ons:
o Property: smelly.
First-Order Logic:
Temporal Logic:
Probability Theory:
o Represents beliefs about facts as probabili es (e.g., 0.75 belief that the
wumpus is in [1,3]).
Fuzzy Logic:
o Probability Theory:
o Fuzzy Logic:
Higher-Order Logic:
o Extends first-order logic to treat rela ons and func ons as objects.
Special-Purpose Logics:
o Tailored for specific domains (e.g., temporal logic for reasoning about me).
What is a Model?
A model is a formal structure that represents a possible world and allows us to evaluate the
truth of logical sentences. It links the vocabulary of FOL (symbols) to elements in the
possible world.
Proposi onal Logic Models: Link proposi on symbols to predefined truth values.
FOL Models:
o Include objects in a domain.
Domain of a Model
The domain is the set of objects (also called domain elements) in the model.
Requirement: The domain must be non-empty (there must be at least one object).
Example:
o The domain contains: Richard the Lionheart, King John, their le legs, and a
crown.
o Example:
o Example:
Func ons are specialized rela ons that map objects uniquely to other objects.
Func ons in FOL must be total, meaning every input has an output.
Syntax of FOL
o Example: Le Leg.
Arity
Interpreta on
A model’s interpreta on links symbols to specific objects, rela ons, or func ons in
the domain.
Example Interpreta on
1. Constant Symbols:
2. Predicate Symbols:
3. Func on Symbols:
o Le Leg → Mapping:
Example:
o There are 25 possible interpreta ons for two constants (Richard and John)
with five objects in the domain.
Naming Objects
Example: The crown and le legs may remain unnamed in an interpreta on.
Overlapping Names
Example: Both Richard and John could refer to the crown in one interpreta on.
Complexity of FOL Models
Example:
FOL requires more advanced techniques than proposi onal logic for reasoning.
8.2.3 Terms
Defini on of a term:
A term in logic is an expression that refers to an object. For example:
o A constant symbol refers to a specific object. (e.g., John refers to King John.)
Complex terms:
A complex term combines a func on symbol with a list of arguments.
Example: Le Leg(John) refers to King John’s le leg.
Key points:
Seman cs of terms:
The arguments t1, ..., tn refer to objects in the domain of the model.
Examples:
o The rela on represented by the predicate holds for the objects referred to
by the terms.
Interpreta on:
o A universally quan fied sentence is true if the predicate holds for every
object in the domain of discourse.
o Example:
For the domain {Richard, John, Richard’s le leg, John’s le leg, crown}, the
sentence ∀x King(x) ⇒ Person(x) means:
Objects like legs and crowns are irrelevant since they are not kings
(the premise King(x) is false for them).
o Truth table of ⇒:
This ensures the sentence asserts the conclusion only for objects that
sa sfy the premise.
Common Mistake:
o This asserts that every object in the domain is both a king and a person
(which is clearly incorrect for legs and crowns).
Existen al quan fica on is used to assert that a statement holds true for at least one object
in a given domain. The symbol ∃ (pronounced "there exists") represents this quan fier.
For example:
∃x Crown(x)∧OnHead(x,John)
This sentence can be read as: "There exists an xxx such that x is a crown and x is on John's
head."
Formal Seman cs
The meaning of ∃xP (where P is any logical expression) is that P is true in at least one
extended interpreta on of the model where xxx refers to a specific domain element. This
means:
x could refer to "Richard the Lionheart," "King John," "Richard's le leg," "John's le
leg," or "the crown."
Since one valid interpreta on exists (the crown is on John's head), the existen al statement
is true.
Key Connec ve
The main logical connec ve for existen al quan fiers is conjunc on (∧) because we are
asser ng that the existence of xxx sa sfies all parts of the statement.
Using ⇒⇒⇒ (implica on) with ∃o en leads to weak or ambiguous statements. For example:
∃x (Crown(x)⇒OnHead(x,John) This translates to:
In prac ce, this formula on doesn’t effec vely express "there exists a crown on John's head"
because it can be true regardless of whether any crown exists.
Logical statements o en involve mul ple quan fiers, which can be of the same type (e.g.,
all universal or all existen al) or mixed.
Here:
∀x∀y is shorthand for saying the statement applies to all possible combina ons of x
and y.
For statements involving universal and existen al quan fiers, order ma ers.
Example 1:
"Everybody loves somebody."
∀x ∃y Loves(x,y)
This means: For every person xxx, there exists someone y such that xxx loves y.
Example 2:
"There is someone who is loved by everyone."
∃y ∀x Loves(x,y)
This means: There exists a person y such that every person x loves y.
The difference:
In Example 1, the quan fiers assert that each person has someone they love.
In Example 2, the quan fiers assert that there is one person loved by all.
Variable Scope
1. ∀x P is equivalent to ¬∃x¬P:
"For all x, P" is the same as "There does not exist an xxx for which P is false."
2. ∃x P is equivalent to ¬∀x¬P:
"There exists an x for which P" is the same as "It is not the case that P is false for all
x."
∀x¬P(x)≡¬∃xP(x)
¬∀xP(x)≡∃x¬P(x)
∀xP(x)≡¬∃x¬P(x)
∃xP(x)≡¬∀x¬P(x)
These equivalences show that both ∀ and ∃ can be expressed in terms of each other, just as
logical conjunc on (∧) and disjunc on (∨) are duals.
8.2.7 Equality
First-order logic (FOL) allows comparisons between terms to determine if they refer to the
same object. This is done using the equality symbol (=). For example:
If Father(John) and Henry both refer to the same object in the model, the statement
is true.
Otherwise, it is false.
To assert that two terms are not the same object, we use nega on:
¬(x = y) or shorthand x ≠ y.
For instance, to express that Richard has at least two brothers, we write:
This ensures:
Some mes, simple asser ons using equality may lead to incorrect interpreta ons. For
example:
∃x, y (Brother(x, Richard) ∧ Brother(y, Richard)) does not guarantee Richard has two
dis nct brothers. If both x and y are assigned the same value (e.g., King John), the
statement can s ll hold true.
Each interpreta on assigns a unique meaning to predicates, constants, and func ons.
Database Seman cs
2. Closed-world assump on: Any statement not explicitly true is considered false.
3. Domain closure: The domain contains only the objects explicitly named by constants.
o Example: If John and Geoffrey are the only constants, no unnamed brothers
of Richard exist.
What is a Domain?
A domain refers to a specific part of the world we want to represent using knowledge. For
example:
In each domain, FOL is used to express asser ons (facts) and queries about the rela onships
and objects within it.
For example:
2. Queries (ASK)
o The answer is true, but it does not specify the exact individuals.
ASKVARS works best with Horn clauses: Knowledge bases where queries can always
bind variables to specific values.
Predicates:
o ∀x Male(x) ⇔ ¬Female(x)
4. Parent and Child: Parent and child are inverse rela ons.
Examples:
Key Differences
Including theorems in the knowledge base can reduce computa onal cost.
Plain Facts
o Male(Jim)
o Spouse(Jim, Laura)
The theory of natural numbers (non-nega ve integers) can be built from a few basic axioms,
known as the Peano axioms. These axioms define the natural numbers and the fundamental
opera ons on them, like addi on.
We begin by defining natural numbers using a predicate NatNum, which is true for natural
numbers. We also introduce a constant symbol 0 and a func on symbol S, represen ng the
successor func on, which gives the next number.
This recursive defini on allows us to generate all natural numbers: 0, S(0), S(S(0)), and so on.
Once we have the basic structure of natural numbers, we can define addi on using the
successor func on. Addi on is defined recursively:
Using these axioms, we can define addi on using the successor func on. For example, 1 + n
is S(n) (the successor of n), and 2 + n is S(S(n)), and so on.
In first-order logic, we use prefix nota on for func ons (e.g., +(m, 0)), but we can use infix
nota on (m + 0) as syntac c sugar to make the expressions easier to read.
This expresses that adding 1 to a number is equivalent to adding 1 a er the sum of the
number and another.
Sets in First-Order Logic
Sets are fundamental objects in both mathema cs and everyday reasoning. In first-order
logic, we can represent sets, elements, and opera ons on sets with predicates and
func ons.
We need:
A binary predicate s1 ⊆ s2, which tells whether set s1 is a subset of set s2.
We also need func ons for common set opera ons like:
Set Axioms
These axioms help define the basic opera ons on sets like union, intersec on, subset, and
equality.
Duplicates allowed: A list can have the same element mul ple mes.
We can use func ons like Nil for the empty list, and Cons to construct lists.
5. Rest: A func on to retrieve the remaining elements of a list (a er the first element).
Just like with sets, we use syntac c sugar to make list expressions more readable:
For a list [A, B, C], it would be represented as Cons(A, Cons(B, Cons(C, Nil))).
Lists can be axioma zed in a similar manner to sets. We can define recursive proper es of
lists, such as how to add an element to the front, and how to manipulate lists through
func ons like First, Rest, and Append.
The agent receives percepts in the form of a vector, represen ng sensory informa on. Each
percept contains elements such as "Stench," "Breeze," "Gli er," etc., and the corresponding
me step when the percept occurred. These percepts are stored as first-order sentences,
with the Percept predicate capturing sensory data and its associated me.
For example:
Percept([Stench, Breeze, Gli er, None, None], 5) means that at me 5, the agent
perceives a stench, breeze, and gli er.
Ac ons that the agent can take, like moving (turning, going forward), shoo ng, grabbing, or
climbing, are represented by terms like:
The Wumpus World agent's reflexive behavior can be defined using first-order logic. For
instance:
If the agent perceives gli er, then the agent should grab the treasure:
This rule means that at any me t, if the agent perceives gli er, the best ac on is to grab the
treasure.
Squares: The grid's squares are represented as lists of integers. Instead of naming
each square individually (like Square1,2), you can represent it as [x, y].
Pits and Wumpus: Since there are many pits, it's inefficient to name them
individually. Instead, a unary predicate Pit is used to indicate if a square contains a
pit. The Wumpus is a special object represented by the constant Wumpus.
4. Adjacency of Squares
To define which squares are adjacent to each other, we use a complex term. For example:
Adjacent([x, y], [a, b]) means that the square at [x, y] is adjacent to the square at [a,
b]. This is defined with the condi on that squares are adjacent if their rows or
columns differ by 1.
The agent’s loca on changes over me, and we represent this using:
∀t At(Wumpus, [2, 2], t), meaning the Wumpus is always located at square [2, 2].
∀ x, s1, s2, t At(x, s1, t) ∧ At(x, s2, t) ⇒ s1 = s2 ensures that no two objects can
occupy the same square at the same me.
With first-order logic, the agent can deduce facts about its environment based on the
sensory data it perceives. For example:
If the agent is at a square and perceives a breeze, the square is considered "breezy":
Using the logical representa on, the agent can deduce that a breezy square is adjacent to a
pit. The axiom:
8. Succession of States
The state of the agent changes over me, and successor-state axioms allow us to represent
how certain proper es evolve. For example, for the agent's possession of an arrow:
Using logical inference, the agent can reason about the environment and its ac ons. By
applying axioms like the ones above, the agent can determine where the pits are, where the
wumpus is, and what ac ons to take. This allows the agent to make decisions based on
incomplete informa on.
For instance:
If the agent perceives a breeze, it can infer that one of the adjacent squares contains
a pit.
If the agent perceives gli er, it knows that it has found the treasure and should grab
it.
Knowledge Engineering
A knowledge engineer inves gates a domain, iden fies key concepts, and formalizes
the representa on of objects and their rela onships within that domain.
o Determine the range of ques ons the knowledge base will answer.
The task determines the knowledge required to link specific problems to answers.
Example:
o Specifies what "exists" in the domain (e.g., objects, proper es, rela onships).
o Does not determine the specific proper es or rela onships between these
en es.
Example:
Itera ve Process:
Example:
∀s Breezy(s) ⟺ ∃r Adjacent(r,s)∧Pit(r)
o This step involves wri ng atomic sentences that describe the problem's ini al
condi ons.
Example:
At(Agent,[1,1],0).
Percept([None,None,None,None,None],0).
6. Pose Queries
Debugging Process:
Examples:
o Missing Axiom:
is used instead of a bicondi onal, the agent cannot prove the absence
of Wumpuses.
o Incorrect Axiom:
1. Analyzing func onality: For example, does the circuit in Figure 8.6 func on as a one-
bit adder? If all inputs are high, what is the output of gate A2?
2. Structural analysis: Ques ons such as which gates connect to a specific input
terminal or if the circuit contains feedback loops.
3. Other detailed analyses: These involve ming delays, circuit area, power
consump on, or produc on costs, which are beyond the scope of this discussion.
2. Signal flow: Signals flow through wires to input terminals of gates, which transform
them and produce outputs.
3. Gate types: There are four types—AND, OR, XOR, and NOT. Gates have fixed input
and output configura ons:
4. Relevance: Analysis focuses on connec ons between terminals rather than physical
proper es like wire paths, size, or color.
o Gates: Iden fied by predicates like Gate(X1) and type func ons like Type(X1)
= XOR.
2. Terminals:
o Input/output terminals for gates/circuits are represented using func ons like
In(n, X1) and Out(n, X1).
5. Arity: Specifies the number of input and output terminals, e.g., Arity(c, i, j).
Axioms:
Step 5: Encode the Specific Problem Instance
Circuit Components:
1. Output condi ons: Find input combina ons where Out(1, C1) = 0 and Out(2, C1) = 1:
∃i1,i2,i3 such that inputs/outputs sa sfy constraints.
o Solu ons: {i1/1, i2/1, i3/0}, {i1/1, i2/0, i3/1}, {i1/0, i2/1, i3/1}.
2. Complete input-output table: ∃i1,i2,i3,o1,o2 All valid terminal values.
2. Iden fying the issue in the XOR axiom where Signal(Out(1, X1)) = 1 ⇔ 1 ≠ 0.
Proposi onal Inference: Operates directly on proposi onal logic sentences without
quan fiers.
First-Order Inference: Deals with sentences containing quan fiers like ∀ (universal)
and ∃ (existen al).
Conversion between first-order logic and proposi onal logic is possible, allowing the
reuse of proposi onal inference methods.
Defini on: Allows subs tu on of variables with ground terms (constants or terms
without variables) in universally quan fied sentences.
Formal Rule:
∀v α ⟹ SUBST({v/g},α)
Example:
From:
∀x (King(x)∧Greedy(x) ⟹ Evil(x))
We can infer:
King(John)∧Greedy(John) ⟹ Evil(John)
Defini on: Replaces an existen al variable with a new constant symbol that does not
exist elsewhere in the knowledge base.
Formal Rule:
∃v α ⟹ SUBST({v/k},α)
Example:
From:
∃x (Crown(x)∧OnHead(x,John))
We can infer:
Crown(C1)∧OnHead(C1,John)
Skolemiza on
Constraint: The new constant or func on must not clash with exis ng terms in the
knowledge base.
o Retains the original universally quan fied sentence in the knowledge base.
o Removes the original sentence from the knowledge base a er instan a on.
Inferen al Equivalence
Even though the instan ated knowledge base is not logically equivalent to the
original, it is inferen ally equivalent.
1. ∀x (King(x)∧Greedy(x) ⟹ Evil(x))
2. King(John)King(John)King(John)
3. Greedy(John)Greedy(John)Greedy(John)
4. Brother(Richard,John)Brother(Richard, John)Brother(Richard,John).
Steps:
1. Apply Universal Instan a on (UI) to the universally quan fied sentence (∀x) using
all possible ground terms:
o Subs tu on {x/John}:
King(John)∧Greedy(John) ⟹ Evil(John)
o Subs tu on {x/Richard}:
King(Richard)∧Greedy(Richard) ⟹ Evil(Richard)
2. Replace the universally quan fied sentence with these instan a ons.
4. Apply any proposi onal inference algorithm (e.g., resolu on) to draw conclusions,
such as Evil(John).
Key Insight: Every first-order knowledge base and query can be proposi onalized
while preserving entailment.
1. Generate all instan a ons with constant symbols (e.g., Richard and John).
2. Add instan a ons involving terms of increasing depth (e.g., Father(Richard),
Father(Father(John)), etc.).
3. Stop when a proposi onal proof for the entailed sentence is found.
If the knowledge base contains func on symbols (e.g., Father), infinitely many
nested terms like Father(Father(John)) can be generated.
Semi-Decidability: Algorithms exist that say "yes" to every entailed sentence but
cannot guarantee "no" for non-entailed sentences.
Similar to the hal ng problem: The procedure may loop indefinitely without
determining whether it is stuck or close to a solu on.
Implica ons:
However, it does not always terminate when sentences are not entailed.
5. Historical Context
Alan Turing (1936) and Alonzo Church (1936) proved that the entailment problem
for first-order logic is undecidable in general.
Unifica on is the process of finding a subs tu on θ that makes different logical expressions
iden cal.
Key Points:
The most general unifier (MGU) is the simplest unifier that applies.
Li ing
Li ing raises inference from ground (variable-free) proposi onal logic to first-order
logic.
Advantages of Li ing
1. Efficiency: Avoids genera ng unnecessary ground sentences by directly subs tu ng
variables as needed.
3. Integra on: Li ing can be applied to forward chaining, backward chaining, and
resolu on algorithms.
9.2.2 Unifica on
Unifica on is a fundamental process in first-order logic, enabling inference rules to work by
finding subs tu ons that make different logical expressions iden cal. This process is vital in
inference algorithms like Generalized Modus Ponens and ensures expressions can be
compared or combined effec vely.
Defini on of Unifica on
The UNIFY algorithm takes two sentences p and q and returns a subs tu on θ\ that makes
them iden cal, i.e.,
The UNIFY algorithm recursively compares two expressions and constructs a subs tu on θ.
Steps:
2. Match Variables:
o Perform the occur check to ensure variables are not cyclic (e.g., S(x) cannot
unify with S(S(x)).
Complexity:
The occur check can make the algorithm quadra c in the size of expressions.
Some systems omit the occur check for efficiency, leading to poten al unsound
inferences.
The simplest approach involves maintaining all sentences in a single list and a emp ng to
unify q against every sentence in the list. While this method is straigh orward and
func onal, it is inefficient for large knowledge bases. Improving retrieval efficiency is cri cal
in prac cal systems.
Indexing allows unifica ons to be a empted only with relevant sentences, avoiding
unnecessary comparisons.
Predicate Indexing
o Example: Place all sentences with the predicate Knows in one "bucket" and
those with Brother in another.
For predicates associated with many clauses (e.g., Employs(x, y), with millions of facts), a
single bucket may s ll require extensive scanning.
Enhanced Indexing
For maximum flexibility, facts can be indexed under mul ple keys, enabling fast responses to
various query types.
Subsump on La ces
Each sentence in the knowledge base corresponds to a set of queries it can unify with. These
queries form a subsump on la ce:
La ce Proper es
Each child node is derived from its parent by a single subs tu on.
The most general unifier (MGU) gives the "highest" common descendant of any two
nodes.
La ce Complexity
When func on symbols are present, the number of nodes grows exponen ally with
the term size.
Indexing Trade-offs
While indexing improves retrieval efficiency, it also increases storage and maintenance
overhead. Excessive indexing can outweigh its benefits.
1. Fixed Policy: Maintain indices only for combina ons like predicate+argument.
2. Adap ve Policy: Dynamically create indices based on query pa erns and demands.
1. Small Knowledge Bases: Efficient indexing is generally a solved problem for systems
with a limited number of facts.
2. Large Commercial Databases: Handling billions of facts has driven extensive research
and technological advancements in database indexing.
By balancing retrieval efficiency and indexing costs, systems can achieve op mal
performance tailored to their scale and usage pa erns.
First-order definite clauses resemble proposi onal definite clauses but extend them to first-
order logic. A definite clause is:
Features
Variables in the literals are universally quan fied (omi ng quan fiers is a common
shorthand).
Not all knowledge bases can be converted to definite clauses due to the restric on of
having only one posi ve literal.
Example Problem
Datalog Knowledge Base
Datalog is closely related to rela onal databases and is suitable for represen ng facts and
rules in many prac cal AI systems.
1. Start with Known Facts: Begin with the ini al facts in the knowledge base.
2. Trigger Rules: Check the rules (definite clauses). If the premises of a rule are sa sfied
(i.e., all literals in the antecedent are true), apply the rule and add the conclusion
(the consequent) to the knowledge base.
3. Repeat: This process repeats, triggering new rules as the knowledge base expands,
un l:
o No new facts are added (the knowledge base reaches a fixed point).
Renaming of Facts:
A fact is considered not new if it is just a renaming of a known fact. This happens when the
fact differs only by the variables used. For example, Likes(x,IceCream) are considered
iden cal facts, as they represent the same meaning (i.e., everyone likes ice cream), but with
different variables.
Soundness and Completeness
For general first-order definite clauses with func on symbols, there may be infinitely many
new facts generated. In such cases, we rely on Herbrand’s Theorem to prove that the
algorithm will eventually find a proof if one exists. However, if there is no answer, the
algorithm might fail to terminate due to infinite genera on of new facts, such as in the case
of the Peano axioms (where natural numbers are represented recursively).
A fixed point is reached when no further facts can be added, indica ng that the inference
process has concluded. However, in cases with recursive facts or func on symbols, such as
the Peano axioms, forward chaining may not terminate, leading to an undecidable process
(i.e., the system cannot determine whether it will eventually terminate or not).
1. Pa ern Matching: Finding unifiers to match the premises of rules with the facts in
the knowledge base can be computa onally expensive.
2. Rechecking Rules: The algorithm checks every rule on each itera on, even if few new
facts are added, leading to redundant work.
3. Irrelevant Facts: The algorithm may generate many irrelevant facts that do not
contribute to answering the query, leading to inefficiency.
Pa ern matching involves checking if the premise of a rule matches any facts in the
knowledge base. The challenge arises when rules have mul ple conjuncts, as matching
becomes more computa onally expensive. For example:
For the rule Missile(x)⇒Weapon(x) we need to find all facts that unify with
Missile(x)Missile(x)Missile(x), which can be done efficiently with an indexed
knowledge base.
Conjunct Ordering:
Finding the op mal order in which to check the conjuncts of a rule is called the conjunct
ordering problem. The goal is to check the facts in an order that minimizes the
computa onal cost. This problem is NP-hard, but heuris cs can help. For example, the
Minimum-Remaining-Values (MRV) heuris c used in constraint sa sfac on problems
(CSPs) suggests that we should check the fewer number of items first. In the example above,
if Nono owns many objects but only a few are missiles, it would be be er to check for
missiles first.
The algorithm rechecks all rules on every itera on, which is inefficient, especially when few
facts are being added. To address this, we can:
Track Changes: Only recheck rules whose premises have changed due to the addi on
of new facts.
Rule Ac va on: Keep track of which rules were ac vated (i.e., those whose premises
were sa sfied) and only check them in subsequent itera ons.
The algorithm may generate many facts that are irrelevant to the goal, which can slow down
the process. To mi gate this:
Prune Irrelevant Facts: Use the goal (the query) to filter out facts that are unlikely to
help answer the query.
Selec ve Rule Applica on: Apply only those rules that are likely to contribute to the
goal, avoiding unnecessary rule firing.
Matching rules against facts is NP-hard in the general case, as it involves solving constraint
sa sfac on problems (CSPs). However, there are ways to make matching more efficient:
Tractable CSPs: For certain types of CSPs, such as those with tree-structured
constraint graphs, matching can be solved in linear me. This is the case when the
constraint graph (represen ng variables and constraints) forms a tree structure.
For example, the map-coloring problem can be formulated as a CSP, and if the map is
simplified (removing certain regions), the CSP may become tractable, allowing efficient
matching of the rule to the facts.
Incremental Matching: Only apply rules incrementally, based on newly added facts,
rather than reapplying all rules in each itera on.
In forward chaining, redundancy can be avoided by ensuring that new facts are only used to
derive new facts. This leads to an incremental forward-chaining algorithm, which improves
efficiency by focusing on rules that are triggered by newly inferred facts.
o Every new fact generated in itera on t must be derived from a fact generated
in itera on t−1. This means that only facts inferred in the previous itera on
are needed to trigger new rules.
o The incremental approach checks each rule only if its premise contains a
conjunct that can unify with a fact generated in the previous itera on,
reducing unnecessary rule matching.
o In tradi onal forward chaining, redundant rule matching occurs because the
same facts may match the same rules in mul ple itera ons. With the
incremental method, only the facts from the last itera on are considered for
matching, reducing unnecessary work.
The Rete algorithm improves the matching process by construc ng a dataflow network that
keeps track of par al matches. Here’s how it works:
Dataflow Network: Each node in the network represents a literal in the rule's
premise. Variable bindings flow through this network, and they are filtered out when
they fail to match.
Equality Nodes: When two literals share a variable, their bindings are filtered
through an equality node, ensuring that only consistent bindings con nue through
the network.
Avoiding Repe on: The network retains par al matches, so when new facts are
added, the matching process can con nue from where it le off, avoiding
unnecessary recomputa on.
The Rete network op mizes forward chaining by elimina ng the need to rebuild par al
matches from scratch. This technique is crucial in systems with large rule sets, as it greatly
reduces computa on me.
Produc on systems, like XCON (a system for designing computer configura ons), use the
Rete algorithm to handle large numbers of rules efficiently. These systems rely on rules
(condi on-ac on pairs) that are matched against a working memory of facts, where each
match may trigger further rule applica ons.
Cogni ve architectures, such as ACT and SOAR, also use produc on systems to model
human reasoning. In these systems:
Produc ons, represen ng long-term memory, are matched against the working
memory to perform inferences.
These systems o en deal with large numbers of rules and rela vely few facts, making
efficient rule matching even more important.
Forward chaining can some mes generate irrelevant facts, which are conclusions not related
to the goal. This inefficiency can be addressed in several ways:
1. Using Backward Chaining: As discussed in Sec on 9.4, backward chaining can help
focus the reasoning process on the goal, preven ng irrelevant facts from being
generated.
2. Restric ng to Relevant Rules: Instead of considering all rules, forward chaining can
be restricted to a subset of rules relevant to the goal, improving efficiency.
3. Magic Sets: In deduc ve databases, a technique called the magic set approach helps
restrict forward chaining to relevant facts. This approach involves rewri ng rules to
include addi onal condi ons (or magic facts) that limit the search space.
The magic set approach can be seen as a hybrid between forward inference and backward
preprocessing, making it a powerful op miza on technique for large-scale databases.