Forward Chaining and Backward Chaining in Ai: Inference Engine
Forward Chaining and Backward Chaining in Ai: Inference Engine
First Order Predicate Logic – Prolog Programming – Unification – Forward Chaining-Backward Chaining –
Resolution – Knowledge Representation - Ontological Engineering-Categories and Objects – Events - Mental
Events and Mental Objects - Reasoning Systems for Categories - Reasoning with Default Information
In artificial intelligence, forward and backward chaining is one of the important topics, but before
understanding forward and backward chaining lets first understand that from where these two terms came.
Inference engine:
The inference engine is the component of the intelligent system in artificial intelligence, which applies logical
rules to the knowledge base to infer new information from known facts. The first inference engine was part of
the expert system. Inference engine commonly proceeds in two modes, which are:
A. Forward chaining
b. Backward chaining
Horn Clause and Definite clause: Horn clause and definite clause are the forms of sentences, which enables
knowledge base to use a more restricted and efficient inference algorithm. Logical inference algorithms use
forward and backward chaining approaches, which require KB in the form of the first-order definite clause.
Definite clause: A clause which is a disjunction of literals with exactly one positive literal is known as a
definite clause or strict horn clause.
Horn clause: A clause which is a disjunction of literals with at most one positive literal is known as horn
clause. Hence all the definite clauses are horn clauses.
Example: (¬ p V ¬ q V k). It has only one positive literal k.
It is equivalent to p ∧ q → k.
a. Forward Chaining
Forward chaining is also known as a forward deduction or forward reasoning method when using an inference
engine. Forward chaining is a form of reasoning which start with atomic sentences in the knowledge base and
applies inference rules (Modus Ponens) in the forward direction to extract more data until a goal is reached.
The Forward-chaining algorithm starts from known facts, triggers all rules whose premises are satisfied, and
add their conclusion to the known facts. This process repeats until the problem is solved.
Properties of Forward-Chaining:
o It is a down-up approach, as it moves from bottom to top.
o It is a process of making a conclusion based on known facts or data, by starting from the initial state and
reaches the goal state.
o Forward-chaining approach is also called as data-driven as we reach to the goal using available data.
o Forward -chaining approach is commonly used in the expert system, such as CLIPS, business, and
production rule systems.
Example:
"As per the law, it is a crime for an American to sell weapons to hostile nations. Country A, an enemy of
America, has some missiles, and all the missiles were sold to it by Robert, who is an American citizen."
To solve the above problem, first, we will convert all the above facts into first-order definite clauses, and then
we will use a forward-chaining algorithm to reach the goal.
Step-1:
In the first step we will start with the known facts and will choose the sentences which do not have implications,
such as: American(Robert), Enemy(A, America), Owns(A, T1), and Missile(T1). All these facts will be
represented as below.
Step-2:
At the second step, we will see those facts which infer from available facts and with satisfied premises.
Rule-(1) does not satisfy premises, so it will not be added in the first iteration.
Rule-(4) satisfy with the substitution {p/T1}, so Sells (Robert, T1, A) is added, which infers from the
conjunction of Rule (2) and (3).
Step-3:
At step-3, as we can check Rule-(1) is satisfied with the substitution {p/Robert, q/T1, r/A}, so we can add
Criminal(Robert) which infers all the available facts. And hence we reached our goal statement.
In backward-chaining, we will use the same above example, and will rewrite all the rules.
o American (p) ∧ weapon(q) ∧ sells (p, q, r) ∧ hostile(r) → Criminal(p) ...(1)
Owns(A, T1) ........(2)
o Missile(T1)
o ?p Missiles(p) ∧ Owns (A, p) → Sells (Robert, p, A) ......(4)
3 Mrs. V. NISHA JENIPHER (AP/CSE)
o Missile(p) → Weapons (p) .......(5)
o Enemy(p, America) →Hostile(p) ........(6)
o Enemy (A, America) .........(7)
o American(Robert). ..........(8)
Backward-Chaining proof:
In Backward chaining, we will start with our goal predicate, which is Criminal (Robert), and then infer further
rules.
Step-1:
At the first step, we will take the goal fact. And from the goal fact, we will infer other facts, and at last, we will
prove those facts true. So our goal fact is "Robert is Criminal," so following is the predicate of it.
Step-2:
At the second step, we will infer other facts form goal fact which satisfies the rules. So as we can see in Rule-1,
the goal predicate Criminal (Robert) is present with substitution {Robert/P}. So we will add all the conjunctive
facts below the first level and will replace p with Robert.
Here we can see American (Robert) is a fact, so it is proved here.
Step-3:t At step-3, we will extract further fact Missile(q) which infer from Weapon(q), as it satisfies Rule-(5).
Weapon (q) is also true with the substitution of a constant T1 at q.
At step-4, we can infer facts Missile(T1) and Owns(A, T1) form Sells(Robert, T1, r) which satisfies the Rule- 4,
with the substitution of A in place of r. So these two statements are proved here.
Step-5:
At step-5, we can infer the fact Enemy(A, America) from Hostile(A) which satisfies Rule- 6. And hence all the
statements are proved true using backward chaining.
FOLLOWING IS THE DIFFERENCE BETWEEN THE FORWARD CHAINING AND BACKWARD CHAINING:
1) Forward chaining as the name suggests, start from the known facts and move forward by applying
inference rules to extract more data, and it continues until it reaches to the goal, whereas backward
chaining starts from the goal, move backward by using inference rules to determine the facts that satisfy
the goal.
5 Mrs. V. NISHA JENIPHER (AP/CSE)
2) Forward chaining is called a data-driven inference technique, whereas backward chaining is called
a goal-driven inference technique.
3) Forward chaining is known as the down-up approach, whereas backward chaining is known as a top-
down approach.
4) Forward chaining uses breadth-first search strategy, whereas backward chaining uses depth-first
search strategy.
5) Forward and backward chaining both applies Modus ponens inference rule.
6) Forward chaining can be used for tasks such as planning, design process monitoring, diagnosis, and
classification, whereas backward chaining can be used for classification and diagnosis tasks.
7) Forward chaining can be like an exhaustive search, whereas backward chaining tries to avoid the
unnecessary path of reasoning.
8) In forward-chaining there can be various ASK questions from the knowledge base, whereas in backward
chaining there can be fewer ASK questions.
9) Forward chaining is slow as it checks for all the rules, whereas backward chaining is fast as it checks
few required rules only.
1. Forward chaining starts from known Backward chaining starts from the goal and works
facts and applies inference rule to extract backward through inference rules to find the
more data unit it reaches to the goal. required facts that support the goal.
5. Forward chaining tests for all the Backward chaining only tests for few required
available rules rules.
6. Forward chaining is suitable for the Backward chaining is suitable for diagnostic,
planning, monitoring, control, and prescription, and debugging application.
interpretation application.
9. Forward chaining is aimed for any Backward chaining is only aimed for the required
conclusion. data.
KNOWLEDGE REPRESENTATION
Creating representations, concentrating on general concepts—such as Events, Time, Physical Objects, and
Beliefs— that occur in many different domains. Representing these abstract concepts is sometimes called
ontological engineering.
1. The general framework of concepts is called an upper ontology because of the convention of drawing
graphs with the general concepts at the top and the more specific concepts below them, as in Figure.
2. Two major characteristics of general-purpose ontologies distinguish them from collections of special-
purpose ontologies:
• A general-purpose ontology should be applicable in more or less any special-purpose domain (with the
addition of domain-specific axioms). This means that no representational issue can be finessed or brushed under
the carpet.
• In any sufficiently demanding domain, different areas of knowledge must be unified, because reasoning and
problem solving could involve several areas simultaneously. A robot circuit-repair system, for instance, needs
to reason about circuits in terms of electrical connectivity and physical layout, and about time, both for circuit
timing analysis
EXAMPLES:
1. An object is a member of a category.
BB9 ∈ Basketballs
2. A category is a subclass of another category.
Basketballs ⊂ Balls
3. All members of a category have some properties.
(x∈ Basketballs) ⇒ Spherical (x)
4. Members of a category can be recognized by some properties.
Orange(x) ∧ Round (x) ∧ Diameter(x)=9.5__ ∧ x∈ Balls ⇒ x∈ Basketballs
5. A category as a whole has some properties.
Dogs ∈ Domesticated Species
PHYSICAL COMPOSITION
It is also useful to define composite objects with definite parts but no particular structure. For example, we
might want to say “The apples in this bag weigh two pounds.” The temptation would be to ascribe this weight to
MEASUREMENTS
We can call this length 1.5 inches or 3.81 centimeters. Thus, the same length has different names in our
language. We represent the length with a unit’s function that takes a number as argument. If the line segment is
called L1, we can write
A significant portion of reality that seems to defy any obvious individuation—division into distinct objects. We
give this portion the generic name stuff.
EXAMPLE:
PROCESSES
Time intervals
Event calculus opens us up to the possibility of talking about time, and time intervals. We will consider two
kinds of time intervals: moments and extended intervals. The distinction is that only moments have zero
duration:
A mental event is any event that happens within the mind of a conscious individual. Examples include thoughts,
feelings, decisions, dreams, and realizations.
1. Categories are the primary building blocks of large-scale knowledge representation schemes
2. There are two closely related families of systems: semantic networks provide graphical aids for
visualizing a knowledge base and efficient algorithms for inferring properties of an object on the basis of
its category membership; and description logics provide a formal language for constructing and
combining category definitions and efficient algorithms
3. for deciding subset and superset relationships between categories.
Semantic networks:
1. A semantic network or net is a graph structure for representing knowledge in patterns of interconnected
nodes and arcs.
2. Computer implementations of semantic networks were first developed for artificial intelligence and
machine translation, but earlier versions have long been used in philosophy, psychology, and linguistics.
3. The Giant Global of the Semantic Web is a large semantic network.
4. What is common to all semantic networks is a declarative graphic representation that can be used to
represent knowledge and support automated systems for reasoning about the knowledge.
5. Some versions are highly informal, but others are formally defined systems of logic.
6. It is defined objects in terms of their association with other objects ex: snow, white, snowman, ice,
slippery.
1: Definitional networks:
1. Emphasize the subtype or is-a relation between a concept type and a newly defined subtype.
2. The resulting network, also called a generalization or sub Sumption hierarchy, supports the rule of
inheritance for copying properties defined for a supertype to all of its subtypes.
3. Since definition are true by definition, the information in these networks is often assumed to be
necessarily true.
2 : Assertional networks:
1. Are designed to assert propositions. Unlike definitional network, the information in an assertional
network is assumed to be contingently true, unless it is explicitly marked with a modal operator.
2. Some assertional networks have been proposed as models of the conceptual structures underlying
natural language semantics.
3 : Implicational networks:
1. Use implication as the primary relation for connecting nodes. They may be used to represent patterns of
beliefs, causality, or inferences.
2. Implicational networks emphasize implication, they are capable of expressing all the Boolean
connectives by allowing a conjunction of inputs to a propositional node and a disjunction of outputs.
4 : Executable networks:
1. Include some mechanism, such as marker passing or attached procedures, which can perform inferences,
pass messages, or search for patterns and associations.
2. Executable semantic networks contain mechanisms that can cause some change to the network itself.
5 : Learning networks:
1. Build or extend their representations by acquiring knowledge from examples.The new knowledge may
change the old network by adding and deleting nodes and arcs or by modifying numerical values, called
weights, associated with the nodes and arcs.
2. The purpose of learning, both from a natural and AI standpoint, is to create modifications that enable the
system to respond more effectively within its environment.
6 : Hybrid networks:
1. Combine two or more of the previous techniques, either in a single network or in separate, but closely
interacting networks.
2. System are usually called hybrids if their component languages have different syntax... The most widely
used hybrid of multiple network notations is the Unified Modeling Language (UML), which was by
designed by three authors.... who merged their competing notations.
Description logics
Description logics are notations that are designed to make it easier to describe definitions and properties of
categories. Description logic systems evolved from semantic networks in response to pressure to formalize what
the networks mean while retaining the emphasis on taxonomic structure as an organizing principle.
14 Mrs. V. NISHA JENIPHER (AP/CSE)
The principal inference tasks for description logics are subsumption (checking if one category is a subset of
another by comparing their definitions) and classification (checking whether an object belongs to a category)..
Some systems also include consistency of a category definition—whether the membership criteria are logically
satisfiable
Truth Maintenance Systems (TMS), also called Reason Maintenance Systems, are used within Problem
Solving Systems, in conjunction with Inference Engines (IE) such as rule-based inference systems, to manage
as a Dependency
2. Recognize inconsistencies
o The IE may tell the TMS that some sentences are contradictory. Then, if on the basis of other IE
commands and of inferences we find that all those sentences are believed true, then the TMS
reports to the IE that a contradiction has arisen. For instance, in the ABC example the statement
that either Abbott, or Babbitt, or Cabot is guilty together with the statements that Abbott is not
guilty, Babbitt is not guilty, and Cabot is not guilty, form a contradiction.
The IE can eliminate an inconsistency by determining the assumptions used and changing them
appropriately, or by presenting the contradictory set of sentences to the users and asking them to
choose which sentence(s) to retract.
The TMS maintains the following information with each sentence node:
• a sentence
• a label expressing the current belief in the sentence; it is IN for sentences that are believed, and OUT for
sentences that are not believed.
• a list of the justification nodes that support it
• a list of the justification nodes supported by it
• an indication if the node is an assumption, contradiction, or premise.
The TMS maintains very little information with justification nodes. Only a label with value IN or OUT
depending if we believe the justification valid or not.
More powerful than JTMS in that it recognizes the propositional semantics of sentences, i.e. understands the
relations between p and ~p, p and q and p&q, and so on.