Block 2
Block 2
Andpropositional
Logic
Block
2
ARTIFICIAL INTELLIGENCE - KNOWLEDGE
REPRESENTATION
Unit 5
First Order Logic 175
Unit 6
Rule based Systems and other formalism 207
Unit 7
Probabilistic Reasoning 227
Unit 8
Fuzzy and Rough Set 244
171
PROGRAMME DESIGN COMMITTEE
Prof. (Retd.) S.K. Gupta , IIT, Delhi Sh. Akshay Kumar, Associate Professor, SOCIS, IGNOU
Prof. Ela Kumar, IGDTUW, Delhi Dr. P. Venkata Suresh, Associate Professor, SOCIS,
Prof. T.V. Vijay Kumar JNU, New Delhi IGNOU
Prof. Gayatri Dhingra, GVMITM, Sonipat Dr. V.V. Subrahmanyam, Associate Professor, SOCIS,
IGNOU
Mr. Milind Mahajan,. Impressico Business Solutions,
New Delhi Sh. M.P. Mishra, Assistant Professor, SOCIS, IGNOU
Sh. Shashi Bhushan Sharma, Associate Professor, Dr. Sudhansh Sharma, Assistant Professor, SOCIS,
SOCIS, IGNOU IGNOU
SOCIS FACULTY
Prof. P. Venkata Suresh, Director, SOCIS, IGNOU Prof. V.V. Subrahmanyam, SOCIS, IGNOU
Prof. Sandeep Singh Rawat, SOCIS, IGNOU Prof. Divakar Yadav, SOCIS, IGNOU
Dr. Akshay Kumar, Associate Professor, SOCIS, IGNOU Dr.Sudhansh Sharma,Assistant Professor,SOCIS, IGNOU
Dr. M.P. Mishra, Associate Professor, SOCIS, IGNOU
PREPARATION TEAM
Dr. Sudhansh Sharma, (Writer- Unit 5,6) Prof. Ela Kumar (Content Editor)
Assistant Professor, SOCIS, IGNOU Department of Computers & Engg. IGDTUW, Delhi
(Writer Unit 5, 6)-(Partially Adapted from MCSE003
Prof. Parmod Kumar (Language Editor)
SOH, IGNOU, New Delhi
Dr. Manish Kumar, (Writer- Unit 7)
Assistant Professor, SOCIS, IGNOU
(Writer Unit 7 - Partially adapted from AST-01)
COURSE COORDINATOR
Dr. Sudhansh Sharma
Assistant Professor, SOCIS, IGNOU
PRINT PRODUCTION
Mr. Sanjay Aggarwal
Assistant Registrar,MPDD, IGNOU, New Delhi
August, 2023
© Indira Gandhi National Open University, 2023
ISBN: 978-93-5568-926-9
All rights reserved. No part of this work may be reproduced in any form, by mimeograph or any other means, without
permission in writing from the Indira Gandhi National Open University.
172
BLOCK INTRODUCTION
The Block-2 Titled “Artificial Intelligence - Knowledge Representation” is
comprised of four units, the details are as follows :
• Unit-5 First Order Logic
• Unit-6 Rule based Systems and other formalism
• Unit-7 Probabilistic Reasoning
• Unit-8 Fuzzy and Rough Set
In Unit 5 – “First Order Logic”, we extend the your understanding of Predicate
and Propositional Logic (discussed in Unit-4) to First Order Predicate Logic
(FOPL), this will help you to solve larger class of problems. In both of these
Units, we illustrated through a number of examples how tools and techniques of
PL and FOPL are used in solving problems of our everyday experience. Here,
we discussed the framework of PL and FOPL, along with that additional tools
and techniques in the form of some basic inference rules and resolution method,
for solving problems are also discussed.
Unit-6 – “Rule based Systems and other formalism” covers the concepts of
Rule based systems and other formalisms which includes the forward chaining
systems, backward chaining systems, conflict resolution and knowledge
representation techniques viz. frames and scripts.
The problem with PL and FOPL systems taken together is that these systems
assume knowledge of the problem domain as essentially precise, complete and
consistent. However, in the real world, knowledge of the problem domains, in
general, is imprecise, incomplete and inconsistent. The question of “How to
address the imprecise, incomplete and inconsistent knowledge?” is answered in
Unit-7-“Probabilistic Reasoning” and Unit – 8-“Fuzzy and Rough Set”
173
174
UNIT 5 FIRST ORDER LOGIC
Structure
5.0 Introduction
5.1 Objectives
5.2 Syntax of First Order Predicate Logic(FOPL)
5.3 Interpretations in FOPL
5.4 Semantics of Quantifiers
5.5 Inference & Entailment in FOPL
5.6 Conversion to clausal form
5.7 Resolution & Unification
5.8 Summary
5.9 Solutions/Answers
5.10 Further/Readings
5.0 INTRODUCTION
In the previous unit, we discussed how propositional logic helps us in solving
problems. However, one of the major problems with propositional logic is
that, sometimes, it is unable to capture even elementary type of reasoning or
argument as represented by the following statements:
Every man is mortal.
Raman is a man.
Hence, he is mortal.
The above reasoning is intuitively correct. However, if we attempt to simulate
the reasoning through Propositional Logic and further, for this purpose, we use
symbols P, Q and R to denote the statements given above as:
P: Every man is mortal,
Q: Raman is a man,
R: Raman is mortal.
Once, the statements in the argument in English are symbolised to apply tools
of propositional logic, we just have three symbols P, Q and R available with us
and apparently no link or connection to the original statements or to each other.
The connections, which would have helped in solving the problem become
invisible. In Propositional Logic, there is no way, to conclude the symbol R
from the symbols P and Q. However, as we mentioned earlier, even in a natural
language, the conclusion of the statement denoted by R from the statements
denoted by P and Q is obvious. Therefore, we search for some symbolic system
of reasoning that helps us in discussing argument forms of the above-mentioned
175
Artificial Intelligence- type, in addition to those forms which can be discussed within the framework
Knowledge of propositional logic. First Order Predicate Logic (FOPL) is the most well-
Representation
known symbolic system for the pourpose.
The symbolic system of FOPL treats an atomic statement not as an indivisible
unit. Rather, FOPL not only treats an atomic statement divisible into subject
and predicate but even further deeper structures of an atomic statement are
considered in order to handle larger class of arguments. How and to what
extent FOPL symbolizes and establishes validity/invalidity and consistency/
inconsistency of arguments is the subject matter of this unit.
5.1 OBJECTIVES
After studying this unit, you should be able to:
• explain why FOPL is required over and above PL;
• define, and give appropriate examples for, each of the new concepts required
for FOPL including those of quantifier, variable, constant, term, free and
bound occurrences of variables, closed and open wff;
• check consistency/validity, if any, of closed formulas;
• reduce a given formula of FOPL to normal forms: Prenex Normal Form
(PNF) and (Skolem) Standard Form, and conversion to the clausal form
• use the tools and techniques of FOPL, developed in the unit, to solve
problems requiring logical reasoning
• Perform unification and resolution mechanism.
1 1
(∃ x ∈N) (x - = 0), which is read as ‘There exists an x in N for which x -
= 0.’. 2 2
An example of the use of the universal quantifier is (∀ x ∉ N) (x2 > x), which
is read as ‘for every x not in N, x2 > x.’. Of course, this is a false statement,
because there is at least one x∉ N, x ∈ R, for which it is false.
As you have already read in the example of a child in the class,
( ∀ x ∈U)p(x) is logically equivalent to ~ ( ∃ x ∈ U) (~ p(x)). Therefore,
~(∀ x ∈ U)p(x) ≡ ~~ (∃ x ∈U) (~ p(x)) ≡ ( ∃ x ∈ U) ( ~ p(x)).
This is one of the rules for negation that relate ∀ and ∃. The two rules are
~ (∀ x ∈ U)p(x) ≡ (∃ x ∈ U) (~ p(x)), and
~ (∃ x ∈ U)p(x) ≡ (∀ x ∈ U) (~ p(x))
Where U is the set of values that x can take.
177
Artificial Intelligence-
Knowledge
5.3 INTERPRETATIONS IN FOPL
Representation In order to have a glimpse at how FOPL extends propositional logic, let us
again discuss the earlier argument.
Every man is mortal. Raman is a man.
Hence, he is mortal.
In order to derive the validity of above simple argument, instead of looking at
an atomic statement as indivisible, to begin with, we divide each statement into
subject and predicate. The two predicates which occur in the above argument
are:
‘is mortal’ and ‘is man’.
Let us use the notation
IL: is_mortal and
IN: is_man.
In view of the notation, the argument on para-phrasing becomes:
For all x, if IN (x) then IL (x).
IN (Raman).
Hence, IL (RAMAN)
More generally, relations of the form greater-than (x, y) denoting the phrase ‘x
is greater than y’, is_brother_ of (x, y) denoting ‘x is brother of y,’ Between (x,
y, z) denoting the phrase that ‘x lies between y and z’, and is_tall (x) denoting ‘x
is tall’ are some examples of predicates. The variables x, y, z etc which appear
in a predicate are called parameters of the predicate.
The parameters may be given some appropriate values such that after substitution
of appropriate value from all possible values of each of the variables, the
predicates become statements, for each of which we can say whether it is ‘True’
or it is ‘False’.
For example, for the predicate greater-than (x, y), if x is given value 3 then we
obtain greater-than (3, y), for which still it is not possible to tell whether it is
True or False. Hence, ‘greater-than (3, y)’ is also a predicate. Further, if the
variable y is given value 5 then we get greater (3, 5) which , as we known, is
False. Hence, it is possible to give its Truth-value, which is False in this case.
Thus, from the predicate greater-than (x, y), we get the statement greater-than
(3, 5) by assigning values 3 to the variable x and 5 to the variable y. These values
3 and 5 are called parametric values or arguments of the predicate greater-than.
(Please note ‘argument of a function/predicate’ is a mathematical concept,
different from logical argument)
Similarly, we can represent the phrase x likes y by the predicate LIKE (x, y).
Then Ram likes Mohan can be represented by the statement LIKE (RAM,
MOHAN).
178
Also function symbols can be used in the first-order logic. For example, we can First Order Logic
use product (x, y) to denote x * y and father (x) to mean the ‘father of x’. The
statement: Mohan’s father loves Mohan can be symbolised as LOVE (father
(Mohan), Mohan). Thus, we need not know name of father of Mohan and still
we can talk about him. A function serves such a role.
We may note that LIKE (Ram, Mohan) and LOVE (father (Mohan),Mohan) are
atoms or atomic statements of PL, in the sense that, one can associate a truth-
value True or False with each of these, and each of these does not involve a
logical operator like ~, ∧, ∨, → or ↔.
Summarizing in the above discussion, LIKE (Ram, Mohan) and LOVE
(father (Mohan) Mohan) are atoms; where as GREATER, LOVE and LIKE
are predicate symbols; x and y are variables and 3, Ram and Mohan are
constants; and father and product are function symbols.
From the above discussion we learned the following concepts of symbols.
i) Individual symbols or constant symbols: These are usually names of
objects, such as Ram, Mohan, numbers like 3, 5 etc.
ii) Variable symbols: These are usually lowercase unsubscripted or
subscripted letters, like x, y, z, x3.
iii) Function symbols: These are usually lowercase letters like f, g, h,….or
strings of lowercase letters such as father and product.
iv) Predicate symbols: These are usually uppercase letters like P, Q, R,….or
strings of lowercase letters such as greater-than, is_tall etc.
A function symbol or predicate symbol takes a fixed number of arguments. If
a function symbol f takes n arguments, f is called an n-place function symbol.
Similarly, if a predicate symbol Q takes m arguments, P is called an m-place
predicate symbol. For example, father is a one-place function symbol, and
GREATER and LIKE are two-place predicate symbols. However, father-of in
father_of (x, y) is a, two place predicate symbol.
The symbolic representation of an argument of a function or a predicate is
called a term where a term is defined recursively as follows:
i) A variable is a term.
ii) A constant is a term.
iii) If f is an n-place function symbol, and t1….tn are terms, then f(t1,….,tn) is
a term.
iv) Any term can be generated only by the application of the rules given above.
For example: Since, y and 3 are both terms and plus is a two-place function
symbol, plus (y, 3) is a term according to the above definition.
Furthermore, we can see that plus (plus (y, 3), y) and father (father (Mohan))
are also terms; the former denotes (y + 3) + y and the later denotes grandfather
of Mohan.
179
Artificial Intelligence- A predicate can be thought of as a function that maps a list of constant arguments
Knowledge to T or F. For example, GREATER is a predicate with GREATER (5, 2) as T,
Representation
but GREATER (1, 3) as F.
We already know that in PL, an atom or atomic statement is an indivisible unit
for representing and validating arguments. Atoms in PL are denoted generally
by symbols like P, Q, and R etc. But in FOPL,
Definition: An Atom is
(i) either an atom of Propositional Logic, or
(ii) is obtained from an n-place predicate symbol P, and terms t1,….tn so that
P (t1,….,tn) is an atom.
Once, the atoms are defined, by using the logical connectives defined in
Propositional Logic, and assuming having similar meaning in FOPL, we can
build complex formulas of FOPL. Two special symbol ∀ and ∃ are used to
denote qualifications in FOPL. The symbols ∀ and ∃ are called, respectively,
the universal quantifier and existential quantifier. For a variable x, (∀x) is
read as for all x, and (∃x) is read as there exists an x. Next, we consider some
examples to illustrate the concepts discussed above.
In order to symbolize the following statements:
i) There exists a number that is rational.
ii) Every rational number is a real number
iii) For every number x, there exists a number y, which is greater than x.
let us denote x is a rational number by Q(x), x is a real number by R(x), and
x is less than y by LESS(x, y). Then the above statements may be symbolized
respectively, as
(i) (∀x) Q(x)
(ii) (∀x) (Q(x) → R (x))
(iii) (∀x) (∃y) LESS(x, y).
Each of the expressions (i), (ii), and (iii) is called a formula or a well-formed
formula or wff.
1 1
(∃ x ∈N) (x - = 0), which is read as ‘There exists an x in N for which x -
= 0.’. 2 2
An example of the use of the universal quantifier is (∀ x ∉ N) (x2 > x), which
is read as ‘for every x not in N, x2 > x.’. Of course, this is a false statement,
because there is at least one x∉ N, x ∈ R, for which it is false.
As you have already read in the example of a child in the class,
( ∀ x ∈U)p(x) is logically equivalent to ~ ( ∃ x ∈ U) (~ p(x)). Therefore,
181
Artificial Intelligence- ~(∀ x ∈ U)p(x) ≡ ~~ (∃ x ∈U) (~ p(x)) ≡ ( ∃ x ∈ U) ( ~ p(x)).
Knowledge
Representation This is one of the rules for negation that relate ∀ and ∈. The two rules are
~ (∀ x ∈ U)p(x) ≡ (∃ x ∈ U) (~ p(x)), and
~ (∃ x ∈ U)p(x) ≡ (∀ x ∈ U) (~ p(x))
Where U is the set of values that x can take.
Next, we discuss three new concepts, viz Scope of occurrence of a quantified
variable, Bound occurrence of a quantifier variable or quantifier and Free
occurrence of a variable.
Before discussion of these concepts, we should know the difference between a
variable and occurrence of a variable in a quantifier expression.
The variable x has THREE occurrences in the formula
(∃x) Q(x) → P(x, y).
Also, the variable y has only one occurrence and the variable z has zero
occurrence in the above formula. Next, we define the three concepts mentioned
above.
Scope of an occurrence of a quantifiers is the smallest but complete formula
following the quantifier sometimes delimited by pair f parentheses. For example,
Q(x) is the scope of (∃x) in the formula
(∃x) Q(x) → P(x, y).
But the scope of (∃x) in the formula: (∃x) (Q(x) → P(x, y)) is (Q(x) → P(x, y)).
Further in the formula:
(∃x) (P(x) → Q(x, y)) ∧ (∃x) (P(x) → R(x, 3)),
the scope of first occurrence of (∃x) is the formula (P(x) → Q (x, y) and the
scope of second occurrence of (∃x) is the formula
(P(x) → R(x, 3)).
As another example, the scope of the only occurrence of the quantifier (∀y) in
(∃x) (( P(x) → Q(x) ↔ (∀y) (Q (x) → R (y))) is ( Q (x) → R(y)). But the scope
of the only occurrence of the existential variable (∃x) in the same formula is the
formula:
(P(x) → Q(x)) P ↔ (∀y) (Q (x) → R(y))
An occurrence of a variable in a formula is bound if and only if the occurrence
is within the scope of a quantifier employing the variable, or is the occurrence
in that quantifier. An occurrence of a variable in a formula is free if and only if
this occurrence of the variable is not bound.
Thus, in the formula (∃x) P(x, y) → Q (x), there are three occurrences of x, out
of which first two occurrences of x are bound, where, the last occurrence of x is
free, because scope of (∃x) in the above formula is P(x, y). The only occurrence
182 of y in the formula is free. Thus, x is both a bound and a free variable in the
above formula and y is only a free variable in the formula so far, we talked of First Order Logic
an occurrence of a variable as free or bound. Now, we talk of (only) a variable
as free or bound. A variable is free in a formula if at least one occurrence of it is
free in the formula. A variable is bound in a formula if at least one occurrence
of it is bound.
It may be noted that a variable can be both free and bound in a formula. In
order to further elucidate the concepts of scope, free and bound occurrences of
a variable, we consider a similar but different formula for the purpose:
(∃x) (P(x, y) → Q(x)).
In this formula, scope of the only occurrence of the quantifier (∃x) is the whole
of the rest of the formula, viz. scope of (∃x) in the given formula is (P(x, y) →
Q (x))
Also, all three occurrence of variable x are bound. The only occurrence of y is
free.
Remarks: It may be noted that a bound variable x is just a place holder or a
dummy variable in the sense that all occurrences of a bound variable x may
be replaced by another free variable say y, which does not occur in the formula.
However, once, x is replaced by y then y becomes bound. For example, (∀x) (f
(x)) is the same as (∀y) f (y). It is something like
2 23 13 7
2
∫1 ∫1
x 2 dx = y 2 dy =
− =
3 3 3
Replacing a bound variable x by another variable y under the restrictions
mentioned above is called Renaming of a variable x
Having defined an atomic formula of FOPL, next, we consider the definition
of a general formula formally in terms of atoms, logical connectives, and
quantifiers.
Definition A well-formed formula, wff a just or formula in FOPL is defined
recursively as follows:
i) An atom or atomic formula is a wff.
ii) If E and G are wff, then each of ~ (E), (E ∨ G), (E ∧ G), (E → G), (E ↔ G)
is a wff.
iii) If E is a wff and x is a free variable in E, then (∀x)E is a wff.
iv) A wff can be obtained only by applications of (i), (ii), and (iii) given above.
We may drop pairs of parentheses by agreeing that quantifiers have the
least scope. For example, (∃x) P(x, y) → Q(x) stands for
((∃x) P(x, y)) → Q(x)
We may note the following two cases of translation:
(i) for all x, P(x) is Q(x) is translated as
(∀x) (P(x) → Q(x) ) 183
Artificial Intelligence- (the other possibility (∀x) P(x) → Q(x) is not valid.)
Knowledge
Representation (ii) for some x, P(x) is Q (x) is translated as (∃x) P(x) ∧ Q(x)
(the other possibility (∀x) P(x) ∧ Q(x) is not valid)
Example
Translate the statement: Every man is mortal. Raman is a man. Therefore,
Raman is mortal.
As discussed earlier, let us denote “x is a man” by MAN (x), and “x is mortal”
by MORTAL(x). Then “every man is mortal” can be represented by
(∀x) (MAN(x) → MORTAL(x)),
“Raman is a man” by
MORTAL(Raman).
The whole argument can now be represented by
(∀x) (MAN(x) → MORTAL(x)) ∧ MAN (Roman) → MORTAL (Roman).
as a single statement.
In order to further explain symbolisation let us recall the axioms of natural
numbers:
(1) For every number, there is one and only one immediate successor,
(2) There is no number for which 0 is the immediate successor.
(3) For every number other than 0, there is one and only one immediate
predecessor.
Let the immediate successor and predecessor of x, respectively be denoted by
f(x) and g(x).
Let E (x, y) denote x is equal to y. Then the axioms of natural numbers are
represented respectively by the formulas:
(i) (∀x) (∃y) (E(y, f(x)) ∧ (∀z) (E(z, f(x)) → E(y, z)))
(ii) ~ ((∃x) E(0, f(x))) and
(iii) (∀x) (~ E(x, 0) → ((y)∃, g(x)) ∧ (∀z) (E(z, g(x)) → E(y, z))))).
From the semantics (for meaning or interpretation) point of view, the wff of
FOPL may be divided into two categories, each consisting of
(i) wffs, in each of which, all occurrences of variables are bound.
(ii) wffs, in each of which, at least one occurrence of a variable is free.
The wffs of FOPL in which there is no occurrence of a free variable, are like wffs
of PL in the sense that we can call each of the wffs as True, False, consistent,
inconsistent, valid, invalid etc. Each such a formula is called closed formula.
However, when a wff involves a free occurrence, then it is not possible to
call such a wff as True, False etc. Each of such a formula is called an open
184 formula.
For example: Each of the formulas: greater (x, y), greater (x, 3), (∀y) greater First Order Logic
(x, y) has one free occurrence of variable x. Hence, each is an open formula.
Each of the formulas: (∀x) (∃y) greater (x, y), (∀y) greater (y, 1), greater (9, 2),
does not have free occurrence of any variable. Therefore each of these formulas
is a closed formula.
Next we discuss some equivalences, and inequalities
The following equivalences hold for any two formulas P(x) and Q(x):
(i) (∀x) P(x) ∧ (∀x) Q(x) = (∀x) (P(x) ∧ Q(x))
(ii) (∃x) P(x) ∨ ( ∃x) Q (x) = (∃x) (P(x) ∨ Q(x)
But the following inequalities hold, in general:
(iii) (∀x) (P(x) ∨ Q(x) ≠ (∀x) P(x) ∨ (∀x) Q(x)
(iv) (∃x) (P(x) ∧ Q(x) ≠ (∃x) P(x) ∧ (∃x) Q (x)
We justify (iii) & (iv) below:
Let P(x): x is odd natural number,
Q(x): x is even natural number.
Then L.H.S of (iii) above states for every natural number it is either odd or
even, which is correct. But R.H.S of (iii) states that every natural number is
odd or every natural number is even, which is not correct.
Next, L.H.S. of (iv) states that: there is a natural number which is both even
and odd, which is not correct. However, R.H.S. of (iv) says there is an integer
which is odd and there is an integer which is even, correct.
Equivalences involving Negation of Quantifiers
(v) ~ (∀x) P(x) = (∃x) ~ P(x)
(iv) ~ (∃x) P(x) = (∀x) ~ P(x)
Examples: For each of the following closed formula, Prove
(i) (∀x) P(x) ∧ (∃y) ~ P(y) is inconsistent.
(ii) (∀x) P(x) → (∃y) P(y) is valid
Solution: (i) Consider
(∀x) P(x) ∧ (∃y) ~ P(y)
= (∀x) P(x) ∧ ~ (∀y) P(y) (taking negation out)
But we know for each bound occurrence, a variable is dummy, and can be
replaced in the whole scope of the variable uniformly by another free variable.
Hence,
R = (∀x) P(x) ∧ ~ (∀x) P(x)
Each conjunct of the formula is either
185
Artificial Intelligence- True of False and, hence, can be thought of as a formula of PL, in stead of
Knowledge formula of FOPL, Let us replace (∀x) (P(x) by Q , a formula of PL.
Representation
R = Q ∧ ~ Q = False
Hence, the proof.
(ii) Consider
(∀x) P(x) → (∃y) P(y)
Replacing ‘→’ we get
= ~ (∀x) P(x) ∨ (∃y) P(y)
= (∃x) ~ P(x) ∨ (∃y) P(y)
= (∃x) ~ P(x) ∨ (∃x) P(x) (renaming x as y in the second disjunct)
In other words,
= (∃x) (~ P(x) ∨ P(x)) (using equivalence)
The last formula states: there is at least one element say b, for ~ P(b) ∨ P(b)
holds i.e., for b, either P(b) is False or P(b) is True.
But, as P is a predicate symbol and b is a constant ~ P(b) ∨ P(b) must be True.
Hence, the proof.
Check Your Progress 1
Ex. 1 Let P(x) and Q(x) represent “x is a rational number” and “x is a real
number,” respectively. Symbolize the following sentences:
(i) Every rational number is a real number.
(ii) Some real numbers are rational numbers.
(iii) Not every real number is a rational number.
Ex. 2 Let C(x) mean “x is a used-car dealer,” and H(x) mean “x is honest.”
Translate each of the following into English:
(i) (∃x)C(x)
(ii) (∃x) H(x)
(iii) (∃x)C(x) → ~ H (x))
(iv) (∃x) (C(x) ∧ H(x))
(v) (∃x) (H(x) → C(x)).
Ex. 3 Prove the following:
(i) P(a) → ~ ((∃x) P(x)) is consistent.
(ii) (∀x) P(x) ∨ ((∃y) ~ P(y)) is valid.
186
5.5 INFERENCING & ENTAILMENT IN FOPL First Order Logic
(∀x) p ( x)
p(a)
187
Artificial Intelligence- (ii) Universal Generalisation Rule (U.G.)
Knowledge
Representation P (a ), for all a
(∀x) p ( x)
The rule says that if it is known that for all constants a, the statement P(a) is
True, then we can, instead, use the formula (∀x) p ( x) .
The rule associates with a set of formulas P(a) for all a of Propositional Logic,
a formula (∀x) p ( x) of FOPL.
Before using the rule, we must ensure that P(a) is True for all a, Otherwise
it may lead to wrong conclusions.
(iii) Existential Instantiation Rule (E. I.)
(∃x) P ( x)
( E.I .)
P(a)
The rule says if the Truth of (∃x) P( x) is known then we can assume the
Truth of P(a) for some fixed a. The rule, again, associates a formula P(a) of
Propositional Logic to a formula (∀x) p ( x) of FOPL.
An inappropriate application of this rule may lead to wrong conclusions. The
source of possible errors lies in the fact that the choice ‘a’ in the rule is not
arbitrary and can not be known at the time of deducing P(a) from (∃x) P( x) .
The rule states that if P(a), a formula of Propositional Logic is True, then the
Truth of , a formula of FOPL , may be assumed to be True.
The Universal Generalisation (U.G) and Existential Instantiation rules
should be applied with utmost care, however, other two rules may be
applied, whenever, it appears to be appropriate.
Next, The purpose of the two rules, viz.,
(i) Universal Instantiation Rule (U. I.)
(iii) Existantial Rule (E. I.)
is to associate formulas of Propositional Logic (PL) to formulas of FOPL in a
manner, the validity of arguments due to these associations, is not disturbed.
188 Once, we get formulas of PL, then any of the eight rules of inference of PL may
be used to validate conclusions and solve problems requiring logical reasoning First Order Logic
for their solutions.
The purpose of the other Quantification rules viz. for generalisation, i.e.,
(ii) P(a ), for all a
(∀x) P( x)
(iv) P(a)
(∃x) P( x)
(i) To conclude F (a ) ∧ G (a ) → H (a ) ∧ I (a )
from (∀x) ( F ( x) ∧ G ( x) ) → H ( x) ∧ I ( x)
using Universal Instantiation (U.I.)
The above inference or conclusion is incorrect in view of the fact that the scope
of universal quantification is only the formula: and not the whole of the formula.
F (a) ∧ G (a) → H ( x) ∧ I ( x)
(iii) To conclude ~ F(a) for an arbitrary a, from ~ (∀x) F(x) using U.I.
The conclusion is incorrect, because actually
~ (∀x) F(x) = (∃x) ~ F (x)
Thus, the inference is not a case of U.I., but of Existential Instantiation (E.I.)
Further, as per restrictions, we can not say for which a, ~ F(x) is True. Of
course, ~ F(x) is true for some constant, but not necessarily for a pre-assigned
constant a.
189
Artificial Intelligence-
Knowledge (iv) to conclude ( ( F (b) ∧ G (b) → H (c) )
Representation
190
Therefore, (i), (ii) and (iii) can be symbolized as: First Order Logic
(xvii) M(b)
Using conjunction for (viii), (ix) and (xvii) we get
(xviii) J(b) ∧C(b) ∧M(b)
From (xviii), through Existential Generalization we get the required (iv), i.e.
(∃x) (J(x) ∧C(x) ∧M(x))
Remark: It may be noted the occurrence of quantifiers is not, in general,
commutative i.e.,
(Q1x) (Q2x) ≠ (Q2x) (Q1x)
For example
(∀x) (∃y) F(x,y)≠ (∃y) (∀x) F(x,y) (A)
The occurrence of (∃y) on L.H.S depends on x i.e., occurrence of y on L.H.S is
a function of x. However, the occurrence of (∃y) on R.H.S is independent of x,
hence, occurrence of y on R.H.S is not a function of x.
For example, if we take F(x,y) to mean:
y and x are integers such that y>x,
then, L.H.S of (A) above states: For each x there is a y such that y>x.
The statement is true in the domain of real numbers.
On the other hand, R.H.S of (A) above states that: There is an integer y which
is greater than x, for all x.
This statement is not true in the domain of real numbers.
When the logical statements are interconnected in a manner that one is
consequence of other then such Logical consequences (also called entailment)
are the fundamental concept in logical reasoning, which describes the
relationship between statements that hold true when one statement logically
follows from one or more statements.
A valid logical argument is one in which the conclusion is entailed by the
premises, because the conclusion is the consequence of the premises. The
philosophical analysis of logical consequence involves the questions: In what
sense does a conclusion follow from its premises? and What does it mean for
a conclusion to be a consequence of premises? All of philosophical logic is
meant to provide accounts of the nature of logical consequence and the nature
of logical truth.
Logical consequence is necessary and formal, by way of examples that explain
with formal proof and models of interpretation. A sentence is said to be a logical
consequence of a set of sentences, for a given language, if and only if, using
only logic (i.e., without regard to any personal interpretations of the sentences)
the sentence must be true if every sentence in the set is true.
193
Artificial Intelligence-
Knowledge
5.6 CONVERSION TO CLAUSAL FORM
Representation In order to facilitate problem solving through Propositional Logic, we discussed
two normal forms, viz, the conjunctive normal form CNF and the disjunctive
normal form DNF. In FOPL, there is a normal form called the prenex normal
form. Further the statement in Prenex Normal Form is required to be skolomized
to get the clausal form, which can be used for the purpose of Resolution.
So, first step towards the Clausal form is to begin with Prenex Normal
Form (PNF), and the second step is skolomization, which will be discussed
after PNF.
Prenex Normal Form (PNF): In broad sense it relates to re-alignment of the
quantifiers, i.e. to bring all the quantifiers in the beginning of the expression
and then replacement the existential and universal quantifiers with constants
and the functions is performed for skolomization i.e. to bring the statement in
the clausal form.
The use of a prenex normal form of a formula simplifies the proof procedures,
to be discussed.
Definition A formula G in FOPL is said to be in a prenex normal form if and
only if the formula G is in the form
(Q1x1)….(Qn xn) P
where each (Qixi), for i = 1, ….,n, is either (∀xi) or (∃xi), and P is a quantifier
free formula. The expression (Q1x1)….(Qn xn) is called the prefix and P is
called the matrix of the formula G.
Examples of some formulas in prenex normal form:
(i) (∃x) (∀y) (R(x, y) ∨ Q(y)), (∀x) (∀y) (~ P(x, y) → S(y)),
(ii) (∀x) (∀y) (∃z) (P(x, y) → R (z)).
Next, we consider a method of transforming a given formula into a prenex
normal form. For this, first we discuss equivalence of formulas in FOPL. Let
us recall that two formulas E and G are equivalent, denoted by E = G, if and
only if the truth values of F and G are identical under every interpretation. The
pairs of equivalent formulas given in Table of equivalent Formulas of previous
unit are still valid as these are quantifier–free formulas of FOPL. However,
there are pairs of equivalent formulas of FOPL that contain quantifiers. Next,
we discuss these additional pairs of equivalent formulas. We introduce some
notation specific to FOPL: the symbol G denote a formula that does not contain
any free variable x. Then we have the following pairs of equivalent formulas,
where Q denotes a quantifier which is either ∀ or ∃. Next, we introduce four
laws for pairs of equivalent formulas.
In the rest of the discussion of FOPL, P[x] is used to denote the fact that x is a
free variable in the formula P, for example, P[x] = (∀y) P (x, y). Similarly, R [x,
y] denotes that variables x and y occur as free variables in the formula R Some
of these equivalences, we have discussed earlier.
194
Then, the following laws involving quantifiers hold good in FOPL First Order Logic
(i) ( Qx ) P [ x] ∨ G = ( Qx ) ( P [x ] ∨ G).
(ii) ( Qx ) P [x ] ∧ G = ( Qx ) ( P [x] ∧ G).
In the above two formulas, Q may be either ∀ or ∃.
(iii) ~ (( ∀x ) P [ x ]) = (∃x ) ( ~ P [ x ] ).
(iv) ~ (( ∃x) P [ x ] ) = ( ∀x ) ( ~ P [ x ]).
(v) (∀x) P [x] ∧ (∀x) H [x] = (∀x) (P [x] ∧ H [x]).
(vi) (∃x) P [x] ∨ (∃x) H [x] = (∃x) (P [x] ∨ H [x]).
That is, the universal quantifier ∀ and the existential quantifier ∃ can be
distributed respectively over ∧ and ∨.
But we must be careful about (we have already mentioned these inequalities)
(vii) (∀x) E [x] ∨ (∀x) H [x] ≠ (∀x) (P [x] ∨ H [x]) and
(viii) (∃x ) P [x] ∧ (∃x) H [x] ≠ (∃x) (P [x] ∧ H [x])
Steps for Transforming an FOPL Formula into Prenex Normal Form
Step 1 Remove the connectives ‘↔’ and ‘→’ using the equivalences
P ↔ G = (P → G) ∧ ( G → P)
P→ G = ~ P → G
Step 2 Use the equivalence to remove even number of ~’s
~ ( ~ P) = P
Step 3 Apply De Morgan’s laws in order to bring the negation signs immediately
before atoms.
~ (P ∨ G) = ~ P ∧ ~ G
~ (P ∧ G) = ~ P ∨ ~ G
and the quantification laws
~ ((∀x) P[x]) = (∃x) (~P[x])
~ ((∃x) P [x]) = (∀x) (~F[x])
Step 4 rename bound variables if necessary
Step 5 Bring quantifiers to the left before any predicate symbol appears in the
formula. This is achieved by using (i) to (vi) discussed above.
We have already discussed that, if all occurrences of a bound variable are
replaced uniformly throughout by another variable not occurring in the formula,
then the equivalence is preserved. Also, we mentioned under (vii) that ∀ does
not distribute over ∧ and under (viii) that ∃ does not distribute over ∨. In such
195
Artificial Intelligence- cases, in order to bring quantifiers to the left of the rest of the formula, we may
Knowledge have to first rename one of bound variables, say x, may be renamed as z, which
Representation
does not occur either as free or bound in the other component formulas. And
then we may use the following equivalences.
(Q1 x) P[x] ∨ (Q2 x) H[x] = (Q1 x) (Q2 z) (P[x] ∨ H[z])
(Q3 x) P[x] ∧ (Q4 x) H[x] = (Q3 x) (Q4 z) (P[x] ∧ H[z])
Example: Transform the following formulas into prenex normal forms:
(i) (∀x) (Q(x) → (∃x) R (x, y))
(ii) (∃x) (~ (∃y) Q(x, y) → ((∃z) R(z) → S (x)))
(iii) (∀x) (∀y) ((∃z) Q(z, y, z) ∧ ((∃u) R (x, u) → (∃v) R (y, v))).
Part (i)
Step 1: By removing ‘→’, we get
(∀x) (~ Q (x) ∨ (∃x) R (x, y))
Step 2: By renaming x as z in (∃x) R (x, y) the formula becomes
(∀x) (~ Q (x) ∨ (∃z) R (z, y))
Step 3: As ~ Q(x) does not involve z, we get
(∀x) (∃z) (~ Q (x) ∨ R (z, y))
Part (ii)
(∃x) (~ (∃y) Q (x, y) → ((∃z) R (z) → S (x)))
Step 1: Removing outer ‘’ we get
(∃x) (~ (~ ((∃y) Q (x, y))) ∨ (( z) R (z) S (x)))
Step 2: Removing inner ‘→’ , and simplifying ~ (~ ( ) ) we get
(∃x) ((∃y) Q (x, y) ∨ (~ ( (∃z) R(z)) → S (x)))
Step 3: Taking ‘~’ inner most, we get
(∃x) (∃y) Q (x, y) ∨ ((∀z) ~ R(z) ∨ S(x)))
As first component formula Q (x, y) does not involve z and S(x) does not involve
both y and z and ~ R(z) does not involve y. Therefore, we may take out ( ∃ y)
and (∀z) so that, we get
(∃x) (∃y) (∀z) (Q (x, y) ∨ (~ R(z) ∨ S (x) ), which is the required formula in
prenex normal form.
Part (iii)
(∀x) (∀y) ((∃z) Q (x, y, z) ∧ (( ∃u) R (x, u) → (∃v) R (y v)))
Step 1: Removing ‘→’, we get
196 (∀x) (∀y) ((∃z) Q (x, y, z) ∧ (~ ((∀u) R (x, u)) ∨ (∃v) R (y, v)))
Step 2: Taking ‘~’ inner most, we get First Order Logic
(∀y1) (∀y2) (∀y3) (P (c, f(y1), g(y1, y2), y1, y2) ∧ Q (h (y1, y2), j (y1, y2, y3))).
Check Your Progress -2
Ex: 4 (i) Transform the formula (∀x) P(x) → (∃x) Q(x) into prenex normal
form.
(ii) Obtain a prenex normal form for the formula
(∀x) (∀y) ((∃z) (P(x, y) ∧ P(y, z)) → (∃u) Q (x, y, u))
Ex 5. Obtain a (skolem) standard form for each of the following formula:
(i) (∃x) (∀y) (∀v) (∃z) (∀w) (∃u) P (x, y, z, u, v, w)
(ii) (∀x) (∃y) (∃z) ((P (x, y) ∨ ~ Q (x, z)) → R (x, y, z))
203
Artificial Intelligence-
Knowledge
5.8 SUMMARY
Representation In this unit, initially, we discuss how PL is inadequate to solve even simple
problems, requires some extension of PL or some other formal inferencing
system so as to compensate for the inadequacy. First Order Predicate Logic
(FOPL), is such an extension of PL that is discussed in the unit.
Next, syntax of proper structure of a formula of FOPL is discussed. In this
respect, a number of new concepts including those of quantifier, variable,
constant, term, free and bound occurrences of variables; closed and open wff,
consistency/validity of wffs etc. are introduced.
Next, two normal forms viz. Prenex Normal Form (PNF) and Skolem Standard
Normal Form are introduced. Finally, tools and techniques developed in the
unit, are used to solve problems involving logical reasoning.
5.9 SOLUTIONS/ANSWERS
Check Your Progress - 1
Ex. 1 (i) (∀x) (P (x) → Q(x))
(ii) (∃x) (P(x) ∧ Q(x))
(iii) ~ (∀x) ( Q (x) → P(x))
Ex. 2
(i) There is (at least) one (person) who is a used-car dealer.
(ii) There is (at least) one (person) who is honest.
(iii) All used-car dealers are dishonest.
(iv) (At least) one used-car dealer is honest.
(v) There is at least one thing in the universe, (for which it can be said
that) if that something is Honest then that something is a used-car
dealer
Note: the above translation is not the same as: Some no gap one honest, is a
used-car dealer.
Ex 3: (i) After removal of ‘→’ we get the given formula
= ~ P(a) ∨ ~ (( ∃x) P(x))
= ~ P(a) ∨ (∀x) (~ P(x))
Now P(a) is an atom in PL which may assume any value T or F. On taking P(a)
as F the given formula becomes T, hence, consistent.
(ii) The formula can be written
(∀x) P(x) ∨ ~ (∀x) (P(x)), by taking negation outside the second disjunct and
then renaming.
204
The (∀x) P(x) being closed is either T or F and hence can be treated as formula First Order Logic
of PL.
Let ∀x P(x) be denoted by Q. Then the given formula may be denoted by Q ∨ ~
Q = True (always) Therefore, formula is valid.
Check Your Progress - 2
Ex: 4 (i) (∀x) P(x) → (∃x) Q(x) = ~ ((∀x) P(x)) ∨ (∃x) Q(x)(by removing the
connective→)
= (∃x) (~P(x)) ∨ (∃x) Q(x) (by taking ‘~’ inside)
= (∃x) (~P(x) ∨ Q(x)) (By taking distributivity of ∃x over ∨)
Therefore, a prenex normal form of (∀x) P(x) → (∃x) Q(x) is (∃x) (~P(x) ∨
Q(x)).
(ii) (∀x) (∀y) ((∃z) (P(x, y) ∧ P(y, z)) → (∃u) Q (x, y, u)) (removing the
connective→)
= (∀x) (∀y) (~ ((∃z) (P(x, z) ∧ P(y, z)))
∨ (∃u) Q (x, y, u)) (using De Morgan’s Laws)
= (∀x) (∀y) ((∀z) (~P(x, z) ∧ ~ P(y, z))
∨ (∃u) Q(x, y, u))
= (∀x) (∀y) (∀z) (~P(x, z)
∨ ~ P(y, z) ∨ Q (x, y, u) (as z and u do not occur in the rest of the
formula except their respective scopes)
Therefore, we obtain the last formula as a prenex normal form of the first
formula.
Ex 5 (i) In the given formula (∃x) is not preceded by any universal quantification.
Therefore, we replace the variable x by a (skolem) constant c in the formula and
drop (∃x).
Next, the existential quantifier (∃z) is preceded by two universal quantifiers
viz., v and y. we replace the variable z in the formula, by some function, say,
f (v, y) and drop (∃z). Finally, existential variable (∃u) is preceded by three
universal quantifiers, viz., (∀y), (∀y) and (∀w). Thus, we replace in the formula
the variable u by, some function g(y, v, w) and drop the quantifier (∃u). Finally,
we obtain the standard form for the given formula as
(∀y) (∀v) (∀w) P(x, y, z, u, v, w)
(ii) First of all, we reduce the matrix to CNF.
= (P (x, y) ∨ ~ Q (x, z)) → R (x, y, z)
= (~ P (x, y) ∧ Q (x, z)) ∨ R (x, y, z)
= (~ P (x, y) ∨ R (x, y, z)) ∧ (Q (x, z) ∨ R (x, y, z))
205
Artificial Intelligence- Next, in the formula, there are two existential quantifiers, viz., (∀y) and (∀z).
Knowledge Each of these is preceded by the only universal quantifier, viz. (∀x).
Representation
Thus, each variable y and z is replaced by a function of x. But the two functions
of x for y and z must be different functions. Let us assume, variable, y is replaced
in the formula by f(x) and the variable z is replaced by g(x). Thus the initially
given formula, after dropping of existential quantifiers is in the standard form:
(∀x) ((~ P (x, y) ∨ R (x, y, z)) ∧ (Q (x, z) ∨ R (x, y, z)))
Check Your Progress - 3
Ex 6 : Refer to section 5.7
Ex 7 : Refer to section 5.7
206
UNIT 6 RULE BASED SYSTEMS AND OTHER First Order Logic
FORMALISM
Structure
6.0 Introduction
6.1 Objectives
6.2 Rule Based Systems
6.2.1 Forward chaining
6.0 INTRODUCTION
Computer Science is the study of how to create models that can be represented
in and executed by some computing equipment. In this respect, the task for a
computer scientist is to create, in addition to a model of the problem domain,
a model of an expert of the domain as problem solver who is highly skilled in
solving problems from the domain under consideration, and the concerned field
relates to the field of Expert Systems.
First of all we must understand that an expert system is nothing but a computer
program or a set of computer programs which contains the knowledge and some
inference capability of an expert, most generally a human expert, in a particular
domain. An expert system is supposed to contain the capability to lead to some
conclusion, based on the inputs provided, the system already contains some pre-
existing information; which is processed to infer some conclusion. The expert
system belongs to the branch of Computer Science called Artificial Intelligence.
Taking into consideration all the points, discussed above, one of the many
possible definitions of an Expert System is : “An Expert System is a computer
program that possesses or represents knowledge in a particular domain, has the
capability of processing/ manipulating or reasoning with this knowledge with
a view to solving a problem, giving some achieving or to achieve some specific
goal.”
Whereas, the Artificial Intelligence programs written to achieve expert-level
competence in solving problems of different domains are more called knowledge
based systems. A knowledge-based system is any system which performs a job
or task by applying rules of thumb to a symbolic representation of knowledge, 207
Artificial Intelligence- instead of employing mostly algorithmic or statistical methods. Often the term
Knowledge expert systems is reserved for programs whose knowledge base contains the
Representation
knowledge used by human experts, in contrast to knowledge gathered from
textbooks or non-experts. But more often than not, the two terms, expert systems
and knowledge-based systems are taken as synonyms. Together they represent
the most widespread type of AI application.
One of the underlying assumptions in Artificial Intelligence is that intelligent
behaviour can be achieved through the manipulation of symbol structures
(representing bits of knowledge). One of the main issues in AI is to find appropriate
representation of problem elements and available actions as symbol structures
so that the representation can be used to intelligently solve problems. In AI, an
important criteria about knowledge representation schemes or languages is that
they should support inference. For intelligent action, the inferencing capability
is essential in view of the fact that we can’t represent explicitly everything that
the system might ever need to know–some things have to be left implicit, to be
inferred/deduced by the system as and when needed in problem solving.
In general, a good knowledge representation scheme should have the following
features:
• It should allow us to express the knowledge we wish to represent in the
language. For example, the mathematical statement: Every symmetric and
transitive relation on a domain, need not be reflexive is not expressible in
First Order Logic.
• It should allow new knowledge to be inferred from a basic set of facts, as
discussed above.
• It should have well-defined syntax and semantics.
Building a expert system is known as knowledge engineering and its practitioners
are called knowledge engineers. It is the job of the knowledge engineer to
ensure to make sure that the computer has all the knowledge needed to solve a
problem. The knowledge engineer must choose one or more forms in which to
represent the required knowledge i.e., s/he must choose one or more knowledge
representation schemes.
A number of knowledge representing schemes like predicate logic, semantic
nets, frames, scripts and rule based systems, exists; and we will discuss them in
this unit. Some popular knowledge representation schemes are:
First order logic,
Semantic networks,
Frames,
Scripts and,
Rule-based systems.
As predicate logic have been discussed in previous blocks so we will discuss
the remaining knowledge representation schemes here in this unit.
208
6.1 OBJECTIVES Rule based Systems
and other Formalism
After going through this unit, you should be able to:
• Understand the basics of expert system
• Understand the basics of Knowledge based systems
• discuss the various knowledge representation scheme like rule based
systems, semantic nets, frames, and scripts
214
Although, in principle same set of rules can be used for both forward and Rule based Systems
backward chaining. However, in backward chaining, in practice we may and other Formalism
choose to write the rules slightly differently. In backward chaining we are
concerned with matching the conclusion of a rule against some goal that we
are trying to prove. So the ‘then or consequent’ part of the rule is usually not
expressed as an action to take (e.g., add/delete), but as a state which will be true
if the premises are true.
To learn more, let us take a different example in which we use backward
chaining (The system is used to identify an animal based on its properties stored
in the working memory):
Example
1. Let us assume that the working memory initially contains the following
facts:
(has-hair raja) representing the fact “raja has hair”
(big-mouth raja) representing the fact “raja has a big mouth”
(long-pointed-teeth raja) representing the fact “raja has long pointed
teeth”
(claws raja) representing the fact “raja has claws”
Let, the existing set of rules are:
1. IF (gives-milk X)
THEN (mammal X)
2. IF (has-hair X)
THEN (mammal X)
3. IF (mammal X) AND (eats-meat X)
THEN (carnivorous X)
4. IF (mammal X) AND (long-pointed-teeth X) AND (claws X)
THEN (carnivorous X)
5. IF (mammal X) AND (does-not-eat-meat X)
THEN (herbivorous X)
6. IF (carnivorous X) AND (dark-spots X)
THEN (cheetah, X)
7. IF (herbivorous X) AND (long-legs X) AND (long-neck X) AND (dark-
spots X)
THEN (giraffe, X)
8. IF (carnivorous X) AND (big-mouth X)
THEN (lion, X) 215
Artificial Intelligence- 9. IF (herbivorous X) AND (long-trunk X) AND (big-size X)
Knowledge
Representation THEN (elephant, X)
10. IF (herbivorous, X) AND (white-color X) AND ((black-strips X)
THEN (zebra, X)
Now to start the process of inference through backward chaining, the rule
based system will first form a hypothesis and then it will use the antecedent
– consequent rules (previously called condition – action rules) to work
backward toward hypothesis supporting assertions or facts.
Let us take the initial hypothesis that “raja is a lion” and then reason about
whether this hypothesis is viable using backward chaining approach explained
below :
The system searches a rule, which has the initial hypothesis in the consequent
part that someone i.e., raja is a lion, which it finds in rule 8.
The system moves from consequent to antecedent part of rule 8 and it finds
the first condition i.e., the first part of antecedent which says that “raja must
be a carnivorous”.
Next the system searches for a rule whose consequent part declares that
someone i.e., “raja is a carnivorous”, two rules are found i.e., rule 3 and rule
4. We assume that the system tries rule 3 first.
To satisfy the consequent part of rule 3 which now has become the system’s
new hypothesis, the system moves to the first part of antecedent which says
that X i.e., raja has to be mammal.
So a new sub-goal is created in which the system has to check that “raja
is a mammal”. It does so by hypothesizing it and tries to find a rule having
a consequent that someone or X is a mammal. Again the system finds two
rules, rule 1 and rule 2. Let us assume that the system tries rule 1 first.
In rule 1, the system now moves to the first antecedent part which says that
X i.e., raja must give milk for it to be a mammal. The system cannot tell
this because this hypothesis is neither supported by any of the rules and
also it is not found among the existing facts in the working memory. So
the system abandons rule 1 and try to use rule 2 to establish that “raja is a
mammal”.
In rule 2, it moves to the antecedent which says that X i.e., raja must have
hair for it to be a mammal. The system already knows this as it is one of the
facts in working memory. So the antecedent part of rule 2 is satisfied and so
the consequent that “raja is a mammal” is established.
Now the system backtracks to the rule 3 whose first antecedent part is
satisfied. In second condition of antecedent if finds its new sub-goal and in
turn forms a new hypothesis that X i.e., raja eats meat.
The system tries to find a supporting rule or an assertion in the working
memory which says that “raja eats meat” but it finds none. So the system
216 abandons the rule 3 and try to use rule 4 to establish that “raja is carnivorous”.
In rule 4, the first part of antecedent says that raja must be a mammal for Rule based Systems
it to be carnivorous. The system already knows that “raja is a mammal” and other Formalism
because it was already established when trying to satisfy the antecedents in
rule 3.
The system now moves to second part of antecedent in rule 4 and finds
a new sub-goal in which the system must check that X i.e., raja has long-
pointed-teeth which now becomes the new hypothesis. This is already
established as “ raja has long-pointed-teeth” is one of the assertions of the
working memory.
In third part of antecedent in rule 4 the system’s new hypothesis is that
“raja has claws”. This also is already established because it is also one the
assertions in the working memory.
Now as all the parts of the antecedent in rule 4 are established so its
consequent i.e., “raja is carnivorous” is established.
The system now backtracks to rule 8 where in the second part of the
antecedent says that X i.e., raja must have a big-mouth which now becomes
the new hypothesis. This is already established because the system has an
assertion that “raja has a big mouth”.
Now as the whole antecedent of rule 8 is satisfied so the system concludes
that “raja is a lion”.
We have seen that the system was able to work backward through the antecedent
– consequent rules, using desired conclusions to decide that what assertions it
should look for and ultimately establishing the initial hypothesis.
How to choose the type of chaining among forward or backward chaining
for a given problem ?
Many of the rule based deduction systems can chain either forward or backward,
but which of these approaches is better for a given problem is the point of
discussion.
First, let us learn some basic things about rules i.e., how a rule relates its input/s
(i.e., facts) to output/s (i.e., conclusion). Whenever in a rule, a particular set
of facts can lead to many conclusions, the rule is said to have a high degree of
fan out, and a strong candidate of backward chaining for its processing. On the
other hand, whenever the rules are such that a particular hypothesis can lead
to many questions for the hypothesis to be established, the rule is said to have
a high degree of fan in, and a high degree of fan in is a strong candidate of
forward chaining.
To summarize, the following points should help in choosing the type of chaining
for reasoning purpose :
• If the set of facts, either we already have or we may establish, can lead to
a large number of conclusions or outputs , but the number of ways or input
paths to reach that particular conclusion in which we are interested is small,
then the degree of fan out is more than degree of fan in. In such case,
backward chaining is the preferred choice.
217
Artificial Intelligence- • But, if the number of ways or input paths to reach the particular conclusion
Knowledge in which we are interested is large, but the number of conclusions that we
Representation
can reach using the facts through that rule is small, then the degree of fan in
is more than the degree of fan out. In such case, forward chaining is the
preferred choice.
For case where the degree of fan out and fan in are approximately same,
then in case if not many facts are available and the problem is check if one of
the many possible conclusions is true, backward chaining is the preferred
choice.
218
To understand, let us take the following example in which we use forward Rule based Systems
chaining: and other Formalism
Example
Let us assume that the working memory initially contains the following facts :
(day monday)
(at-home ram)
(does-not-like ram)
Let, the existing set of rules are:
R1 : IF (day monday)
THEN ADD to working memory the fact : (working-with ram)
R2 : IF (day monday)
THEN ADD to working memory the fact : (talking-to ram)
R3 : IF (talking-to X) AND (working-with X)
THEN ADD to working memory the fact : (busy-at-work X)
R4 : IF (busy-at-work X) OR (at-office X)
THEN ADD to working memory the fact : (not-at-home X)
R5 : IF (not-at-home X)
THEN DELETE from working memory the fact : (happy X)
R6 : IF (working-with X)
THEN DELETE from working memory the fact : (does-not-like X)
Now to start the process of inference through forward chaining, the rule
based system will first search for all the rule/s whose antecedent part/s are
satisfied by the current set of facts in the working memory. For example, in
this example, we can see that the rules R1 and R2 are satisfied, so the system
will chose one of them using its conflict resolution strategies. Let the rule
R1 is chosen. So (working-with ram) is added to the working memory (after
substituting “ram” in place of X). So working memory now looks like:
(working-with ram)
(day monday)
(at-home ram)
(does-not-like ram)
Now this cycle begins again, the system looks for rules that are satisfied, it finds
rule R2 and R6. Let the system chooses rule R2. So now (taking-to ram) is
added to working memory. So now working memory contains following:
219
Artificial Intelligence- (talking-to ram)
Knowledge
Representation (working-with ram)
(day monday)
(at-home ram)
(does-not-like ram)
Now in the next cycle, rule R3 fires, so now (busy-at-work ram) is added to
working memory, which now looks like:
(busy-at-work ram)
(talking-to ram)
(working-with ram)
(day monday)
(at-home ram)
(does-not-like ram)
Now antecedent parts of rules R4 and R6 are satisfied. Let rule R4 fires, so
(not-at-home, ram) is added to working memory which now looks like :
(not-at-home ram)
(busy-at-work ram)
(talking-to ram)
(working-with ram)
(day monday)
(at-home ram)
(does-not-like ram)
In the next cycle, rule R5 fires so (at-home ram) is removed from the working
memory :
(not-at-home ram)
(busy-at-work ram)
(talking-to ram)
(working-with ram)
(day monday)
(does-not-like ram)
The forward chining will continue like this. But we have to be sure of one
thing, that the ordering of the rules firing is important. A change in the ordering
sequence of rules firing may result in a different working memory.
220
Check your Progress - 1 Rule based Systems
and other Formalism
Exercise 1 ; In the “Animal Identifier System” discussed above use forward
chaining to try to identify the animal called “raja”.
struck
past of
time agent
last week strike Mohan
instrument
place
object knife
garden
Nita
property of
sharp
The two most important relations between concepts are (i) subclass relation
between a class and its superclass, and (ii) instance relation between an object
and its class. Other relations may be has-part, color etc. As mentioned earlier,
relations are indicated by labeled arcs.
As information in semantic networks is clustered together through
relational links, the knowledge required for the performance of some task
is generally available within short spatial span of the semantic network. This
type of knowledge organisation in some way, resembles the way knowledge is
stored and retrieved by human beings.
221
Artificial Intelligence- Subclass and instance relations allow us to use inheritance to infer new facts/
Knowledge relations from the explicitly represented ones. We have already mentioned that
Representation
the graphical portrayal of knowledge in semantic networks, being visual, is
easier than other representation schemes for the human beings to comprehend.
This fact helps the human beings to guide the expert system, whenever required.
This is perhaps the reason for the popularity of semantic networks.
Check Your Progress – 2
Exercise 2: Draw a semantic network for the following English statement:
Mohan struck Nita and Nita’s mother struck Mohan.
6.4 FRAMES
Frames are a variant of semantic networks that are one of the popular ways of
representing non-procedural knowledge in an expert system. In a frame, all the
information relevant to a particular concept is stored in a single complex entity,
called a frame. Frames look like the data structure, record. Frames support
inheritance. They are often used to capture knowledge about typical objects or
events, such as a car, or even a mathematical object like rectangle. As mentioned
earlier, a frame is a structured object and different names like Schema, Script,
Prototype, and even Object are used in stead of frame, in computer science
literature.
We may represent some knowledge about a lion in frames as follows:
Mammal :
Subclass : Animal
warm_blooded : yes
Lion :
subclass : Mammal
eating-habbit : carnivorous
size : medium
Raja :
instance : Lion
colour : dull-Yellow
owner : Amar Circus
Sheru :
instance : Lion
size : small
A particular frame (such as Lion) has a number of attributes or slots such as
eating-habit and size. Each of these slots may be filled with particular values,
such as the eating-habit for lion may be filled up as carnivorous.
222
Sometimes a slot contains additional information such as how to apply or use Rule based Systems
the slot values. Typically, a slot contains information such as (attribute, value) and other Formalism
pairs, default values, conditions for filling a slot, pointers to other related frames,
and also procedures that are activated when needed for different purposes.
In the case of frame representation of knowledge, inheritance is simple if an
object has a single parent class, and if each slot takes a single value. For example,
if a mammal is warm blooded then automatically a lion being a mammal will
also be warm blooded.
But in case of multiple inheritance i.e., in case of an object having more than
one parent class, we have to decide which parent to inherit from. For example,
a lion may inherit from “wild animals” or “circus animals”. In general, both the
slots and slot values may themselves be frames and so on.
Frame systems are pretty complex and sophisticated knowledge
representation tools. This representation has become so popular that special
high level frame based representation languages have been developed. Most of
these languages use LISP as the host language. It is also possible to represent
frame-like structures using object oriented programming languages, extensions
to the programming language LISP.
Check Your Progress – 3
Exercise 3: Define a frame for the entity date which consists of day, month and
year. each of which is a number with restrictions which are well-known. Also a
procedure named compute-day-of-week is already defined.
6.5 SCRIPTS
A script is a structured representation describing a stereotyped sequence of
events in a particular context.
Scripts are used in natural language understanding systems to organize a
knowledge base in terms of the situations that the system should understand.
Scripts use a frame-like structure to represent the commonly occurring experience
like going to the movies eating in a restaurant, shopping in a supermarket, or
visiting an ophthalmologist.
Thus, a script is a structure that prescribes a set of circumstances that could be
expected to follow on from one another.
Scripts are beneficial because:
• Events tend to occur in known runs or patterns.
• A casual relationship between events exist.
• An entry condition exists which allows an event to take place.
• Prerequisites exist upon events taking place.
Components of a script
The components of a script include:
223
Artificial Intelligence- • Entry condition: These are basic condition which must be fulfilled before
Knowledge events in the script can occur.
Representation
• Results: Condition that will be true after events in script occurred.
• Props: Slots representing objects involved in events
• Roles: These are the actions that the individual participants perform.
• Track: Variations on the script. Different tracks may share components of
the same scripts.
• Scenes: The sequence of events that occur.
Describing a script, special symbols of actions are used. These are:
6.6 SUMMARY
This unit majorly discussed the various knowledge representation mechanisms,
used in Artificial Intelligence. The unit begins with the discussion on Rule Based
Systems, and discussed the related concept of Forward chaining and Backward
chaining, later the concept of Conflict resolution is discussed. The unit also
discussed the other techniques of knowledge representation like Semantic nets,
Frames and Scripts; along with relevant examples for each.
225
Artificial Intelligence-
Knowledge
6.7 SOLUTIONS/ANSWERS
Representation Check Your Progress – 1
Exercise 1: Refer to section 6.2
Check Your Progress – 2
Exercise 2: Refer to section 6.3
Check Your Progress – 3
Exercise 3: Refer to section 6.4
226
UNIT 7 PROBABILISTIC REASONING Rule based Systems
and other Formalism
Structure
7.0
Introduction
7.1
Objectives
7.2 Reasoning with uncertain information
7.3 Review of Probability Theory
7.4 Introduction to Bayesian Theory
7.5
Baye’s Networks
7.6 Probabilistic Inference
7.7 Basic idea of Inferencing with Bayes Networks
7.8 Other Paradigm of Uncertain Reasoning
7.9 Dempster Scheffer Theory
7.10 Summary
7.11 Solutions/ Answers
7.12 Further Readings
7.0 INTRODUCTION
This unit is dedicated to probability theory and its usage in decision making for
various problems. Contrary to the classical decision making of True and False
propositions, the probability of the truth value with a certain probability is used
for making decisions. The inclusion of such a probabilistic approach is quite
relevant since uncertainties are quite obvious in the real world.
As we know, the probability of an event (uncertain event I) is basically the
measure of the degree of likelihood of the occurrence of event I. Let the set
of all such possible events is represented as sample space S The measure of
probability is a function P () mapping the event outcome E_i from sample space
S to some real number and satisfying few conditions such as:
(i) 0≤P(I)≤1 for any event I⊆S
(ii) P(S) = 1, represents a certain outcome, and
(iii) For Ei ∩ Ej = ϕ, for all i ≠j (the Ei are mutually exclusive), i.e. P(E_1 ∪ E_2
...) = P(E_1)+P(E_2)+ ...
Using the above mentioned three conditions, we can derive the basic laws
of probability. It is also to be noted that only these three conditions are not
enough to compute the probability of an outcome. This additionally requires
the collection of experimental data for estimating the underlying distribution.
227
Artificial Intelligence-
Knowledge
7.1 OBJECTIVES
Representation After going through this unit, you should be able to:
• Understand the role of probabilistic reasoning in AI
• Understand the Concept of Bayesian theory and Bayesian networks
• Perform probabilistic inference through Bayesian Networks
• Understand the other Paradigm of Uncertain Reasoning & Dempster
Scheffer Theory
The above relations between events can be best viewed through a Venn diagram.
A rectangle is drawn to represent the sample space ζ. All the sample points are
represented within the rectangle by means of points. An event is represented
by the region enclosed by a closed curve containing all the sample points
leading to that event. The space inside the rectangle but outside the closed curve
representing E represents the complementary event Ec (See Fig.1(a) above)
Similarly, in Fig.1(b), the space inside the curve represented by the broken line
represent the event E U F and the shaded portion represents E ∩ F.
As is clear by now, the outcome of a random experiment being uncertain, none
of the various events associated with a sample space can be predicted with
certainty before the underlying experiment is performed and the outcome of
it is noted. However, some events may intuitively seem to be more likely than
the rest. For example, talking about human beings, the event that a person will
live 20 years seems to be more likely compared to the event that the person
will live 200 years. Such thoughts motivate us to explore if one can construct
a scale of measurement to distinguish between likelihoods of various events.
Towards this, a small but extremely significant fact comes to our help. Before
we elaborate on this, we need a couple of definitions.
Consider an event E associated with a random experiment; suppose the
experiment is repeated n times under identical conditions and suppose the event
E (which is not likely to occur with every performance of the experiment)
occurs fn(E) times in these n repetitions. Then, fn(E) is called the frequency
of the event E in n repetitions of the experiment and rn.(E) = fn,(E)/n is called
the relative frequency of the event E in n repetitions of the experiment. Let us
consider the following example.
Example 4: Consider the experiment of throwing a coin. Suppose we repeat the
process of throwing a coin 5 times and suppose the frequencies of occurrence of
head is tabulated below in Table-1:
231
Artificial Intelligence- Example 7: Suppose a pair of balanced dice A and B are rolled simultaneously
Knowledge so that each of the 36 possible outcomes is equally likely to occur and hence has
Representation
probability Let E be the event that the sum of the two scores is 10 or more and
F be the event that exactly one of the two scores is 5.
Then E = {(4.6), (5.5), (5,6), (6,4), (6,5), (6,6)} so that P(E) = 6/36 = 1/6.
Also, F= {(1.5), (2,5), (3,5), (4,5), (6,5), (5,1), (5,2), (5,3), (5,4), (5,6)}.
Now suppose we are told that the event F has taken place (note that this is only
partial information relating to the outcome of the experiment). Since each of the
outcome originally had the same probability of occurring, they should still have
equal probabilities. Thus given that exactly one of the two scores is 5 each of
the 10 outcomes of event F has probability while the probability of remaining
26 points in the sample space is 0.
In the light of the information that the event F has taken place the sample points
(4,6), (6,4), (5,5) and (6,6) in the event E must not have materialized. One
of the two sample points (5,6) or (6,5) must have materialized. Therefore the
probability of E would no longer be 1/6. Since all the 10 sample points in F are
equally likely, the revised probability of E given the occurrence of F, which
occurs through the materialization of one of the two sample points (6,5) or (5,6)
should be 2/10 = 1/5.
The probability just obtained is called the conditional probability that E occurs
given that F has occurred and is denoted by P(E|F). We shall now derive a
general formula for calculating P(E|F).
Consider the following probability table:
Table 2
Events E Ec
F P Q
Fc r s
In Table 2, P(E ∩ F) = p, P(Ec ∩ F) = q, P(E ∩ Fc) = r and P(Ec ∩ Fc) = s and
hence, P(E)=P(E ∩ F) U (E ∩ Fc)) = P(E ∩ F) + P(E ∩ Fc) =p+r and similarly,
P(F) = q +s.
Now suppose that the underlying random experiment is being repeated a large
number of times, say N times. Thus, taking a cue from the long term relative
frequency interpretation of probability, the approximate number of times the
event F is expected to take place will be NP(F) = N(q+s). Under the condition
that the event F has taken place, the number of times the event E is expected
to take place would be NP(E ∩ F) as both E and F must occur simultaneously.
Thus, the long term relative frequency of E under the condition of occurrence
of F, i.e. the probability of occurrence of E under the condition of occurrence of
F, should be NP(E ∩ F)/NP(F) = P(E ∩ F)/P(F). This is the proportion of times
E occurs out of the repetitions where F takes place. With the above background,
we are now ready to define formally the conditional probability of an event
given another.
232
Definition: Let E and F be two events from a sample space ζ. The conditional Probabilistic
probability of the event E given the event F, denoted by P(E|F), is defined as Reasoning
P(E|F) = P(E ∩ F)/P(F), whenever P(F) > 0.
When P(F) = 0, we say that P(E|F) is undefined. We can also write from Eqn.
P(E ∩ F) = P(E|F)P(F).
Referring back to Example 3, we see that P(E) = 6/36,P(F) = 10/36; since, E ∩
F= {(5,6), (6,5)}, P(E ∩ F) = 2/36, P(E|F) = (2/36)/(10/36) = 2/10 = 1/5, which
is the same as that obtained in Example 3. Another result can be generalized to
k events E1 E2, ..., Ek, where k >2. And now an exercise for you.
Check Your Progress 2
Problem-1: In a class, three students tossed one coins (one each) for 3 times.
Write down all the possible outcomes which can be obtained in this experiment.
Problem-2: In problem 1, what is the probability of getting 2 more than 2 heads
at a time. Also write the probability of getting three tails at a time.
Problem-3: In problem 1 calculate the Relative frequency of tail rn(T).
Figure 2. (a-d)
The source of information can be an entity or person giving some relevant state
information. Here the information source is a non biased source of information.
The information received from such sources is combined to provide more
reliable information for further use. The D-S theory models are able to handle
the varying precision regarding the information and hence no additional
assumptions are needed to represent the information.
to the D-S theory. The Uncertainty Interval, shows the range where the true
probability may be
found. This is calculated as the difference of belief and plausibility level i.e.
Pl(A) - Bel(A).
B∩A=∅
B ∩ B = B ≠ ∅ hence m1 (B) = 0.2
B ∩ θ = B ≠ ∅ hence m1 (θ) = 0.4
Pl1(B) = m1 (B) + m1 (θ) = 0.2 + 0.4 = 0.6
7.10 SUMMARY
This unit relates to the discussion over Reasoning with uncertain information,
whih involves Review of Probability Theory, and Introduction to Bayesian 241
Artificial Intelligence- Theory. Unit also covers the concept of Baye’s Networks, which is later used for
Knowledge the purpose of inferencing. Finally, the unit discussed about the Other Paradigm
Representation
of Uncertain Reasoning, including the Dempster Scheffer Theory
7.11 SOLUTIONS/ANSWERS
Check Your Progress- 1
Problem-1. In each of the following exercises, an experiment is described.
Specify the relevant sample spaces:
a) A machine manufactures a certain item. An item produced by the machine
is tested to determine whether or not it is defective.
b) An urn contains six balls, which are colored differently. A ball is drawn
from the urn and its color is noted.
c) An urn contains ten cards numbered 1 through 10. A card is drawn, its
number noted and the card is replaced. Another card is drawn and its
number is noted.
Solution - *Please refer to section 7.3 to answer these problems.
Problem 2. Suppose a six-faced die is thrown twice. Describe each of the
following events:
i) The maximum score is 6.
ii) The total score is 9.
iii) Each throw results in an even score.
iv) Each throw results in an even score larger than 2.
v) The scores on the two throws differ by at least 2.
Solution - *Please refer to section 7.3 to answer these problems.
Check Your Progress 2
Problem-1: In a class,three students tossed one coins (one each) for 3 times.
Write down all the possible outcomes which can be obtained in this experiment.
Solution - *Please refer to example 4 and section 7.3 to solve these problems
Problem-2: In problem 1, what is the probability of getting 2 more than 2 heads
at a time. Also write the probability of getting three tails at a time.
Solution - *Please refer to example 4 and section 7.3 to solve these problems
Problem-3: In problem 1 calculate the Relative frequency of tail rn(T).
Solution - *Please refer to example 4 and section 7.3 to solve these problems
Check Your Progress 3
Problem-1. Differentiate between Joint, Marginal and conditional probability
with an example of each.
242 Solution - *Please refer to section 7.9 and example 10 to answer these problems.
Problem-2. Explain Dempster Shafer theory with a suitable example. Probabilistic
Reasoning
Solution - *Please refer to section 7.9 and example 10 to answer these problems.
Problem-3. What are different type of evidences? Give suitable example of
each.
Solution - *Please refer to section 7.9 and example 10 to answer these problems.
243
Artificial Intelligence-
Knowledge
UNIT 8 FUZZY AND ROUGH SETS
Representation Structure
8.0 Introduction
8.1 Objectives
8.2 Fuzzy Systems
8.3 Introduction to Fuzzy Sets
8.4 Fuzzy Set Representation
8.5 Fuzzy Reasoning
8.6 Fuzzy Inference
8.7 Rough Set Theory
8.8 Summary
8.9 Solutions/ Answers
8.10 Further Readings
8.0 INTRODUCTION
In the earlier units, we discussed PL and FOPL systems for making inferences
and solving problems requiring logical reasoning. However, these systems
assume that the domain of the problems under consideration is complete, precise
and consistent. But, in the real world, the knowledge of the problem domains is
generally neither precise nor consistent and is hardly complete.
In this unit, we discuss a number of techniques and formal systems that attempt
to handle some of these blemishes. To begin with we discuss the fuzzy systems
that attempt to handle imprecision in knowledge bases, specially, due to use of
natural language words like hot, good, tall etc.
Then, we discuss non-monotonic systems which deal with indefiniteness of
knowledge in the knowledge bases. The significance of these systems lies in the
fact that most of the statements in the knowledge bases are actually based on
beliefs of the concerned persons or actors. These beliefs get revised as better
evidence for some other beliefs become available, where the later beliefs may
be in conflict with the earlier beliefs. In such cases, the earlier beliefs my have to
be temporarily suspended or permanently excluded from further considerations.
Subsequently, we will discuss two formal systems that attempt to handle
incompleteness of the available information. These systems are called Default
Reasoning Systems and Closed World Assumption Systems. Finally, we
discuss some inference rules, viz, abductive inference rule and inductive
inference rule that are, though not deductive, yet are quite useful in solving
problems arising out of everyday experience.
244
8.1 OBJECTIVES Fuzzy and Rough Sets
245
Artificial Intelligence- Sweet Milk: Add small sugar cube one at a time to glass of milk, and go on
Knowledge adding upto, say, 100 small cubes.
Representation
Initially, without sugar, we may take milk as not sweet. However, with addition
of each one small sugar particle cube, the sweetness gradually increases. It
is not possible to say that after addition of 100 small cubes of sugar, the milk
becomes sweet, and, till addition of 99 small cubes, it was not sweet.
Pool, Pond, Lake, Sea, Ocean: for different sized water bodies, we can not say
when exactly a pool becomes a pond, when exactly a pond becomes a lake and
so on.
One of the reasons, for this type of problem of our inability to associate one of
the two-truth values to statements describing everyday situations, is due to the
use of natural language words like hot, good, beautiful etc. Each of these words
does not denote something constant, but is a sort of linguistic variable. The
context of a particular usage of such a word may delimit the scope of the word
as a linguistic variable. The range of values, in some cases, for some phrases or
words, may be very large as can be seen through the following three statements:
• Dinosaurs ruled the earth for a long period (about millions of years)
• It has not rained for a long period (say about six months).
• I had to wait for the doctor for a long period (about six hours).
Fuzzy theory provides means to handle such situations. A Fuzzy theory may
be thought as a technique of providing ‘continuization’ to the otherwise binary
disciplines like Set Theory, PL and FOPL.
Further, we explain how using fuzzy concepts and rules, in situation like the
ones quoted below, we, the human beings solve problems, despite ambiguity in
language.
Let us recall the case of crossing a road discussed in Unit 1 of Block 1. We
Mentioned that a step by step method of crossing a road may consist of
(i) Knowing (exactly) the distances of various vehicles from the path to be
followed to cross over.
(ii) Knowing the velocities and accelerations of the various vehicles moving on
the road within a distance of, say, one kilometer.
(iii) Using Newton’s Laws of motion and their derivatives like s = ut + at2, and
calculating the time that would be taken by each of the various vehicles to
reach the path intended to be followed to cross over.
(iv) Adjusting dynamically our speeds on the path so that no collision takes
place with any of the vehicle moving on the road.
But, we know the human beings not only do not follow the above precise
method but cannot follow the above precise method. We, the human beings
rather feel comfortable with fuzziness than precision. We feel comfortable,
if the instruction for crossing a road is given as follows:
246
Look on both your left hand and right hand sides, particularly in the beginning, Fuzzy and Rough Sets
to your right hand side. If there is no vehicle within reasonable distance, then
attempt to cross the road. You may have to retreat back while crossing, from
somewhere on the road. Then, try again.
The above instruction has a number of words like left, right (it may 45° to the
right or 90° to the right) reasonable, each of which does not have a definite
meaning. But we feel more comfortable than the earlier instruction involving
precise terms.
Let us consider another example of our being comfortable with imprecision than
precision. The statement: ‘The sky is densely clouded’ is more comprehensible
to human beings than the statement: ‘The cloud cover of the sky is 93.5 %’.
Thus is because of the fact that, we, the human beings are still better than
computers in qualitative reasoning. Because of better qualitative reasoning
capabilities
• just by looking at the eyes only and/or nose only, we may recognize a
person.
• just by taking and feeling a small number of grains from cooking rice bowl,
we can tell whether the rice is properly cooked or not.
• just by looking at few buildings, we can identify a locality or a city.
Achieving Human Capability
In order that computers achieve human capability in solving such problems,
computers must be able to solve problems for which only incomplete and/or
imprecise information/knowledge is available.
Modelling of Solutions and Data/Information/Knowledge
We know that for any problem, the plan of the proposed solution and the relevant
information is fed in the computer in a form acceptable to the computer.
However, the problems to be solved with the help of computers are, in the
first place, felt by the human beings. And then, the plan of the solution is also
prepared by human beings.
It is conveyed to the computer mainly for execution, because computers have
much better executional speed.
Summarizing the discussion, we conclude the following facts
(i) We, the human beings, sense problems, desire the problems to be solved
and express the problems and the plan of a solution using imprecise words
of a natural language.
(ii) We use computers to solve the problems, because of their executional
power.
(iii) Computers function better, when the information is given to the computer
in terms of mathematical entities like numbers, sets, relations, functions,
vectors, matrices graphs, arrays, trees, records, etc., and when the steps of
solution are generally precise, involving no ambiguity. 247
Artificial Intelligence- In order to meet the mutually conflicting requirements:
Knowledge
Representation (i) Imprecision of natural language, with which the human beings are
comfortable, where human beings feel a problem and plan its solution.
(ii) Precision of a formal system, with which computers operate efficiently,
where computers execute the solution, generally planned by human beings
a new formal system viz. Fuzzy system based on the concept of ‘Fuzzy’
was suggested for the first time in 1965 by L. Zadeh.
In order to initiate the study of Fuzzy systems, we quote two statements to recall
the difference between a precise statement and an imprecise statement.
A precise Statement is of the form: ‘If income is more than 2.5 lakhs then tax is
10% of the taxable income’.
An imprecise statement may be of the form: ‘If the forecast about the rain
being slightly less than previous year is believed, then there is around 30%
probability that economy may suffer heavily’.
The concept of ‘Fuzzy’, which when applied as a prefix/adjective to
mathematical entities like set, relation, functions, tree, etc., helps us
in modelling the imprecise data, information or knowledge through
mathematical tools.
Crisp Set/Relation vs. Fuzzy Set/Relation: In order to differentiate the sets,
normally used so far, from the fuzzy sets to be introduced soon, we may call the
normally called sets as crisp sets.
Next, we explain, how the fuzzy sets are defined, using mathematical entities,
to capture imprecise concepts, through an example of the concept : tall.
In Indian context, we may say, a male adult, is
(i) definitely tall if his height > 6 feet
(ii) not at all tall if height < 5 feet and
(iii) if his height = 5' 2” a little bit tall
(iv) if his height = 5' 6” slightly tall
(v) if height = 5' 9” reasonably tall etc.
Next step is to model ‘definitely tall’ ‘not at all tall’, ‘little bit tall’, ‘slightly
tall’ ‘reasonably Tall’ etc. in terms of mathematical entities, e.g., numbers;
sets etc.
In modelling the vague concept like ‘tall’, through fuzzy sets, the numbers in
the closed set [0, 1] of reals may be used on the following lines:
(i) ‘Definitely tall’ may be represented as ‘tallness having value 1’
(ii) ‘Not at all tall’ may be represented as ‘Tallness having value 0’
other adjectives/adverbs may have values between 0 and 1 as follows:
248
(iii) ‘A little bit tall’ may be represented as ‘tallness having value say .2’. Fuzzy and Rough Sets
(iv) ‘Slightly tall’ may be represented as ‘tallness having value say .4’.
(v) ‘Reasonably tall’ may be represented as ‘tallness having value say .7’.
and so on.
Similarly, the values of other concepts or, rather, other linguistic variables like
sweet, good, beautiful, etc. may be considered in terms of real numbers
between 0 and 1.
Coming back to the imprecise concept of tall, let us think of five male persons
of an organisation, viz., Mohan, Sohan, John, Abdul, Abrahm, with heights 5'
2”, 6' 4”,
5' 9”, 4' 8”, 5' 6” respectively.
Then had we talked only of crisp set of tall persons, we would have denoted the
Set of tall persons in the organisation = {Sohan}
But, a fuzzy set, representing tall persons, include all the persons alongwith
respective degrees of tallness. Thus, in terms of fuzzy sets, we write:
Tall = {Mohan/.2; Sohan/1; John/.7; Abdul/0; Abrahm/.4}.
The values .2, 1, .7, 0, .4 are called membership values or degrees:
Note: Those elements which have value 0 may be dropped e.g.
Tall may also be written as Tall = {Mohan/.2; Sohan/1; John/.7;, Abrahm/.4},
neglecting Abdul, with associated degree zero.
If we define short/Diminutive as exactly opposite of Tall we may say
Short = {Mohan/.8; Sohan/0; John/.3; Abdul/1; Abrahm/.6}
249
Artificial Intelligence- (ii) Subset
Knowledge
Representation Consider sets A = {1, 2, 3, 4, 5, 6, 7}
B = {4, 1, 3, 5}
C = {4, 8}
Then B is a subset of A, denoted by B ⊂ A. Also C is not a subset of A, denoted
by C ⊄ A.
(iii) Belongs to/is a member of
If A = {1, 4, 3, 5}
Then each of 1, 4, 3 and 5 is called an element or member of A and the fact that
1 is a member of A is denoted by 1 ∈ A.
Corresponding Definitions/ concepts for Fuzzy Sets
In order to define for fuzzy sets, the concepts corresponding to the concepts
of Equality of Sets, Subset and Membership of a Set considered so far only for
crisp sets, first we illustrate the concepts through an example:
Let X be the set on which fuzzy sets are to be defined, e.g.,
X = {Mohan, Sohan, John, Abdul, Abrahm}.
Then X is called the Universal Set.
Note: In every fuzzy set, all the elements of X with their corresponding
memberships values from 0 to 1, appear.
(i) Degree of Membership: In respect of fuzzy sets, we do not speak of just
‘membership’, but speak of ‘degree of membership’.
In the set
A = {Mohan/.2; Sohan/1; John/.7; Abrahm/.4},
Degree (Mohan) = .2, degree (John) =.4
For (ii) Equality of Fuzzy sets: Let A, B and C be fuzzy sets defined on X as
follows:
Let A = {Mohan/.2; Sohan/1; John/.7; Abrahm/.4}
B = {Abrahm/.4, Mohan/.2; Sohan/1; John/.7}.
Then, as degrees of each element in the two sets, equal; we say fuzzy set A
equals fuzzy set B, denoted as A = B
However, if C = {Abrahm/.2, Mohan/.4; Sohan/1; John/.7}, then
A ≠ C.
(iii) Subset/Superset
Intuitively, we know
250
(i) The Set of ‘Very Tall’ people should be a subset of the set of Tall people. Fuzzy and Rough Sets
251
Artificial Intelligence- A = {Mohan/1; Sohan/0; John/1; Abdul/1; Abrahm/0}, in which degree of
Knowledge each member of the crisp set, is taken as one and degree of each element of the
Representation
universal set which does not appear in the set A, is taken as zero.
However, conversely, a fuzzy set may not be written as a crisp set. Let C be
a fuzzy set denoting Educated People, where degree of education is defined
as follows:
degree of education (Ph.D. holders) = 1
degree of education (Masters degree holders) = 0.85
degree of education (Bachelors degree holders) = .6
degree of education (10 + 2 level) = 0.4
degree of education (8th Standard) = 0.1
degree of education (less than 8th) = 0.
Let us C = {Mohan/.85; Sohan/.4; John/.6; Abdul/1; Abrahm/0}.
Then, we cannot think of C as a crisp set.
Next, we define some more concepts in respect of fuzzy sets.
Definition: Support set of a Fuzzy Set, say C, is a crisp set, say D, containing
all the elements of the universe X for which degree of membership in Fuzzy set
is positive. Let us consider again
C = {Mohan/.85; Sohan/.4; John/.6; Abdul/1; Abrahm/0}.
Support of C = D = {Mohan, Sohan, John, Abdul}, where the element Abrahm
does not belong to D, because, it has degree 0 in C.
Definition: Fuzzy Singleton is a fuzzy set in which there is exactly one element
which has positive membership value.
Example:
Let us define a fuzzy set OLD on universal set X in which degree of OLD is
zero if a person in X is below 20 years and Degree of Old is .2 if a person is
between 20 and 25 years and further suppose that
Old = C = {Mohan/0; Sohan/0; John/.2; Abdul/0; Abrahm/0},
then support of old = {John} and hence old is a fuzzy singleton.
Check Your Progress - 1
Ex. 1: Discuss equality and subset relationship for the following fuzzy sets
defined on the Universal set X = { a, b , c, d, e}
A = { a/.3, b/.6, c/.4 d/0, e/.7}
B = {a/.4, b/.8, c/.9, d/.4, e/.7}
C = {a/.3, b/.7, c/.3, d/.2, e/.6}
252
8.4 FUZZY SET REPRESENTATION Fuzzy and Rough Sets
Idempotence
A∩A=A
A∪A=A
Identity
A∪U =U A∪U=A
A∩φ =A φ ∩A=φ
where φ : empty fuzzy set = {x/0 with x∈U} and U: universe = {x/1 with x∈U}
Check Your Progress - 2
Ex. 2: For the following fuzzy sets
A = {a/.5, b/.6, c/.3, d/0, e/.9} and
B = { a/.3, b/.7, c/.6, d/.3, e/.6},
find the fuzzy sets A ∩ B, A ∩ B and (A ∩ B)'
Next, we discuss three operations, viz., concentration, dilation and normalization,
that are relevant only to fuzzy sets and can not be discussed for (crisp) sets.
(1) Concentration of a set A is defined as
CON (A) = {x/m2A(x)|x∈U}
Example:
If A = {Mohan/.5; Sohan/.9; John/.7; Abdul/0; Abrahm/.2}
then
CON (A) = {Mohan/.25; Sohan/.81; John/.49; Abdul/0; Abrahm/.04}.
In respect of concentration, it may be noted that the associated values being
between 0 and 1, on squaring, become smaller. In other words, the values
concentrate towards zero. This fact may be used for giving increased emphasis
on a concept. If Brightness of articles is being discussed, then Very bright may
be obtained in terms of
CON. (Bright).
(2) Dilation (Opposite of Concentration) of a fuzzy set A is defined as
DIL (A) = {x/ m A ( x)|x ∈ U }
Example:
If A = {Mohan/.5; Sohan/.9; John/.7; Abdul/0; Abrahm/.2}
then
DIL (A) = {Mohan/.7; Sohan/.95; John/.84; Abdul/0; Abrahm/.45}
255
Artificial Intelligence- The associated values, that are between 0 and 1, on taking square-root get
Knowledge increased, e.g., if the value associated with x was .01 before dilation, then the
Representation
value associated with x after dilation becomes .1, i.e., ten times of the original
value.
This fact may be used for decreased emphasis. For example, if colour say
‘yellow’ has been considered already, then ‘light yellow’ may be considered in
terms of already discussed ‘yellow’ through Dilation.
as NORM ( A) x / mA ( x) | x ∈ U .
(3) Normalization of a fuzzy set, is defined =
Max
NORM (A) and is a fuzzy set in which membership values are obtained by
dividing values of the membership function of A by the maximum membership
function.
The resulting fuzzy set, called the normal, (or normalized) fuzzy set, has the
maximum of membership function value of 1.
Example:
If A ={Mohan/.5; Sohan/.9; John/.7; Abdul/0; Abrahm/.2}
Norm (A) = {Mohan/(.5 ÷.9 = .55.); Sohan/1; John /(.7 ÷.9 = .77.); Abdul/0;
Abrahm/(.2 ÷.9 = .22.)}
Note: If one of the members has value 1, then Norm (A) = A,
Relation & Fuzzy Relation
We know from our earlier background in Mathematics that a relation from a set
A to a set B is a subset of A x B.
For example, The relation of father may be written as {{Dasrath, Ram), …},
which is a subset of A × B, where A and B are sets of persons living or dead.
The relation of Age may be written as
{(Mohan, 43.7), (Sohan, 25.6), …},
where A is set of living persons and B is set of numbers denoting years.
Fuzzy Relation
In fuzzy sets, every element of the universal set occurs with some degree of
membership. A fuzzy relation may be defined in different ways. One way of
defining fuzzy relation is to assume the underlying sets as crisp sets. We will
discuss only this case.
Thus, a relation from A to B, where we assume A and B as crisp sets, is
a fuzzy set, in which with each element of A × B is associated a degree of
membership between zero and one.
For example:
We may define the relation of UNCLE as follows:
(i) x is an UNCLE of y with degree 1 if x is brother of mother or father,
256
(ii) x is an UNCLE of y with degree .7 if x is a brother of an UNCLE of y, and Fuzzy and Rough Sets
x is not covered above,
(iii) x is an UNCLE of y with degree .6 if x is the son of an UNCLE of mother
or father.
Now suppose
Ram is UNCLE of Mohan with degree 1, Majid is UNCLE of Abdul with
degree .7
and Peter is UNCLE of John with degree .7. Ram is UNCLE of John with
degree .4
Then the relation of UNCLE can be written as a set of ordered-triples as
follows:
{(Ram, Mohan, 1), (Majid, Abdul, .7), (Peter, John, .7), (Ram, John, .4)}.
As in the case of ordinary relations, we can use matrices and graphs to represent
FUZZY relations, e.g., the relation of UNCLE discussed above, may be
graphically denoted as
1
Ram .4
Mohan
Majid .7 John
.7
Peter Abdul
Fuzzy Graph
Fuzzy Reasoning
In the rest of this section, we just have a fleeting glance on Fuzzy Reasoning.
Let us recall the well-known Crisp Reasoning Operators
(i) AND
(ii) OR
(iii) NOT
(iv) IF P THEN Q
(v) P IF AND ONLY IF Q
Corresponding to each of these operators, there is a fuzzy operator discussed and
defined below. For this purpose, we assume that P and Q are fuzzy propositions
with associated degrees, respectively, deg (P) and deg (Q) between 0 and 1.
The deg (P) = 0 denotes P is False and deg (P) =1 denotes P is True.
Then the operators are defined as follows:
(i) Fuzzy AND to be denoted by ∧ , is defined as follows:
257
Artificial Intelligence- For given fuzzy propositions P and Q, the expression P ∧ Q denotes a fuzzy
Knowledge proposition with Deg (P ∧ Q) = min (deg (P), deg (Q))
Representation
Example: Let P: Mohan is tall with deg (P) = .7
Q: Mohan is educated with deg (Q) = .4
Then P ∧ Q denotes: ‘Mohan is tall and educated’ with degree ((min) {.7, .4})
= .4
(ii) Fuzzy OR to be denoted by ∨, is defined as follows:
For given fuzzy propositions P and Q, P ∨ Q is a fuzzy proposition with
Deg (P ∨ Q) = max (deg (P), deg (Q))
Example: Let P: Mohan is tall with deg (P) = .7
Q: Mohan is educated with deg (Q) = .4
Then P ∧ Q denotes: ‘Mohan is tall or educated’ with degree ((max) {.7, .4})
= .7
258
Therefore, Tweety can fly long distances. Fuzzy and Rough Sets
However, later on, we come to know that Tweety is actually a hen and a hen
cannot fly long distances. Therefore, we have to revise our belief that Tweety
can fly over long distances.
This type of situation is not handled by any monotonic reasoning system
including PL and FOPL. This is appropriately handled by Non-Monotomic
Reasoning Systems, which are discussed next.
A non-monotomic reasoning system is one which allows retracting of old
knowledge due to discovery of new facts which contradict or invalidate a part
of the current knowledge base. Such systems also take care that retracting of
a fact may necessitate a chain of retractions from the knowledge base or even
reintroduction of earlier retracted ones from K.B. Thus, chain-shrink and chain
emphasis of a K.B and reintroduction of earlier retracted ones are part of a non-
monotomic reasoning system.
To meet the requirement for reasoning in the real-world, we need non-
monotomic reasoning systems also, in addition to the monotomic ones. This is
true specially, in view of the fact that it is not reasonable to expect that all the
knowledge needed for a set of tasks could be acquired, validated, and loaded
into the system just at the outset. In general, initial knowledge is an incomplete
set of partially true facts. The set may also be redundant and may contain
inconsistencies and other sources of uncertainty.
Major components of a Non-Monotomic reasoning system
Next, we discuss a typical non-monotomic reasoning system (NMRS) consists
of the following three major components:
(1) Knowledge base (KB),
(2) Inference Engine (IE),
(3) Truth-Maintenance System (TMS).
The KB contains information, facts, rules, procedures etc. relevant to the type
of problems that are expected to be solved by the system. The component
IE of NMRS gets facts from KB to draw new inferences and sends the new
facts discovered by it (i.e., IE) to KB. The component TMS, after addition of
new facts to KB. either from the environment or through the user or through
IE, checks for validity of the KB. It may happen that the new fact from the
environment or inferred by the IE may conflict/contradict some of the facts
already in the KB. In other words, an inconsistency may arise. In case of
inconsistencies, TMS retracts some facts from KB. Also, it may lead to a chain
of retractions which may require interactions between KB and TMS. Also,
some new fact either from the environment or from IE, may invalidate some
earlier retractions requiring reintroduction of earlier retracted facts. This may
lead to a chain of reintroductions. These retrievals and introductions are taken
care of by TMS. The IE is completely relieved of this responsibility. Main job
of IE is to conclude new facts when it is supplied a set of facts.
259
Artificial Intelligence-
Knowledge IE IE TMS
Representation
KB
8.8 SUMMARY
In this unit the Fuzzy Systems are discussed along with the Introduction to
Fuzzy Sets and their Representation. Later the conceptual understanding of
Fuzzy Reasoning is build, and the same is used to perform the Fuzzy Inference.
The unit finally discussed the concept of Rough Set Theory, also.
8.9 SOLUTIONS/ANSWERS
Check Your Progress - 1
Ex. 1: Discuss equality and subset relationship for the following fuzzy sets
defined on the Universal set X = { a, b , c, d, e}
A = { a/.3, b/.6, c/.4 d/0, e/.7} ; B = {a/.4, b/.8, c/.9, d/.4, e/.7}; C = {a/.3, b/.7,
c/.3, d/.2, e/.6}
SOLUTION: Both A and C are subsets of the fuzzy set B, because deg (x in
A ) ≤ deg (x in B) for all x ∈ X
Similarly degree (x in C) ≤ degree (x in B) for all x ∈ X
Further, A is not a subset of C, because, deg (c in A) = .4 > .3 = degree (c in C)
Also, C is not a subset of A, because, degree (b in C) = .7 > .6 = degree (b in A)
Check Your Progress - 2
Ex. 2: For the following fuzzy sets A = {a/.5, b/.6, c/.3, d/0, e/.9} and B = { a/.3,
b/.7, c/.6, d/.3, e/.6}, find the fuzzy sets A ∩ B, A ∪ B and (A ∩ B)'
Solution : A ∩ B = {a/.3, b/.6, c/.3, d/0, e/.6},
where degree (x in A ∩ B) = min { degree (x in A), degree (x in B)}.
A ∪ B = {a/.5, b/.7, c/.6, d/.3, e/.9},
where degree (x in A ∪ B) = max {degree (x in A), degree (x in B)}.
The fuzzy set (A ∩ B)′ is obtained from A ∩ B, by the rule:
degree (x in (A ∩ B)′ ) = 1 − degree (x in A ∩ B). 265
Artificial Intelligence- Hence
Knowledge
Representation (A ∩ B)′ = { a/.7, b/.4, c/.7, d/1, e/.4}
266