Artificial Intelligence MCSE-003
Artificial Intelligence MCSE-003
1.0 Introduction 5
1.1 Objectives 6
1.2 Some Simple Definition of A.I. 6
1.3 Definition by Eliane Rich 6
1.4 Definition by Buchanin and Shortliffe 8
1.5 Another Definition by Elaine Rich 12
1.6 Definition by Barr and Feigenbaum 13
1.7 Definition by Shalkoff 18
1.8 Summary 19
1.9 Further Readings/References 20
1.0 INTRODUCTION
In this unit, we discuss intelligence, both machine and human. However, as our
subject matter in the course is machine intelligence, or artificial intelligence, our
discussion of the subject matter is mainly from the point of view of machine
intelligence. Machine intelligence is popularly known as Artificial Intelligence and is
generally referred to by its abbreviation viz. AI. We also shall use the name AI for
the discipline throughout. The style of discussion in this unit is to start with a
definition of AI by some pioneer in the field, and then elaborate the ideas involved in
the definition. Further, while elaborating the ideas involved in the definition, we
introduce a number of relevant new ideas, concepts and definitions to be used later. In
this process, we have introduced and/or explained the following:
5
Introduction to A.I
1.1 OBJECTIVES
Before looking at what A.I. is in the expert’s opinions that involve technical terms
needing some explanation, we state below three simple definitions from completely
non-specialists’ point of view:
2. A.I. is the study of making computer models of human intelligence; and finally
3. A.I. is the study concerned with building machines that simulate human
behaviour.
In order to have still better and concrete opinion about what is AI and its subject-
matter, we consider definitions suggested by leading writers and pioneer contributors
to the development of A.I. We supplement these definitions with comments to
facilitate the understanding of the underlying ideas and of the technical terms involved
in the definitions.
Definition 1: The first definition we consider is by Elaine Rich, the author of the
book entitled ‘Artificial Intelligence’[1]. It states: Artificial Intelligence is the
study of how to make computers do things, at which, at the moment, people are
better.
Comment 1, Definition 1: Implicit in the Rich’s definition is the idea that there are
mental tasks that computers can do better than human beings and vice-versa, there are
tasks which at the moment human beings can do better than computers. It is well-
known that computers are better than human beings in the matter of
6 • numerical computation,
• information storage, and Introduction to Intelligence
• repetitive tasks. and Artificial Intelligence
On the other hand, at the moment, human beings are much better than machine in
the matter of
• understanding including the capability of explaining,
• predicting the behaviour and structure of a system,
• in the matter of common-sense reasoning,
• in drawing conclusions when available information is either incomplete,
inconsistent or even both, and
• also, in visual understanding and speech understanding, which require
simultaneous availability (availability in parallel) of large amount of information.
In essence, it is found that computers are better than human beings in tasks
requiring sequential but fast computations, where human beings are better than
computers in tasks, requiring essentially parallel processing. In order to clarify
what it is for a problem to essentially require parallel processing for its solution, we
consider the following problem:
Figure 1.1
We are given a paper with some letter, say, C written on it and a card-board with a
pin-hole in it. The card board is placed on the paper in such a manner that the letter is
fully covered by the card board as shown in Figure 1.1. We are allowed to look at the
paper only through the pin-hole in the card-board. The problem is to tell correctly the
letter written on the paper by just looking through the pin-hole. As the information
about the black and white pixels is not available simultaneously, it is not possible to
figure out the letter written on the paper. The figuring out the letter on the paper
requires, simultaneous availability of the whole of the grey-level information of all the
points constituting the letter and its surrounding on the paper. The gray-level
information of the surrounding of the letter provides the context in which to interpret
the letter.
We consider another example that shows the significance of contextual information or
knowledge and its simultaneous availability for visual understanding. From the
following picture, we can conclude that one of the curved lines represents a river and
other curved lines represent sides of the hills only on the basis of the simultaneous
availability of information of the pixels.
7
Introduction to A.I
Contextual information plays a very important role not only in the visual
understanding but also in the language and speech understanding. In case of speech
understanding, consider the following example, in which the word ‘with’ has a
number of meanings (or connotations) each being determined by the context.
Further, the phrase ‘for a long time’ may stand for a few hours to millions of years,
but again determined by the context, as explained below.
Comment 3, Definition 1: The definition is rather weak in the sense that it fails to
include some areas of potentially large importance viz, problems that can be solved at
present neither by human beings nor by computers. Also, it may be noted that, by
and by, if computer systems become so powerful that there is no problem left, which
human beings can solve better than computers, then nothing is left of AI according to
this definition.
8
Definition 2 AI is the branch of computer science that deals with symbolic rather Introduction to Intelligence
than numeric processing and non-algorithmic methods including the rules of and Artificial Intelligence
thumb or heuristics instead of algorithms as techniques for solving problems.
On the other hand, even a non-digital character sequence say ‘ABC’ may represent a
number, for example, in hexadecimal number system. Also, words of English (or any
other) language when considered lexicographically ordered, acquire some numeric
attributes.
The conclusion we draw from the above discussion, is that a word as a sequence of
characters (including digits) may denote a number or a symbol (henceforth, a symbol
stands for non-numeric symbol) depending upon the context in which it is used.
And the context is determined by the nature of the problem under consideration. If
the problem can be solved using only numerical aspects of the objects in the domain
and environment of the problem, then we have the advantage of having built-in
relations (like less than, equal to etc.) and the built-in operations (like +, -, * etc.) that
can be readily used without having to define these relations and operations explicitly.
But, unfortunately, most of the problems, we encounter for our day to day survival or
even for our intellectual pursuits, involve not only quantitative, but qualitative aspects
also of the objects of the problem domain. In order to solve these problems, we use
common sense reasoning, exploit our capability for visual and linguistic
understanding, try to get meaning out of incomplete and even inconsistent information
that is available, in addition to a number of other known and unknown mechanism.
Qualitative aspects, their ideal representations, defining relations and operations
involving these aspects, are generally different for different types of problems.
Hence, it is impossible to capture in general relevant relations and operations for all
types of problems, and then defining these as built-in operations of the machine,
because there are potentially infinite types of problems that we encounter and try to
solve.
This discussion explains the basic difference between numeric processing and
(non-numeric) symbolic processing. Summarizing, numeric processing involves
only a small number of well-defined relations and operations having universally
accepted meanings, and hence, these relations and operations can be incorporated as a
part of a computer system. On the other hand, in symbolic processing the relations
and operations required to solve a problem depend upon the problem under
consideration, and hence, have to be defined explicitly along-with or as a part of
programs constituting the solutions of the problems.
They proved that even through a problem may be expressed precisely or formally (i.e.,
in terms of mathematical entities like sets, relations functions etc.), yet it need not
yield to an algorithmic solution. A problem which has at least one algorithmic
solution is called a solvable problem. They further proved that out of even solvable
problems, only a small fraction can be solved if only feasible amount of resources
like, time and space are used. Informally, feasible amount of resources means that
the requirement for resources does not increase too rapidly with the increase in size of
the problem. The notion of the size of a problem will be defined formally later on
(under comment 1 on Definition 3). However, an intuitive idea about the concept of
the size of a problem and its role in estimating the resource requirement for solving
the problem can be had through the simple problem of calculating income tax for each
of the tax-payers. The requirement of resources like, time and computing equipment
for 1000 tax-payers would be much less, as compared to the requirement of resources
for computing income-tax for one million tax payers. In this problem, n, the number of
tax-payers for whom the income-tax is to be calculated, may be taken as size of the
problem.
This limitation and other difficulties with algorithmic solutions has given impetus to
efforts for finding non-algorithmic solutions of problems. Neural Network
approach to solving many difficult problems, is a well-known alternative to
algorithmic methods of solving problems. In AI, there are mainly two approaches to
solve problems, which generally difficult to solve with algorithmic methods. One
approach is Neural approach, mentioned just above. The other approach is called
symbolic approach. The symbolic approach cannot be said to be non-algorithmic. The
main difference between symbolic approach of AI and algorithmic approach is that
symbolic approach of AI emphasizes exploitation of the knowledge of the domain and
the environment of the problem under consideration. Some of this knowledge is in the
form rules of thumb, generally, called heuristics in AI.
Consider the problem of crossing from one side over to the other side of a busy road
on which a number of vehicles are moving at different velocities. A step-by-step (i.e.,
algorithmic) method of solving this problem may consist of:
10
(i) Knowing (exactly) the distances of various vehicles from the path to be Introduction to Intelligence
followed to cross over. and Artificial Intelligence
(ii) Knowing the velocities and accelerations of the various vehicles moving on the
road within a distance of, say, one kilometer.
1 2
(iii) Using Newton’s Laws of motion and their derivatives like s = ut + at , and
2
calculating the times that would be taken by each of the various vehicles to
reach the path intended to be followed to cross over.
(iv) Adjusting dynamically our speeds on the path so that no collision takes place
with any of the vehicle moving on the road.
The above is a systematic step-by-step method, i.e., an algorithm, of crossing the road
that may ensure no collision with any vehicle. But, how many of us can follow it?
Hardly anybody! First of all, it is practically impossible to measure distances,
velocities and accelerations of various vehicles on the road, even within a radius of
one kilometer. Secondly, even if we assume theoretically that it is possible to measure
distances, velocities and accelerations of various vehicles and to calculate safe timings
to cross the road, we would not like or care to follow the above-mentioned algorithm,
because our past experience, our sense of survival and other built-in mechanisms have
allowed us, in the past, to cross over safely without following any systematic method.
All of us just guess the distances of the vehicles, safe enough to cross over, and then
actually cross over at an appropriate time. Not even one in 1000, on an average gets
hurt when crossing a road using only guesses, in a crowded city like, Delhi, where
movement of vehicles is one of the most chaotic and unruly in the whole world.
However, this is not to deny that once in a while, the guess is incorrect and someone
or other gets hurt or even is killed almost every day.
Each one of us every day, comes across hundreds of problems similar to the one of
crossing of a road. And, for each such problem one uses a good guess and one
generally is able to solve the problem satisfactorily each time, though the solutions
may not be the best possible ones. And, or once in a while, we even fail to get any
solution using the guess. However, if we insist on only following a systematic step-
by-stop method that guarantees best possible solution for solving each problem, then
we would hardly be able to make any progress in our day to day business of even
mere survival.
The essence of the above discussion is that while attempting solutions of many of the
problems, it is not only desirable but almost essential that for each of such problems
we follow some good guess instead of following a step-by-step systematic method
that guarantees the best solution. In A.I, these guesses are called heuristics. In later
chapters, we discuss heuristics in detail. However, for the time being, we state that
heuristics are good guesses, possibly based on past experience, judgement, intuition
or hunches, which lead us most of the time to reasonably good solutions, though these
guesses do not guarantee the best solutions or even any solution for every instance of
the problem under consideration.
11
Introduction to A.I
1.5 ANOTHER DEFINITION BY ELAINE RICH
The next definition, again by Elaine Rich [1] is more technical and involves some
concepts from Theory of Computation. It sates:
Definition 3: Artificial Intelligence is the study of techniques for solving
exponentially hard problems in polynomial time exploiting knowledge about the
problem domain.
As computer study is partly engineering in nature, in the sense that we design and
implement or produce computer solutions for different types of problems and hence
these products, i.e., solutions, need to be evaluated vis-a-vis problem specifications
and other measures like, efficiency in respect of time and space requirements of the
solutions. In order to measure the efficiency of a suggested computer solution of a
problem, the earlier mentioned logicians/mathematicians suggested the concepts of
time complexity and space complexity for the solutions and even for the problems.
The basic idea behind these complexity measures is that all the operations that a
computer (present or future generations) can execute, may be thought of as composed
of a small number of basic operations. These basic operations can be easily compared
for their relative requirements for time and space. For the basic operation say O1,
which is expected to take minimum time (or space) among all the basic operations, the
time (or space) complexity is assigned the number one. For any other basic operation,
complexity is a positive number depending upon the expected relative requirement for
time (or space) for the operation as compared to that for the operation O1. For other
computer operations, time/space complexity may be computed from those for the
basic operations. Also from these complexities, we can compute the complexities of
the programs using the size of the input data as an additional parameter. For example,
to multiply two n x n matrices we require n3 multiplications and (n3 ─ n2) additions.
Further, if Ms X also knows the house number in Hauz Khas, then there is hardly any
search required and X can directly reach Y’s residence. Next, consider just opposite
situation so far as availability of knowledge is concerned. Let us X even do not know
that Y lives in Delhi. We can easily guess the plight of X when she, if follows a step-
by-step method, is required to search, possibly all over the world, for the residence of
Y.
Fisher and Firschein in their book ‘Intelligence: The Eye, the Brain and the
Computer’ [9] on Page 4 state that they expect an intelligent agent to be able to:
They further state that there are a number of human attributes that are related
to the concept of intelligence, but are normally considered distinct from it:
• Awareness (consciousness)
• Aesthetic appreciation (art, music)
• Emotion (anger, sorrow, pain, pleasure, love, hate)
14
• Sensory acuteness Introduction to Intelligence
• Muscular coordination (motor skills) and Artificial Intelligence
Next, we discuss ‘intelligence’ from more fundamental level. The ideas explained
below are based on the Information Transfer Model of scientific phenomena due to
Norbert Wiener (1894-1964). Norbert Wiener, an intellectual prodigy and author of
the famous book entitled Cybernetics [14], suggested the Transfer of Information
model to be a better model than the prevailing model based on Transfer of Energy for
explanation of a number of scientific phenomena. Through the Wiener’s theory, a new
discipline was born, also, called Cybernetics
However, our discussion is mainly based on ideas explained in the book ‘Beyond
Information’ by Tom Stonier [10]: According to the ideas explained in Stonier,
there are four fundamental properties of the universe viz. energy, matter,
information and evolution (or change). The cardinality of information in the
universal scheme of things can be judged from the following argument: All the
entities from down to nucleons to the whole of the universe, each is known to us as an
organised system of simpler objects, e.g., fundamental particles organise into
nucleolus, nucleolus organise to form atomic nuclei, which alongwith electrons and
protons organise into atoms and so on. Molecules, polymers, membranes, organs,
living beings, societies, planets, planetary systems, galaxies … and finally the whole
universe, each is known as an organised system of some simpler objects. An
organisation builds upon pre-existing organisations. Thus an organised system is
recursively obtained (or defined) as an interdependent assembly of elements
and/or organised systems. And it is ‘information’ what is exchanged between
components of an organised system to effect their interdependence and to maintain
the integrity of the system as long as the system survives against the fourth
fundamental property of the universe, i.e., evolution or change. Gravitational pull,
now an established entity, is just an information processing activity. Thus
‘information’ is no more or no less an abstract concept than ‘energy’ or ‘matter’.
What mass is to matter and the heat is to energy, so is organisation to
information. Each of the former is a visible and measurable form of the
corresponding latter. More the mass, more the matter in a system; more the heat,
more the capacity to do work, i.e., energy in the system; similarly higher the degree
(or more the complexity) of the organization (in terms of underlying organizations of
the components and their components and so on, and in terms of the number and
levels of interactions and relations between components at a particular level) higher is
the information content of the system.
Remark 2: Information organises not only matter and energy but itself as well.
Evolution leads to discontinuities, i.e., to something which is qualitatively different
from the earlier existing entities. And intelligence is the phenomenon which has
evolved out of information but which is qualitatively different from information.
In the similar manner, we consider a finite set of attributes and degrees for each
attribute for organizations, i.e., information processing systems, which allow us to
categorise systems as intelligent or otherwise in such a way that the systems which are
generally considered as intelligent are categorised as intelligent and further whatever
systems are generally considered as non-intelligent are categorized as non-intelligent.
As evolution has taken over billions of years, hence divergence among information
processing systems intelligence-wise must be potentially infinite. Thus any
categorization based on only finite number of attributes would always be incomplete
16
and leave large number of cases ‘uncategorisable’. To begin with, we start with a Introduction to Intelligence
working definition of intelligence and then later expand on it: and Artificial Intelligence
The above principle fits best, at least, in the limiting cases: At one extreme is a cube
of sugar dissolving in a cup of tea. Although highly organised, the cube is totally
controlled by environmental elements and hence, according to the above principle, it
has zero intelligence. This is exactly what we also feel. On the other extreme is
technologically advanced human society which can divert the waters of rivers to
irrigate plains to provide an assured supply of food to its population. Thus
intelligence measure of a technologically advanced society as a whole is, according to
the above principle, quite high. This conclusion of the above principle is in
consonance with what we also feel.
Fishler and Firschein [9] on Page 4 state: Intelligence involves learning capability
and goal-oriented behaviour. Additional attributes of intelligence include reasoning,
common-sense, planning, perception, creativity, memory retention & recall.
Shanks [11] on Page 49 observes: The simplest and perhaps safest definition of
intelligence is the ability to react to something new in a non-programmed way. The
ability to be surprised or to think for oneself is really what we mean by intelligence.
In order to explain the concept of A.I. through ‘Definition 4’, we discussed the
concept of intelligence itself as a phenomenon. Next, we quote another definition of
A.I. again based on the concept of intelligence and given but from engineering point
of view by another pioneer in the field, viz Shalkoff, a Professor of Electrical
Engineering.
17
Introduction to A.I
1.7 DEFINITION BY SHALKOFF
In view of the fact that A.I. is partly an engineering discipline according to the above
definition, let us recall what is meant by the concept engineering.
Again, in the light of the definition of Engineering given above, a part of the
definition by Shalkoff may be paraphrased as ‘…through application of A.I., products
are obtained that exhibit intelligent behaviour….’ This paraphrased part of the
definition by Shalkoff raises another issue: How to judge/evaluate whether a product
obtained through an application of A.I., is actually intelligent.
The issue of testing an A.I. product as intelligent product was considered by the
pioneers themselves including Alan Turing, the most well known name among the
pioneers. In honour of Turing, the most prestigious award for contributions to the field
of computer science, has been instituted and is given annually.
Turing suggested a test, which is well known as Turing Test, for testing whether a
product has intelligence. An outline of the Turing test is given below.
For the purpose of the test, there are three rooms. In one of the rooms is a computer
system claimed to have imbedded intelligence. In the other two rooms, two persons
are sitting, one in each room. The role of one of the persons, let us call A, is to put
questions to the computer and to the other person to be called B, without knowing to
whom a particular question is being directed, and, of course, with the specific purpose
of identifying the computer. On the other hand, the computer would answer in such a
way that its identity is not revealed to A.
The communication among the three is only through computer terminals so that
identity of the computer or the person B can be known only on the basis of quality of
responses as intelligent or otherwise, and not just on the basis of other human or
machine characteristics. If A is not able to know the identity of the computer, then
computer is intelligent. More appropriately, if the computer is able to conceal its
identity from A, then the computer is intelligent.
We may note here that, in order to be called intelligent, the computer should be clever
enough not to give answer too quickly, at least not within a fraction of a second, even
if it can, say, to a question involving finding of the product of two numbers each of
more than 20 digits.
18
Objections to Turing Test: There have been a number of objections to the Turing Introduction to Intelligence
test as a test of intelligence of a machine. One of the most well known objections is and Artificial Intelligence
called Chinese Room Test proposed by John Searle. The essence of the Chinese
Room Test, that we are going to explain below, is that convincing successfully by a
system, say A , of possessing qualities of another system, say B, does not imply that
the system A actually possesses the qualities of B. For example, the capability of
convincing others by a male human of being a woman, does not give the male the
quality of bearing a child like a woman.
The scenario for the Chinese Room Test consists of a single room with two windows.
In the room a scholar on Shakespeare, knowing English, but not knowing Chinese, is
sitting with a sort of encyclopedia on Shakespeare. The encyclopedia is printed in
such a way that for each pair of facing pages, one page is written in Chinese
characters and the other page is translation in English of the contents of the facing
page in Chinese. Through one of the windows questions on Shakespeare’s literature in
Chinese characters are sent to the person sitting inside. The person looks through the
encyclopedia and on finding in the encyclopedia the exact copy of the sequence of
characters sent in, reads its translation in English, thinks of its answer and writes the
answer in English for his/her own understanding, finds the corresponding sequence of
Chinese characters in the encyclopedia, and sends the sequence of Chinese characters
through the other window. Now, Searle says that, though the scholar successfully
behaves as if s/he knows Chinese, but, as per assumption it is not so. Just from the fact
that a system is able to simulate a quality, it can not be inferred that the system
possesses the quality.
1.8 SUMMARY
This is an introductory unit to the course. The unit gives a bird’s eye view of the
whole of the course of Artificial Intelligence. The approach, in the unit, is to start with
a definition by some pioneer in A.I. In the process of discussion of the definition, a
number of relevant new concepts are gradually built up and discussed.
In Section 0.4, we discuss the differences (i) between number and symbol, (ii)
between algorithmic and non-algorithmic methods of solving problems.
In the Section 0.5, another definition by Eliane Rich, as given below, is discussed:
Artificial Intelligence is the study of techniques for solving exponentially hard
problems in polynomial time exploiting knowledge about the problem domain.
In section 0.6, we discuss the following definition of A.I. by Barr & Feigenbaum:
Artificial Intelligence is the part of computer science concerned with designing 19
Introduction to A.I intelligent computer systems, i.e., systems that exhibit the characteristics we
associate with intelligence in human behaviour.
20
The Propositional
UNIT 2 THE PROPOSITIONAL LOGIC Logic
2.0 Introduction 21
2.1 Objectives 23
2.2 Logical Study of Valid and Sound Arguments 23
2.3 Non-Logical Operators 25
2.4 Syntax of Propositional Logic 26
2.5 Semantics/Meaning in Propositional Logic 27
2.6 Interpretations of Formulas 29
2.7 Validity and Inconsistency of Propositions 30
2.8 Equivalent forms in the Prepositional Logic (PL) 32
2.9 Normal Forms 33
2.10 Logical Deduction 35
2.11 Applications 37
2.12 Summary 38
2.13 Solutions/Answers 38
2.14 Further/Readings 43
2.0 INTRODUCTION
Symbolic logic may be thought of as a formal language for representing facts about
objects and relationships between objects of a problem domain alongwith a precise
inferencing mechanism for reasoning and deduction. An inferencing mechanism
derives the knowledge, which is not explicitly/directly available in the knowledge
base, but can be logically inferred from what is given in the knowledge base.
The reason why the subject-matter of the study is called Symbolic Logic is that
symbols are used to denote facts about objects of the domain and relationships
between these objects. Then the symbolic representations and not the original facts
and relationships are manipulated in order to make conclusions or to solve problems.
Also, we mentioned that a Symbolic Logic, apart from having other characteristics, is
a formal language. As a formal language, there must be clearly stated unambiguous
rules for defining various constituents or constructs, viz. alphabet set, words, phrases,
sentences etc. of the language and also for associating meaning to each of these
constituents.
The study of Symbolic Logic is significant, specially, for academic pursuits, in view
of the fact that it is not only descriptive (i.e., it tells how the human beings reason)
but it is also normative (i.e., it tells how the human beings should reason).
In this unit, we shall first study the simplest form of symbolic logic, viz, the
Propositional Logic (PL). In the next unit, we consider a more general form of logic
called the First-Order Predicate Logic (FOPL). Subsequently, we shall consider other
symbolic systems including Fuzzy systems and some Non-monotonic systems.
For a given declarative sentence, its being ‘True’ or ‘False’ is called its Truth-value.
Thus, truth-value of (i) above is False and that of (ii) is True.
On the other hand, none of the following sentences can be assigned a truth-value, and
hence none of these, is a statement or a proposition:
(i) Who was the first Prime Minister of India? (Interrogative sentence)
(ii) Please, give me that book. (Imperative sentence)
(iii) Ram must exercise regularly. (Imperative, rather Deontic)
(iv) Hurrah! We have won the trophy. (Exclamatory sentence)
The symbols, such as P, Q, and R, that are used to denote propositions, are called
atomic formulas, or atoms. As discussed earlier, in this case, the truth-value of P is
False, the truth-value of Q is True and the truth-value of R, though not known yet, is
exactly one of ‘True’ or ‘False’, depending on whether Ram is actually a Ph. D or
not.
At this stage, it may be noted that once symbols are used in place of given statements
in, say, English, then the propositional system, and, in general, a symbolic system is
aware only of symbolic representations, and the associated truth values. The system
operate only on these representations. And, except for possible final translation, is not
aware of the original statements, generally given in some natural language, say,
English.
We can build, from atoms, more complex propositions, sometimes called compound
propositions, by using logical connectives.
(i) Sun rises in the east and the sky is clear, and
(ii) If it is hot then it shall rain.
The logical connectives in the above two propositions are “and” and “if…then”. In the
propositional logic, five logical operators or connectives, viz., ~ (not), ∧ (and), ∨
(or), → (if… then), and ↔ (if and only if), are used. These five logical connectives can
be used to build compound propositions from given atomic formulas. More generally,
they can be used to construct more complicated compound propositions from
compound propositions by applying the connectives repeatedly. For example, if each
of the letters P, Q, C is used as a symbol for the corresponding statement, as follows:
2.1 OBJECTIVES
Valid Argument: A valid argument is one in which it would be contradictory for the
premises to be true but the conclusion false.
(This argument is invalid, because despite not having overslept, one may be late
because of some other engagements or lazyness.)
(i) If we are close to the top of Mt. Everest then we have magnificent view.
(ii) We are having a magnificent view.
Therefore,
(iii) We are the near the top of Mt. Everest.
(This argument is invalid, because, we may have a magnificent view even if we are not
close to the top of Mt. Everest. The two given statements do not falsify this claim)
We have already discussed invalidity of some arguments, but invalidity above was
based on our intuition. However, intuition may also lead us to incorrect conclusion.
To be sure about the validity of our argument, we need some formal method. In
Section 1.5, we discuss how a Truth table (a formal tool) can be used to establish the
validity/invalidity of an argument.
Sound Argument
We may note that, in the case of a valid argument, it is not required that the
premises/axioms or assumed statements must be True. The assumptions may not be
True, and still the argument may be valid. For example, the following argument is
valid, but its premises and conclusion both are false:
24
Example of Invalid Argument The Propositional
Logic
I (i) If you overslept, you are late.
(ii) you are late.
Therefore, you overslept.
II (i) If you are in Delhi, you are in India.
You are in India.
Therefore, you are in Delhi (invalid argument, though conclusion may be True)
(though the word and joins two words Ram & Mohan, but can not be equivalently
broken into two statements viz. (i) Ram is a friend (ii) Mohan is a friend)
(iii) Mohan drove a car to reach home, met an accident and got slightly injured.
(Here, the use of the word ‘and’ is not in a logical sense, but, it is in temporal sense of
‘and then’ because statement (iii) has different sense from the statement given in (iv)
below)
(iv) Mohan met an accident, got slightly injured and drove a car to reach home.
Thus from the above statements, it can be seen that the natural language word and
may have many senses, both logical and non-logical. Similarly, the words since,
hence and because are frequently used in arguments to establish some facts. But as
shown from the following two arguments, their use in logical arguments is risky in
the sense that some of the arguments involving any of these words may lead to
incorrect conclusions:
Argument (1): Using the word because, we get correct conclusion from
True statements.
Let
Q: Congress party and its allies commanded majority in Indian Parliament in the year
2006 (True statement)
Argument ( 2)
In the following using the word, because, we get incorrect/false conclusion from
True statements
Let
However to say
P because R, i.e., to say
Dr. Man Mohan Singe was Prime Minster of India in 2006, because Chirapoonji, a
town in north-east India, received maximum average rainfall in the world during
1901-2000.
is at least incorrect, if not ludicrous.
Thus from two True statements, P and R and by using connective ‘because’, in this
case, the conclusion is incorrect.
Thus, by using connective because, in one argument we get a correct conclusion from
two True statements and, on the other hand, we get an incorrect conclusion from True
statements.
1. An atom is a wff.
2. If A is a wff, then (~A) is a wff.
3. If A and B are wffs, then each of (A ∧ B), (A ∨ B), (A → B), and (A ↔ B) is a
wff.
4. Any wff is obtained only by applying the above rules.
From the above recursive definition of a wff it is not difficult to see that expression:
(( P → ( Q ∧ ( ~ R))) ↔ S) is a wff; because , to begin with, each of P, Q , ( ~ R) and
S, by definitions is a wff. Then, by recursive application, the expression: (Q ∧ ( ~ R))
is a wff. Again, by another recursive application, the expression: (P → (Q ∧ ( ~ R)))
is a wff. And, finally the expression given initially is a wff.
Further, it is easy to see that according to the recursive definition of a wff, each of the
expressions: (P → (Q ∧ )) and (P ( Q ∧ R )) is not a wff.
For example: Let us be given the wff P → Q ∧ ~ R without parenthesis. Then among
the operators appearing in wff, the operator ‘~’ has highest priority. Therefore, ~ R is
replaced by (~R). The equivalent expression becomes P → Q ∧ (~ R). Next, out of the
two operators viz ‘→’ and ‘∧’, the operators ‘∧’ has higher priority. Therefore, by
applying parentheses appropriately, the new expression becomes P → (Q ∧ (~ R)).
Finally, only one operator is left. Hence the fully parenthesized expression becomes (P
→ (Q ∧ (~ R)))
Next, we define the rules of finding the truth value or meaning of a wff, when truth
values of the atoms appearing in the wff are known or given.
1. The wff ~ A is True when A is False, and ~ A is False when A is true. The wff
~ A is called the negation of A.
2. The wff (A ∧ B) is True if A and B are both True; otherwise, the wff A ∧ B is
False. The wff (A ∧ B) is called the conjunction of A and B.
3. The wff (A ∨ B) is true if at least one of A and B is True; otherwise, (A ∨ B) is
False. (A ∨ B) is called the disjunction of A and B.
4. The wff (A → B) is False if A is True and B is False; otherwise, (A → B) is True.
The wff (A → B) is read as “If A, then B,” or “A implies B.” The symbol ‘→’ is
called implication.
5. The wff (A ↔ B) is True whenever A and B have the same truth values;
otherwise (A ↔ B) is False. The wff (A ↔ B) is read as “A if and only if B.”
Table 1.5
A B ~A (A ∧ B) (A ∨ B) (A → B) (A ↔ B)
(i) T T F T T T T
(ii) T F F F T F F
(iii) F T T F T T F
(iv) F F T F F T T
This table, shall be used to evaluate the truth values of a wff in terms of the truth
values of the atoms occurring in the formula.
Now, we discuss the issue, raised in Section 1.2, of how to check validity/invalidity of
an argument through formal means.
27
Introduction to A.I Validity through Truth-Table.
S L S→L ~L ~S
F F T T T
F T T F T
T F F T F
T T T F F
There is only one row, viz., first row, in which both the premises viz. S → L and ~ L
are True. But in this case the conclusion represented by ~ S is also True. Hence, the
conclusion is valid.
S L (S → L) ~ S ~L
F F T T T
F T T T F
T F F F T
T T T F F
The invalidity of the argument is established, because, for validity last column must
contain True in those rows for which all axioms/premises are True. But in the second
row both S → L and ~ S are True but ~ L is False
Ex. 2: Let
P : He needs a doctor, Q : He needs a lawyer,
R : He has an accident, S : He is sick,
U : He is injured.
28
State the following formulas in English. The Propositional
Logic
a) (S → P) ∧ (R → Q) b) P → (S ∨ U)
c) (P ∧ Q) → R d) (P ∧ Q) ↔ (S ∧ U)
In order to find the truth value of a given formula G, the truth values for the atoms of
the formula are either given or assumed. The set of initially given/assumed values of
all the atomic formulas occurring in a formula say G, is called an interpretation of
the formula G. Suppose that A and B are two atoms and that the truth values of A and
B are T and F respectively. Then, according to third row of Table 1.5, when A is F
and B is T we find that the truth values of (~A), (A ∧ B),
(A ∨ B), (A → B), and (A ↔ B) are T, F, T, T and F, respectively. By developing a
Truth-table of a(ny) formula, its truth value can be evaluated in terms of its
interpretation, i.e., in terms of the truth values associated with the constituent atoms.
Example
G : ((A ∧ B) → (R ↔ (~ S))).
(Please note that the string, in this case G, before the symbol ‘:’, is the name of the
formula which is the name of the string of symbols after ‘:’. Thus, G is the name of the
formula ((A∧ B)→ (R ↔ (~ S))).
The atoms in this formula are A, B, R and S. Suppose the truth values of A, B, R, and
S are given as T, F, T and T, respectively. Then (in the following and elsewhere also,
if there is no possibility of confusion, we use T for ‘True’ and F for ‘False’.)
• (A ∧ B) is F since B is F;
• (~S) is F since S is T;
• (R ↔ (~ S)) is F since R is T and (~S) is F; and hence,
• (A ∧ B) → (R ↔ (~S)) is T since (A ∧ B) is F (and (R ↔ (~S)) is F, which
does not matter).
Note: In view of the fact that when ( A ∧ B) is F, the truth-value of
(A ∧ B) → Any Formula
must be T and, hence, we need not compute the value of (R ↔ (~ S)).
Therefore, the formula G is T if A, B, R, and S are assigned truth values T, F, T and T,
respectively.
The above procedure may be repeated to find truth value of any formula from any
interpretation, i.e., from any assignment to the atomic formulas occurring in the given
formula.
29
Introduction to A.I
Table 1.6 Truth Table of (A ∧ B → (R ↔ ( ~ S)
A B R S ~S (A ∧ B) (R ↔ (~S)) (A ∧ B) → (R
↔ ( ~ S)
T T T T F T F F
T T T F T T T T
T T F T F T T T
T T F F T T F F
T F T T F F F T
T F T F T F T T
T F F T F F T T
T F F F T F F T
F T T T F F F T
F T T F T F T T
F T F T F F T T
F T F F T F F T
F F T T F F F T
F F T F T F T T
F F F T F F T T
F F F F T F F T
A table, such as given above, that displays the truth values of a formula G for all
possible assignments of truth values to atoms occurring in G is called a Truth table
of G.
NOTATION: If A1,….An are all the atoms in a formula, it may be more convenient to
represent an interpretation by a set (m1,….mn), where mi is either Ai or ~Ai. mi is
written as Ai if T is assigned to Ai. But mi is written as ~ Ai if F is assigned to Ai.
For example, the set {A, ~B, ~R,S} represents an interpretation of a formula in which
A, B, R, and S are the only atoms and which are, respectively, assigned T, F, F, and T.
We will use the notation throughout.
It may noted that in Section 1.2, we discussed the concept of valid Argument. Here,
we study formulas or propositions. Next, we shall consider wff that are true under
all possible interpretations and wff that are false under all possible interpretations.
Example
G : (((A → B) ∧ A) → B).
The formula G has 22 = 4 possible interpretations in view of the fact it has two atoms
viz A and B. It can be easily seen from the following table that the wff G is True
under all its interpretations. Such as a wff which is True under all interpretation is
called a valid formula (or a tautology).
30
Truth Table of (((A→ B) ∧ A) → B) The Propositional
A B (A→ B) (A→ B) ∧ A ((A → B) ∧ A) → B Logic
T T T T T
T F F F T
F T T F T
F F T F T
G : ((A → B) ∧ (A ∧ ~ B))
The truth table of the formula G given below shows that G is False under all its
interpretations. Such a formula which is False under all interpretations is called an
inconsistent formula (or a contradiction).
Definition: A formula is said to be valid if and only if it is true under all its
interpretations. A formula is said to be invalid if and only if it is not true under at
least one interpretation. A valid formula is also called a Tautology. A formula is
invalid if there is at least one interpretation for which the formula has a truth value
False.
31
Introduction to A.I Examples:
(i) A Valid Formula:
(a) Even True is a wff which is always True and, hence, True is a valid formula.
(b) G1: A ∨ (~A) is True for all its interpretations. As G1 has only one atom viz. A,
terefore, it has only two interpretations. Let one interpretation of G1 be : A is
True. But then G1 assumes the value (True ∨ (~ True)) = True. The other
interpretation of G1 is : A is False. Then G1 assumes the value (False ∨ ~ False) =
True.
(ii) Consistent (True for at least one interpretation) but not valid Formula (i.e. is
invalid, i.e., False for at least one interpretation):
(a) The simplest example of such a formula is the formula G2: A. Then, for the
assignment A as True, G2 is True. Therefore G2 is consistent. On the other
hand, the interpretation of G2 with A as False, makes G2 false. Therefore, G2:
A is not valid.
(b) Both G3 : A ∨ B and G4 : A ∧ B are consistent but not valid. Both G3 and G4
are True under the assignment A as True and B as True. On the other hand,
both are False under the interpretation A as False and B as False.
(iii) Invalid (False for at least one interpretation) but not inconsistent (not False
for all interpretations): Any one of the examples in (ii) above
(a) Even ‘False’ is a wff; which is always False, and hence is inconsistent.
(b) G5 : A ∧ (~A) is False, for all interpretations of G5. Actually, there are only
two interpretations of G5. One is : A is True. The other is : A is False. In both
cases G5 is False.
It will be shown later that the proof of the validity or inconsistency of a formula is a
very important problem. In the propositional logic, since the number of interpretations
of a formula is finite, one can always decide whether or not a formula in the
propositional logic is valid (inconsistent) by exhaustively examining all of its possible
interpretations.
Ex. 4: For each of the following formulas, determine whether it is valid, inconsistent,
consistent or some combination of these.
(i) E: ~ (~A) → B
(ii) G: (A → B) → (~ B → ~ A)
(iii) H: (A ∨ ~ A) → (A ∧ B ) ∧ ( ~ A)
(iv) J: (A ∧ B) ∧ (~ A) → ( B ∨ ~ B)
Example
Table of Equivalences of PL
(1.1) E ↔ G = (E → G) ∧ (G → E)
(1.2) E→G=~E∨G
(1.3)(a) E ∨ G = G ∨ E; (b) E ∧ G = G ∧ E
(1.4)(a) (E ∨ G) ∨ H = E ∨ (G ∨ H); (b) (E ∧ G) ∧ H = E ∧ (G ∧ H)
(1.5)(a) E ∨ (G ∧ H) = (E ∨ G) ∧ (E ∨ H); (b) E ∧ (G ∨ H) = (E ∧ G) ∨ (E ∧ H)
(1.6)(a) E ∨ False = E; (b) E ∧ True = E
(1.7)(a) E ∨ True = True (b) E ∧ False = False
(1.8)(a) E ∨ ~ E = True; (b) E ∧ E = E
(1.9) ~ (~ E) = E
(1.10)(a) ~ (E ∨ G) = ~ E ∧ ~ G; (b) ~ (E ∧ G) = ~ E ∨ ~ G
In the table given above, True denotes the fact that the wff is True under all
interpretations and False denotes the wff that is False under all interpretations.
Laws (1.3a), (1.3b) are often, called commutative laws; (1.4a), (1.4b) associative
laws; (1.5a), (1.5b), distributive laws: and (1.10a), (1.10b), De Morgan’s laws.
(i) (~ A ∨ B) ∨ (A ∧ ~ B ∨ C)
(ii) ( A → B) ∧ ( ~ B ∧ ~ A)
Using table of equivalent formulas given above, any valid Propositional Logic
formula can be transformed into CNF as well as DNF.
Step 1: Use the equivalences to remove the logical operators ‘↔’ and ‘→’:
(i) E ↔ G = (E → g) ∧ (G → E)
(ii) E → G = ~ E ∨ G
(iii) ~ (~E) = E
(v) ~(E ∨ G) = ~ E ∧ ~ G
(vi) ~(E ∧ G) = ~ E ∨ ~ G
(vii) E ∨ (G ∧ H) = (E ∨ G) ∧ (E ∨ H)
(viii) E ∧ (G ∨ H) = (E ∧ G) ∨ (E ∧ H)
Example
= (A ∧ B) ∨ (A ∧ (~ C)) (Using E ∧ (F ∨ G) =
(E ∧ F) ∨ (E ∧ G))
However, if we are to obtain CNF of ~ A (→ (~ B ∧ C)), in the last but one step, we
obtain
~ (A → (~ B ∧ C)) = A ∧ (B ∨ ~ C), which is in CNF, because, each of A and
( B ∨ ~ C) is a disjunct.
34
Consider The Propositional
D → (A→ (B ∧ C)) (using E → F = ~ E ∨ F for the inner implication) Logic
(i) ~ (A ∨ ~ B) ∧ (S → T) (ii) (A → B) → R
(ii) (~A ∧ B) ∨ (A ∧ ~ B)
(i) (A→ B) → (A ∧ B) = (~ A → B) ∧ (B → A)
(ii) A ∧ B ∧ (~A ∨ ~ B) = ~ A ∧ ~ B ∧ (A ∨ B)
Next, we state without proof two very useful theorems for establishing logical
derivations:
The above two theorems are very useful. They show that proving a particular
formula as a logical consequence of a finite set of formulas is equivalent to
proving that a certain single but related formula is valid or inconsistent.
Note: Significance of the above two theorems lies in the fact that logical consequence
relates two formulas, where as validity/inconsistency is only about one formula. Also,
there are a number of well-known methods, including truth-table method, for
35
Introduction to A.I establishing inconsistency/validity of a formula. Thus, formula G logically follows
from a given set of formulas, we check validity of single formula. And, for checking
validity of a single formula, we already have some methods including Truth-table
method.
Definition: If the formula G is a logical consequence of the formula E1,….En, then the
single formula ((E1 ∧ ….∧ En) → G) is called a theorem, and G is also called the
conclusion of the theorem.
According to second method, using Theorem 1, we should show that the formula:
(E1 ∧ E2 ∧ …..∧ En) → G
is valid, i.e., True for each of its interpretations. Again validity can be shown either
through a truth table or otherwise.
The last of the three methods uses Theorem 2. According to this method, in order to
show, G as a logical consequence of E1, E2,…En, it should be established that the
formula (E1 ∧ E2 ∧ …..∧ En ∧ ~ G) is inconsistent, i.e., is False under all its
interpretations. Next, we apply these methods through an example.
E1 : (A → B), E2 : ~B , G : ~ A
Method 1: From the following Table, it is clear that whenever E1: A → B and
E2: ~ B both are simultaneously True, (which is true only in the last row of the table)
then G: ~ A is also True. Hence, the proof.
A B A→B ~B ~A
T T T F F
T F F T F
F T T F T
F F T T T
(A → B) ∧ ~ B) → ~ A = ~ (( A → B) ∧ ~ B) ∨ ~ A (using E → F = (~ E ∨ F))
= ~ (( ~ A ∨ B) ∧ ~ B) ∨ ~ A
= ~ ((~ A ∧ ~ B) ∨ (B ∧ ~ B)) ∨ ~ A
= ~ ((~ A ∧ ~ B) ∨ False) ∨ ~ A
= ~(( ~ A ∧ ~ B)) ∨ ~ A (using De Morgan’s Laws)
= (A ∨ B) ∨ ~ A =
= (B ∨ A) ∨ ~ A
= B ∨ (A ∨ ~ A)
= B ∨ True
36
= True (always) The Propositional
Thus, ((A → B) ∧ B) → ~ A is valid. Logic
2.11 APPLICATIONS
Example
Suppose the stock prices go down if the interest rate goes up. Suppose also that most
people are unhappy when stock prices go down. Assume that the interest rate goes up.
Show that we can conclude that most people are unhappy.
To show the above conclusion, let us denote the statements are as follows:
(1′) A→ S
(2′) S→ U
(3′) A
(4′) U. (to conclude)
In order to establish the conclusion, we should show that (4′) is logical consequence
of (1′) , (2′) and (3′). For this purpose, we show that (4′) is true whenever (1′) ∧ (2′) ∧
(3′) is true.
37
Introduction to A.I = (((A ∧ ~A) ∨ (A ∧ S)) ∧ (~ S ∨ U)) (by using associative
laws and then using
distributivity of
‘A ∧’ over the next
disjunct (~ A ∨ S))
= ((False ∨ (A ∧ S)) ∧ (~ S ∧ U)) (using False
∨ E = E)
= (A ∧ S ) ∧ ( ~ S ∨ U)
= (A ∧ S ∧ ~ S) ∨ (A ∧ S ∧ U)
= (A ∧ False) ∨ (A ∧ S ∧ U) (using A ∧ False =
False)
= False ∨ (A ∧ S ∧ U)
=A∧S∧U
Therefore, if ((A→ S) ∧ (S→ U ) ∧ A) is true, then (A ∧ S ∧ U) is true. Since
(A ∧ S ∧ U) is true then each of A, S, and U is true, we conclude that U is true. Hence,
U is a logical consequence of 1), 2) and 3) given above.
Ex. 10:Given that if the Parliament refuses to enact new laws, then the strike will not
be over unless it lasts more than one year and the president of the firm resigns, will
the strike not be over if the Parliament refuses to act and the strike just starts?
2.12 SUMMARY
In this unit, to begin with, we discuss what is Symbolic Logic and why it is it is
important to study it. The subject matter of symbolic logic consists of arguments,
where an argument consists of a number of statements — one of which is called
the conclusion and is supposed to be logically drawn from the others. Each one of the
other is called a premise, To be more specific, the subject of Symbolic Logic is the
study of how to develop tools and technique to draw correct conclusions from a given
set of premisses or to verify whether a conclusion is correct or not. A conclusion is
correct in the sense: Whenever all the premisses are True then conclusion is
necessarily True. An argument with correct conclusion is called a valid argument.
Next, a sound argument is defined as a valid argument in which premises also have to
be True.
(in some world).
In this unit, we study only a specific branch of symbolic logic, viz. Propositional
Logic (PL).
Next, we discuss how a statement, also called a well-formed formula (wff) and also a
Proposition, which is the basic unit of an argument in PL, is appropriately denoted
and how it is interpreted, i.e., how a wff is given meaning. The meaning of a wff in
PL is only in terms of True or False. The wffs are classified as valid, invalid,
consistent and inconsistent.
Then tools and techniques in the form of Truth-table, logical deduction, normal forms
etc are discussed to test these properties of wffs and also to test validity of arguments.
Finally a number of applications of these concepts, tools and techniques of PL are
used to solve problems that involve logical reasoning of PL systems.
2.13 SOLUTIONS/ANSWERS
Ex. 1
(a) Let H: He campaigns hard ; E: He will be elected
38 Then the statement becomes the formula:
H→E The Propositional
Logic
(b) Let H: The Humidity is high, RTY: It will rain today
RTW: It will rain tomorrow.
Then
H → RTY ∨ RTW
(c) Let C: Cancer will be cured
D: Cancer’s cause will be determined
F: A new drug for cancer will be found
Then the statement becomes the formula:
(~ C) ∨ (D ∧ F). This formula may also be written as:
C→D∧F
(d) Let C: One has courage
S: One has skill
M: One climbs mountain
Then the statement becomes the formula:
M→C∧S
Ex 2: (a) If he is sick then he needs a doctor, but, if he has an accident then he needs a
lawyer
(b) If One requires a doctor then one must be either sick or injured.
(c) If he needs both a doctor and a lawyer then he has an accident.
(d) One requires a doctor and also a lawyer if and only if one is sick and also
injured.
Ex. 3:
(i) Truth table of the formula: P: (~ A ∨ B) ∧ ( ~ (A ∧ ~ B)) is as given below.
A B ~A ~B ~A ∨ B A∧~B ~ (A ∧ ~B) P
T T F F T F T T
T F F T F T F F
F T T F T F T T
F F T T T F T T
Ex. 4:
(i) Consistent but not valid, because, for For B as T and A as F, the formula
is T. But, for A as T and B as F the formula is F.
(ii) It can be easily that ~ B → ~ A has same truth-value as (A → B) for any
interpretation. Therefore, in stead of the given formula, we may consider
the formula
(A → B) → (A → B)
which can be further written as P → P, writing (A → B) as P. Even P →
can be written as P ∨ P ≡ P ≡ (A → B), The last formula is F when F and
A is T. The formula is T when A is F and B is T. Hence, the formula is
neither valid nor inconsistent.
Therefore, the formula is consistent but not valid
(iii) For all truth assignments to A and B, L. H.S. of the formula is always T
and R. H.S. is always F. Hence the formula is inconsistent, i.e., always F
(iv) The L. H. S. of the given formula is F under all interpretations. Hence, the
formula is T under all interpretation. Therefore, the formula is valid.
Ex. 6:
(i) Using distributive law in the last formula of 5 (ii) above, we get
(A ∨ R) ∧ (~ B ∨ R)
which is the required CNF
= (( A ∧ B) ∧ (~ A ∨ ~ B )),
using left distributivity and commutativity of ∧ we get
= (( A ∧ B) ∧ ~ A) ∨ (( A ∧ B) ∧ ~ B)
Using associativity of ∧ and using A ∧ ~ A = F = B ∧ ~ B
= (B ∧ F) ∨ ( A ∧ F)
Using A ∧ F = F = B ∧ F
=F
Ex. 9: (i) From the following table, ((A → B) ∧ ~ B ∧ A) being False for all
interpretations, is inconsistent.
Truth Table of (A → B) ∧ ~ B ∧ A
A B A→B ~B (A → B) ∧ ~ B ∧ A
T T T F F
T F F T F
F T T F F
F F T T F
(A → B) ∧ ~ B ∧ A = (~ A ∨ B) ∧ ( ~ B ∧ A )
= (~ A ∧ ~ B ∧ A) ∨ (B ∧ ~ B ∧ A) (Distributive Law)
= (~ A ∧ A ∧ ~ B) ∨ (F ∧ A)
= False ∨ False = False
Thus (A → B) ∧ ~ B ∧ A is inconsistent.
Ex. 10:
Let us symbolize the statements in the problem state of above as follows:
A: The Parliament refuses to act.
B: The strike is over.
R: The president of the firm resigns.
S: The strike lasts more than one year.
Then the facts and the question to be answered in the problem can be symbolized as:
E1: (A→ (~ B ∨ (R ∧ S))) represents the statement ‘If the congress refuses to enact
new laws, then the strike will not be over unless it lasts more than one year and the
president of the firm resigns.’
41
Introduction to A.I E3: ~ S represent the statement ‘The strike just starts.’
Ex. 10: We solve the problem by showing that the formula P: ((A → (~ B ∨ (R ∧ S)))
∧ A ∧ ~ S) → ~ B is valid by two methods: (i) by reducing to CNF/DNF
(ii) by constructing truth-table of the formula.
Method (ii)
The solution of the problem lies in showing that ~ B logical follows from E1, E2, and
E3. This is equivalent to showing that P: ((A → (~B ∨ (R ∧ S ))) ∧ A ∧ ~ S) → ~ B is
a valid formula. The truth values of the above formula under all the interpretations are
shown in given table
A B R S ~B ~ B ∨ (R ∧ S)
T T T T F T
T T T F F F
T T F T F F
T T F F F F
T F T T T T
T F T F T T
T F F T T T
T F F F T T
F T T T F T
F T T F F F
F T F T F F
F T F F F F
F F T T T T
F F T F T T
F F F T T T
F F F F T T
42
A B R S E1 E2 E3 ~B ~ B ∨ (R ∧ E1 (E1 ∧ E2 ∧ E3) The Propositional
Logic
S) →~B
T T T T T T F F T T T
T T T F F T T F F F T
T T F T F T F F F F T
T T F F F T T F F F T
T F T T T T F T T T T
T F T F T T T T T T T
T F F T T T F T T T T
T F F F T T T T T T T
F T T T T F F F T T T
F T T F T F T F F T T
F T F T T F F F F T T
F T F F T F T F F T T
F F T T T F F T T T T
F F T F T F T T T T T
F F F T T F F T T T T
F F F F T F T T T T T
Under all interpretations formula is True. Hence, the formula P a valid formula. ~ B is
a logical consequence of E1, E2 and E3. Hence, the “The strike will not be over” is a
valid conclusion.
43
UNIT 1 THE FIRST-ORDER PREDICATE
LOGIC (FOPL)
Structure Page Nos.
1.0 Introduction 5
1.1 Objectives 7
1.2 Syntax of Predicate Logic 7
1.3 Prenex Normal Form (PNF) 12
1.4 (Skolem) Standard Form 15
1.5 Applications of FOPL 17
1.6 Summary 18
1.7 Solutions/Answers 19
1.8 Further/Readings 24
1.0 INTRODUCTION
Once, the statements in the argument in English are symbolised to apply tools of
propositional logic, we just have three symbols P, Q and R available with us and
apparently no link or connection to the original statements or to each other. The
connections, which would have helped in solving the problem become invisible. In
Propositional Logic, there is no way, to conclude the symbol R from the symbols P
and Q. However, as we mentioned earlier, even in a natural language, the conclusion
of the statement denoted by R from the statements denoted by P and Q is obvious.
Therefore, we search for some symbolic system of reasoning that helps us in
discussing argument forms of the above-mentioned type, in addition to those forms
which can be discussed within the framework of propositional logic. First Order
Predicate Logic (FOPL) is the most well-known symbolic system for the pourpose.
The symbolic system of FOPL treats an atomic statement not as an indivisible unit.
Rather, FOPL not only treats an atomic statement divisible into subject and predicate
but even further deeper structures of an atomic statement are considered in order to
handle larger class of arguments. How and to what extent FOPL symbolizes and
establishes validity/invalidity and consistency/inconsistency of arguments is the
subject matter of this unit.
5
Knowledge Representation In addition to the baggage of concepts of propositional logic, FOPL has the
following additional concepts: terms, predicates and quantifiers. These concepts
will be introduced at appropriate places.
In order to have a glimpse at how FOPL extends propositional logic, let us again
discuss the earlier argument.
More generally, relations of the form greater-than (x, y) denoting the phrase ‘x is
greater than y’, is_brother_ of (x, y) denoting ‘x is brother of y,’ Between (x, y, z)
denoting the phrase that ‘x lies between y and z’, and is_tall (x) denoting ‘x is tall’ are
some examples of predicates. The variables x, y, z etc which appear in a predicate
are called parameters of the predicate.
The parameters may be given some appropriate values such that after substitution of
appropriate value from all possible values of each of the variables, the predicates
become statements, for each of which we can say whether it is ‘True’ or it is ‘False’.
For example, for the predicate greater-than (x, y), if x is given value 3 then we obtain
greater-than (3, y), for which still it is not possible to tell whether it is True or False.
Hence, ‘greater-than (3, y)’ is also a predicate. Further, if the variable y is given value
5 then we get greater (3, 5) which , as we known, is False. Hence, it is possible to
give its Truth-value, which is False in this case. Thus, from the predicate greater-than
(x, y), we get the statement greater-than (3, 5) by assigning values 3 to the variable x
and 5 to the variable y. These values 3 and 5 are called parametric values or
arguments of the predicate greater-than.
Similarly, we can represent the phrase x likes y by the predicate LIKE (x, y). Then
Ram likes Mohan can be represented by the statement LIKE (RAM, MOHAN).
Also function symbols can be used in the first-order logic. For example, we can use
product (x, y) to denote x ∗ y and father (x) to mean the ‘father of x’. The statement:
Mohan’s father loves Mohan can be symbolised as LOVE (father (Mohan), Mohan).
Thus, we need not know name of father of Mohan and still we can talk about him. A
function serves such a role.
We may note that LIKE (Ram, Mohan) and LOVE (father (Mohan),Mohan) are atoms
or atomic statements of PL, in the sense that, one can associate a truth-value True or
6
False with each of these, and each of these does not involve a logical operator like ~, The First Order
Predicate Logic
∧, ∨, → or ↔.
Summarizing in the above discussion, LIKE (Ram, Mohan) and LOVE (father
(Mohan) Mohan) are atoms; where as GREATER, LOVE and LIKE are predicate
symbols; x and y are variables and 3, Ram and Mohan are constants; and father and
product are function symbols.
1.1 OBJECTIVES
(i) How analysis of an atomic statement of PL can and should be carried out.
(ii) What are the new concepts and terms that are required to discuss the subject
matter of FOPL.
(iii) How (i) and (ii) above will prove useful in solving problems using FOPL over and
above the set of problems solvable using only PL.
Also, in the introduction to the previous unit, we mentioned that a symbolic logic is a
formal language and hence, all the rules for building constructs of the language must
be specified clearly and unambiguously.
Next, we discuss how various constructs are built up from the alphabet.
For this purpose, from the discussion in the Introduction, we need at least the
following concepts.
iii) Function symbols: These are usually lowercase letters like f, g, h,….or strings
of lowercase letters such as father and product.
iv) Predicate symbols: These are usually uppercase letters like P, Q, R,….or
strings of lowercase letters such as greater-than, is_tall etc.
i) A variable is a term.
ii) A constant is a term.
iii) If f is an n-place function symbol, and t1….tn are terms, then f(t1,….,tn) is a term.
iv) Any term can be generated only by the application of the rules given above.
For example: Since, y and 3 are both terms and plus is a two-place function symbol,
plus (y, 3) is a term according to the above definition.
Furthermore, we can see that plus (plus (y, 3), y) and father (father (Mohan)) are also
terms; the former denotes (y + 3) + y and the later denotes grandfather of Mohan.
We already know that in PL, an atom or atomic statement is an indivisible unit for
representing and validating arguments. Atoms in PL are denoted generally by symbols
like P, Q, and R etc. But in FOPL,
Definition: An Atom is
Once, the atoms are defined, by using the logical connectives defined in Propositional
Logic, and assuming having similar meaning in FOPL, we can build complex
formulas of FOPL. Two special symbol ∀ and ∃ are used to denote qualifications in
FOPL. The symbols ∀ and ∃ are called, respectively, the universal quantifier and
existential quantifier. For a variable x, (∀x) is read as for all x, and (∃x) is read as
there exists an x. Next, we consider some examples to illustrate the concepts discussed
above.
let us denote x is a rational number by Q(x), x is a real number by R(x), and x is less
than y by LESS(x, y). Then the above statements may be symbolized respectively, as
Each of the expressions (i), (ii), and (iii) is called a formula or a well-formed formula
or wff.
8
Next, we discuss three new concepts, viz Scope of occurrence of a quantified variable, The First Order
Predicate Logic
Bound occurrence of a quantifier variable or quantifier and Free occurrence of a
variable.
Also, the variable y has only one occurrence and the variable z has zero occurrence in
the above formula. Next, we define the three concepts mentioned above.
Thus, in the formula (∃x) P(x, y) → Q (x), there are three occurrences of x, out of
which first two occurrences of x are bound, where, the last occurrence of x is free,
because scope of (∃x) in the above formula is P(x, y). The only occurrence of y in the
formula is free. Thus, x is both a bound and a free variable in the above formula and y
is only a free variable in the formula so far, we talked of an occurrence of a variable
as free or bound. Now, we talk of (only) a variable as free or bound. A variable is free
in a formula if at least one occurrence of it is free in the formula. A variable is bound
in a formula if at least one occurrence of it is bound.
It may be noted that a variable can be both free and bound in a formula. In order to
further elucidate the concepts of scope, free and bound occurrences of a variable, we
consider a similar but different formula for the purpose:
(∃x) (P(x, y) → Q(x)).
In this formula, scope of the only occurrence of the quantifier (∃x) is the whole of the
rest of the formula, viz. scope of (∃x) in the given formula is (P(x, y) → Q (x))
Also, all three occurrence of variable x are bound. The only occurrence of y is free.
Remarks: It may be noted that a bound variable x is just a place holder or a dummy
variable in the sense that all occurrences of a bound variable x may be replaced by
another free variable say y, which does not occur in the formula. However, once, x is
replaced by y then y becomes bound. For example, (∀x) (f (x)) is the same as (∀y) f
(y). It is something like
9
Knowledge Representation
2 23 13 7
x 2 dx = ∫ y 2 dy =
2
∫1 1
− =
3 3 3
Replacing a bound variable x by another variable y under the restrictions mentioned
above is called Renaming of a variable x
We may drop pairs of parentheses by agreeing that quantifiers have the least
scope. For example, (∃x) P(x, y) → Q(x) stands for
((∃x) P(x, y)) → Q(x)
Example
Translate the statement: Every man is mortal. Raman is a man. Therefore, Raman is
mortal.
(1) For every number, there is one and only one immediate successor,
(2) There is no number for which 0 is the immediate successor.
(3) For every number other than 0, there is one and only one immediate
predecessor.
Let the immediate successor and predecessor of x, respectively be denoted by f(x) and
g(x).
10
Let E (x, y) denote x is equal to y. Then the axioms of natural numbers are represented The First Order
Predicate Logic
respectively by the formulas:
(i) (∀x) (∃y) (E(y, f(x)) ∧ (∀z) (E(z, f(x)) → E(y, z)))
(ii) ~ ((∃x) E(0, f(x))) and
(iii) (∀x) (~ E(x, 0) → ((y)∃, g(x)) ∧ (∀z) (E(z, g(x)) → E(y, z))))).
From the semantics (for meaning or interpretation) point of view, the wff of FOPL
may be divided into two categories, each consisting of
The wffs of FOPL in which there is no occurrence of a free variable, are like wffs of
PL in the sense that we can call each of the wffs as True, False, consistent,
inconsistent, valid, invalid etc. Each such a formula is called closed formula.
However, when a wff involves a free occurrence, then it is not possible to call such a
wff as True, False etc. Each of such a formula is
called an open formula.
For example: Each of the formulas: greater (x, y), greater (x, 3), (∀y) greater (x, y)
has one free occurrence of variable x. Hence, each is an open formula.
Each of the formulas: (∀x) (∃y) greater (x, y), (∀y) greater (y, 1), greater (9, 2), does
not have free occurrence of any variable. Therefore each of these formulas is a closed
formula.
The following equivalences hold for any two formulas P(x) and Q(x):
(i) (∀x) P(x) ∧ (∀x) Q(x) = (∀x) (P(x) ∧ Q(x))
(ii) (∃x) P(x) ∨ ( ∃x) Q (x) = (∃x) (P(x) ∨ Q(x)
But the following inequalities hold, in general:
(iii) (∀x) (P(x) ∨ Q(x) ≠ (∀x) P(x) ∨ (∀x) Q(x)
(iv) (∃x) (P(x) ∧ Q(x) ≠ (∃x) P(x) ∧ (∃x) Q (x)
R = Q ∧ ~ Q = False
Hence, the proof.
(ii) Consider
(∀x) P(x) → (∃y) P(y)
Replacing ‘→’ we get
= ~ (∀x) P(x) ∨ (∃y) P(y)
= (∃x) ~ P(x) ∨ (∃y) P(y)
= (∃x) ~ P(x) ∨ (∃x) P(x) (renaming x as y in the second disjunct)
In other words,
= (∃x) (~ P(x) ∨ P(x)) (using equivalence)
The last formula states: there is at least one element say b, for ~ P(b) ∨ P(b) holds i.e.,
for b, either P(b) is False or P(b) is True.
But, as P is a predicate symbol and b is a constant ~ P(b) ∨ P(b) must be True. Hence,
the proof.
Ex. 1 Let P(x) and Q(x) represent “x is a rational number” and “x is a real number,”
respectively. Symbolize the following sentences:
Ex. 2 Let C(x) mean “x is a used-car dealer,” and H(x) mean “x is honest.” Translate
each of the following into English:
(i) (∃x)C(x)
(ii) (∃x) H(x)
(iii) (∀x)C(x) → ~ H (x))
(iv) (∃x) (C(x) ∧ H(x))
(v) (∃x) (H(x) → C(x)).
In order to facilitate problem solving through PL, we discussed two normal forms, viz,
the conjunctive normal form CNF and the disjunctive normal form DNF. In FOPL,
there is a normal form called the prenex normal form. The use of a prenex normal
form of a formula simplifies the proof procedures, to be discussed.
Definition A formula G in FOPL is said to be in a prenex normal form if and only if
the formula G is in the form
12
(Q1x1)….(Qn xn) P The First Order
Predicate Logic
where each (Qixi), for i = 1, ….,n, is either (∀xi) or (∃xi), and P is a quantifier free
formula. The expression (Q1x1)….(Qn xn) is called the prefix and P is called the
matrix of the formula G.
In the rest of the discussion of FOPL, P[x] is used to denote the fact that x is a free
variable in the formula P, for example, P[x] = (∀y) P (x, y). Similarly, R [x, y]
denotes that variables x and y occur as free variables in the formula R Some of these
equivalences, we have discussed earlier.
(iii) ~ (( ∀x ) P [ x ]) = (∃x ) ( ~ P [ x ] ).
(iv) ~ (( ∃x) P [ x ] ) = ( ∀x ) ( ~ P [ x ]).
(v) (∀x) P [x] ∧ (∀x) H [x] = (∀x) (P [x] ∧ H [x]).
(vi) (∃x) P [x] ∨ (∃x) H [x] = (∃x) (P [x] ∨ H [x]).
That is, the universal quantifier ∀ and the existential quantifier ∃ can be distributed
respectively over ∧ and ∨.
But we must be careful about (we have already mentioned these inequalities)
(vii) (∀x) E [x] ∨ (∀x) H [x] ≠ (∀x) (P [x] ∨ H [x]) and
(viii) (∃x ) P [x] ∧ (∃x) H [x] ≠ (∃x) (P [x] ∧ H [x])
Step 1 Remove the connectives ‘↔’ and ‘→’ using the equivalences
P ↔ G = (P → G) ∧ ( G → P)
P→ G = ~ P ∨ G
13
Knowledge Representation Step 3 Apply De Morgan’s laws in order to bring the negation signs immediately
before atoms.
~ (P ∨ G) = ~ P ∧ ~ G
~ (P ∧ G) = ~ P ∨ ~ G
Step 5 Bring quantifiers to the left before any predicate symbol appears in the
formula. This is achieved by using (i) to (vi) discussed above.
We have already discussed that, if all occurrences of a bound variable are replaced
uniformly throughout by another variable not occurring in the formula, then the
equivalence is preserved. Also, we mentioned under (vii) that ∀ does not distribute
over ∧ and under (viii) that ∃ does not distribute over ∨. In such cases, in order to
bring quantifiers to the left of the rest of the formula, we may have to first rename one
of bound variables, say x, may be renamed as z, which does not occur either as free or
bound in the other component formulas. And then we may use the following
equivalences.
Part (i)
Step 1: By removing ‘→’, we get
(∀x) (~ Q (x) ∨ (∃x) R (x, y))
Step 2: By renaming x as z in (∃x) R (x, y) the formula becomes
(∀x) (~ Q (x) ∨ (∃z) R (z, y))
Step 3: As ~ Q(x) does not involve z, we get
(∀x) (∃z) (~ Q (x) ∨ R (z, y))
Part (ii)
(∃x) (~ (∃y) Q (x, y) → ((∃z) R (z) → S (x)))
Step 1: Removing outer ‘→’ we get
(∃x) (~ (~ ((∃y) Q (x, y))) ∨ (( ∃z) R (z) → S (x)))
Step 2: Removing inner ‘→’ , and simplifying ~ (~ ( ) ) we get
(∃x) ((∃y) Q (x, y) ∨ (~ ( (∃z) R(z)) ∨ S (x)))
Step 3: Taking ‘~’ inner most, we get
(∃x) (∃y) Q (x, y) ∨ ((∀z) ~ R(z) ∨ S(x)))
As first component formula Q (x, y) does not involve z and S(x) does not involve both
y and z and ~ R(z) does not involve y. Therefore, we may take out ( ∃ y) and (∀z) so
that, we get
(∃x) (∃y) (∀z) (Q (x, y) ∨ (~ R(z) ∨ S (x) ), which is the required formula in prenex
normal form.
14
Part (iii) The First Order
Predicate Logic
(∀x) (∀y) ((∃z) Q (x, y, z) ∧ (( ∃u) R (x, u) → (∃v) R (y v)))
Step 1: Removing ‘→’, we get
(∀x) (∀y) ((∃z) Q (x, y, z) ∧ (~ ((∃u) R (x, u)) ∨ (∃v) R (y, v)))
Step 3: As variables z, u & v do not occur in the rest of the formula except the formula
which is in its scope, therefore, we can take all quantifiers outside, preserving
the order of their occurrences, Thus we get
(∀x) (∀y) (∃z) (∀u) (∃v) (Q (x, y, z) ∧ (~ R (x, u) ∨ R (y, v)))
Ex: 4 (i) Transform the formula (∀x) P(x) → (∃x) Q(x) into prenex normal form.
A further refinement of Prenex Normal Form (PNF) called (Skolem) Standard Form,
is the basis of problem solving through Resolution Method. The Resolution Method
will be discussed in the next unit of the block.
The Standard Form of a formula of FOPL is obtained through the following three
steps:
(1) The given formula should be converted to Prenex Normal Form (PNF), and then
(2) Convert the Matrix of the PNF, i.e, quantifier-free part of the PNF into
conjunctive normal form
(3) Skolomization: Eliminate the existential quantifiers using skolem constants and
functions
Skolem Function
We in general, mentioned earlier that (∃x) (∀y) P(x,y) ≠ (∀y) (∃x) P(x,y)…….(1)
For example, if P(x,y) stands for the relation ‘x>y’ in the set of integers, then the
L.H.S. of the inequality (i) above states: some (fixed) integer (x) is greater than all
integers (y). This statement is False.
On the other hand, R.H.S. of the inequality (1) states: for each integer y, there is an
integer x so that x>y. This statement is True.
The difference in meaning of the two sides of the inequality arises because of the fact
that on L.H.S. x in (∃x) is independent of y in (∀y) whereas on R.H.S x of dependent
on y. In other words, x on L.H.S. of the inequality can be replaced by some constant
say ‘c’ whereas on the right hand side x is some function, say, f(y) of y.
Therefore, the two parts of the inequality (i) above may be written as
LH.S. of (1) = (∃x) (∀y) P (x,y) = (∀y) P(c,y),
15
Knowledge Representation Dropping x because there is no x appearing in (∀y) P(c,y)
The above argument, in essence, explains what is meant by each of the terms viz.
skolem constant, skolem function and skolomisation.
The constants and functions which replace existential quantifiers are respectively
called skolem constants and skolem functions. The process of replacing all
existential variables by skolem constants and variables is called skolemisation.
We explain through examples, the skolomisation process after PNF and CNF have
already been obtained.
(i) (∃x1) (∃x2) (∀y1) (∀y2)(∃x3)(∀y3) P(x1, x2, x3, y1, y2, y3)
(ii) (∃x1)(∀y1)(∃x2)(∀y2) (∃x3)P(x1, x2, x3, y1, y2 )∧(∃x1)(∀y3)( ∃x2) (∀y4)Q(x1, x2,
y3, y4)
Solution (ii) As a first step we must bring all the quantifications in the beginning of
the formula through Prenex Normal Form reduction. Also,
Then the existential variable x1 is independent of all the universal quantifiers. Hence,
x1 may be replaced by a constant say, ‘c’. Next x2 is preceded by the universal
quantifier y1 hence, x2 may be replaced by f (y1). The existential quantifier x3 is
preceded by the universal quantifiers y1 and y2. Hence x3 may be replaced by g
(y1, y2). The existential quantifier x5 is preceded by again universal quantifier y1 and
y2. In other words, x5 is also a function of y1 and y2. But, we have to use a different
function symbol say h and replace x5 by h (y1, y2). Similarly x6 may be replaced by
j (y1, y2, y3).
We have developed tools of FOPL for solving problems requiring logical reasoning.
Now, we attempt solve the problem mentioned in the introduction the unit to show
insufficiency of Propositional Logic.
Example: Every man is mortal. Raman is a man. Show that Raman is mortal. The
problem can be symbolized as:
(i) (∀x) (MAN(x) → MORTAL (x)).
(ii) MAN (Roman).
To show
(iii) Mortal (Raman)
Solution:
Ex: 6 No used-car dealer buys a used car for his family. Some people who buy used
cars for their families are absolutely dishonest. Conclude that some absolutely
dishonest people are not used-car dealers.
Ex: 7 Some patients like all doctors. No patient likes any quack. Therefore, no doctor
is a quack.
The disjunctions (~ P(x, f(x)) ∨ R(x, f(x), g(x))) and Q(x, g(x)) ∨ R(x, f(x), g(x)) of
the standard form
~ P(x, f(x) ∨ R(x, f(x), g(x)) ∧ Q(x, g(x)) ∨ R(x, f(x), g(x))
are clauses. A set S of clauses is regarded as a conjunction of all clauses in S, (1)
with the condition that every variable that occurs in S is considered governed by a
universal quantifier. By this convention, a standard form can be simply represented
by a set of clauses.
For example, the standard form the above mentioned formula of (1) can be
represented by the set.
{P(x, f(x)) ∨ R(x, f(x), g(x)), Q(x, g(x)) ∨ R(x, f(x), g(x))}.
Example:
As variables x, y and z do not occur any where else, expect within their respective
scopes, therefore, the quantifiers may be taken in the beginning of the formula
without any changes. Hence, we get
(∀x) (∀y) (∃z) (P (x) ∧ ~ Q (y , z))
which is the required standard form.
Ex: 12 Conclude that some of the officials were drug pushers where we know the
following
(i) The custom officials searched everyone who entered this country who was
not a VIP.
(ii) Some of the drug pushers entered this country and they were only searched
by drug pushers. (iii) No drug pusher was a VIP.
(iv) Some of the officials were drug pushers.
Ex: 13 From the given statement: Everyone who saves money earns interest, conclude
that if there is no interest, then nobody saves money.
1.6 SUMMARY
In this unit, initially, we discuss how inadequacy of PL to solve even simple problems,
requires some extension of PL or some other formal inferencing system so as to
compensate for the inadequacy. First Order Predicate Logic (FOPL), is such an
extension of PL that is discussed in the unit.
18
Next, syntax of proper structure of a formula of FOPL is discussed. In this respect, a The First Order
Predicate Logic
number of new concepts including those of quantifier, variable, constant, term, free
and bound occurrences of variables; closed and open wff, consistency/validity of wffs
etc. are introduced.
Next, two normal forms viz. Prenex Normal Form (PNF) and Skolem Standard
Normal Form are introduced. Finally, tools and techniques developed in the unit, are
used to solve problems involving logical reasoning.
1.7 SOLUTIONS/ANSWERS
Ex. 2
(i) There is (at least) one (person) who is a used-car dealer.
(ii) There is (at least) one (person) who is honest.
(iii) All used-car dealers are dishonest.
(iv) (At least) one used-car dealer is honest.
(v) There is at least one thing in the universe, (for which it can be said that) if
that something is Honest then that something is a used-car dealer
Note: the above translation is not the same as: Some no gap one honest, is a used-car
dealer.
Ex: 4 (i) (∀x) P(x) → (∃x) Q(x) = ~ ((∀x) P(x)) ∨ (∃x) Q(x) (by removing the
connective→)
= (∃x) (~P(x)) ∨ (∃x) Q(x) (by taking ‘~’ inside)
Therefore, a prenex normal form of (∀x) P(x) → (∃x) Q(x) is (∃x) (~P(x) ∨ Q(x)).
(ii) (∀x) (∀y) ((∃z) (P(x, y) ∧ P(y, z)) → (∃u) Q (x, y, u)) (removing the
connective→)
= (∀x) (∀y) (~ ((∃z) (P(x, z) ∧ P(y, z)))
∨ (∃u) Q (x, y, u)) (using De Morgan’s Laws)
= (∀x) (∀y) ((∀z) (~P(x, z) ∨ ~ P(y, z))
∨ (∃u) Q(x, y, u))
= (∀x) (∀y) (∀z) (~P(x, z)
19
Knowledge Representation ∨ ~ P(y, z) ∨ Q (x, y, u) (as z and u do not occur in the rest of
the formula except their respective
scopes)
Therefore, we obtain the last formula as a prenex normal form of the first formula.
Ex 5 (i) In the given formula (∃x) is not preceded by any universal quantification.
Therefore, we replace the variable x by a (skolem) constant c in the formula and drop
(∃x).
Next, the existential quantifier (∃z) is preceded by two universal quantifiers viz., v and
y. we replace the variable z in the formula, by some function, say, f (v, y) and drop
(∃z). Finally, existential variable (∃u) is preceded by three universal quantifiers, viz.,
(∀y), (∀y) and (∀w). Thus, we replace in the formula the variable u by, some function
g(y, v, w) and drop the quantifier (∃u). Finally, we obtain the standard form for the
given formula as
(∀y) (∀v) (∀w) P(x, y, z, u, v, w)
Next, in the formula, there are two existential quantifiers, viz., (∃y) and (∃z). Each of
these is preceded by the only universal quantifier, viz. (∀x).
Thus, each variable y and z is replaced by a function of x. But the two functions of x
for y and z must be different functions. Let us assume, variable, y is replaced in the
formula by f(x) and the variable z is replaced by g(x). Thus the initially given formula,
after dropping of existential quantifiers is in the standard form:
(∀x) ((~ P (x, y) ∨ R (x, y, z)) ∧ (Q (x, z) ∨ R (x, y, z)))
Ex: 6 Let
(i) U(x), denote x is a used-car dealer,
(ii) B(x) denote x buys a used car for his family, and
(iii) D(x) denote x is absolutely dishonest,
20
Ex: 7 The First Order
Predicate Logic
Let us use the notation for the predicates of the problem as follows:
P(x) : x is a patient,
D(x): x is a doctor,
Q(x): x is a quack,
L(x, y): x likes y.
By Universal Instantiation of (ii) with x as c (because the fact in (ii) is true for all
values of x and for the already considered value c also. This type of association of an
already used value c may not be allowed in Existential Instantiation)
we get
(vii) P (c) → ∀(y) (Q (y) → ~ L (c, y))
Using Modus Ponens with (v) and (vii) we get
(viii) ∀(y) (Q(y) → ~ L (c, y))
As (∀y) is the quantifier appearing in both (vi) and (viii),
we can say that for an arbitrary a, we have
(ix) D (a) → L (c, a) for every a (from (vi)) and
(x) Q (a) → ~ L (c, a) for every a (from (viii))
(Using the equivalent P → Q = ~ Q → ~ P, we get from (x):
(xi) L (c, a) → ~ Q (a)
Ex: 9 As a first step, the matrix is transformed into the following conjunctive normal
form:
(∀x) (∃y) (∃z) ((~P(x, y) ∨ R(x, y, z)) ∨ R (x, y, z)) ∧ (Q(x, z) ∨ R(x, y, z))).
As the existential variables (∃y) and (∃z) are both preceded by (∀x), the variables y
and z are replaced, by one-place function f(x) and g(x) respectively.
The clauses which are obtained after reducing to standard form are:
22
When symbolized, the known facts become: The First Order
Predicate Logic
The negation of (ii) becomes (iii) ~ ((∃x) I (x) → (∀x) (∀y) (S(x, y) → ~M(y)))
24
Deductive Inference
UNIT 2 DEDUCTIVE INFERENCE RULES Rules and Methods
AND METHODS
Structure Page Nos.
2.0 Introduction 25
2.1 Objectives 26
2.2 Basic Inference Rules and Application in PL 26
2.3 Basic Inference Rules and Application in FOPL 31
2.4 Resolution Method in PL 37
2.5 Resolution Method in FOPL 40
2.6 Summary 44
2.7 Solutions/Answers 45
2.8 Further Readings 49
2.0 INTRODUCTION
In Section 3.2, we introduce eight inference rules for drawing valid conclusions in PL.
Next, in Section 3.3, we introduce four quantification rules, so that all the twelve
inference rules are used to validate conclusions in FOPL. The methods of drawing
valid conclusions, discussed so far, are cases of an approach of drawing valid
conclusions, called natural deduction approach of making inferences in which the
reasoning system initiates reasoning process from the axioms, uses inferencing rules
and, if the conclusion can be validly drawn, then ultimately reaches the intended
conclusion.
On the other hand, there is another approach called Refutation approach of drawing
valid conclusions. According to this approach, negation of the intended conclusion is
taken as an additional axiom. If the conclusion can be validly drawn from the axioms,
then through application of inference rules, a contradiction is encountered, i.e., two
formulas which are mutual negations, are encountered during the process of making
inference.
Resolution method is a single rule refutation method. Resolution method and its
applications for PL are discussed in Section 3.4. Resolution Method and its
applications for FOPL are discussed in Section 3.5.
25
Knowledge Representation
2.1 OBJECTIVES
In this section, we study a method which uses a number of rules of inference for
drawing valid conclusions, and later we study Resolution Method for establishing
validity of arguments.
We introduce eight rules of inference. Each of these rules has a specific name. In
order to familiarize ourselves with
(i) what a rules of inference is ,
(ii) how a rule is represented, and
(iii) how a rule of inference helps us in solving problems.
P → Q, P
Notations for M. P.:
Q
The rule states that if formulas P and P → Q (of either propositional logic or
predicate logic) are True then we can assume the Truth of Q.
The assumption is based on the fact that through truth-table method or otherwise we
can show that if P and P → Q , each is assigned truth value T then Q must have truth
value T.
P Q P→Q
T T T
T F F
F T T
F F T
26
From the above table, we can see that P and P → Q both are True only in the first Deductive Inference
Rules and Methods
row and in the first row Q, the formula which is inferred, is also True.
The same is the reason for allowing use of other rules of inference in deducing
new facts.
P → Q, ~ Q
~P
The rule states if P → Q is True, but Q , the consequent of P→ Q is False then the
antecedent P of P → Q is also False.
The validity of the rule may again be established through truth-table as follows:
P Q P→Q
T T T
T F F
F T T
F F T
In the above table P → Q is T and Q is False simultaneously only in the last row and
in this row P, the formula which is inferred, is False.
Note: The validity of the rest of the rules will not be established. However, it is
desirable that the students verify the validity of the other inference rules also through
Truth-Table or otherwise.
P → Q, Q → R
P→R
The rule states that if we assume that both the formulas P → Q and Q → R are True
then we may assume P → R is also True.
P∧Q P∧Q
(i) and (ii )
P Q
The rule says that if P ∧ Q is True then P can be assumed to be True ( and similarly
Q may be assumed to be True.)
Some of us may be surprised at the mention of the rule, thinking that if P∧ Q is True
then P must be True. The symbol ∧ is generally read as ‘and’. But the significance of
the rule is that ‘ ∧ ’ is merely a symbol and its meaning in the sense of ‘and’ comes
only through this rule of inference.
Rule 5 Conjunction (Conj.)
P, Q
P∧Q
27
Knowledge Representation The rule states if formulas P and Q are simultaneously True then the formula
P ∧ Q can be assumed to be True.
P ∨ Q, ~ P P ∨ Q, ~ Q
(i) and (ii)
Q P
The two rules above state that if it is given that (a) P ∨ Q is true and (b) one of P or
Q is False, then other must be True
P Q
(i) and (ii)
P∨Q P∨Q
The rules state that if one of P and Q is assumed to be True, then we can assume
P ∨ Q to be True.
P → Q, R → S , P ∨ R
Q∨S
The rule states that if both the formulas P → Q and R → S are assumed to be True and
if P ∨ R , i.e. disjunction of the antecedents is assumed to be ‘True’, then assume
Truth of Q ∨ S , which is disjunction of consequents.
Example: Symbolize and construct a proof for the following valid argument using
rules of inference:
(i) If you smoke or drink too much then you do not sleep well, and if you do not sleep
well or do not eat well then you feel rotten, (ii) If you feel rotten, you do not exercise
well and do not study enough, (iii) You do smoke too much, therefore, (iv) You do
not study enough.
(ii) R → (~ X ∧ ~ T)
(iii) S
P∧Q
Through simplification of (i), i.e., by using , we get
P
(v) S ∨ D → ~ W
S
Using Add on (iii) i.e. by using , we get
S∨D
(vi) S ∨ D
(vii) ~ W
(viii) (~ W ∨ ~ E) → R
(ix) ~ W ∨ ~ E
(x) R
(xi) ~ X ∧ ~ T
(xi) ~ T
Example: Symbolize and construct a proof for the following valid argument: (i) If the
Bible is literally true then the Earth was created in six days, (ii) If the Earth was
created in six days then carbon dating techniques are useless and scientists are frauds,
(iii) Scientists are not frauds, (iv) The Bible is literally true, therefore, (iv) God does
not exist.
29
Knowledge Representation Therefore the statements in the given arguments are symbolically represented as :
(i) B → E
(ii) E → C ∧ S
(iii) ~ S
(iv) B
(v) ~ G (to show)
Remarks: In the above deduction, (iii) and (viii) are contradicting each other. In
general, if in the process of derivation, we encounter two statement (like S and ~S)
which contradict each other, then we can deduce any statement even if the statement
can never be True in any sense. Thus, if both S and ~ S have already occurred in the
process of derivation, then we can assume the truth of any statement. For example, we
can assume the truth of the statement: ‘Moon is made of green cheese’
Thus, once we encounter S, where ~S has already occurred, use Addition rule to get
S ∨ NON-SENSE from S. Then use D.S. on S ∨ NON-SENSE and ~S, we get NON-
SENSE.
Ex.2 Using propositional logic, show that, if the following statements are
assumed to be true:
30
Deductive Inference
2.3 BASIC INFERENCING RULES AND Rules and Methods
APPLICATIONS IN FOPL
In the previous unit, we discussed eight inferencing rules of Propositional Logic (PL)
and further discussed applications of these rules in exhibiting validity/invalidity of
arguments in PL. In this section, the earlier eight rules are extended to include four
more rules involving quantifiers for inferencing. Each of the new rules, is called a
Quantifier Rule. The extended set of 12 rules is then used for validating arguments in
First Order Predicate Logic (FOPL).
Before introducing and discussing the Quantifier rules, we briefly discuss why, at all,
these rules are required. For this purpose, let us recall the argument discussed earlier,
which Propositional Logic could not handle:
(∀x) p ( x)
p(a)
The rule states if (∀x) p(x) is True, then we can assume P(a) as True for any constant
a (where a constant a is like Raman). It can be easily seen that the rule associates a
formula P(a) of Propositional Logic to a formula (∀x) p(x) of FOPL. The
significance of the rule lies in the fact that once we obtain a formula like P(a), then
the reasoning process of Propositional Logic may be used. The rule may be used ,
whenever, its application seems to be appropriate.
31
Knowledge Representation The rule says that if it is known that for all constants a, the statement P(a) is True,
then we can, instead, use the formula (∀x) p ( x) .
The rule associates with a set of formulas P(a) for all a of Propositional Logic, a
formula (∀x) p ( x) of FOPL.
Before using the rule, we must ensure that P(a) is True for all a,
Otherwise it may lead to wrong conclusions.
(∃x) P( x)
( E.I .)
P(a)
The rule says if the Truth of (∃x) P( x) is known then we can assume the Truth of
P(a) for some fixed a . The rule, again, associates a formula P(a) of Propositional
Logic to a formula (∀x) p ( x) of FOPL.
An inappropriate application of this rule may lead to wrong conclusions. The source
of possible errors lies in the fact that the choice ‘a’ in the rule is not arbitrary and can
not be known at the time of deducing P(a) from (∃x) P ( x) .
P (a)
(E.G)
(∃x) P ( x)
The rule states that if P(a), a formula of Propositional Logic is True, then the Truth of
(∃x) P( x) , a formula of FOPL , may be assumed to be True.
The Universal Generalisation (U.G) and Existential Instantiation rules should be
applied with utmost care, however, other two rules may be applied, whenever, it
appears to be appropriate.
The purpose of the other Quantification rules viz. for generalisation, i.e.,
P(a), for all a
(ii)
(∀x) P( x)
P(a)
(iv)
(∃x) P ( x)
32
is that the conclusion to be drawn in FOPL is not generally a formula of PL but a Deductive Inference
Rules and Methods
formula of FOPL. However, while making inference, we may be first associating
formulas of PL with formulas of FOPL and then use inference rules of PL to conclude
formulas in PL. But the conclusion to be made in the problem may correspond to a
formula of FOPL. These two generalisation rules help us in associating formulas of
FOPL with formulas of PL.
Example: Tell, supported with reasons, which one of the following is a correct
inference and which one is not a correct inference.
(i) To conclude F (a ) ∧ G (a ) → H (a ) ∧ I (a )
from (∀x) ( F ( x) ∧ G ( x) ) → H ( x) ∧ I ( x)
using Universal Instantiation (U.I.)
The above inference or conclusion is incorrect in view of the fact that the scope of
universal quantification is only the formula: F ( x) ∧ G ( x) and not the whole of the
formula.
F (a) ∧ G (a) → H ( x) ∧ I ( x)
(iii) To conclude ~ F(a) for an arbitrary a, from ~ (∀x) F(x) using U.I.
Thus, the inference is not a case of U.I., but of Existential Instantiation (E.I.)
Further, as per restrictions, we can not say for which a, ~ F(x) is True. Of course,
~ F(x) is true for some constant, but not necessarily for a pre-assigned constant a.
The reason being that the constant to be substituted for x cannot be assumed to be the
same constant b, being given in advance, as an argument of F. However,
to conclude ( ( F (b) ∧ G ( a ) → H (c ) )
from (∃x ) ( ( F (b) ∧ G ( x ) ) → H (c ) ) is correct.
Ex. 3: Tell for each of the following along with appropriate reasoning, whether it is a
case of correct/incorrect reasoning.
(i) To conclude
33
Knowledge Representation F (a) ∧ G (a) by applying E.I. to
(∃x) F(x) ∧ ∃ (x) G (x)
Step for using Predicate Calculus as a Language for Representing Knowledge &
for Reasoning:
Step 1: Conceptualisation: First of all, all the relevant entities and the relations that
exist between these entities are explicitly enumerated. Some of the implicit facts like,
‘a person dead once is dead for ever’ have to be explicated.
Example: Symbolize the following and then construct a proof for the argument:
(i) Anyone who repairs his own car is highly skilled and saves a lot of money on
repairs
(ii) Some people who repair their own cars have menial jobs. Therefore,
(iii) Some people with menial jobs are highly skilled.
P(x) : x is a person
S(x) : x saves money on repairs
M(x) : x has a menial job
R(x) : x repairs his own car
H(x) : x is highly skilled.
From (ii) using Existential Instantiation (E.I), we get, for some fixed a
34
(iv)R(a) ∧ M(a) Deductive Inference
Rules and Methods
Then by simplification rule of Propositional Logic, we get
(v) R(a)
From (i), using Universal Instantiation (U.I.), we get
(vi) R(a) → H(a) ∧ S(a)
Using modus ponens w.r.t. (v) and (vi) we get
(vii) H(a) ∧ S(a)
By specialisation of (vii) we get
(viii) H(a)
By specialisation of (iv) we get
(ix) M(a)
By conjunctions of (viii) & (ix) we get
M(a) ∧ H(a)
By Existential Generalisation, we get
(∃x) (M(x) ∧ H(x))
Example:
(i) Some juveniles who commit minor offences are thrown into prison, and any
juvenile thrown into prison is exposed to all sorts of hardened criminals.
(ii) A juvenile who is exposed to all sorts of hardened criminals will become bitter
and learn more techniques for committing crimes.
(iii) Any individual who learns more techniques for committing crimes is a menace
to society, if he is bitter.
(iv) Therefore, some juveniles who commit minor offences will be menaces to the
society.
(viii) J(b)
35
Knowledge Representation (ix) C(b) and
(x) P(b)
then, L.H.S of (A) above states: For each x there is a y such that y>x.
The statement is true in the domain of real numbers.
On the other hand, R.H.S of (A) above states that: There is an integer y which is
greater than x, for all x.
36
Deductive Inference
2.4 RESOLUTION METHOD IN PL Rules and Methods
Basically, there are two different approaches for proving a theorem or for making a
valid deduction from a given set of axioms:
i) natural deduction
ii) refutation method
In the natural deduction approach, one starts with a the set of axioms, uses some
rules of inference and arrives at a conclusion. This approach closely resembles of the
intuitive reasoning of human beings.
On the other hand, in a refutation method, one starts with the negation of the
conclusion to be drawn and derives a contradiction or FALSE. Because of having
assumed the conclusion as false, we derive a contradiction; therefore, the
assumption that the conclusion is wrong, itself is wrong. Hence, the argument of
resolution method leads to the validity of the conclusion.
i) Truth-table construction
ii) Use of inference rules,
and follow, directly or indirectly, natural deduction approach.
In this section, we discuss how the resolution method is applied in solving problems
using only Propositional Logic (PL). The general resolution method for FOPL is
discussed in the next section.
The resolution method in PL is applied only after converting the given statements or
wffs into clausal forms. A clasual form of a wff is obtained by first converting the
wff into its equivalent Conjuctive Normal Form (CNF). We already know that a
clause is a formula (only) of the form:
A1 ∨ A2 ∨……..∨ An ,
where Ai is either an atomic formula or negation of an atomic formula.
P, P → Q P, ~ P ∨ Q
when written in the equivalent form
Q Q
(replacing P → Q by ~ P ∨ Q).
This simple special case, of general resolution principle to be discussed soon, states
that if the two formulas P and ~ P ∨ Q are given to be True, then we can assume Q to
be True.
Example: Let C1 : Q ∨ R and C2: ~ Q ∨ S be two given clauses, so that, one of the
literals i.e., Q occurs in one of the clauses (in this case C1) and its negation (~ Q)
occurs in the other clause C2. Then application of resolution method in this case tells
us to take disjunction of the remaining parts of the given clause C1 and C2, i.e., to take
C3 : R ∨ S as deduction from C1 and C2. Then C3 is called a resolvent of C1 and C2.
The two literals Q and (~ Q) which occur in two different clauses are called
complementary literals.
In this case, complementary pair of literals viz. Q and ~ Q occur in the two clause C1
and C2.
Hence, the resolution method states:
Conclude C3: ~ S ∨ R ∨ (~ P)
Note: We could have obtained the resolvent FALSE from only two clauses, viz., C2
and C3. Thus, out of the given four clauses, even set of only two clauses viz, C2 and C3
is unsatisfiable. Also, a superset of any unsatisfiable set is unsatisfiable.
38
Example: Show that the set of clauses: Deductive Inference
Rules and Methods
C1: R ∨ S
C2: ~ S ∨ W
C3: ~ R ∨ S
C4: ~ W is unsatisfiable.
After symbolizing the problem under consideration, add the negation of the wff which
represents conclusion, as an additional premise. From this enhanced set of
premises/axioms, derive FALSE or contradiction. If we are able to conclude FALSE,
then the conclusion, that was required to be drawn, is valid and problem is solved.
However, through all efforts, if we are not able to derive FALSE, then we cannot say
whether the conclusion is valid or invalid. Hence, the problem with given axioms and
the conclusion is not solvable.
Let us now apply Resolution Method for the problems considered earlier.
Example: Suppose the stock prices go down if the interest rate goes up. Suppose also
that the most people are unhappy when stock prices go down. Assume that the interest
rate goes up. Show that we can conclude that most people are unhappy.
(1′) A→ S
(2′) S→ U
(3′) A
(4′) U. (to conclude)
Then from (i) and (iii), through resolution, we get the clause
(v) S.
From (ii) and (iv), through resolution, we get the clause
(vi) ~ S
From (vi) and (v), through resolution we get,
(viii) FALSE
We might have observed from the above solution using resolution method, that clausal
conversion is a major time-consuming step after translation to wffs. Generally, once
the clausal form is obtained, proof, at least, by a human being can be easily visualised.
Ex. 5:Given that if the Parliament refuses to enact new laws, then the strike will not
be over unless it lasts more than one year and the president of the firm resigns, will
the strike not be over if the Parliament refuses to act and the strike just starts?
In the beginning of the previous section, we mentioned that resolution method for
FOPL requires discussion of a number of complex new concepts. Also, in Block 2, we
discussed (Skolem) Standard Form and also discussed how to obtain Standard Form
for a given formula of FOPL. In this section, we introduce two new, and again
complex, concepts, viz., substitution and unification.
The complexity of the resolution method for FOPL mainly results from the fact that a
clause in FOPL is generally of the form : P(x) ∨ Q ( f(x), x, y) ∨….., in which the
variables x, y, z, may assume any one of the values of their domain.
Thus, the atomic formula (∀x) P(x), which after dropping of universal quantifier, is
written as just P(x) stands for P(a1) ∧ P(a2)… ∧ P(an) where the set {a1 a2…, an} is
assumed here to be domain (x).
40
Example: Let us consider our old problem: Deductive Inference
Rules and Methods
To conclude
(i) Raman is mortal
MORTAL (Raman)
from
In the above x varies over the set of human beings including Raman. Hence, one
special instance of (iv) becomes
(a) MAN(Raman) and MORTAL(Raman) do not contain any variables, and, hence,
their truth or falsity can be determined directly. Hence, each of like a formula of PL.
In term of formula which does not contain any variable is called ground term or
ground formula.
(b) Treating MAN (Raman) as formula of PL and using resolution method on (v) and
(vi), we conclude
41
Knowledge Representation In order to unify MAN (x) and MAN (Raman) identical, we found that because one of
the possible values of x is Raman also. And, hence, we replaced x by one of its
possible values : Raman.
This replacement of a variable like x, by a term (which may be another variable also)
which is one of the possible values of x, is called substitution. The substitution, in
this case is denoted formally as {Raman/x}
Substitution, in general, notationally is of the form {t1 / x1 , t2 / x2 …tm/ xm } where
x1, x2 …, xm are variables and t2, t2 …tm are terms and ti replaces the variable xi in
some expression.
Example: (i) Assume Lord Krishna is loved by everyone who loves someone (ii) Also
assume that no one loves nobody. Deduce Lord Krishna is loved by everyone.
LK : Lord Krishna
so that we get
Thus, to solve the problem, we have the following standard form formulas for
resolution:
The possibilities exist because for each possibility pair, the predicate Love occurs in
complemented form in the respective pair.
For this purpose we attempt to make the two formulas Love(x, f(x)) and Love (a, LK)
identical, through unification involving substitutions. We start from the left, matching
the two formulas, term by term. First place where matching may fail is when ‘x’
occurs in one formula and ‘a’ occurs in the other formula. As, one of these happens
to be a variable, hence, the substitution {a/x} can be used to unify the portions so far.
Next, we attempt unification of (vi) Love (a, LK) with Love (x, LK) of (iv).
Then first term-by-term possible disagreement occurs when the corresponding terms
are ‘a’ and ‘x’ respectively. As one of these is a variable, hence, the substitution{a/x}
unifies the parts of the formulas so far. Next, the two occurrences of LK, one each in
the two formulas, match. Hence, the whole of each of the two formulas can be unified
through the substitution {a/x}. Though the unification has been attempted in
corresponding smaller parts, substitution has to be carried in the whole of the
formula, in this case in whole of (iv). Thus, after substitution, (iv) becomes
(viii) ~ Love (a, y) ∨ Love (a, L K)
resolving (viii) with (vi) we get
43
Knowledge Representation In order to resolve (v) and (ix), we attempt to unify Love (x, f(x)) of (v) with Love (a,
y) of (ix). The term-by-term matching leads to possible disagreement of a of (ix) with
x of (v). As, one of these is a variable, hence, the substitution {a/x} will unify the
portions considered so far. Next, possible disagreement may occur with f (x) of (v)
and y of (ix). As one of these are a variable viz. y, therefore, we can unify the two
terms through the substitution { f(x)/y}. Thus, the complete substitution {a/x, f (x)/y}
is required to match the formulas.
Making the substitutions, we get
(v) becomes Love (a, f(x))
and (ix) becomes ~ Love (a, f (x))
Resolving these formulas we get False. Hence, the proof.
Solution: As two literals with predicate Q occur and are mutually negated in (i) and
(ii),therefore, there is possibility of resolution of ~ Q (x, z, x) from (i) with Q (w, h (v,
v), w) of (ii). We attempt to unify Q (x, z, x) and Q (w, h (v, v), w), if possible, by
finding an appropriate substitution. First terms x and w of the two are variables,
hence, unifiable with either of the substitutions {x/w} or {w/x}. Let us take {w/x}.
Next pair of terms from the two formulas, viz, z and h(v, v) are also unifiable,
because, one of the terms is a variable, and the required substitution for unification is
{ h (v, v)/z}.
Next pair of terms at corresponding positions is again {w, x} for which, we have
determined the substitution {w/x}. Thus, the substitution {w/x, h(v, v)/z} unfies the
two formulas. Using the substitutions, (i) and (ii) become resp. as
(iii) ~ Q (w, h (v, v), w) ∨ Q (w, h (v, v), w)
(iv) Q (w, h (v, v), w)
Resolving, we get
Q (w, h (v, v), w),
which is the required resolvent.
2.6 SUMMARY
In this unit, eight basic rules of inference for PL and four rules involving quantifiers
for inferencing in FOPL, are introduced respectively in Section 3.2 and Section 3.3,
and then these rules are used in solving problems. Further, a new method of drawing
inference called Resolution method based on refutation approach, is discussed in the
next two Sections. In Section 3.4, Resolution method for PL is introduced and applied
in solving problems involving PL reasoning. In Section 3.5, Resolution method for
FOPL is introduced and used for solving problems involving FOPL reasoning.
44
Problems with FOPL as a system of knowledge representation and reasoning: Deductive Inference
Rules and Methods
FOPL is not capable of easily representing some kinds of information including
information pieces involving.
Any relation which is symmetric and transitive may not be reflexive is not expressible
in FOPL. A relation in FOPL can only be constant, and not a variable. Only in second
and higher order logics, the relations may be variable. This type of logics are not
within the scope of the course.
2.7 SOLUTIONS/ANSWERS
Ex.1: Assuming the statements (i), (ii) and (iii) given above as True we are required to
Show the truth of (iv)
The first step is to mark the logical operators, if any, in the statements of the
argument/problem under consideration.
In the above-mentioned problem, statement (i) does not contain any logical operator.
Each of the statements (ii) and (iii) contains the logical operator ‘If….then….’
The next step is to use symbols, P, Q, R, for atomic formulas occurring in the
problem. The symbols are generally mnemonic, i.e., names used to help memory.
Let us denote the atomic statements in the argument given above as follows:
Then the given statements in English, become respectively the following formulas of
PL:
(i) M
(ii) TG → GU
(iii) GU → ~ M
(iv) ~ TG (To show)
(v) M → ~ GU
(vi) ~ GU
(vii) ~ TG
The formula (viii) is the same as formula (iv) which was required to be proved.
Using these symbols, the Statement (i) to (iv) become the formula (i) to (iv) of PL as
given below:
(i) ML
(ii) ML → SG
(iii) SG → TG and
(iv) TG
Applying Modus Ponens to formulae (i) and (ii) we get the formula
(v) SG
(vi) TG
But formula (vi) is the same as (iv), which is required to be established. Hence the
proof.
Ex. 3: (i) Concluding F (a) ∧ G (a) from (∃x) F ( x) ∧ (∃x) G ( x) is incorrect, because,
as mentioned earlier also, the given Quantified Formula may be equivalently written
as (∃x) F ( x) ∧ (∃y ) G ( y ) . And in the case of each existential quantification, we can
not assign an already-used constant. Therefore, a correct conclusion may be of the
form
F (a) ∧ G (b)
Then, the statement (i), (ii) and (iii) can be equivalently expressed as formulas of
FOPL
Ex. 5: Let us symbolize the statements in the problem given above as follows:
A: The Parliament refuses to act.
B: The strike is over.
R: The president of the firm resigns.
S: The strike lasts more than one year.
Then the facts and the question to be answered can be symbolized as:
E1: (A→ (~ B ∨ (R ∧ S))) represents the statement: If the congress refuses to enact
new laws, then the strike will not be over unless it lasts more than one year and the
president of the firm resigns.
(Note: Punless Q = P ∨ Q)
E2 : A represents the statement: The congress refuses to act, and
We get the axioms, including the negation of the conclusion, in the clausal form
as
E11: (~ A ∨ ~ B ∨ R)
E12: (~ A ∨ ~ B ∨ S)
E2: A
E3: ~ S
E5: ~ (~ B) = B
By resolving E2 with E12, we get the resolvent
E6: ~ B ∨ S
By resolving E5 with E6, we get the resolvent as
E7 : S
By resolving E7 with E3, we get the resolvent as
E8 : FALSE
In the similar manner (i) and (iii) are not unifiable as the second terms f (y, z) and g (h
(k (u))) are such that none is a variable.
Ex. 7: The predicate symbols (each being Q) match. Hence, we may proceed. Next,
the first two terms viz. f (a) and x, are not identical. However, as one of these terms is
a variable viz. ‘x’, hence, the corresponding terms are unifiable with substitution
{f (a)/x}.
Next, the two terms g (x) and y, one from each of the formula at corresponding
positions, are again unifiable by the substitution { g(x)/y}.
Hence, the required substitutions { f (a)/x, g (f (a))/y} using the substitution {f (a)/x}
in g (x)/y to get the substitution {g (f (a))/y}.
Therefore the two formulas are unifiable and after unification the formulas become
Q (f(a), g (f (a)))
48
Deductive Inference
2.8 FURTHER READINGS Rules and Methods
49
Knowledge Representations
UNIT 3 SYSTEMS FOR
IMPRECISE/INCOMPLETE
KNOWLEDGE
Structure Page Nos.
3.0 Introduction 50
3.1 Objectives 51
3.2 Fuzzy Systems 51
3.3 Relations on Fuzzy Sets 55
3.4 Operations on Fuzzy Sets 57
3.5 Operations Unique to Fuzzy Sets 59
3.6 Non-Monotonic Reasoning Systems 62
3.7 Default Reasoning Systems 64
3.8 Closed World Assumption Systems 65
3.9 Other Non-Deductive Systems 66
3.10 Summary 67
3.11 Solutions/ Answers 67
3.12 Further Readings 68
3.0 INTRODUCTION
In the earlier three units of the block, we discussed PL and FOPL systems for making
inferences and solving problems requiring logical reasoning. However, these systems
assume that the domain of the problems under consideration is complete, precise and
consistent. But, in the real world, the knowledge of the problem domains is generally
neither precise nor consistent and is hardly complete.
In this unit, we discuss a number of techniques and formal systems that attempt to
handle some of these blemishes. To begin with, in Sections 4.2 to 4.5, we discuss
fuzzy systems that attempt to handle imprecision in knowledge bases, specially, due
to use of natural language words like hot, good, tall etc.
In Sections 4.7 and 4.8, we discuss two formal systems that attempt to handle
incompleteness of the available information. These systems are called Default
Reasoning Systems and Closed World Assumption Systems. Finally, we discuss
some inference rules, viz, abductive inference rule and inductive inference rule that
are, though not deductive, yet are quite useful in solving problems arising out of
everyday experience.
50
Systems for
3.1 OBJECTIVES Imprecise/incomplete
knowledge
In the symbolic Logic systems like, PL and FOPL, that we have studied so far, any
(closed) formula has a truth-value which must be binary, viz., True or False.
However, in our everyday experience, we encounter problems, the descriptions of
which involve some words, because of which, to statements of situations, it is not
possible to assign a truth value: True or False. For example, consider the statement:
If the water is too hot, add normal water to make it comfortable for taking a bath.
In the above statement, for a number of words/phrases including ‘too hot’ ‘add’,
‘comfortable’ etc., it is not possible to tell when exactly water is too hot, when water
is (at) normal (temperature), when exactly water is comfortable for taking a bath.
For example, we cannot tell the temperature T such that for water at temperature T or
less, truth value False can be associated with the statement ‘Water is too hot’ and at
the same time truth-value True can also be associated to the same statement ‘Water is
too hot’ when the temperature of the water is, say, at degree T + 1, T + 2….etc.
Healthy Person: we cannot even enumerate all the parameters that determine health.
Further, it is even more difficult to tell for what value of a particular parameter, one is
healthy or otherwise.
Old/young person: It is not possible to tell exactly upto exactly what age, one is
young and, by just addition of one day to the age, one becomes old. We age gradually.
Aging is a continuous process.
Sweet Milk: Add small sugar cube one at a time to glass of milk, and go on adding
upto, say, 100 small cubes.
Initially, without sugar, we may take milk as not sweet. However, with addition of
each one small sugar particle cube, the sweetness gradually increases. It is not
possible to say that after addition of 100 small cubes of sugar, the milk becomes
sweet, and, till addition of 99 small cubes, it was not sweet.
Pool, Pond, Lake,….., Sea, Ocean: for different sized water bodies, we can not say
when exactly a pool becomes a pond, when exactly a pond becomes a lake and so on.
51
Knowledge Representations One of the reasons, for this type of problem of our inability to associate one of the
two-truth values to statements describing everyday situations, is due to the use of
natural language words like hot, good, beautiful etc. Each of these words does not
denote something constant, but is a sort of linguistic variable. The context of a
particular usage of such a word may delimit the scope of the word as a linguistic
variable. The range of values, in some cases, for some phrases or words, may be very
large as can be seen through the following three statements:
• Dinosaurs ruled the earth for a long period (about millions of years)
• It has not rained for a long period (say about six months).
• I had to wait for the doctor for a long period (about six hours).
Fuzzy theory provides means to handle such situations. A Fuzzy theory may be
thought as a technique of providing ‘continuization’ to the otherwise binary
disciplines like Set Theory, PL and FOPL.
Further, we explain how using fuzzy concepts and rules, in situation like the ones
quoted below, we, the human beings solve problems, despite ambiguity in language.
(i) Knowing (exactly) the distances of various vehicles from the path to be
followed to cross over.
(ii) Knowing the velocities and accelerations of the various vehicles moving on the
road within a distance of, say, one kilometer.
1 2
(iii) Using Newton’s Laws of motion and their derivatives like s = ut + at , and
2
calculating the time that would be taken by each of the various vehicles to reach
the path intended to be followed to cross over.
(iv) Adjusting dynamically our speeds on the path so that no collision takes place
with any of the vehicle moving on the road.
But, we know the human beings not only do not follow the above precise method but
cannot follow the above precise method. We, the human beings rather feel
comfortable with fuzziness than precision. We feel comfortable, if the instruction
for crossing a road is given as follows:
Look on both your left hand and right hand sides, particularly in the beginning, to
your right hand side. If there is no vehicle within reasonable distance, then attempt to
cross the road. You may have to retreat back while crossing, from somewhere on the
road. Then, try again.
The above instruction has a number of words like left, right (it may 45° to the right or
90° to the right) reasonable, each of which does not have a definite meaning. But we
feel more comfortable than the earlier instruction involving precise terms.
Let us consider another example of our being comfortable with imprecision than
precision. The statement: ‘The sky is densely clouded’ is more comprehensible to
human beings than the statement: ‘The cloud cover of the sky is 93.5 %’.
Thus is because of the fact that, we, the human beings are still better than computers
in qualitative reasoning. Because of better qualitative reasoning capabilities
• just by looking at the eyes only and/or nose only, we may recognize a person.
52
Systems for
• just by taking and feeling a small number of grains from cooking rice bowl, we Imprecise/incomplete
can tell whether the rice is properly cooked or not. knowledge
• just by looking at few buildings, we can identify a locality or a city.
We know that for any problem, the plan of the proposed solution and the relevant
information is fed in the computer in a form acceptable to the computer.
However, the problems to be solved with the help of computers are, in the first place,
felt by the human beings. And then, the plan of the solution is also prepared by human
beings.
It is conveyed to the computer mainly for execution, because computers have much
better executional speed.
(i) We, the human beings, sense problems, desire the problems to be solved and
express the problems and the plan of a solution using imprecise words of a natural
language.
(ii) We use computers to solve the problems, because of their executional power.
(iii) Computers function better, when the information is given to the computer in
terms of mathematical entities like numbers, sets, relations, functions, vectors,
matrices graphs, arrays, trees, records, etc., and when the steps of solution are
generally precise, involving no ambiguity.
(i) Imprecision of natural language, with which the human beings are comfortable,
where human beings feel a problem and plan its solution.
(ii) Precision of a formal system, with which computers operate efficiently, where
computers execute the solution, generally planned by human beings
a new formal system viz. Fuzzy system based on the concept of ‘Fuzzy’ was
suggested for the first time in 1965 by L. Zadeh.
In order to initiate the study of Fuzzy systems, we quote two statements to recall the
difference between a precise statement and an imprecise statement.
A precise Statement is of the form: ‘If income is more than 2.5 lakhs then tax is 10%
of the taxable income’.
An imprecise statement may be of the form: ‘If the forecast about the rain being
slightly less than previous year is believed, then there is around 30% probability that
economy may suffer heavily’.
Next, we explain, how the fuzzy sets are defined, using mathematical entities, to
capture imprecise concepts, through an example of the concept : tall.
Next step is to model ‘definitely tall’ ‘not at all tall’, ‘little bit tall’, ‘slightly tall’
‘reasonably Tall’ etc. in terms of mathematical entities, e.g., numbers; sets etc.
In modelling the vague concept like ‘tall’, through fuzzy sets, the numbers in the
closed set [0, 1] of reals may be used on the following lines:
(iii) ‘A little bit tall’ may be represented as ‘tallness having value say .2’.
(iv) ‘Slightly tall’ may be represented as ‘tallness having value say .4’.
(v) ‘Reasonably tall’ may be represented as ‘tallness having value say .7’.
and so on.
Similarly, the values of other concepts or, rather, other linguistic variables like
sweet, good, beautiful, etc. may be considered in terms of real numbers between
0 and 1.
Coming back to the imprecise concept of tall, let us think of five male persons of an
organisation, viz., Mohan, Sohan, John, Abdul, Abrahm, with heights 5' 2”, 6' 4”,
5' 9”, 4' 8”, 5' 6” respectively.
Then had we talked only of crisp set of tall persons, we would have denoted the
But, a fuzzy set, representing tall persons, include all the persons alongwith
respective degrees of tallness. Thus, in terms of fuzzy sets, we write:
54
Systems for
3.3 RELATIONS ON FUZZY SETS Imprecise/incomplete
knowledge
In the case of Crisp sets, we have the concepts of Equality of sets, Subset of a set, and
Member of a set, as illustrated by the following examples:
In order to define for fuzzy sets, the concepts corresponding to the concepts of
Equality of Sets, Subset and Membership of a Set considered so far only for crisp sets,
first we illustrate the concepts through an example:
Note: In every fuzzy set, all the elements of X with their corresponding memberships
values from 0 to 1, appear.
For (ii) Equality of Fuzzy sets: Let A, B and C be fuzzy sets defined on X as
follows:
Let A = {Mohan/.2; Sohan/1; John/.7; Abrahm/.4}
B = {Abrahm/.4, Mohan/.2; Sohan/1; John/.7}.
Then, as degrees of each element in the two sets, equal; we say fuzzy set A equals
fuzzy set B, denoted as A = B
However, if C = {Abrahm/.2, Mohan/.4; Sohan/1; John/.7}, then
A ≠ C.
55
Knowledge Representations (iii) Subset/Superset
Intuitively, we know
(i) The Set of ‘Very Tall’ people should be a subset of the set of Tall
people.
(ii) If the degree of ‘tallness’ of a person is say .5 then degree of ‘Very
Tallness’ for the person should be lesser say .3.
then, in view of the fact that for each element, degree in A is greater than or equal to
degree in B, B is a subset of A denoted as B ⊂ A.
However, degree (Mohan) = .3 in C and degree (Mohan) =.2 in A,
,therefore, C is not a subset of A.
On the other hand degree (John) = .5 in C and degree (John) = .7 in A,
therefore, A is also not a subset of C.
with that 0 ≤ vi , wi ≤ 1.
Then fuzzy set A equals fuzzy set B, denoted as A = B, if and only if
vi = wi for all i = 1,2,….,n.
Further if and w ≤ vi for all i.
56
However, conversely, a fuzzy set may not be written as a crisp set. Let C be a fuzzy Systems for
Imprecise/incomplete
set denoting Educated People, where degree of education is defined as follows: knowledge
degree of education (Ph.D. holders) = 1
degree of education (Masters degree holders) = 0.85
degree of education (Bachelors degree holders) = .6
degree of education (10 + 2 level) = 0.4
degree of education (8th Standard) = 0.1
degree of education (less than 8th) = 0.
Let us C = {Mohan/.85; Sohan/.4; John/.6; Abdul/1; Abrahm/0}.
Definition: Support set of a Fuzzy Set, say C, is a crisp set, say D, containing all the
elements of the universe X for which degree of membership in Fuzzy set is positive.
Let us consider again
Definition: Fuzzy Singleton is a fuzzy set in which there is exactly one element
which has positive membership value.
Example:
Let us define a fuzzy set OLD on universal set X in which degree of OLD is zero if a
person in X is below 20 years and Degree of Old is .2 if a person is between 20 and 25
years and further suppose that
Old = C = {Mohan/0; Sohan/0; John/.2; Abdul/0; Abrahm/0},
then support of old = {John} and hence old is a fuzzy singleton.
Ex. 1: Discuss equality and subset relationship for the following fuzzy sets defined on
the Universal set X = { a, b , c, d, e}
A = { a/.3, b/.6, c/.4 d/0, e/.7}
B = {a/.4, b/.8, c/.9, d/.4, e/.7}
C = {a/.3, b/.7, c/.3, d/.2, e/.6}
The concepts of Union, intersection and complementation for crisp sets may be
extended to FUZZY sets after observing that for crisp sets A and B, we have
57
Knowledge Representations (i) A ∪ B is the smallest subset of X containing both A and B.
(ii) A ∩ B is the largest subset of X contained in both A and B.
(iii) The complement A' is such that
(2) While taking union of Crisp sets, members of both sets are included, and none
else. However, in each Fuzzy set, all members of the universal set occur but their
degrees determine the level of membership in the fuzzy set.
The Union of two fuzzy sets A and B, is the set C with the same universe as that of A
and B such that, the degree of an element of C is equal to the MAXIMUM of degrees
of the element, in the two fuzzy sets.
(if Universe A ≠ Universe B, then take Universe C as the union of the universe A and
universe B)
The Intersection C of two fuzzy sets A and B is the fuzzy set in which, the degree
of an element of C is equal to the MINIMUM of degrees in the two fuzzy sets.
Example:
Then
Commutativity
(i) A ∪ B = B ∪ A
(ii) A ∩ B = B ∩ A
We prove only (i) above just to explain, how the involved equalities, may be proved in
general.
58
Let U = {x1, x2…..xn}. be universe for fuzzy sets A and B Systems for
Imprecise/incomplete
If y ∈ A ∪ B, then y is of the form {xi/di} for some i knowledge
y ∈ A ∪ B ⇒ y = {xi/ei} as member of A and
y = (xi/fi} as member of B and
di = max {ei, fi} = max {fi, ei}
⇒ y∈ B ∪ A.
DeMorgan’s Laws
(A')' = A
Idempotence
A∩A=A
A∪A=A
Identity
A∪U =U A∪U=A
A∩φ =A φ ∩A=φ
where
φ : empty fuzzy set = {x/0 with x∈U}
and
U: universe = {x/1 with x∈U}
Next, we discuss three operations, viz., concentration, dilation and normalization, that
are relevant only to fuzzy sets and can not be discussed for (crisp) sets.
59
Knowledge Representations In respect of concentration, it may be noted that the associated values being between 0
and 1, on squaring, become smaller. In other words, the values concentrate towards
zero. This fact may be used for giving increased emphasis on a concept. If Brightness
of articles is being discussed, then Very bright may be obtained in terms of
CON. (Bright).
Example:
The associated values, that are between 0 and 1, on taking square-root get increased,
e.g., if the value associated with x was .01 before dilation, then the value associated
with x after dilation becomes .1, i.e., ten times of the original value.
This fact may be used for decreased emphasis. For example, if colour say ‘yellow’ has
been considered already, then ‘light yellow’ may be considered in terms of already
discussed ‘yellow’ through Dilation.
⎧ ⎛ m ( x) ⎞ ⎫
NORM ( A) = ⎨ x / ⎜ A ⎟ | x ∈ U ⎬ .
⎩ ⎝ Max ⎠ ⎭
NORM (A) and is a fuzzy set in which membership values are obtained by dividing
values of the membership function of A by the maximum membership function.
The resulting fuzzy set, called the normal, (or normalized) fuzzy set, has the
maximum of membership function value of 1.
Example:
Norm (A) = {Mohan/(.5 ÷.9 = .55.); Sohan/1; John /(.7 ÷.9 = .77.); Abdul/0;
Abrahm/(.2 ÷.9 = .22.)}
We know from our earlier background in Mathematics that a relation from a set A to a
set B is a subset of A x B.
For example, The relation of father may be written as {{Dasrath, Ram), …}, which is
a subset of A × B, where A and B are sets of persons living or dead.
Fuzzy Relation
60
In fuzzy sets, every element of the universal set occurs with some degree of Systems for
Imprecise/incomplete
membership. A fuzzy relation may be defined in different ways. One way of knowledge
defining fuzzy relation is to assume the underlying sets as crisp sets. We will discuss
only this case.
For example:
Now suppose
Ram is UNCLE of Mohan with degree 1, Majid is UNCLE of Abdul with degree .7
and Peter is UNCLE of John with degree .7. Ram is UNCLE of John with degree.4
Then the relation of UNCLE can be written as a set of ordered-triples as follows:
{(Ram, Mohan, 1), (Majid, Abdul, .7), (Peter, John, .7), (Ram, John, .4)}.
As in the case of ordinary relations, we can use matrices and graphs to represent
FUZZY relations, e.g., the relation of UNCLE discussed above, may be graphically
denoted as
1
Ram Mohan
.4
Majid .7 John
.7 Abdul
Peter
Fuzzy Graph
Fuzzy Reasoning
In the rest of this section, we just have a fleeting glance on Fuzzy Reasoning.
Let us recall the well-known Crisp Reasoning Operators
(i) AND
(ii) OR
(iii) NOT
(iv) IF P THEN Q
(v) P IF AND ONLY IF Q
The deg (P) = 0 denotes P is False and deg (P) =1 denotes P is True.
Monotonic Reasoning: The conclusion drawn in PL and FOPL are only through
(valid) deductive methods. When some axiom is added to a PL or an FOPL system,
then, through deduction, we can draw more conclusions. Hence, more additional facts
become available in the knowledge base with the addition of each axiom. Adding of
axioms to the knowledge base increases the amount of knowledge contained in the
knowledge base. Therefore, the set of facts through inferences in such systems can
only grow larger with addition of each axiomatic fact. Adding of new facts can not
reduce the size of K.B. Thus, amount of knowledge monotonically increases with the
number of independent premises due to new facts that become available.
However, in everyday life, many times in the light of new facts that become available,
we may have to revise our earlier knowledge. For example, we consider a sort of
deductive argument in FOPL:
However, later on, we come to know that Tweety is actually a hen and a hen cannot
fly long distances. Therefore, we have to revise our belief that Tweety can fly over
long distances.
This type of situation is not handled by any monotonic reasoning system including PL
and FOPL .This is appropriately handled by Non-Monotomic Reasoning Systems,
which are discussed next.
62
To meet the requirement for reasoning in the real-world, we need non-monotomic Systems for
Imprecise/incomplete
reasoning systems also, in addition to the monotomic ones. This is true specially, in knowledge
view of the fact that it is not reasonable to expect that all the knowledge needed for a
set of tasks could be acquired, validated, and loaded into the system just at the outset.
In general, initial knowledge is an incomplete set of partially true facts. The set may
also be redundant and may contain inconsistencies and other sources of uncertainty.
The KB contains information, facts, rules, procedures etc. relevant to the type of
problems that are expected to be solved by the system. The component IE of NMRS
gets facts from KB to draw new inferences and sends the new facts discovered by it
(i.e., IE) to KB. The component TMS, after addition of new facts to KB. either from
the environment or through the user or through IE, checks for validity of the KB. It
may happen that the new fact from the environment or inferred by the IE may
conflict/contradict some of the facts already in the KB. In other words, an
inconsistency may arise. In case of inconsistencies, TMS retracts some facts from
KB. Also, it may lead to a chain of retractions which may require interactions
between KB and TMS. Also, some new fact either from the environment or from IE,
may invalidate some earlier retractions requiring reintroduction of earlier retracted
facts. This may lead to a chain of reintroductions. These retrievals and introductions
are taken care of by TMS. The IE is completely relieved of this responsibility. Main
job of IE is to conclude new facts when it is supplied a set of facts.
IEIE TMS
KB
Let us assume KB has two facts P and ~ Q → ~ P and a rule called Modus Tollens.
When IE is supplied these knowledge items, it concludes Q and sends Q to KB.
However, through interaction with the environment, KB is later supplied with the
information that ~ P is more appropriate than P. Then TMS, on the addition of ~ P to
KB, finds that KB is no more consistent, at least, with P. The knowledge that ~ P is
more appropriate, suggests that P be retracted. Further Q was concluded assuming P
as True. But, in the new situation in which P is assumed to be not appropriate, Q also
becomes inappropriate. P and Q are not deleted from KB, but are just marked as
dormant or ineffective. This is done in view of the fact that later on, if again, it is
found appropriate to include P or Q or both, then, instead of requiring some
mechanism for adding P and Q, we just remove marks that made these dormant.
In the previous section, we discussed uncertainty due to beliefs (which are not
necessarily facts) where beliefs are changeable. Here, we discuss another form of
uncertainty that occur as a result of incompleteness of the available knowledge at a
particular point of time.
Whenever, for any entity relevant to the application, information is not in the KB, then
a default value for that type of entity, is assumed and is assigned to the entity. The
default assignment is not arbitrary but is based on experiments, observations or some
other rational grounds. However, the typical value for the entity is removed if some
information contradictory to the assumed or default value becomes available.
The advantage of this type of a reasoning system is that we need not store all facts
regarding a situation. Reiter has given one theory of default reasoning, which is
expressed as
a ( x ) : M b1 ( x ),....., Mb k ( x )
(A)
C( x )
The inference rule (A) states that if a(x) is true and none of the conditions bk (x) is in
conflict or contradiction with the K.B, then you can deduce the statement C(x)
Suppose we have
Bird ( x ) : Mfly ( x )
(i)
Fly ( x )
M fly (x) stands for a statement of the form ‘KB does not have any statement of the
form that says x does not have wings etc, because of which x may not be able to fly’.
In other words, Bird (x) : M fly (x) may be taken to stand for the statement ‘if x is a
normal bird and if the normality of x is not contradicted by other facts and rules in the
KB.’ then we can assume that x can fly. Combining with Bird (Twitty), we conclude
that if KB does not have any facts and rules from which, it can be inferred that Twitty
can not fly, then, we can conclude that twitty can fly.
Further, suppose, KB also contains
Adult( x ) : M drive ( x )
Drive ( x )
If a person x is an adult and in the knowledge base there is no fact (e.g., x is blind, or
x has both of his/her hands cut in an accident etc) which tells us something making x
incapable of driving, then x can drive, is assumed.
This mechanism is useful in applications where most of the facts are known and
therefore it is reasonable to assume that if a proposition cannot be proved, then it is
FALSE. This is called CWA with failure as negation.
This means if a ground atom P(a) is not provable, then assume ~ P(a). A predicate like
LESS (x, y) becomes a ground atom when the variables x and y are replaced by
constants say x by 2 and y by 3, so that we get the ground atom LESS (2, 3).
AKB is complete if for each ground atom P(a); either P(a) or ~ P(a) can be proved.
By the use of CWA any incomplete KB becomes complete by the addition of the
meta rule:
The above KB is incomplete as we can not say anything about Q(b) (or ~ Q(b)) from
the given KB.
Remarks: In general, KB argumented by CWA need not be consistent i.e.,
it may contain two mutually conflicting wffs. For example, if our KB contains
only P(a) ∨ Q(b).
65
Knowledge Representations (Note: from P (a) ∨ Q (b), we can not conclude either of P (a) and Q (b) with
definiteness)
As neither P(a) nor Q(b) is provable, therefore, we add ~ P(a) and ~ Q(b) by using
CWA.
But, then, the set of P(a) ∨ Q(b), ~ P(a) and ~Q(b) is inconsistent.
PL and FOPL are deductive inferencing systems: i.e., the conclusions drawn are
invariably true whenever the premises are true. However, due to limitations of these
systems for making inferences, as discussed earlier, we must have other systems
inferences. In addition to Default Reasoning systems and Closed World Assumption
systems, we have the following useful reasoning systems:
Abduction Rule
P→Q
Note that abductive inference rule is different form Modus Ponens inference rule in
that in abductive inference rule, the consequent of P → Q, i.e., Q is assumed to be
given as True and the antecedent of P → Q, i.e., P is inferred.
The doctor then attempts to diagnose the disease (i.e., P) from symptoms. However, it
should be noted that the conclusion of the disease from the symptoms may not always
be correct. In general, abductive reasoning leads to correct conclusions, but the
conclusions may be incorrect also. In other words, Abductive reasoning is not a valid
form of reasoning.
P(a1 ), P ( a2 ) ......, P ( an )
,
( x) P ( x)
states that from n instances P(ai) of a predicate/property P(x), we infer that P(x) is
True for all x.
However, all the rules discussed under Propositional Logic (PL) and FOPL, including
Modus Ponens etc are deductive i.e., lead to irrefutable conclusions.
3.10 SUMMARY
In this unit, we briefly discussed some formal systems which take care of at least one
of the blemishes in the knowledge base, namely, of inconsistency, imprecision and
incompleteness of the knowledge base. In Sections 4.2 to 4.5, we discuss Fuzzy
systems, which attempt to handle imprecision due to use of words, having multiple
meanings, of a natural language. The words appear in the description of the problems
to be solved by man-machine systems.
In Section 4.6, we briefly discuss non-monotonic (formal) systems, which mainly deal
with problems involving beliefs, in stead of facts. A belief may be revised by the
believer, when strong evidence becomes available for the revision of the belief.
In Sections 4.7 and 4.8, we discuss two formal systems which attempt to deal with
incompleteness of the available knowledge of the problem domain. Default reasoning
systems discussed in Section 4.7, attempt to handle the problem of incompleteness of
knowledge, through assumption of default values for the missing values. The default
values may be withdrawn, in case some knowledge contrary to the default values,
becomes available.
On the other hand, another formal system, viz., closed world assumption system
discussed in Section 4.8, assume that, if the truth of a statement is not available in the
knowledge base, then assume the statement false. Finally, in Section 4.9, we discuss
some inference rules, namely, abductive and inductive rules, which though are not
deductive, yet prove quite useful in everyday problems, particularly, in diagnostic
problems.
3.11 SOLUTIONS/ANSWERS
Ex. 1: Both A and C are subsets of the fuzzy set B, because deg (x min A ) ≤ deg (x
in B) for all x ∈ X
68
A.I. Languages-1:
UNIT 1 A.I. LANGUAGES-1: LISP LISP
1.0 INTRODUCTION
The task of solving problems using computer as a tool, in general, is a quite a LISP has jokingly been called ‘the
most intelligent way to misuse a
comprehensive task. Ever since the use of computers in solving problems, it has been computer’. I think that description
found that the solving of problems can be facilitated by using appropriate is a great compliment, because it
style/paradigm for a given type of problem and using a language designed and transmits the full flavour of
developed according to the basic principles of the style. liberation; it has assisted a number
of most gifted fellow humans in
thinking previously impossible
Some of the well-known programming languages like C support imperative style of thoughts’
programming for solving problems with the help of a computer. The major feature of
imperative style is that the proposed solution is expressed in terms of variables, Edsger W. Dijkstra
declarations, expressions and commands. Declarations assign names to locations in
the memory and associate types with the values. Commands may be thought of as
names for actions, that are required to be executed by a computing system, mainly to
change values stored in the memory locations. Commands are generally executed in
the order, from top to bottom, as these appear in the program, though through
conditional and unconditional jumps, flow of execution can be changed. One of very
important concept in imperative style of programming is that of the state (of memory),
i.e., the set of values assigned to various locations in the memory at a particular point
of time.
1.1 OBJECTIVES
LISP has been one of the most popular languages for AI applications. LISP was
specifically designed for A.I. applications, and we have already mentioned that AI
applications involve symbolic processing, instead of mere numeric processing. Also,
AI systems are generally large and complex. Their development requires that the
implementation language and support environment provide flexibility, rapid
prototyping and good debugging tools. On all these counts, LISP wins over all other
*LISP is based on a formal system called λ-calculus (Lambda-calculus) originally proposed by Alonzo church and
who developed it later alongwith Stephen Kleene as a foundation for Mathematics.
6
programming languages. Hence, the language was used in its earliest applications for A.I. Languages-1:
writing programs which performed symbolic differentiation, integration and LISP
mathematical theorem proving. Later applications mainly written in LISP, include
expert systems and programs for common sense reasoning, natural language
interfaces, education and intelligent support systems, learning, speech and vision.
LISP has been found quite useful for the purpose of systems programming to the
extent that LISP machines have been developed in which whole of the programming
from top to bottom is in LISP. In LISP machines, which are personal computers, the
operating system, the user utility programs, the compilers, and the interpreters are all
written in LISP.
(ii) The above example also shows that prefix notation is used in LISP, i.e., the
operator ‘*’ comes before operands x and y. The prefix notation has the
advantage that the operators (like +, −, * etc.) can be easily located in an
expression.
Product = x * y;
Difference = z − u
Result = product + sum
It is easily seen that the emphasis is on assigning values to the variables, viz,
product, difference and result.
(iii) The language LISP allows programs to be used as data and vice-versa.
LISP has mainly one data structures viz list, in addition to the elementary
data types: number and symbol. All expressions, whether data or programs,
mainly are expressed in terms of lists.
The main advantage of this property of LISP is that the declarative knowledge, i.e.,
information about properties of an object can be easily integrated with procedural
knowledge, i.e., information about what actions to be performed. The facility of
uniform representation in LISP is also useful in the sense that it allows us to write
• LISP programs which can modify other programs (written in any language)
including themselves.
• LISP program that can write entirely new LISP programs.
• AI programs that learn new tasks.
7
A.I. Programming (iv) LISP is a highly modular language and hence suitable for development of
Languages
large software.
(v) The LISP environment provides a facility called Trace by using which
programs written in LISP can easily keep track of the various instructions that
have been executed, the number of times each has been executed and the
order in which the instructions have been executed.
(vi) LISP, being based on the mathematical discipline of λ-calculus, is the most
well-defined of all the programming languages. Hence, programs in LISP
are more reliable. The well-definedness of a language is a very important
issue as can be seen from the fact that due to a small error in FORTRAN
program of the type quoted below and FORTRAN environment’s
incapability to detect it, lead to the loss a spaceship.
In FORTRAN, the statement
DO 12 I = 1,5
denotes the beginning of a loop, whereas the statement
DO 12 I = 1.5
is an assignment statement. Through the execution of the second statement,
the value 1.5 is assigned to the variable DO12I, because blanks are ignored at
all places in FORTRAN.
(vii) Comments in LISP are given by using the character for semicolon i.e. ‘;’ as
the first symbol on the line which is to be treated as a comment. Sequence of
characters on a given line after semi-colon, is treated as a comment.
(viii) The types of the variables are not required to be declared in the beginning
as is done in imperative languages like, FORTRAN or C. Also a variable
name say x, we can assume any type for the values of the variable within the
same program or procedure.
can be written in any one of the following (or even other) formats:
(x( (x(y
yz) or z
u )
) u)
(x) We should note that the two parentheses viz the left parenthesis denoted by
‘(‘ and the right parenthesis denoted by ’)’ are two most important
characters in LISP and must be used very carefully. They are used to
denote lists and for each left parenthesis, there is a right parenthesis for any
valid LISP expression. In the light of the above fact, the expressions
xyx ) or (fg
Next most important character in LISP is quote, the role of which is explained
after some time under (ii) of evaluation of S-expr.
8
(xi) Separator. Blanks are used to separate S-exprs, i.e., any valid Lisp entity A.I. Languages-1:
or object, specially numbers and symbols. Lists are always separated LISP
automatically from other S-exprs. Comma is used for special purposes
and not as a separator.
(3a(cd))
( this is a list of only symbols)
( )
But the following expressions, one on each of the next three lines is
not a valid list.
(ba3
this − is − not − a − list
) a b c (
Notation for a valid object in LISP is called S-expressions or just S-exprs (shorthand
for Symbolic Expression). Even objects themselves are also referred to sometime as
S-exprs. The main data types of LISP objects and their interrelationships are shown
in the following diagram:
S-exprs
atom string
list
symbol
number character
Note that 3 + 11 is not a number but a symbol. The reason for this is that any
arithmetic expression involving at least one operator, when represented in LISP, has
to be a list with first and the last characters as ‘( ’ and ‘)’ respectively, and first
element within the list being necessarily an operator. If we intend to represent the
arithmetic expression 3 + 11 which is equivalent to 14, then it is represented as
( + 3 11). 9
A.I. Programming String: A sequence of characters enclosed within double quotes is a string, e.g.,
Languages
“abc 2 @ string ”
is a string
Initially LISP was designed as an interpreted language, though later compiler based
versions also became available. In the interpreted mode, the prompt visible on the
screen is the symbol ‘→‘ , which, of course, may be changed.
→ (+x(*yz))
→ ’Colour
; where colour is a symbol we get respectively the
; following situations after printing
→ colour
(ii) For evaluating a symbol say colour, first of all, some value or binding must have
been associated at some stage before the current evaluation. And then evaluation
of the symbol returns the associated value or binding. Suppose, earlier at some
stage, the symbol colour is given the value RED, then, if give the input colour,
i.e., if we have
→ colour
;then RED is returned, i.e., we get after evaluation
→ RED
(iii) Evaluating a number: A number evaluates to itself
For example if the input is given as
→ 41
;then after evaluation the number 41 is returned.
→(+(*x3)(−y1))
Next, assume sum-sq is a function defined in a program which returns the sum of the
squares of its arguments and again suppose x is bound to 2 and y is bound to 5, then if
we have the following expression is given as input
→ ( sum-sq ( * x 3 ) ( − y 1 ) )
with < special-word > being a special word in LISP (to be discussed) and < parami >
an S-expr, is called a special form. The evaluation of special form depends upon the
special word. The parameters < parami > may or may not be evaluated depending
upon the
< special-word >.
Some of the well-known special words are: defun, cond, do, quote. We shall discuss
evaluation of these special forms at appropriate place. The special form
(quote x) is just equivalent to ’x and hence evaluates to x.
→ (quote (* 3 7 ) );
evaluates to
→ (* 3 7 )
We have already mentioned that the action of return includes printing of the
returned value. Also the action of return includes the fact that returned value can
be utillised in further processing, e.g., the S-expr
( + ( setq x 7 ) ( setq y 3 ) )
not only binds x to 7 and y to 3 but also the values 7 and 3 respectively are used
in further evaluation and
( + ( setq x 7 ) ( setq y 3 ) )
Sometimes the pairs of arguments to several occurrences of SETQ are run together
and given to a single SETQ. In such a situation, odd-numbered arguments are not
evaluated and even-numbered arguments are evaluated. Further each even-numbered
value is associated or bound to the immediately preceding (or bound to the
immediately preceding) odd numbered argument, e.g., the S-expr
( setq x ’( 1 2 ) y 7 z 11)
12
A.I. Languages-1:
1.5 EVALUATION OF PRIMITIVE FUNCTIONS LISP
LISP has a number of basic functions. In this section, we discuss how S-exprs
involving these primitive functions are evaluated.
We have already mentioned that LISP uses prefix notation for representing functional
expressions.
We have already mentioned that the programming language LISP is mainly designed
for symbolic processing though it may be used for numeric purposes also. Symbolic
processing in LISP is mainly about manipulating lists. Here we consider main list
processing operations.
(i) car: It takes a list as an argument and returns the first element of the list. ( car ’(d b
c)) returns the element d, ( car ’( ( d b ) c ) ) returns ( d b ). We should note that
argument has to be a quoted list. This is in consonance with our earlier statement
that for all functions denoted by an atom, the parameters are evaluated to return
arguments for the function.
Further
(ii) cdr (pronounced as ‘KUDDR’) also takes a list as its argument and returns a list
obtained from the given list by deleting its first element e.g.
(IS AN AI LANGUAGE ).
Also if the statement ( setq x ’( a b c ) ) is followed by the statement ( cdr x ) then the
value returned is ( b c ).
But note (cdr ’x) returns error, because, ’x evaluates to x and not to the list (a b c) and
the cdr of a symbol, in this case x, is not defined.
LISP provides facilities to simplify notation for a sequence of cars and cdrs e.g. the
S-expr
( car ( car ( cdr ( car ( cdr ( cdr given-list ) ) ) ) ) ).
14
where given-list is bound to some valid list, can be simplified to A.I. Languages-1:
( caadaddr given-list) LISP
or to any other S-expr like
( caar ( cdadr ( cdr given-list ) ) ).
Remark: The functions car and cdr take things apart. Next we describe three
functions cons, list and append which put things together.
(iii) Cons: takes two arguments, the first may be any (valid) S-expr but the second
must be a list. Then cons returns a new list in which the first argument is the first
element in the returned list followed by the elements of the list given as second
argument, preserving the earlier order of occurrence in the second argument.
(iv) list: may take any number of parameters, each of which is an S-expr, it evaluates
the parameters and then the arguments so obtained are grouped into a list in the
same order as the corresponding parameters are given initially, e.g., if the inputs (
setq x ’ (a b ) ) and ( setq y ’intelligence) are followed by the input s-expr (list ’x y
x ’knowledge ’y ) then on evaluation of the last expr, we get the list ( x
intelligence ( a b ) knowledge y )
(v) Append: the parameters of the function append can not be arbitrary S-exprs but
must evaluate to lists. It removes the parenthesis from the arguments obtained by
evaluating the parameters and puts all the lisp objects so obtained into a list.
For example, if the S-exprs ( setq x ’ ( a b ) ) and (setq y ’ (c d ) ) are followed by
( append x y ’ ( ( a b ) ) )
then the last S-expr on evaluation returns ( a b c d ( a b ). If we try to evaluate
( append ’ x y ) then an error is returned, because value of ’x is a symbol x and a
symbol can not be an argument of append.
Next, we define some more built-in list processing functions, which are quite useful in
writing LISP programs for solving problems requiring symbolic processing.
(vi) Reverse: takes a list as its argument and reverses the top level elements of the
argument e.g.
( reverse ’ ( a b ( a b ) ( c d ) ) )
returns
((cd)(ab)ba)
(vii) Length: again takes a list as its argument and returns the number of the top level
elements, e.g.,
( length ’ (a ( a b ) ( c d ) e ) ) returns the number 4.
(viii) Last: again takes a list as argument and returns the last top-level element of the
list, e.g.,
15
A.I. Programming (ix) Subst (stands for substitution): takes three arguments, such that each occurrence
Languages
of second argument in the third argument are replaced by the first argument.
Second argument must be an atom, e.g.,
( subst ’ A ’B ’ ( D B A ) )
returns
(DAA)
Also
( subst ’ ( A B ) ’C ’ ( D B A C ) )
returns
(DBA(AB))
(x) eval: In some situations, we may need another evaluation, in addition to the
evaluation provided by read −eval−print loop. The function eval is explained with
following examples:
→ ( setq x ’y )
→ ( setq y z ) ; then x evaluates to y but ( eval x ) evaluates to z.
Predicates are functions which return nil or t depending upon the values of their
arguments. The evaluation by some important predicates is explained in the following
table:
Further if we have (setq fifty 50) before the next S-expr then
( numberp fifty ) t
16
But A.I. Languages-1:
( number ’fifty ) nil LISP
( zerop 0 ) t
( zerop x ) t
But
( zerop ’x ) nil
The predicate ‘null’: null returns t if its argument is nil else returns nil, e.g.,
( null ( ) ) returns t
( null ’man ) returns nil
( null ’ (a b ) ) returns nil
The predicate ‘member’: The predicate ‘member’ has a little different behaviour. It
may not return t and/or nil. The predicate member tests an atom for the membership
of a list. If an atom is not a member of the list then it returns nil, else it returns the
portion of the list starting with the atom in the list up to the last element of the list. For
example, if we define
(setq last-alphabet ’( u v w x y z ) )
; then
( member ’a last-alphabet ) returns nil
( member ’w last-alphabet ) returns ( w x y z )
However, the predicate member tests the atom only for top membership of the
argument list. Hence,
( member ’w ’ ( u v ( w x y ) z ) returns nil,
because w is not a member of the list given by the second argument, but w is a
member of a member, viz of (w x y) of the list given by the second argument.
The predicate eql: We considered two forms of predicates for equality viz. ‘equal’
and ‘=’. We consider another predicate eql for equality. The predicate eql checks the
equality of the internal structure of its arguments. If the structures of arguments are
identical, it returns t else nil. In order to explain the behaviour of eql, we need the
following additional information:
Each time we use the function list even with the same elements, it takes new memory
cells and creates the list. Thus the two s-exprs,
creates two lists viz list1 and list2 which are internally different though each of them
consists of the same three elements x, y and z. However, further we have
then list3 and list1 point to the same memory locations and hence
17
A.I. Programming ( eql list1 list2 ) returns nil
Languages
( eql list1 list3 ) returns t
( eql list2 list3 ) returns nil
The main differences in the behaviour of the logical operators in LISP from their
behaviour in Boolean Algebra or in some other/programming languages are:
i) The operators AND and OR may take one, two or more than two arguments
ii) The values operated by logical operators in LISP are not exactly true or false
but the values are nil and non-nil, i.e., in LISP any S-expr which is not nil has the
same logical status as that of true in Boolean Algebra. Hence, modified definitions of
the three logical operators are:
AND: The arguments of AND are evaluated from left to right until some S-expr
evaluates to nil then other arguments are not evaluated. If, at any stage, an
argument evaluates to nil, then nil is returned. However, if none of the
arguments evaluates to nil then the value of the last argument is returned.
OR: The arguments of OR are evaluated from left to right, until either some value
returned is non- nil, then the value is returned as the value of application of
OR and the rest of the arguments are not evaluated. However, every argument
evaluates to nil, then nil is returned.
Examples:
So far we have discussed only built-in functions including predicates and relations.
The special word DEFUN allows us to write our own functions and build our own
programs. To build up highly complex programs, we need
where < function-name > is a symbol which names the function being defined,
< parameter-list > is a list of distinct symbols, which forms a list of parameters to the
function and < function-body > is the sequence of S-exprs which describes (or
denotes) the desired computation.
(defun sumcube ( x y )
(+(*xxx)(*yyy)))
Note 1: We have already mentioned that in interpreter mode, every S-expr, which
appears after the LISP prompt ’→’ is read, evaluated and printed. The execution of an
S-expr that defines a function, returns the name of the function. The name, acting as
a symbol, has a value obtained through execution, associated with it and the
associated value can be used in further processing. For example,
Note 2: Applying a function to its arguments is termed as making a function call. For
the above definition of the function sumcube, the following sequence
→ (setq x 3 ) ; returns 3
→ ( sumcube 2 x) ; returns 2 * 2 * 2 + 3 * 3 * 3, i.e. 35
(cond
( < test-1 > < S-expr > < S-expr > … < S-expr > )
( < test-2 > < S-expr > < S-expr > … < S-expr > )
( < test-n > < S-expr > < S-expr > … < S-expr > )
)
Each list of the form ( < test-i > < S-expr > . . . < S-expr > ) in the above is called a
clause.
The COND form is evaluated according to the following rule :
19
A.I. Programming Evaluate the first test viz. < test-1 > in clause 1. If it evaluates to non-nil then the
Languages
remaining < S-expr > ’s in the clause are evaluated in the order from left to right and
the value of the whole COND is the same as the value of the last S-expr in clause1.
If < test-1 > evaluates to nil, the same sequence of steps is repeated for second clause,
and so on. Until either some < test-i > evaluates to non-nil, for which earlier described
sequence of steps for the non-nil case, is followed. However, if all < test-i > evaluate
to nil then COND form evaluates to nil.
A special case of COND form is the one in which < test-i > is the first test which
invariably evaluates to non-nil and the corresponding ith clause has no other S-expr,
i.e., ith clause is of the form ( < test-i > ).
In this case, the value of < test-i > which is assumed to be non-nil is returned.
Comment (a): The example given above graphically demonstrates a style of placing
corresponding parentheses in the code. This is allowed as LISP code is format free.
Comment (b): We also know that t evaluates to itself and hence is non-nil. Therefore,
the two occurrences of the clause ( t z ) state that in case ( > x y ) is true but ( > x z )
is false then z is returned. Similarly, if ( > x y ) is false then the next clause as given
below is executed
(
( (>y z) y)
( t z )
)
In this clause, first of all, condition ( > y z ) is evaluted which, if the value happens to
be nil then the condition t in the next sub-clause ( t z ) is tested. As t always evaluates
to non-nil, hence z is returned.
The special form Do provides the power of iteration to LISP. We may recall from
our earlier studies that iterative constructs are very useful, specially for denoting long
sequences of actions by shorter code.
20
Also, LET is special form useful in LISP because, it facilitates the creation of local A.I. Languages-1:
variables and often yields code which is both compact and efficient. Let us first LISP
discuss the special form Do in detail.
end-form1 ; is to be terminated
end-form-n return-value
)
body1 ; body of Do-loop
body2
Step 1: Variables var-i are initially bound to init-i for all i in parallel
Step 2: If the S-expr viz end-test is present, it is examined. If end-test evaluates to nil
then the following sub-steps are followed:
(i) Each of body-j is evaluated, and if in body-j any S-expr of the form (return value)
is encountered, do is exited and its value is the value in return value.
(ii) We should note that only utility of the body is for exiting or for side-effects.
(iii) Next iteration starts (only if end-test evaluated as nil) with binding of each of the
var-i to the value of step-i. If step-i is omitted, the var-i is left unchanged.
(iv) Repeat whole of step2 again if in step2, end-test evaluates to nil else go to step3
The next two steps are executed when end-test evaluates to non-nil.
Step 3: Each of end-form-i is evaluated, the utility of which is for exiting or side-
effects.
→ ( print-beginnning-integers 12 )
123456789101112 done
; here done is used to indicate the successful
As we have already mentioned that LET is used for creating local variables. The
purpose and use of LET may be explained through the simple example:
( defun explain−let ( x y )
( let
(
(x 1)
(y 2)
)
( print ’x = x )
( terpri )
( print ’ y = y)
( terpi )
)
( print ’x = x )
( print ’y = y )
The printing command ( terpri ) asks the printer to leave the line and start
printing on the next line.
x=1
y=2
(∵ through let, we define a local loop in which x is 1 and y is 2. But, once execution
exits the loop, then x and y assume the assigned values.
x=8
y=9
Remarks: As in Do, the values of the variables within LET structure are bound in
parallel. If we wish to bind values in sequential order then we use LET *
Ex 5: Write a LISP program expo to compute i raise to power j where i and j are
natural numbers.
22
A.I. Languages-1:
1.11 INPUT / OUTPUT PRIMITIVES LISP
In this section, some frequently used I/O primitives are introduced example:
(i) The read statement for input is explained through the following:
→ ( + 7 (read) );
→ 8 ; the value given by the user;
→ 15 ; the value returned by the system.
(ii) The primitive TERPRI directs the printer to start on a new line.
(iii) The primitive PRINT takes one argument. It prints its argument in the same
form in which it is received and also returns the argument as value of the print
statement.
Example :
One of the occurrences of (X Y Z U) is for the value returned by the function print
and the other occurrence is printed by the print statement.
Another Example:
→ ( print ’good morning’ ); the statement an execution returns the following two
;lines
’good morning’
’good morning’
(iv) The primitive prin1 is the same as print except that the new-line characters and
space are not provided
Example
(v) The primitive princ : is the same as prin1 except it does not print the double-
quotation marks at of the beginning and end of its argument, if given, e.g.,
princ statement has eliminated the double quotes, but returned value is still having
double quotes.
(vi) Formatted Printing through the primitive FORMAT. LISP allows us to have
cleaner output through the primitive FORMAT which has the following syntax.
( format < destination > < string > arg1 arg2 ……) 23
A.I. Programming Here < destination > specifies where the output is to be directed, e.g., to the printer or
Languages
to the monitor or some other external file. Default value is generally the monitor. The
word < string > in the format clause indicates the desired output string which is mixed
up with format directives. The format directives specify how each argument is to be
represented. The order of occurrence of the directives is the same as the order in
which the arguments are to be printed. The character ∼ is used before each directive to
identify the directives. Most common directives are:
Next, field widths for appropriate argument values are specified by an integer between
tilde (i.e., ∼) and the directive, e.g., ∼ 3D indicates the integer field width is 3.
Recursive function: is a function which calls itself repeatedly, but each call with
simpler arguments than the arguments used by the preceding call.
FACT
→ ( fact 5 )
24
120 A.I. Languages-1:
Another simple example is given below to explain recursion in LISP: LISP
We define our own function LEN that returns the number of top-most elements in a
given list say L:
( defun LEN ( L )
( Cond
(
( ( null L ) 0 )
( t ( + 1 ( LEN ( Cdr L ) ) ) )
)
)
)
)
Ex 6: Write a function deep-length that counts the number of a atoms (not necessarily
distinct) in a given list. The atoms may be in a list which is a member of another list
which at some level occurs as an element of the given list. For example, for the list
L = ( 1 ( 2 ( 3 4 )) ( 5 ))
length L is three. However, deep-length of L is five.
Association lists are useful tools to associate attributes and their values with objects.
For example, to describe a particular book viz. LISP by Winston & Horn published by
Addison-Wesley Publishing Company in 1984, we may use the representation:
The value of the attribute title is a list viz. (LISP second edition) whereas the value of
the attribute year is 1984. The values of the attributes may be any S-expr, e.g.,
number, symbol or list. Formally we define.
Association List is a list of embedded sublists, in which first element of each sublist
is a key. In the example of book given above, the symbols title, author, year and
publisher are keys.
For a given object, ASSOC looks down the sublist (each sublist representing key s
associated value of the key) starting from the first sublist in the list, and matches the
car of each sublist with the key given as first argument of ASSOC. If the two do not
25
A.I. Programming match, ASSOC goes further down to next sublist. However, if the key and the car of
Languages
the sublist match then whole of the sublist is returned.
Another way of associating properties and their associated values to objects in LISP,
is through property lists. Considering again the earlier example of book with title as
LISP, author as Winston & Horn, Publisher as Addison-Wesley, year as 1984, we can
put this information in the database using the above-mentioned primitives as follows:
→ (putprop ’book ’LISP ’Title ) ; putprop returns the attribute values LISP
Addison-wesley
( putprop < object – name – symbol > < attribute – value > < attribute – name > )
where < object – name – symbol > and < attribute – value > must be symbols and
< attribute –value > may be any S – expr.
The newer version of COMMON LISP avoid PUTPROP and instead use SETF.
The primitive SETF : SETF is like SETQ. However, SETF is more general than
SETQ. The primitive SETF also takes two arguments, but the first argument is
allowed to be an access function in addition to being an atom. An access function
includes car, cdr and get. Second argument to SETF is the value, as is in the case of
SETQ. The above-mentioned LISP statements using purtprop can equivalently be
replaced by the following statements using SETF and GET :
The general format for associating values to attributes of an object using SETF and
GET is
( SETF ( GET < object – name –symbol > < attribute – name > )
< attribute – value > )
SETF can also be used to replace values of car or cdr of a list as follows:
26 → ( SETQ L ’ ( x y z ) )
(xyz) A.I. Languages-1:
→ ( SETF (Car L ) ’a ) LISP
(ayz)
→ ( SETF (Cdr L ) ’ ( u v w ) )
(auvw)
In general ( SETF ( car < list > ) < expr > ) replaces the car of < list > by < expr >
This usage of SETF allows us to change the values of attributes, whenever required.
etc.
The general format of GET, in order to find the value of the attribute having name
as < attribute – name > of the object having name as < object – name >, is :
If there is no value in the data-base for < attribute –name > and < object – name >, the
value nil is returned.
Note : When more than one SETF or PUTPROP are used to give different values to
the same attribute of a given object then the effect of only the latest remains. Earlier
values are overwritten. In order to change values of attributes, we write another
statement using SETF or PUTPROP. Thus, in continuation of our example about the
book entitled LISP by Winston & Horn, if in addition to the earlier statement, we give
the following statement:
It may be noted that if &optinal keyword had not been available and/or had we
defined exponentiate by replacing the parameter-list ( n & optional m ) by ( n m ),
then there would have been an error if m is not supplied assuming it to be 10.
→ ( our-sum-3 5 6 )
→ (our-sum-3 ) 5 6 7 )
18
→ (our-sum-3 5689)
ERROR
The above ERROR occurred, because for correct response by our-sum-3, minimum
number of arguments in this case must be 2 ( i.e., number of parameters before
&optimal and maximum number in this case, must be 3 (i.e. number of all parameter
before or after &optimal ), but in the last call to our-sum-3, we supplied four
arguments viz. 5, 6, 7, and 8.
Now, it is not always possible to remember the exact number of optional parameters
and hence not always possible to check erroneous function calls. To remedy this
situation, LISP provides for the keyword &REST which is followed by exactly one
argument say ‘remaining’. Then if m denotes number of parameters before &optional
and n the number of parameters after &optional but before &rest and whenever k
arguments are supplied and k > m + n, then all the remaining ( k – ( m + n ) )
arguments are grouped into a list and bound to &rest. In order to explain the ideas
explained above, let us define a function say specialsum-3 as follows.
→ (special-sum-3 5 7)
12
→ ( special-sum-3 5 7 9 )
21
→ ( special-sum-3 5 7 9 11)
21 ( 11 )
→ ( special-sum-3 5 7 9 11 12 13 )
21 ( 11 12 13 )
When a function is to be called only once in a program then we may not like to give a
name to the function in the definition of the function. In such a situation, instead of
the keyword DEFUN we use the keyword LAMBDA. Rest of the definition of the
function remains the same as it would have been under DEFUN. Suppose we need to
compute (x2 – y2)2, the following LAMBDA expression will accomplish the task:
→ (LAMBDA ( X Y )
(*(–(*XX )(*YY))
(–(*XX)(*YY))
)
)
→ ( ( LAMBDA (XY)
(* (–( *XX)(*YY)
( * X X ) ( * Y Y)
)
)
) (3 4); returns
49
APPLY takes two arguments, each of which is evaluated. The first argument, which
is either a function-name or LAMBDA expression, is applied to second argument
which is a list. FUNCALL is similar to the function APPLY with the difference that
arguments are supplied without boundary parentheses of a list. Function-names are
preferably quoted with #’ in stead of just quote.
Examples:
( APPLY #’* ( 2 3 ) )
6
→ (FUNCALL #’* 2 3 )
6
29
A.I. Programming For the earlier defined function our-sum-3 which returns sum of 2 or 3 arguments,
Languages
what ever number of arguments out of 2 or 3, are supplied. Let us consider
→ ( APPLY #’ our-sum-3 ( 4 5 ) )
9
→ ( FUNCALL #’ our-sum-3 4 5 6 )
15
→ ( APPLY #’ ( LAMBDA ( X Y ) ( * ( + x x ) ( + y y ) ) ) ( 3 4 ) )
48
→ ( funcall #’ ( LAMBDA ( X Y ) ( * ( + X X ) ( + Y Y ) ) ) 3 4 )
48
The Backquote facility : The backquote is just like quoted expression and evaluates
to itself except the following difference : Those subexpressions of the expression that
are preceded by a comma or by the comma followed by the symbol @ are evaluated
and substituted appropriately before returning the result. Let A be bound to
’ ( 3 x 4 ) then
→ ‘ ( A B C ) ; evaluates to
(ABC)
→ ‘ ( ,A B C ) ; evalutes to
((3x4)BC)
→ ‘ ( A ,A B ,@ A C ) ; evaluates to
(A(3x 4)B3x 4 c)
In the contexts in which a symbol has or is expected to have some object or S – expr
associated with it, it is called a variable. The symbol book1 becomes a variable,
when associated with the object which represents a book entitled LISP, authored by
30
Winston & Horn in the year 1984. The association between the symbol and the object A.I. Languages-1:
may be achieved through the LISP statement: LISP
The associated object may be referred to as the value of the symbol. The variable may
be considered as the ordered pair: (symbol, value).
Also, a symbol used in the parameter list of a function definition, though does not
have any associated value or object at the time of definition, yet is a variable because
it is expected to be associated with some object at the time of application of the
function.
Bound & Free Variables: A symbol that appears in the parameter list of a procedure,
is called a bound variable w.r.t the procedure. A symbol, that does not appear in the
parameter list of a procedure, is called a free variable w.r.t the procedure.
Representation of Symbols: In LISP environment, the link between a symbol and the
associated object is unique and is achieved through the following mechanism.
LISP system maintains a Symbol Table (in some part of the memory) in which each
symbol, when encountered for the first time, is entered alongwith some starting
address, say 3000, of some location in the memory where the associated object is
stored. We may note that some of the components of the object may be changed over
time, e.g., if the copies of book1 are again printed in the in the year 1988. In such a
case, the component ‘ (printing first 1984)’ of the object is changed by ‘ ( printing
second 1988)’. However, the entry (book1 3000) remains unchanged. Next time when
book1 occurs in a program, the LISP system searches through its possible occurrences
in the symbol table and on finding it there, does not attempt to associate with it
another address or location in memory. Further, the statements like,
will associate address 3000 ( i.e. the address associated with book1 ) with symbols
book2 and book3 as their address parts of (symbol, address) pairs. Thus, we may also
say that a variable is an ordered pair ( symbol, pointer ).
The Predicate EQ : EQ returns t if and only if the internal structures of its arguments
are identical. Hence, continuing with the earlier discussion, the value t is returned in
all the following three cases:
( EQ ’ book1 ’ book1 )
( EQ ’ book1 ’ book2 )
( EQ ’ book2 ’ book3 )
B
2. The list ( ( A B ) C ) is represented by the cons cell structure:
A C
B
3. The list ( A ( B C ) ) is represented by the cons cell structure:
32
C
A.I. Languages-1:
Remarks: The name cons in the cons-cell structures, is justified on the LISP
following grounds:
Let L1 be a list and A be an atom and we have LISP statement ( setq L ( Cons ’ A L1))
then L is represented by adding one cons cell in the memory as shown below:
A Structure
for L 1
Also if L2 is another list then, on the command, ( setq M ( Cons L2 L1) ) resultant list
M is obtained by adding one cons cell in memory as shown below:
M:
It can be easily seen that using the Cons cells representation for lists, operation like
CAR, CDR, ATOM etc can be efficiently implemented.
The cons-cell structure suggests that a cons-cell in memory may represent a LISP
object in which the CDR need not be a list but may be an atom or a symbol.
Dotted Pair: A LISP data structure pair is a structure like list with the difference that
CDR of a dotted pair may be an atom also. A dotted pair with CAR as symbol A and
CDR as symbol B is denoted by (A . B) with spaces around dot on both sides. Thus,
cons-box representation for dotted pair (A . B) is
A B
Now we explain the two primitives. Both RPLACA and RPLACD take two
arguments. For RPLACA, first argument is a non-empty list say bound to a symbol,
say, X and second is an arbitrary LISP object say bound to a symbol, say, Y. Then
(RPLACA X Y ) replaces ( Car X ) by Y in the given list and the resulting list is still
bound to X. For RPLACD, the first argument is again a non-empty list bound to the
symbol, say, X and the second argument also must be a list, say, bound to Y then
(RPLACD X Y ) replaces ( CDR X ) by Y and the resulting list is still bound to X.
Examples:
→ ( SETQ X ) ’( ( a b ) c d ) ; returns
((ab)cd )
→ ( SETQ Y 3 ) ; returns
3
→ ( REPLACA X Y ) ; returns
( 3 c d); Further if we give
→ ( SETQ Z ’ ( 3 7 9 ) ) ; returns
(379)
→ ( SETQ U ’( c e f ) g ) ) ; returns
((cef))
→ ( REPLACA Z U ); returns;
( ( ( c e f ) g ) 7 9 ); this list is bound to Z
→ ( REPLACD Z V ) ; returns
( ( (e e f ) g ) ( ( a b ) c ) )
The above two primitives viz RPLACA and RPLACD can be obtained from SETF as
follows:
( RPLACA L S ) is same as ( SETF ( CAR L ) S )
and ( RPLACD L S ) is same as (SETF ( CDR L ) S )
In general we can make changes to lists in arbitrary positions in stead of just to the (
CAR L ) and ( CDR L), as follows:
Example :
→ ( SETQ X ’ ( ( a b ) ( c ( d e ) f ) ) ) ; returns
((ab)(c(de)f))
→ ( RPLACA ( Cdadr X) ’p )
( ( a b ) ( c p f ) ) ; still bound to X
→ (SETQ Z ’( ( a b ) ( c d e ) ) )
((ab) (cde))
For example to create an array structure named Matrix–3 with dimensionality 3 and
<dim–1>as 2, <dim–2> as 2 and <dim–3> as 3 of integers, the following LISP
statement is used:
The above statement creates an array Matrix–3 of 12 elements. The slots in Matrix–3
are empty and we will discuss how to fill values in the slots. The 12 slots in Matrix-3
are referred to as
The primitive AREF is used to refer to a particular slot in the array, e.g., .
( AREF Matrix–3 1 0 2) refers to the slot Matrix–3 ( 1, 0, 2)
The above value-assigning statement can be easily used to give value say integer g to
( i, j, k )th element of Matrix–3 as
In order to retrieve values from any slot, say ( i , j , k) of Matrix–3 we use the
following LISP statement:
→ (AREF Matrix–3 i j k)
; the value g is returned
; if g is the value stored at
; Matrix–3 ( i, j, k ) then g is returned
Note that the two-character sequence viz. # \ is used preceding a character to indicate
that the following is to be interpreted as character.
The structure created by the above type of LISP statement will be named
< structure-name > and will have <slot-i>’s as slots. We may recall that ‘<entity>’
enclosed between angular brackets indicate place-holder for entity to be suitable
replaced. The above type of LISP statement automatically generates the keyword
constructor called MAKE – <structure-name> and also automatically creates the
selector functions as <structure–name> – <slot–i> for each i.
For this purpose, a node of binary tree will have three slots: left-tree, value, right-tree.
The left-tree and right-tree are pointers.
36
Using the above description let us create the following binary tree to be called T1. A.I. Languages-1:
The children and grand-children nodes are named as T2, T3 and T4. And a diagonally LISP
crossed cell indicates nil pointer.
T1: 3
T2: 2 5 :T3
7 :T4
The following sequence of LISP statements create the tree shown above:
In the above the symbol #S indicates the fact that the part following # is a structure.
In order to access values of: left-tree, : right-tree or : value the selectors bin-tree-left-
tree, bin-tree-right-tree and bin-tree-value respectively are used, for example
Also if we may give the name of a structure then the whole of the structure would be
available, e.g.
37
A.I. Programming Further, in order to change or even create value of any component, we use the
Languages
primitive SETF. For example, if we wish to change : value component of T2 to – 5 we
can use
T1: 3
T2: –5 5 :T 3
The new shape of tree T1 after the sequence of two changes mentioned above is like:
T1: 3
T3: 5 5 :T3
T4: 7 T4: 7
1.18 SUMMARY
In order to define objects in terms of their attributes and attribute values, association
lists and property lists are used in LISP. These concepts are discussed in Section 1.13.
Some more general and robust facilities in LISP defining and applying functions in
the form of Lambda Expression, Apply, Funcall and Mapcar are discussed in
Section 1.14. The representation of symbols and associated/represented objects in
LISP environment and representation of operations on such representations are
discussed in the next three sections.
1.19 SOLUTIONS/ANSWERS
Ex 1: (i) The variable x is bound to 5 and the variable y is bound to 7. Further the
value ( 5 + 5 ) ∗ ( 7 + 7) is evaluated to 140
Ex 3: (Defun ( X Y)
( cond ( = Y 0) ’infinity)
(t (/ X Y )))
Ex 4:
Stepwise explanation
Ex 5:
( defun expo ( i j )
( do
( answer i ( * i answer ) )
; initially answer is i and is
; multiplied in each iteration by i
(power j ( − power 1 ) )
(counter ( – j 1) (– counter 1 ) )
; initially power is j and in each iteration power is reduced by 1.
; counter is an auxiliary variable
) 39
A.I. Programming ( ( zerop counter ) answer )
Languages
)
)
→ ( expo 2 3 )
8
Remarks: The clause (power j ( – power 1) ) is actually not required. However, it is
introduced to explain. It can be deleted without affecting the overall (final) result. But
it has been introduced to explain an important point about do-loop. We may be
tempted to write the above function expo by replacing the clause ( counter (− j 1)
( − counter 1 ) ) by ( counter ( − j 1 ) ( − power 1 ) ) i.e. replacing last occurrence of
counter by power; because it is also being computed in the earlier clause. But this
replacement will be wrong, leading to incorrect result because of the fact that in Do
loop all the variables, viz answer, power, and counter in the above example are
computed in parallel, using values from the previous iteration/ initialization. Current
values are available only in the same clause. Therefore, if ‘power’ replaces ‘counter’
then previous value of power would be available for processing whereas we require
the current value.
In many situations, we need sequential computation of the variables in the loop. For
this purpose LISP Provides do*. Now, the function of Example 2 above may be
rewritten using the earlier computed values of power as is given below:
( defun expo ( i j )
( do* (
( answer i ( * i answer ) )
( power j (− power 1 ) )
(counter (− power 1 )
(− power 1 )
)
)
( ( zerop counter ) answer )
)
)
Ex 6: (defun deep-length (L)
( cond
(( null L ) 0)
(( list p ( car L )) ( + deep-length (car L )) (deep-length ( cdr L ))
( t ( + 1 ( cdr L )))
)
)
B
C
40
D
Ex 8: A.I. Languages-1:
(i) ( ( u v w ) ( s ( a b c) m ) ) LISP
(ii) ( ( u v w) (s ( t u ) a b c) )
41
A.I. Programming
Languages UNIT 2 A. I. LANGUAGES-2: PROLOG
Structure Page Nos.
2.0 Introduction 42
2.1 Objectives 43
2.2 Foundations of Prolog 43
2.3 Notations in Prolog for Building Blocks 46
2.4 How Prolog System Solves Problems 50
2.5 Back Tracking 54
2.6 Data Types and Structures in Prolog 55
2.7 Operations on Lists in Prolog 57
2.8 The Equality Predicate ‘=’ 61
2.9 Arithmetic in Prolog 62
2.10 The Operator Cut 63
2.11 Cut and Fail 65
2.12 Summary 66
2.13 Solutions/Answers 67
2.14 Further Readings 70
2.0 INTRODUCTION
We mentioned in the previous unit that there are different styles of problem solving
with the help of a computer. For a given type of application, some style is more
appropriate than others. Further, for each style, some programming languages have
been developed to support it. In this context, we have already discussed two styles of
solving problems viz imperative style and functional style. Imperative style is
supported by a number of languages including C and FORTRAN. Functional style is
supported by, among others, the language LISP. The language LISP is more
appropriate for A. I. applications.
There is another style viz declarative style of problem solving. A declarative style is
non-procedural in the sense that a program written according to this style does not
state exactly how the computational process is to be carried out. Rather, a program
consists of mainly a number of declarations representing relevant facts and rules
concerning the problem domain. The solution to be discovered is also expressed as a
question to be answered or, to be more precise, a goal to be achieved. This
question/goal also forms a part of the PROLOG program that is intended to solve the
problem under consideration. The main technique, in this style, based on resolution
method suggested by Robinson (1965), is that of matching goals (to be discussed)
with facts and rules. The matching process generates new facts and goals. The
matching process is repeated for the whole set of goals, facts and rules, including the
newly generated ones. The process, terminates when either all the initial goals,
alongwith new goals generated later, are satisfied or when it may be judged or proved
that the goals in the original question are not satisfiable.
2.1 OBJECTIVES
(i) The facts and rules are represented using a syntax similar to that of predicate
logic. These facts and rules constitute what is called PROLOG
database/knowledge base.
(ii) The process of problem solving through PROLOG is carried out mainly using an
in-built inferencing mechanism based on Robinson’s resolutions method.
Through the inferencing mechanism, in the process of meeting the initially given
goal(s), new facts and goals are generated which also become part of the
PROLOG database.
43
A.I. Programming There is some minor difference between the predicate logic notation and notation
Languages
used in PROLOG. The formula of predicate logic P ∧ Q ∧ R → S is written as
S: - P, Q, R.
Where the symbol obtained from writing ‘:’ (colon) followed by ‘-’ (hyphen) is
read as ‘if ’. Further the conjunction symbol is replaced by ‘,’ (Comma).
Repeating, the symbol ‘:-’ is read as ‘if’ and comma on R.H.S stands for
conjunction.
Summarizing, the predicate logic clause P ∧ Q ∧ R → S is equivalently
represented in PROLOG as S: - P, Q, R. (Note the full stop at the end)
Problem solving style using PROLOG requires statements of the relevant facts
that are true in the problem domain and rules that are valid, again, in the domain
of the problem under consideration.
1. Mohan is tall.
In this case is_tall is a property and if Mohan is actually tall (by some
criteria), then the fact may be stated as
is_tall (mohan).
(the reason for starting the name Mohan with lower-case ‘m’ and not with
upper-case letter ‘M’ is that any sequence of letters starting with an upper-case
letter, is treated, in PROLOG, as a variable, where the names like Mohan etc.
denote a particular person and, hence, denote a constant. Details are given later
on.
44
7. Mohan is tall and Aslam is richer than John A.I. Languages-2:
may be stated , using conjunct notations of predicate logic, as PROLOG
is_tall (mohan) ∧ is_richer (aslam, john).
The above statements show that facts are constituted of the following types of
entities viz
(i) Objects (or rather object names) like, Mohan, Ram, Gold etc. In
PROLOG names of the objects are called atoms.
(ii) Numerals like, 14, 36.3 etc. But numerals are also names of course, that
of numbers. However, generally, numerals are called numbers, though,
slightly incorrectly.
Also, a constant is either an atom or a number, and
(iii) Predicates or relation names like, father, parent, is_precious, is_ richer
and is_greater etc. In a PROLOG statement.
(iv) Variables
which are better stated in general terms, in stead of being stated though innumerable
number of facts in which x’s are given specific names of persons, y’s specific names
of their respective fathers and z’s specific names of respective grand-fathers.
From the above discussion, it is now clear that knowledge of a problem domain
can be stated in terms of facts and rules. Further, facts and rules can be stated in
terms of
(i) Atoms (which represent object names) like, Mohan, Ram, Gold etc.
(ii) numbers
45
A.I. Programming (iii) Variables like, X, Y and Z. A variable may be thought of as something that
Languages
stands for some object from a set but, it is not known for which particular
object.
(iv) Predicates or relation names like, father, parent, is-precious, is_richer and
is_greater etc. and
(v) Comments: A string of characters strings generally enclosed between the pair
of signs, viz, ‘/*’ and ‘*/’, denotes a comment in PROLOG. The comments are
ignored by PROLOG system.
Out of the eight examples of statements which were considered a while ago, the first
six do not involve any of the logical operators, viz,
~ (negation), ∧ (conjunction), ∨ (disjunction) → (implication) and ↔ (bi-
implication).
Such statements which do not contain any logical operators, are called atomic
formulae. In PROLOG, atomic formulae are called structures. In general, a
structure is of the form:
functor (parameter list)
where functor is a predicate and parameter list is list of atoms, numbers variables and
even other structures. We will discuss structure again under data structures of
PROLOG.
Terms: functors, structures and constants including numbers are called terms.
In a logical language, the atoms, numbers, predicates, variables and atomic formulas
are basic building blocks for expression of facts and rules constituting knowledge of
the domain under consideration. And, as mentioned earlier, in logic programming
style of problem solving, this knowledge plays very important role in solving
problems.
In any written language, whether natural or formal, the various linguistic constructs
like words expressions, statements etc. are formed from the elements of a set of
characters. This set is called alphabet set of the language.
For each type of terms, viz, numbers, atoms, variables and structures, there are
different rules to build the type of terms. Next we discuss these rules.
46
Numbers: How numbers are represented in PROLOG is illustrated through the A.I. Languages-2:
following examples of representation of numbers: PROLOG
8 7 – 3.58 0 87.6e2 35.03e –12
In the above, the “e” notation is used to denote a power of 10. For example, 87.6 e 2
denotes the number 87.6 x 102 or 8760. The term 35.03e–12 denotes 35.03 x 10–12
Atom: an atom is represented by
(i) either a string in which the first symbol is a lower-case letter and other
characters in the string are letters, digits and underscores (but no sign
character)
(ii) a string or a sequence of characters from the alphabet set (including sign
characters) enclosed between apostrophes.
(iii) all special symbols like “?-” and “:-” are also atoms in PROLOG
Examples of Atoms
(i) circle (ii) b (iii) =(equal to sign) (iv) _(underscore)
(v) ‘→’ (vi) _beta (vii) mohan (viii) 3
(ix) abdul_kalam(uses underscore)
(x)‘abdul-kalam’
(uses hyphen. Hyphen is not allowed to be a part of an atom, but within single quotes
all character of the alphabet are allowed)
(xi) ‘Anand Prakash’
(the blank symbol is not allowed within a single atom.) But the whole sequence of
characters in ‘Anand Prakash’ including single quotes, represents a single atom.
The following are not atoms
(i) 3mohan (starts with a number)
(ii) abdul-kalam (has hyphen in-between)
(iii) Abdul_kalam (starts with a capital letter)
47
A.I. Programming Structure: As mentioned earlier, a structure is of the form:
Languages
Predicate (parameter list)
where parameter list consists of atoms and variables separated by commas .
Term: We have already mentioned that a term is either a constant or a structure. And,
we have also already discussed representations of constants and structures.
Fact with Compound Statement: For expressing facts and relations, the syntax of
PROLOG does not allow arbitrary clauses but only Horn Clauses. However, in
PROLOG goals may be conjuncted. And, we know a Horn Clause can have at most
one positive literal. Therefore, the following single (non-atomic) statement of
predicate calculus
is_tall (mohan).
is_richer (aslam, john).
In the rule
a4:- a1, a2, a3, a5.
a4 is the Head of the rule and the R.H.S of ‘:-’ , i.e., ‘a1, a2, a3, a5’ is the body of the
rule.
The Head of a rule represents a goal to be achieved and the Body represents one or
more of the subgoals (each represented by single (atomic) structure) each of which
must be achieved if the goal represented by the Head can be said to have been
achieved.
Query/Question Statements:
We mentioned earlier that for representing facts and rules in PROLOG, we are
restricted to using only Horn Clauses. However, the solving of a problem using a
PROLOG system, requires asking questions, any one of which may be an atomic goal
or may be a conjunct of more than one atomic goals. In the later case, the conjunct of
atomic goals representing the (composite) question is represented in Prolog by
writing the atomic goals separated by commas.
For example, if we want to know whether ‘Mohan is tall and Aslam is richer than
John?’ then the question may be stated in PROLOG as
A PROLOG program consists of a finite sequence of facts, rules and a query or goal
statement.
49
A.I. Programming In order to discuss solutions of problems with a PROLOG system, in addition to what
Languages
we have discussed so far we need to discuss in some detail representation of
arithmetic facts, rules and goals. Also, we need to discuss in some details data
structures in PROLOG. These topics will be taken up later on. Next, we discuss how
a PROLOG system solves a problem.
We have mentioned earlier also that solution of a problem through PROLOG system
depends on
(ii) PROLOG inferencing system, which mainly consists of three mechanisms viz
(i) Backtracking,
(ii) Unification,
(iii) Resolution.
If a PROLOG database does not contain sufficient relevant facts and rules in respect
of a particular query, then the PROLOG system will say ‘fail’ or ‘no’, even if the facts
in the query, be true in everyday life.
For example: Suppose that ‘Sita is a sister of Mohan’ is a fact in the real world.
However, if in the database, we are not given any of the following:
(i) is_sister (sita, mohan).
(ii) parents (sita, m, f).
(iii) parents (mohan, m, f).
(iv) mother (sita, m) and mother (mohan, m).
(v) father (sita, f) and father (mohan, f).
More generally, if we are not given any set of statements or relations from which we
can conclude that Sita is sister of Mohan, then PROLOG system would answer the
query:
?- is_sister (sita, moha).
Further, even if all the statements given under (i) to (v) above are in the database, but
the following rule
(vi) is_sister (X, Y):- female (X),
parents (X, M, F), parents (Y, M, F).
is not given in the PROLOG base then also, the system would answer as something
like: “Sita is a sister of Mohan” is not true.
Further, even if all the statements given under (ii) and (iii) and even rule (vi) are in the
database, but some statement equivalent to the fact female (sita).
is not given, then also the system would answer as something like:
50
Let us Assume the PROLOG database is complete in the sense that the database A.I. Languages-2:
contains all the required facts and rules about the real world that are sufficient to PROLOG
answer any relevant query. Then, we discuss how the PROLOG system solves a
problem under consideration. In order to explain the in-built mechanism of PROLOG
system, we start with the particular database given below and some queries and then
generalize the results of our discussions.
Given Database
(i) female (sita).
(ii) female (zarina).
(iii) female (sabina)
(iv) female (jane).
(v) is_sister (sita, sarita).
(vi) is_sister (anita, anil).
(vii) parents (sita, luxmi, raj).
(viii) parents (sarita, luxmi, raj).
(ix) parents (sabina, roshnara, Kasim).
(x) parents (isaac, mary, albert).
(xi) parents (aslam, roshnara, Kasim).
(xii) parents (zarina, roshnara, Kasim).
(xiii) father (jane, albert).
(xiv) father (john,albert).
(xv) father (Phillips, albert)
(xvi) mother (jane, mary).
(xvii) mother (john, mary).
(xviii) mother (phillps, ann).
(jane is half sister of Phillips and according to the definition of the PROLOG
database, jane is not a sister of Phillips.)
But to explain some important points we consider rule (xx) and do not consider the
last rule.
Query No. 1 For the database given above, consider the query:
The PROLOG system starts from the top of the database and matches the functor
is_sister of the query with the functor female of the statement which do not match
(both being constants must be identical for matching). Then PROLOG system passes
to the next statement and attempts to match is_sister with the functor also female of
the second statement in the database. In this case also, the corresponding functors do
51
A.I. Programming not match. Similarly, functor in the query does not match the functor in each of the
Languages
next two statements.
The PROLOG system passes to next (i.e., fifth) statement. The functors in the query
and the third statement are identical. Hence, the query system attempts to match, one
by one, the rest of the parts of the query with the corresponding parts of the fifth
statement. The first argument viz anita of the query does not match (being constants,
are required to be identical for matching) to the first argument viz sita of the fifth
statement.
Hence, the PROLOG system passes to sixth statement in the database. The various
components of the query match (in this case are actually identical) to the
corresponding components of the sixth statement. Hence, the query is answered as
‘yes’.
The PROLOG system attempts to match the functor is_sister of the query with
funtors, one by one, of the facts and rules of the database. The first possibility of
match is in Fact (v) which also has functor is_sister. However, the first argument of
the functor in Fact (v) is sita which does not match sabina, the corresponding
argument. Hence, the PROLOG system attempts to match with Fact (vi) which also
does not match the query. The PROLOG system is not able to find any appropriate
matching upto fact (xviii).
However, the functor is_sister in the query matches the functor is_sister of the
L. H. S. of the Rule (xix). In other words, PROLOG system attempts to match the
constant sabina given in the query with the variable X given in the rule (xix). Here
matching takes a different meaning called unification.
Unification of two terms, out of which at least one is a variable (i.e., the first letter
of the term is an upper case letter), is defined as follows:
(i) If the term other than the variable term is a constant, then the constant value is
temporarily associated with the variable. And further throughout the statement or
rule, the variable is temporarily replaced by the constant.
(ii) If both the terms are variables, then both are temporarily made synonym in the
sense any value associated with one variable will be assumed to be associated
temporarily with the other variable.
In this case, the variable temporarily associated with the constant sabina for all
occurrence of X in rule (xix).
Next, PROLOG system attempts to match second arguments of the functor is_sister of
the query and of L. H. S. of rule (xix). Again this matching is a case of unification of
Y being temporarily associated with the constant aslam through out the rule (xix).
52
And, we know the symbol ‘:-’ stands for ‘if ’. Thus, in order to satisfy the fact (or to A.I. Languages-2:
answer whether) sabina is sister of aslam, the PROLOG system need to check three PROLOG
subgoals viz, female (sabina), parents (sabina, M, F) and parents (aslam, M, F).
Before starting to work on these three subgoals, the PROLOG system marks the
rule (xix) for future reference or for backtracking (to be explained) to some
earlier fact or rule.
The first subgoal: female (sabina) is trivially matched with fact (iii).
In view of the fact that except sabina the other two arguments in the next (sub)goal viz
parents (sabina, M, F) are variables, the subgoal parents (sabina, M, F) is actually a
sort of question of the form:
Who are the parents of sabina?
For the satisfaction of this subgoal, the PROLOG system again starts the exercise of
matching/unification from the top of the database, i.e, from the first statement in the
database. The possibilities are with facts (vii), (viii),….., (xii), in which the functor
parents of the subgoal occurs as the functor of the facts (vii), (viii),….., (xii).
However, the first arguments of the functors in (vii) and (viii) do not match the first
argument of parents in the subgoal. Hence the PROLOG system proceeds further to
Fact (ix) for matching. At this stage the subgoal is
parents(sabina, M, F).
and the fact (ix) is
parents (sabina, roshnara, Kasim).
As explained earlier, M and F, being upper case letters, represent variables. Hence, the
exercise of matching becomes exercise of unification. From the way we have
explained earlier, the constant roshnara gets temporarily (for the discussion of rule
(xix) only) gets associated with the variable M and the constant kasim gets associated
with the variable F. As a consequence, the next goal parents (aslam, M, F) becomes
the (sub) goal parents (aslam, roshnara, kasim). To satisfy this subgoal, the PROLOG
system again starts the process of matching from the first statement in the PROLOG
database. Ultimately, after failure of matching with the first eight facts in the database,
subgoal is satisfied, because of matching with the fact (ix) in the database.
Remarks: In respect of the above database, the relation of is_sister is not reflexive
(i.e, is_sister (X, X) is not true for all numbers of the database), i.e., no male is his
own sister. However, for the set of all females, the relation is reflexive.
Remarks: In respect of the above database, the relation of is_sister is also not
symmetric (i.e., if is_sister (X, Y) is true then it is not necessary that is_sister (Y, X)
must be true). Any female X may be a sister of a male Y, but the reverse is not true,
because the condition female (Y) will not be satisfied. But the relation is symmetric
within the set of all females.
Remarks: However, the relation of is_sister is (fully) transitive (i.e., if is_sister (X, Y)
is true and if is_sister (Y, Z) is true, then is_sister (X, Z) must be true)
Ex 1: Query No. 3 With the same database, discuss how the following query is
answered by the Prolog system:
?- is_sister (aslam, sabina).
Ex 2: Query No 4. With the same database, discuss how the following query is
answered by the Prolog system:
?- is_sister (jane, john)
53
A.I. Programming Ex 3: Query No. 5 With the same database, discuss how the following query is
Languages
answered by the Prolog system:
?- is_sister (jane, Phillips)
2.5 BACKTRACKING
The query when translated in English becomes: ‘Find the names (denoted by Y) of all
those persons for whom sabina is a sister and further who (denoted by Y) is a sister of
zarina.’
In order to answer the above query, the first solution which the PROLOG system
comes with after searching the database is : ‘Associate sabina with Y’, i.e., sabina is
one of the possible answers to the query (which is a conjunct of two propositions). In
other words, associate Y with sabina, which, according to the facts and rules given in
the database, satisfies is_sister (sabina, sabina) and is_sister (sabina, zarina).
If we are interested in more than one answers, which are possible in this case, then,
after the answer ‘sabina’ the user should type the symbol ‘;’ (i.e., type semi-colon).
Typing a semi-colon followed by return’ serves as a direction to PROLOG system to
search the database from the beginning once again for an alternative solution.
In order to prevent the PROLOG system from attempting to find the same
answer: sabina again, the system puts markers – one on Rule (xix) and after that
on Fact (ix) (in that order).
Once the instruction from user through semi-colon is received to find another solution,
the system proceeds to satisfy Rule (xix) from Fact (x) (including) onwards. Then
through Fact (xi), for the first occurrence of Y, aslam is associated. The variable X is
already associated with constant sabina. Then aslam replaces Y in the rule (xix).
Next, PROLOG system attempts to satisfy the second subgoal which, at present, is of
the form:
is_sister (aslam, zarina).
For satisfying this goal, the PROLOG system searches the database from the top
again. Again rule (xix) is to be used. For satisfying L.H.S. of (xix) the subgoal on
R.H.S. of (xix) need to be satisfied. The first subgoal on R. H. S. of (xix) is to satisfy
the fact: female (aslam), which is not satisfied.
At this stage, the PROLOG system goes back to the association i.e. aslam to Y, and
removes this association of Y with aslam. Next, the PROLOG system attempts to
associate some other value to Y, further from the point where Y was associated with
aslam. This is what is meant by Backtracking.
54
Next, through Fact (xii), zarina is associated with Y and sabina is already associated A.I. Languages-2:
with with X. Thus, the subgoal to be searched becomes is_sister (zarina, zarina). This PROLOG
goal can be satisfied, because, three subgoals for this goal, viz, female (zarina),
parents (zarina, M, F) (where zarina is associated with X) and parents (zarina, M, F)
(where zarina is associated with Y) can be easily seen to be satisfiable from the
database. Hence, the second answer which the system gives is zarina.
Again, the user may seek for another answer to the query by typing ‘;’. The system
has already marked the fact (xii) in the database while finding the answer: zarina.
Therefore, for another answer, the PROLOG system starts from the next statement,
i.e, Fact (xiii) to search. It can be easily seen that the PROLOG system will not find
any more answers. Hence, it returns ‘fail’ or ‘NO’.
We have already discussed the concepts of atom and number. These are the only two
elementary data types in PROLOG. We also mentioned how atoms and numbers are
represented in PROLOG.
We next discuss how symbolic data is structured out of the two elementary data types
viz. atoms and numbers. PROLOG has mainly two data structures:
(i) List
(ii) Structure.
Structure
Ex 4: Give the information about the book: Computer Networks, Fourth Edition by
Andrew S. Tanenbaum published by Prentice Hall PTR in the year 2003 as a structure
in PROLOG. Further, write a query in PROLOG to know the name of the author,
assuming the author’s name is not given
In general, the structure of a sentence of the form given above may be expressed in
PROLOG as
List: The list is a common data structure which is built-in in almost all programming
languages, specially programming languages meant for non-numeric processing.
Informally, list is an ordered sequence of elements and can have any length (this is
how a list differs from the data structure array; array has a fixed length). The
elements of a list may be any terms (i.e., constants, variables and structures) and even
other lists. Thus, list is a recursive concept.
(note the last two are different lists and also note that the last list has two elements viz
1 and [2,3]
provided X + Y is defined. For example, X and Y are numbers, then as we shall see
later that X + Y is a valid expression.
Also [_ , _, Y] is a list of three elements, out of which, first two are ‘don’t care’ or
anonymous variables.
I. The operation denoted by the vertical bar, i.e. ‘|’ is used to associate Head and Tail
of a list with two variables as described below:
then X is associated with the element a, the first element of the list and Y is
associated with the list [ b, c ], obtained from the given list by removing its first
element, if any.
then X is associated with the list [a, b], the first element of the list on R.H.S. and
Y is associated with [c].
X = [a, b, c] and Y = [ ]
(vi) The operator ‘|’ is not defined for the list [ ], having no elements
57
A.I. Programming (vii) [a + b , c + d] = [X, Y]
Languages
then X is associated with a + b and Y is associated with c + d
(Note the difference in the response because vertical bar is replaced by comma)
II. Member Function the function is used to determine whether a given argument X
is a member of a given list L. Though the member function is a built-in function in
almost every implementation of PROLOG, yet the following simple (recursive)
program in PROLOG for member achieves the required effect:
Definition of member under (i) above says that if first argument of member is Head of
the second argument, then predicate member is true. If not so, then, go to definition
under (ii). The definition of member under (ii) above says that, in order to find out
whether first argument X is a member of second argument then find out whether X is
a member of the list obtained from the second argument by deleting the first element
of the second argument. From the above definition, it is clear that the case
member (X, [ ]) is not a part of the definition. Hence, the system returns FALSE.
Next, we discuss example to explain how the PROLOG system responds to the queries
involving the member function.
Example 3:
The PROLOG system first attempts to verify(i) in the definition of ‘member’, i.e.,
system matches pascal with the Head of the given list [prolog, fortran, pascal, cobol].
i.e., with prolog. The two constants are not identical, hence, (i) fails. Therefore, the
PROLOG system uses the rule (ii) of the definition to solve the problem. According to
rule (ii) the system attempts to check whether pascal is a member of the tail [fortran,
pascal, cobol] of the given list.
Again fact (i) of the definition is applied to the new list, i.e., [fortran, pascal, cobol] to
check whether pascal belongs to it. As pascal does not match the Head, i.e, fortran of
the new list. Hence, rule (ii) is applied to the new list. By rule (ii), pascal is a member
of the list [fortran, pascal, cobol], if pascal is a member of the tail [pascal, cobol] of
the current list.
Again fact (i) is applied to check whether pascal is a member of the current list
[pascal, cobol]. According to fact (i) for pascal to be a member of the current list
pascal should be Head of the current list, which is actually true. Hence, the
PROLOG system returns ‘yes’.
Goal X [– | Y] Comment
58
pascal is A.I. Languages-2:
not identical PROLOG
to head of
the list
Hence apply
rule (ii)
The append function takes two lists and returns a single list such that the elements of
the first list followed by elements of the second list constitute the elements of the
returned list. For example
Explanation: The above PROLOG program states that append takes three arguments,
each of which should be a list. The result of appending the elements of the lists in the
first and second arguments, is stored in the list represented by third argument.
Further, the result of application of (i) of the definition of append to an empty list in
first argument with a list X in second argument is the same list X as given in second
argument, but written in third argument.
However, if the first argument is not [ ] then apply rule (ii) above.
According to rule (ii) above, Head of the list in first argument becomes Head of the
resultant list (i.e., of the third argument) and the tail of the resultant list is obtained by
appending tail i.e., [tail] of the first argument to the list given as second argument of
append. Executing append according to the above definition is a recursive process. In
each successive step, the size of the list in the first argument is reduced by one and
finally the list in the first argument becomes [ ]. In the last case, fact (i) is applicable
and the process terminates.
59
A.I. Programming
Languages
Next, we explain how PROLOG system responds to a query involving append.
As first list is not [ ], therefore the system associates prolog with Head and [lisp] with
Tail. Then by rule (ii) prolog becomes the Head of the resultant list and after that
PROLOG system attempts to append the list [lisp] with [C, fortran] and the result will
form the tail of the originally required resultant list. Next, list is the Head of new first
argument and [ ] is the tail of the new first argument. However, second argument
remains unchanged. The result of the second append is a list whose first element is
prolog, second element is lisp and the rest of the elements of the resultant list will be
obtained by appending [ ] to [C, fortran]. In this case however rule (i) is applicable
which returns [C, fortran] as the result of append [ ] to [C, fortran], and which will
form the tail of the resultant list. Finally, the resultant list is [prolog, lisp, C, fortran]
IV. prefix (X, Z) function which returns true if X is a sublist of Z, such that X is
either [ ] or X contains any number of consecutive elements of the list Z starting from
the first element of Z onwards. Otherwise, prefix (X, Z) returns ‘No’ or ‘False’.
Further explain, justify and trace your program with an example.
The PROLOG program prefix is just one statement program:
returns yes. The processing for the response, by the PROLOG system may be
described as follows:
60
Let us consider another query: A.I. Languages-2:
PROLOG
?-prefix (X, [a, b, c]).
In response to the query the PROLOG system, first returns X as [ ], then if the user
gives ( ; ) to find another answer, then PROLOG system returns X as the list [a]. If,
further another answer is required then PROLOG system returns X as the list [a, b]. If
still another answer is required, the answer [a, b, c] is returned. Finally, if still another
response is required by the user, then the system responds with a ‘No’.
Let Term1 and Term2 be two terms of PROLOG, where a term may be a constant, a
variable or a structure. Then the success or failure of the goal
?-Term1 = Term2.
is discussed below:
Case I: Both Term1 and Term2 are constants and, further, if both are identical then
the goal succeeds, and, if not identical, then the goal fails.
Examples: The query
?-mohan = mohan succeeds
?-1287 = 1287 succeeds
?-program = programme fails
?- 1287 =1289 fails.
Case II: One of the terms is a variable and if the variable is not instantiated then the
goal always succeeds
For example
?-brother (mohan, sita) = X.
Then the goal succeeds and the variable X is instantiated to the term
brother (mohan, sita)
Case III: One of the terms is a variable and the variable is instantiated:
Then apply ‘=’ recursively to the case which is obtained by replacing the
variable by what is its instantiation. For example, consider the query
?-X = tree.
And X is already instantiated to tree, then, replace X by tree in the given query, which
takes the new form:
?tree = tree
On encountering this new query, PROLOG system returns success. However, if X is
already instantiated to any other constant say flower, then the new query becomes:
?-flower = tree.
Applying Case I above, we get Fail.
Case IV: If both terms are variables, say, the query is of the form:
61
A.I. Programming ?-X = Y
Languages
then if
(a) both are uninstantiated the query succeeds, but, if during further processing
one of X or Y gets instantiated to some term then other variable also gets
instantiated to the same term.
(b) If one or both are instantiated, then replace X and/or Y by their respective
instantiations to generate a new query and apply ‘=’ to the new query.
Case V: Both terms are structures: Two structures satisfy ‘=’ if they have the same
functor and the same number of components and, further, if the definition of ‘=’is
applied recursively to the corresponding components, then each pair of corresponding
components satisfies ‘=’.
For example:
Then the above goal succeeds and X is instantiated to sita and Y is instantiated to
mohan
The language PROLOG has been designed mainly for symbolic processing for A.I
applications. However, some facilities for numeric processing are also included in
PROLOG. For this purpose, the arithmetic operations viz
+ (addition), – (substraction), * (multiplication), / (division) and ∧
(exponentiation)
are built-in and are used in infix notation.
62
The ‘is’ operator A.I. Languages-2:
It may be noted that in PROLOG, the expression ‘3 + 7’ is not the same as the PROLOG
number 10. The operation of ‘+’ does not execute automatically. For the purpose of
execution of an operation, PROLOG provides the operator ‘is’.
Note: In the above definition, each of Number, New, Partial and Result, denotes a
variable.
Explanation of PROLOG program for factorial: If the given number is 0, then its
factorial is 1. Further, result of computation for factorial of any number Number will
be associated with the variable Result and the goal can be achieved through the
following four subgoals:
In some situations, as discussed by the following example, if the goal is met once, the
problem is taken as solved without need for iterations any more.
However, we know that PROLOG system attempts to re-satisfy a goal, even after
satisfaction of the goal once. For example, for the following one statement PROLOG
program
prefix (X, Z):- append (X, Y, Z).
if the query is
?-prefix (X, [a, b]).,
then first X is associates with [ ] as X = [ ] as an answer to the query.
63
A.I. Programming In the next iteration, the PROLOG system associates the list [a] to X, i.e., .X = [a].
Languages
In the still next iteration, the PROLOG system responds with
X = [a, b].
And finally, the system responds with a ‘no’.
The above rule says predicate trial succeeds if the subgoals on R.H.S. succeed. The
PROLOG system may backtrack between subgoals a, b and c as long as it is required
by the system answer the query, for the predicates a, b and c. The change in the
behaviour of PROLOG system due to the presence of ‘!’ occurs only after the subgoal
c succeeds and PROLOG system encounters the cut symbol ‘!’. PROLOG system
always succeeds on the cut operator represented by ‘!’. And hence PROLOG
system attempts to satisfy the subgoal d. Further, if d succeeds then backtracking may
occur between d, e, f and g.
In the process, it is possible that several backtrackings may occur between a, b and c
and several (independent) backtrackings may occur between d, e, f and g. However,
once d fails and crosses on the left to the operator !, then no more attempts will be
made to re-satisfy c on the left of the operator !.
Example 6: We define below the membership for a set (in a set each element occurs
only once). Hence, once a particular element is found to occur then there is no need
for further testing, for the occurrence of the element in a set. The predicate member, in
this sense, may be defined as
member (X, [X | – ]):- !. (i)
member (X, [_| Y]: - member (X, Y). (ii)
The statement (i) above says that if the element X is the Head of the list (i.e., X is the
first element of the list) then the R.H.S. must succeed. However, R.H.S. consists of
only the cut symbol ‘!’ which is always true. Further, in the case when the operator is
the cut symbol then PROLOG system is not allowed to go to its Left. Hence, the
PROLOG system after succeeding on ‘!’ exits program execution because of the
restriction that the system is not allowed to backtrack from ‘!’.
Example 7: One of the possible PROLOG programs for finding maximum of two
given numbers is the following two-statement program:
max (X, Y, X):- X > = Y. (i)
/* third argument is for the variable supposed to be associated with the result */
max (X, Y, Y): - X < Y. (ii)
But we know that if X > = Y then X is the maximum of the two, otherwise Y is the
maximum. In other words, the comparison X < Y is not required if the first rule fails.
This wastage of effort may be saved by using the following program for maximum, in
stead of the one given through (i) and (ii) above.
64
The more efficient program for the required solution is A.I. Languages-2:
max (X, Y, X): - X > = Y, !. PROLOG
max (X, Y, Y).
Example: In order to explain the use of cut, we write a program to find the
factorial(N) using cut as follows:
fact (N, 1) :- N < = 1, !
fact (N, F):- M is N – 1, !
fact (M, F1),
F is F1 * N.
Before discussion of the cut and fail combination, let us first discuss the predicate fail.
The predicate fail wherever occurs always fails and initiates backtracking. For
example, Let number denote a predicate such that number (X) is true whenever X is a
number, else it fails. Further, suppose sum (X, Y, Z) associates with Z the sum of the
numbers X and Y. We define a function add (X, Y, Z) which is similar to sum (X, Y,
Z), except that the function add first verifies that each of X and Y is a number.
Then add can be written as
add (X, Y, Z):- number (X), number (Y), sum (X, Y, Z).
add (X, Y, Z):- fail.
65
A.I. Programming Except for the first one, all others rules are left unspecified. However, these other
Languages
rules do not involve the predicate foreigner but may depend upon income,
concession/rebate etc.
Now suppose Alberts is a foreigner. Then According to the first rule foreigner
(alberts) succeeds and then the predicate fail makes the goal on L.H.S not to succeed.
Hence PROLOG system backtracks and attempts the second and later rules which do
not involve foreigner.
Hence, some of the later goals may succeed, despite the fact that the tax for albert
should not be calculated. However, the error can be rectified by using cut. The
following programme outline gives the correct result in the case of foreigners:
tax (Name, Income, Tax):- foreigner (Name), !, fail.
tax (Name, Income, Tax):-……….
tax (Name, Income, Tax):-……….
.
.
.
Because of the presence of cut in the first rule before fail, if foreigner (name)
succeeds, no backtracking takes place after returning to cut from the execution of fail.
Hence, If foreigner (name) succeeds, the program execution stops after returning back
to ‘!’.
However, if foreigner (Name) fails, then processing by PROLOG system continues
from the second rule onwards, according to normal rules of PROLOG execution.
2.12 SUMMARY
In Section 2.2 the basic concepts of PROLOG, including those of atom, predicate,
variable, atomic formula, goal and fact etc. are discussed and defined in English. The
syntax for various elements and constructs of PROLOG including for the above-
mentioned concepts and also for the concepts of structure, term, rule, query, goal,
subgoal and program etc are (formally) defined in Section 2.3.
The concept of database is defined in Section 2.4. However, Section 2.4 is mainly
devoted to explaining how a PROLOG system solves a problem where the problem is
defined to the system in the way discussed above. In the process of explanation,
important concepts like matching and unification are introduced and defined.
66
We know built-in data types and data structures play an important role in any A.I. Languages-2:
programming environment which is generally based on some problem solving PROLOG
style/paradigm including the imperative style which is the dominant style of solving
problems using computer. Data types and data structures play a significant role in
problem solving in PROLOG environment also and are discussed in Section 2.6.
Prolog has basically two (elementary) data types viz atom and number. Also,
PROLOG has mainly two data structures viz List and Structure.
The operations available in PROLOG, for data structure List are discussed in Section
2.7. The predicate equal (denoted as ‘=’) which plays an important role in PROLOG,
specially in matching and unification, is discussed in Section 2.8.
PROLOG is mainly a symbol processing language. But, in view of the fact that even
for the problems that require for their solutions dominantly symbol processing, some
numeric processing may also be required. For this purpose, the arithmetic facilities
built in PROLOG are discussed and defined in Section 2.9. Recursion and iteration
are among major mechanism of PROLOG problem solving process. However, in
order to check undesirable repetitions (through recursion or iteration) PROLOG
system provides for two very important predicates viz Cut and Fail. These predicate
are discussed in Sections 2.10 and 2.11.
2.13 SOLUTIONS/ANSWERS
Ex 1: The goal fails, because, after matching with either rule (xix) or rule (xx) of the
query, one of the subgoals generated is female (aslam). But this sub-goal can not be
satisfied by the facts and other rules in the database.
Ex 2: As in the previous query, the PROLOG system while attempting to match and
then rejecting the previous facts and rules, reaches rule (xix). And, as in the previous
query, requires to satisfy two subgoals, which after appropriate unification, become
parents(jane, M, F).
parents (john, M, F).
But, the first of these fails and hence the whole of rule (xix) itself is taken as not
matching and is abandoned for the next rule/fact in the database. The next rule (xx)
is_sister (X, Y):- female (X), mother (X, M), father (X, F)
mother (Y, M), father (Y, F).
has functor is_sister in the goal and hence matches the functor of the goal. As
explained earlier, matching becomes the exercise of unification of the goal, i.e., of
female (jane).
mother (jane, M).
mother (john, M).
father (jane, F).
father (john, F).
67
A.I. Programming First of these subgoals is easily satisfied by the Fact (iv) of the database. Second of
Languages
these subgoals viz mother (jane, M) is a sort of question of the form: Who is jane’s
mother?.
Fact (xvi) in the database tells us that mary is mother of jane, and hence, the variable
M temporarily (for the whole of rule (xx)) gets associated with the constant mary.
Thus, the next subgoal ‘mother (john, M).’ becomes the goal ‘mother (john, mary).’
which is given as Fact (xvii) and hence is satisfied.
Fourth subgoal father (jane, F) is equivalent to the question: ‘Who is janes Father?’
The Fact (xiii) in the database associates to variable F the constant albert for the
whole of rule (xx). In the light of this association, the last subgoal becomes father
(john, albert). But this goal is given as Fact (xiv) in the database and hence is
satisfied. The PROLOG system answers the query as ‘yes’.
female (jane).
mother (jane, M).
mother (phillips, M)
father (jane, F).
father (phillips, F).
As, in the case of previous query, while satisfying subgoal mother (jane, M), the
variable M is associated with mary for all the subgoals. Hence the subgoal mother
(phillips, M) becomes mother (phillips, mary). But there is no fact that matches the
last subgoal. Hence the query fails.
The PROLOG system respeonds with ‘No’.
Goal X [– | Y] Comment
5. pascal [cobol |] fact (i) not satisfied. Hence
apply rule (ii)
6. pascal [ ] cannot be satisfied. The
system says the goal cannot
be satisfied.
Explanation The only statement in the program, states that the statement: ‘The list Y
is a suffix of list Z’ is true if there is a list X to which if Y is appended then the list Z is
obtained.
Example query
Ex 8: (i) succeeds, because, first of all the two variables X and Z become co-referred
variables. Then Z gets instantiated to c and hence X also gets instantiated to c
(iii) also fails, because the variable X gets instantiated to the constant a but then
system can not instantiate the second occurrence of X to some other constant, in
this particular case, to the constant b.
69
A.I. Programming (iv) succeeds because the R.H.S. is a variable. The variable Noun is associated to the
Languages
constant noun (alpha)
(v) fails, both sides of ‘=’ are constants but not identical:
1. Clocksin, W.F. & Mellish, C.S. : Programming in Prolog (Fifth Edition), Springer
(2003).
2. Clocksin, W.F. & Mellish, C.S. : Programming in Prolog (Third Edition), Narosa
Publishing House (1981).
3. Tucker, A. & Noonan, R. : Programming Languages: Principles and Paradigms,
Tata McGraw-Hill Publishing Company Limited (2002).
4. Sebesta, R. W. : Concepts of Programming Languages, Pearson Education Asia
(2002).
70
Applications of Artificial
Intelligence UNIT 1 EXPERT SYSTEMS
Structure Page Nos.
1.0 Introduction 5
1.1 Objectives 5
1.2 An Introduction to Expert Systems 6
1.3 Concept of Planning, Representing and using Domain Knowledge 7
1.4 Knowledge Representation Schemes 7
1.4.1 Semantic Networks
1.4.2 Frames
1.4.3 Proposition and Predicate Logic
1.4.4 Rule Based Systems
1.4.4.1 Forward Chaining Systems
1.4.4.2 Backward Chaining Systems
1.4.4.3 Probability Certainty Factors in Rule Based Systems
1.5 Examples of Expert Systems: MYCIN, COMPASS 24
1.6 Expert System Building Tools 25
1.6.1 Expert System Shells
1.6.1.1 Knowledge Base
1.6.1.2 Knowledge Acquisition Subsystem (Example: COMPASS)
1.6.1.3 Inference Engine
1.6.1.4 Explanation Sub-system (Example: MYCIN)
1.6.1.5 User Interface
1.6.1.6 An Expert System shell: EMYCIN
1.7 Some Application Areas of Expert Systems 32
1.8 Summary 32
1.9 Solutions/Answers 33
1.10 Further Readings 35
1.0 INTRODUCTION
Computer Science is the study of how to create models that can be represented in and
executed by some computing equipment. A number of new types of problems are
being solved almost everyday by developing relevant models of the domains of the
problems under consideration. One of the major problems which humanity encounters
is in respect of scarcity of human experts to handle problems from various domains of
human experience relating to domains including domains those of health, education,
economic welfare, natural resources and the environment. In this respect, the task for
a computer scientist is to create, in addition to a model of the problem domain, a
model of an expert of the domain as problem solver who is highly skilled in solving
problems from the domain under consideration. The field of Expert Systems is
concerned with creating such models. The task includes activities related to eliciting
information from the experts in the domain in respect of how they solve the problems,
and activities related to codifying that information generally in the form of rules. This
unit discusses such issues about the design and development of an expert system.
1.1 OBJECTIVES
First of all we must understand that an expert system is nothing but a computer
program or a set of computer programs which contains the knowledge and some
inference capability of an expert, most generally a human expert, in a particular
domain. As expert system is supposed to contain the capability to lead to some
conclusion based on the inputs provided, information it already contains and its
processing capability, an expert system belongs to the branch of Computer Science
called Artificial Intelligence.
Mere possessing an algorithm for solving a problem is not sufficient for a program to
be termed an expert system, it must also possess knowledge i.e., if there is an expert
system for a particular domain or area and if it is fed with a number of questions
regarding that domain then sooner or later we can expect that these questions will be
answered. So we can say that the knowledge contained by an expert system must
contribute towards solving the problems for which it has been designed.
Taking into consideration all the points which have been discussed above, let us try to
give one of the many possible definitions of an Expert System:
An expert system may or may not provide the complete expertise or functionality of a
human expert but it must be able to assist a human expert in fast decision making. The
program might interact with a human expert or with a customer directly.
From our everyday experience, we know that in order to solve difficult problems, we
need to do some sort of planning. Informally, we can say that Planning is the process
that exploits the structure of the problem under consideration for designing a sequence
of actions in order to solve the problem under consideration.
The knowledge of nature and structure of the problem domain is essential for planning
a solution of the problem under consideration. For the purpose of planning, the
problem environments are divided into two categories, viz., classical planning
environments and non-classical planning environments. The classical planning
environments/domains are fully observable, deterministic, finite, static and discrete.
On the other hand, non-classical planning environments may be only partially
observable and/or stochastic. In this unit, we discuss planning only for classical
environments.
7
• It should allow us to express the knowledge we wish to represent in the language. Expert Systems
For example, the mathematical statement: Every symmetric and transitive
relation on a domain, need not be reflexive is not expressible in First Order
Logic.
• It should allow new knowledge to be inferred from a basic set of facts, as
discussed above.
• It should have well-defined syntax and semantics.
Semantic networks,
Frames,
First order logic, and,
Rule-based systems.
As semantic networks, frames and predicate logic have been discussed in previous
blocks so we will discuss these briefly here. We will discuss the rule-based systems
in detail.
For example, the fact (a piece of knowledge): Mohan struck Nita in the garden with
a sharp knife last week, is represented by the semantic network shown in Figure 4.1.
struck
past of
time agent
last week strike Mohan
instrument
place
object knife
garden
Nita
property of
sharp
8
Applications of Artificial
Intelligence
The two most important relations between concepts are (i) subclass relation between a
class and its superclass, and (ii) instance relation between an object and its class.
Other relations may be has-part, color etc. As mentioned earlier, relations are
indicated by labeled arcs.
1.4.2 Frames
Frames are a variant of semantic networks that are one of the popular ways of
representing non-procedural knowledge in an expert system. In a frame, all the
information relevant to a particular concept is stored in a single complex entity, called
a frame. Frames look like the data structure, record. Frames support inheritance. They
are often used to capture knowledge about typical objects or events, such as a car, or
even a mathematical object like rectangle. As mentioned earlier, a frame is a
structured object and different names like Schema, Script, Prototype, and even Object
are used in stead of frame, in computer science literature.
Mammal :
Subclass : Animal
warm_blooded : yes
Lion :
subclass : Mammal
eating-habbit : carnivorous
size : medium
Raja :
instance : Lion
colour : dull-Yellow
owner : Amar Circus
Sheru :
instance : Lion
size : small
9
A particular frame (such as Lion) has a number of attributes or slots such as eating- Expert Systems
habit and size. Each of these slots may be filled with particular values, such as the
eating-habit for lion may be filled up as carnivorous.
Sometimes a slot contains additional information such as how to apply or use the slot
values. Typically, a slot contains information such as (attribute, value) pairs, default
values, conditions for filling a slot, pointers to other related frames, and also
procedures that are activated when needed for different purposes.
But in case of multiple inheritance i.e., in case of an object having more than one
parent class, we have to decide which parent to inherit from. For example, a lion may
inherit from “wild animals” or “circus animals”. In general, both the slots and slot
values may themselves be frames and so on.
Exercise 2: Define a frame for the entity date which consists of day, month and year.
each of which is a number with restrictions which are well-known. Also a procedure
named compute-day-of-week is already defined.
On the other hand, none of the following sentences can be assigned a truth-value, and
hence none of these, is a statement or proposition:
Generally, the following steps are followed for solving problems using
propositional logic, where the problems are expressed in English and are such that
these can be solved using propositional logic (PL):
10
Applications of Artificial b) Next, some of the rules of inference in PL including the ones mentioned
Intelligence below, are used to solve the problem, if, at all, the problem under
consideration is solvable by PL.
Solution:
Let us denote the atomic statements in the argument given above as follows:
M: Matter always existed
TG: There is God
GU: God created the universe.
Then the given statements in English, become respectively the formulae of PL:
(i) M
(ii) TG→GU
(iii) GU→ ∼ M
(iv) To show ∼ TG
Applying transposition to (iii) we get
(v) M→ ∼GU
using (i) and (v) and applying Modus Ponens, we get
(vi) ∼GU
Again, applying transposition to (ii) we get
(vii) ∼GU→ ∼TG
Applying Modus Ponens to (vi) and (vii) we get
(viii) ∼TG
The formula (viii) is the same as formula (iv) which was required to be proved.
Exercise 3: Using prepositional logic, show that, if the following statements are
assumed to be true:
(i) There is a moral law.
(ii) If there is a moral law, then someone gave it.
11
(iii) If someone gave the moral law, then there is God. Expert Systems
then the following statement is also true:
(iv) There is God.
(i) All children more than 12 years old must exercise regularly.
(ii) Ram is more than 12 years old.
Now ‘these statements should be sufficient enough to allow us to conclude: Ram must
exercise regularly. However, in propositional logic, each of the above statement is
indecomposable and may be respectively denoted by P and Q. Further, whatever is
said inside each statement is presumed to be not visible. Therefore, if we use the
language of propositional logic, we are just given two symbols, viz., P and Q,
representing respectively the two given statements. However, from just two
propositional formulae P and Q, it is not possible to conclude the above mentioned
statement viz., Ram must exercise regularly. To draw the desired conclusion with a
valid inference rule, it would be necessary to use some other language, including some
extension of propositional logic.
Predicate Logic, and more specifically, First Order Predicate Logic (FOPL) is an
extension of propositional logic, which was developed to extend the expressiveness of
propositional logic. In addition to just propositions of propositional logic, the
predicate logic uses predicates, functions, and variables together with variable
quantifiers (Universal and Existential quantifiers) to express knowledge.
We already have defined the structure of formulae of FOPL and also have explained
the procedure for finding the meaning of formulae in FOPL. Though, we have already
explained how to solve problems using FOPL, yet just for recalling the procedure for
solving problems using FOPL, we will consider below one example.
In this context, we may recall the inference rules of FOPL. The inference rules of PL
including Modus Ponens, Chain Rule and Rule of Transposition are valid in FOPL
also after suitable modifications by which formulae of PL are replaced by formulae of
FOPL.
In addition to these inference rules, the following four inference rules of FOPL, that
will be called Q1, Q2, Q3 and Q4, have no corresponding rules in PL. In the following F
denotes a predicate and x a variable/parameter:
~ (∃x) F ( x) (∀x) ~ F ( x)
Q1: and
(∀x) ~ F ( x) ~ (∃x) F ( x)
12
Applications of Artificial ~ (∀x) F ( x) (∃x) ~ F ( x)
Intelligence Q2: and
(∃x) ~ F ( x) ~ (∀x) F ( x)
(∀x) F ( x)
Q3: , where a is (any) arbitrary element of the domain of F
F (a)
The rule Q3 is called universal instantiation
F (a ), for arbitrary a
Q⎢ 3 :
(∀x) F ( x)
( ∃x ) F ( x )
Q4: , where a is a particular (not arbitrary) constant.
F (a )
Step 1: Conceptualization: First of all, all the relevant entities and the relations that
exist between these entities are explicitly enumerated. Some of the implicit facts like
‘a person dead once is dead forever’ have to be explicated.
Step 2: Nomenclature and Translation: Giving appropriate names to the objects and
relations. And then translating the given sentences given in English to formulae in
FOPL.
Appropriate names are essential in order to guide a reasoning system based on FOPL.
It is well-established that no reasoning system is complete. In other words, a reasoning
system may need help in arriving at desired conclusion.
While solving problems with FOPL, generally, the proof technique is proof by
contradiction. Under this technique, the negation of what is to be proved is also taken
as one of the assumptions. Then from the given assumptions alongwith the new
assumption, we derive a contradiction, i.e., using inference rules, we derive a
statement which is negation of either an assumption or is negation of some earlier
derived formula.
Next, we give an example to illustrate how FOPL can be used to solve problems
expressed in English and which are solvable by using FOPL. However, the proposed
solution does not use the above-mentioned method of contradiction.
13
Example Expert Systems
Solution:
For translating the given statements (i), (ii) & (iii), let us use the notation:
To prove
(iii) (∀x) ( F(x)→ ∼ C(x) )
Before we try to answer the above question, let us review some of the properties of
logic reasoning systems including predicate calculus. We must remember that three
important properties of any logical reasoning system are soundness, completeness and
tractability. To be confident that an inferred conclusion is true we require soundness.
To be confident that inference will eventually produce any true conclusion, we require
completeness. To be confident that inference is feasible, we require tractability.
But the situation is worse than this, as even on problems for which resolution
refutation terminates, the procedure is NP-hard – as is any sound and complete
inference procedure for the first-order predicate calculus i.e., it may take
exponentially large time to reach a conclusion.
People who have done research in Artificial Intelligence have shown various ways:
First, they say that we should not insist on the property of soundness of inference
rules. Now what does it mean – basically it means that sometimes or occasionally our
rules might prove an “untrue formula”.
Second, they say that we should not insist on the property of completeness i.e., to
allow use of procedures that are not guaranteed to find proofs of true formulas.
Third, they also suggest that we could use a language that is less expressive that the
predicate calculus. For example, a language in which everything is expressed using
only Horn Clauses ( Horn clauses are those which have at most one positive literal).
Rather than representing knowledge in a declarative and somewhat static way (as a set
of statements, each of which is true), rule-based systems represent knowledge in terms
of a set of rules each of which specifies the conclusion that could be reached or
derived under given conditions or in different situations. A rule-based system consists
of
(i) Rule base, which is a set of IF-THEN rules,
(ii) A bunch of facts, and
(iii) Some interpreter of the facts and rules which is a mechanism which decides
which rule to apply based on the set of available facts. The interpreter also
initiates the action suggested by the rule selected for application.
15
Expert Systems
A Rule-base may be of the form:
R1: If A is an animal and A barks, than A is a dog
F1: Rocky is an animal
F2: Rocky Barks
The rule-interpreter, after scanning the above rule-base may conclude: Rocky is a dog.
After this interpretation, the rule-base becomes
In a forward chaining system we start with the initial facts, and keep using the rules
to draw new intermediate conclusions (or take certain actions) given those facts. The
process terminates when the final conclusion is established. In a backward chaining
system, we start with some goal statements, which are intended to be established and
keep looking for rules that would allow us to conclude, setting new sub-goals in the
process of reaching the ultimate goal. In the next round, the subgoals become the new
goals to be established. The process terminates when in this process all the subgoals
are given fact. Forward chaining systems are primarily data-driven, while backward
chaining systems are goal-driven. We will discuss each in detail.
Advantages of Rule-base
Disadvantages
The main problem with the rule-based systems is that when the rule-base grows and
becomes very large, then checking (i) whether a new rule intended to be added is
redundant, i.e., it is already covered by some of the earlier rules. Still worse, as the
rule- base grows, checking the consistency of the rule-base also becomes quite
difficult. By consistency, we mean there may be two rules having similar conditions,
the actions by the two rules conflict with each other.
Let us first define working memory, before we study forward and backward chaining
systems.
In a forward chaining system the facts in the system are represented in a working
memory which is continually updated, so on the basis of a rule which is currently
being applied, the number of facts may either increase or decrease. Rules in the
system represent possible actions to be taken when specified conditions hold on items
in the working memory–they are sometimes called condition-action or antecedent-
consequent rules. The conditions are usually patterns that must match items in the
working memory, while the actions usually involve adding or deleting items from the
working memory. So we can say that in forward chaining proceeds forward,
beginning with facts, chaining through rules, and finally establishing the goal.
Forward chaining systems usually represent rules in standard implicational form, with
an antecedent or condition part consisting of positive literals, and a consequent or
conclusion part consisting of a positive literal.
The interpreter controls the application of the rules, given the working memory, thus
controlling the system’s activity. It is based on a cycle of activity sometimes known as
a recognize-act cycle. The system first checks to find all the rules whose condition
parts are satisfied i.e., the those rules which are applicable, given the current state of
working memory (A rule is applicable if each of the literals in its antecedent i.e., the
condition part can be unified with a corresponding fact using consistent substitutions.
This restricted form of unification is called pattern matching). It then selects one and
performs the actions in the action part of the rule which may involve addition or
deleting of facts. The actions will result in a new i.e., updated working memory, and
the cycle starts again (When more than one rule is applicable, then some sort of
external conflict resolution scheme is used to decide which rule will be applied. But
when there are a large numbers of rules and facts then the number of unifications that
must be tried becomes prohibitive or difficult). This cycle will be repeated until either
there is no rule which fires, or the required goal is reached.
Rule-based systems vary greatly in their details and syntax, let us take the
following example in which we use forward chaining :
Example
Let us assume that the working memory initially contains the following facts :
(day monday)
(at-home ram)
(does-not-like ram)
R1 : IF (day monday)
THEN ADD to working memory the fact : (working-with ram)
R2 : IF (day monday)
THEN ADD to working memory the fact : (talking-to ram)
17
R3 : IF (talking-to X) AND (working-with X) Expert Systems
THEN ADD to working memory the fact : (busy-at-work X)
R4 : IF (busy-at-work X) OR (at-office X)
THEN ADD to working memory the fact : (not-at-home X)
R5 : IF (not-at-home X)
THEN DELETE from working memory the fact : (happy X)
R6 : IF (working-with X)
THEN DELETE from working memory the fact : (does-not-like X)
Now to start the process of inference through forward chaining, the rule based
system will first search for all the rule/s whose antecedent part/s are satisfied by the
current set of facts in the working memory. For example, in this example, we can see
that the rules R1 and R2 are satisfied, so the system will chose one of them using its
conflict resolution strategies. Let the rule R1 is chosen. So (working-with ram) is
added to the working memory (after substituting “ram” in place of X). So working
memory now looks like:
(working-with ram)
(day monday)
(at-home ram)
(does-not-like ram)
Now this cycle begins again, the system looks for rules that are satisfied, it finds rule
R2 and R6. Let the system chooses rule R2. So now (taking-to ram) is added to
working memory. So now working memory contains following:
(talking-to ram)
(working-with ram)
(day monday)
(at-home ram)
(does-not-like ram)
Now in the next cycle, rule R3 fires, so now (busy-at-work ram) is added to working
memory, which now looks like:
(busy-at-work ram)
(talking-to ram)
(working-with ram)
(day monday)
(at-home ram)
(does-not-like ram)
Now antecedent parts of rules R4 and R6 are satisfied. Let rule R4 fires, so (not-at-
home, ram) is added to working memory which now looks like :
(not-at-home ram)
18
Applications of Artificial (busy-at-work ram)
Intelligence
(talking-to ram)
(working-with ram)
(day monday)
(at-home ram)
(does-not-like ram)
In the next cycle, rule R5 fires so (at-home ram) is removed from the working
memory :
(not-at-home ram)
(busy-at-work ram)
(talking-to ram)
(working-with ram)
(day monday)
(does-not-like ram)
The forward chining will continue like this. But we have to be sure of one thing, that
the ordering of the rules firing is important. A change in the ordering sequence of
rules firing may result in a different working memory.
Some of the conflict resolution strategies which are used to decide which rule to fire
are given below:
These strategies may help in getting reasonable behavior from a forward chaining
system, but the most important thing is how should we write the rules. They
should be carefully constructed, with the preconditions specifying as precisely as
possible when different rules should fire. Otherwise we will have little idea or control
of what will happen.
In forward chining systems we have seen how rule-based systems are used to draw
new conclusions from existing data and then add these conclusions to a working
memory. The forward chaining approach is most useful when we know all the
initial facts, but we don’t have much idea what the conclusion might be.
If we know what the conclusion would be, or have some specific hypothesis to test,
forward chaining systems may be inefficient. In forward chaining we keep on moving
ahead until no more rules apply or we have added our hypothesis to the working
memory. But in the process the system is likely to do a lot of additional and irrelevant
work, adding uninteresting or irrelevant conclusions to working memory. Let us say
that in the example discussed before, suppose we want to find out whether “ram is at
home”. We could repeatedly fire rules, updating the working memory, checking each
time whether (at-home ram) is found in the new working memory. But maybe we
19
had a whole batch of rules for drawing conclusions about what happens when I’m Expert Systems
working, or what happens on Monday–we really don’t care about this, so would rather
only have to draw the conclusions that are relevant to the goal.
This can be done by backward chaining from the goal state or on some hypothesized
state that we are interested in. This is essentially how Prolog works. Given a goal state
to try and prove, for example (at-home ram), the system will first check to see if the
goal matches the initial facts given. If it does, then that goal succeeds. If it doesn’t the
system will look for rules whose conclusions i.e., actions match the goal. One such
rule will be chosen, and the system will then try to prove any facts in the
preconditions of the rule using the same procedure, setting these as new goals to
prove. We should note that a backward chaining system does not need to update
a working memory. Instead it needs to keep track of what goals it needs to prove its
main hypothesis. So we can say that in a backward chaining system, the reasoning
proceeds “backward”, beginning with the goal to be established, chaining
through rules, and finally anchoring in facts.
Although, in principle same set of rules can be used for both forward and backward
chaining. However, in backward chaining, in practice we may choose to write the
rules slightly differently. In backward chaining we are concerned with matching the
conclusion of a rule against some goal that we are trying to prove. So the ‘then or
consequent’ part of the rule is usually not expressed as an action to take (e.g.,
add/delete), but as a state which will be true if the premises are true.
To learn more, let us take a different example in which we use backward chaining
(The system is used to identify an animal based on its properties stored in the working
memory):
Example
1. Let us assume that the working memory initially contains the following facts:
1. IF (gives-milk X)
THEN (mammal X)
2. IF (has-hair X)
THEN (mammal X)
Now to start the process of inference through backward chaining, the rule
based system will first form a hypothesis and then it will use the antecedent –
consequent rules (previously called condition – action rules) to work backward
toward hypothesis supporting assertions or facts.
Let us take the initial hypothesis that “raja is a lion” and then reason
about whether this hypothesis is viable using backward chaining approach explained
below :
¾ The system searches a rule, which has the initial hypothesis in the consequent part
that someone i.e., raja is a lion, which it finds in rule 8.
¾ The system moves from consequent to antecedent part of rule 8 and it finds the
first condition i.e., the first part of antecedent which says that “raja must be a
carnivorous”.
¾ Next the system searches for a rule whose consequent part declares that someone
i.e., “raja is a carnivorous”, two rules are found i.e., rule 3 and rule 4. We assume
that the system tries rule 3 first.
¾ To satisfy the consequent part of rule 3 which now has become the system’s new
hypothesis, the system moves to the first part of antecedent which says that X i.e.,
raja has to be mammal.
¾ So a new sub-goal is created in which the system has to check that “raja is a
mammal”. It does so by hypothesizing it and tries to find a rule having a
consequent that someone or X is a mammal. Again the system finds two rules,
rule 1 and rule 2. Let us assume that the system tries rule 1 first.
¾ In rule 1, the system now moves to the first antecedent part which says that X
i.e., raja must give milk for it to be a mammal. The system cannot tell this
because this hypothesis is neither supported by any of the rules and also it is not
found among the existing facts in the working memory. So the system abandons
rule 1 and try to use rule 2 to establish that “raja is a mammal”.
¾ In rule 2, it moves to the antecedent which says that X i.e., raja must have hair for
it to be a mammal. The system already knows this as it is one of the facts in
working memory. So the antecedent part of rule 2 is satisfied and so the
consequent that “raja is a mammal” is established.
¾ Now the system backtracks to the rule 3 whose first antecedent part is satisfied. In
second condition of antecedent if finds its new sub-goal and in turn forms a new
hypothesis that X i.e., raja eats meat.
21
¾ The system tries to find a supporting rule or an assertion in the working memory Expert Systems
which says that “raja eats meat” but it finds none. So the system abandons the rule
3 and try to use rule 4 to establish that “raja is carnivorous”.
¾ In rule 4, the first part of antecedent says that raja must be a mammal for it to be
carnivorous. The system already knows that “raja is a mammal” because it was
already established when trying to satisfy the antecedents in rule 3.
¾ The system now moves to second part of antecedent in rule 4 and finds a new
sub-goal in which the system must check that X i.e., raja has long-pointed-teeth
which now becomes the new hypothesis. This is already established as “ raja has
long-pointed-teeth” is one of the assertions of the working memory.
¾ In third part of antecedent in rule 4 the system’s new hypothesis is that “raja has
claws”. This also is already established because it is also one the assertions in the
working memory.
¾ Now as all the parts of the antecedent in rule 4 are established so its consequent
i.e., “raja is carnivorous” is established.
¾ The system now backtracks to rule 8 where in the second part of the antecedent
says that X i.e., raja must have a big-mouth which now becomes the new
hypothesis. This is already established because the system has an assertion that
“raja has a big mouth”.
¾ Now as the whole antecedent of rule 8 is satisfied so the system concludes that
“raja is a lion”.
We have seen that the system was able to work backward through the antecedent –
consequent rules, using desired conclusions to decide that what assertions it should
look for and ultimately establishing the initial hypothesis.
How to choose the type of chaining among forward or backward chaining for a
given problem ?
Many of the rule based deduction systems can chain either forward or backward, but
which of these approaches is better for a given problem is the point of discussion.
First, let us learn some basic things about rules i.e., how a rule relates its input/s
(i.e., facts) to output/s (i.e., conclusion). Whenever in a rule, a particular set of facts
can lead to many conclusions, the rule is said to have a high degree of fan out, and a
strong candidate of backward chaining for its processing. On the other hand,
whenever the rules are such that a particular hypothesis can lead to many questions for
the hypothesis to be established, the rule is said to have a high degree of fan in, and a
high degree of fan in is a strong candidate of forward chaining.
To summarize, the following points should help in choosing the type of chaining for
reasoning purpose :
• If the set of facts, either we already have or we may establish, can lead to a large
number of conclusions or outputs , but the number of ways or input paths to reach
that particular conclusion in which we are interested is small, then the degree of
fan out is more than degree of fan in. In such case, backward chaining is the
preferred choice.
22
Applications of Artificial • But, if the number of ways or input paths to reach the particular conclusion in
Intelligence which we are interested is large, but the number of conclusions that we can reach
using the facts through that rule is small, then the degree of fan in is more than
the degree of fan out. In such case, forward chaining is the preferred choice.
• For case where the degree of fan out and fan in are approximately same, then
in case if not many facts are available and the problem is check if one of the
many possible conclusions is true, backward chaining is the preferred choice.
Rule based systems usually work in domains where conclusions are rarely certain,
even when we are careful enough to try and include everything we can think of in the
antecedent or condition parts of rules.
Sources of Uncertainty
9 The theory of the domain may be vague or incomplete so the methods to generate
exact or accurate knowledge are not known.
9 Case data may be imprecise or unreliable and evidence may be missing or in
conflict.
So even though methods to generate exact knowledge are known but they are
impractical due to lack or data, imprecision or data or problems related to data
collection.
So rule based deduction system developers often build some sort of certainty or
probability computing procedure on and above the normal condition-action format
of rules. Certainty computing procedures attach a probability between 0 and 1 with
each assertion or fact. Each probability reflects how certain an assertion is, whereas
certainty factor of 0 indicates that the assertion is definitely false and certainty factor
of 1 indicates that the assertion is definitely true.
Example 1: In the example discussed above the assertion (ram at-home) may have a
certainty factor, say 0.7 attached to it.
Example 2: In MYCIN a rule based expert system (which we will discuss later), a
rule in which statements which link evidence to hypotheses are expressed as decision
criteria, may look like :
For detailed discussion on certainty factors, the reader may refer to probability theory,
fuzzy sets, possibility theory, Dempster-Shafter Theory etc.
Exercise 4
In the “Animal Identifier System” discussed above use forward chaining to try to
identify the animal called “raja”.
23
Expert Systems
1.5 EXAMPLES OF EXPERT SYSTEMS: MYCIN,
COMPASS
The first expert system we choose as and example is MYCIN, which is of the earliest
developed expert systems. As another example of and expert system we briefly
discuss COMPASS.
Like every one else, we are also tempted to discuss MYCIN, one of the earliest
designed expert systems in Stanford University in 1970s.
MYCIN’s job was to diagnose and recommend treatment for certain blood infections.
To do the proper diagnosis, it is required to grow cultures of the infecting organism
which is a very time consuming process and sometime patient is in a critical state. So,
doctors have to come up with quick guesses about likely problems from the available
data, and use these guesses to provide a treatment where drugs are given which should
deal with any type of problem.
So MYCIN was developed in order to explore how human experts make these rough
(but important) guesses based on partial information. Sometimes the problem takes
another shape, that an expert doctor may not available every-time every-where, in that
situation also and expert system like MYCIN would be handy.
MYCIN represented its knowledge as a set of IF-THEN rules with certainty factors.
One of the MYCIN rule could be like :
IF infection is primary-bacteremia AND the site of the culture is one of the sterile
sites
AND the suspected portal of entry is the gastrointestinal tract
THEN there is suggestive evidence (0.8) that bacteroid infection occurred.
The 0.8 is the certainty that the conclusion will be true given the evidence. If the
evidence is uncertain the certainties of the pieces of evidence will be combined with
the certainty of the rule to give the certainty of the conclusion.
MYCIN has been written in Lisp, and its rules are formally represented as lisp
expressions. The action part of the rule could just be a conclusion about the problem
being solved, or it could be another lisp expression.
One of the strategy used by MYCIN is to first ask the user a number of predefined
questions that are most common and which allow the system to rule out totally
unlikely diagnoses. Once these questions have been asked, the system can then focus
on particular and more specific possible blood disorders. It then uses backward
chaining approach to try and prove each one. This strategy avoids a lot of
unnecessary search, and is similar to the way a doctor tries to diagnose a patient.
The other strategies are related to the sequence in which rules are invoked. One of the
strategy is simple i.e., given a possible rule to use, MYCIN first checks all the
antecedents of the rule to see if any are known to be false. If yes, then there is no point
24
Applications of Artificial using the rule. The other strategies are mainly related to the certainty factors.
Intelligence MYCIN first find the rules that have greater degree of certainty of conclusions, and
abandons its search once the certainties involved get below a minimum threshold, say,
0.2.
There are three main stages to the interaction with MYCIN. In the first stage, initial
data about the case is gathered so the system can come up with a broad diagnosis. In
the second more directed questions are asked to test specific hypotheses. At the end of
this section it proposes a diagnosis. In the third stage it asks questions to determine an
appropriate treatment, on the basis of the diagnosis and facts related to the patient.
After that it recommends some treatment. At any stage the user can ask why a
question was asked or how a conclusion was reached, and if a particular treatment is
recommended the user can ask if alternative treatments are possible.
MYCIN has been popular in expert system’s research, but it also had a number of
problems or shortcomings because of which a number of its derivatives like
NEOMYCIN developed.
COMPASS is an expert system which checks error messages derived from the
switch’s self test routines, look for open circuits, reduces the time of operation of
components etc. To find the cause of a switch problem, it looks at a series of such
messages and then uses it expertise. The system can suggest the running of additional
tests, or the replacement of a particular component, for example, a relay or printed
circuit board.
As expertise in this area was scarce, so it was a fit case for taking help an expert
system like COMPASS (We will discuss later, how knowledge acquisition is done in
COMPASS).
Expert system tools are designed to provide an environment for development of expert
systems mainly through the approach of prototyping.
Software tools for development of expert systems mainly fall into the following
Categories:
25
• Expert System Shells: These are basically a set of program and to be more Expert Systems
specific - abstractions over one or more applications programs. One of the major
examples is EMYCIN which is the rule interpreter of the famous expert system
called MYCIN (a medical diagnostic system). EMYCIN also constitutes related
data structures like knowledge tables and indexing mechanism over these tables.
Some recent versions of EMYCIN like M.4 is a very sophisticated shell which
combine the backward chaining of EMYCIN with frame – like data structures.
Expert system shells are basically used for the purpose of allowing non-programmers
to take advantage of the already developed templates or shells and which have
evolved because of the efforts of some pioneers in programming who have solved
similar problems before. The core components of an expert systems are the knowledge
base and the reasoning engine.
As we can see in the figure above, the shell includes the inference engine, a
knowledge acquisition subsystem, an explanation subsystem and a user interface.
When faced with a new problem in any given domain, we can find a shell which can
provide the right support for that problem, so all we need is the knowledge of an
expert. There are many commercial expert system shells available now, each one
adequate for a different range of problems. Taking help of expert system shells to
develop expert systems greatly reduces the cost and the time of development as
compared to developing the expert system from the scratch.
Let us now discuss the components of a generic expert system shell. We will discuss
about:
26
Applications of Artificial • Knowledge Base
Intelligence
• Knowledge Acquisition Subsystem
• Inference Engine
• Explanation Subsystem
• User Interface
Some of the reasons behind the difficulty in collecting information are given below :
o Different domains have their own terminology and it is very difficult for experts
to communicate exactly their knowledge in a normal language.
o Capturing the facts and principles alone is not sufficient to solve problems. For
example, experts in particular domains which information is important for
specific judgments, which information sources are reliable and how problems
can be simplified, which is based on personal experience. Capturing such
knowledge is very difficult.
The idea of automated knowledge capturing has also been gaining momentum. Infact
“machine learning” is one of the important research area for sometime now. The
goal is that, a computing system or machine could be enabled to learn in order to solve
problems like the way human do it.
As we already know that COMPASS is an expert system which was build for proper
maintenance and troubleshooting of telephone company’s switches.
27
Now, for knowledge acquisition, knowledge from a human expert is elicited. An Expert Systems
expert explains the problem solving technique and a knowledge engineers then
converts it into and if-then-rule. The human expert then checks if the rule has the
correct logic and if a change is needed then the knowledge engineer reformulates the
rule.
Sometimes, it is easier to troubleshoot the rules with pencil and paper (i.e., hand
simulation), at least the first time than directly implementing the rule and changing
them again and again.
An inference engine is used to perform reasoning with both the expert knowledge
which is extracted from an expert and most commonly a human expert) and data
which is specific to the problem being solved. Expert knowledge is mostly in the form
of a set of IF-THEN rules. The case specific data includes the data provided by the
user and also partial conclusions (along with their certainty factors) based on this data.
In a normal forward chaining rule-based system, the case specific data is the elements
in the working memory.
Developing expert systems involve knowing how knowledge is accessed and used
during the search for a solution. Knowledge about what is known and, when and
how to use it is commonly called meta-knowledge. In solving problems, a certain
level of planning, scheduling and controlling is required regarding what questions to
be asked and when, what is to be checked and so on.
Different strategies for using domain-specific knowledge have great effects on the
performance characteristics of programs, and also on the way in which a program
finds or searches a solution among possible alternatives. Most knowledge
representations schemes are used under a variety of reasoning methods and research is
going on in this area.
An explanation subsystem allows the program to explain its reasoning to the user. The
explanation can range from how the final or intermediate solutions were arrived at to
justifying the need for additional data.
28
Applications of Artificial (i) Proper use of knowledge: There must be some for the satisfaction of
Intelligence knowledge engineers that the knowledge is applied properly even at the time of
development of a prototype.
(ii) Correctness of conclusions: User’s need to satisfy themselves that the
conclusions produced by the system are correct.
(iii) Execution trace: In order to judge that the knowledge elicitation is proceeding
smoothly and successfully, a complete trace of program execution is required.
(iv) Knowledge of program behavior: For proper maintenance and debugging, the
knowledge of program behavior is necessary for the programmers.
(v) Suitability of reasoning approach: Explanation subsystems are necessary to
ensure that reasoning technique applied is suitable to the particular domain.
Explanation in expert systems deals with the issue of control because the reasoning
steps used by the programs will depend on how it searches for a solution.
Example:
Suppose there is a simple rule based system to diagnose the problems in a car.
R4: IF fuel-tank-has-petrol
THEN engine-gets-petrol
Explanation subsystems allow the user to ask why it asked a particular question, and
how it reached some conclusion. These questions are answered by referring to the
system goals, the rules being used, and any existing problem data.
Now let us focus on the example given above to see the explanation facilities
provided, which involve a dialogue involving why and how questions and their
explanations.
29
Expert Systems
System : Does the engine turns over?
User : No
Providing such an explanation facility involves stating what rules are used in reaching
conclusions, and using these records to compose explanations like the ones above.
Giving simple explanations like those above is not very difficult, and is quite useful.
Explanation facilities in expert systems are sometimes not used, and sometimes they
are not easily accepted by their users. There are a whole lot of reasons for this. One
reason is that the explanations just reference the “surface” knowledge encoded in the
rules, rather than providing the “deep” knowledge about the domain which originally
motivated the rules but which is usually not represented. So, the system will say that it
concluded something because of rule 5, but not explain what rule 5 intends to say. In
the example given below, maybe the user needs to understand that both the lights and
the starter use the battery, which is the underlying purpose of the second rule in this
example. Another reason for the frequent failure of explanation facilities is the fact
that, if the user fails to understand or accept the explanation, the system can’t re-
explain in another way (as people can). Explanation generation is a fairly large area of
research, concerned with effective communication i.e., how to present things so that
people are really satisfied with the explanation, and what implications does this have
for how we represent the underlying knowledge.
MYCIN is one of the first popular expert systems made for the purpose of medical
diagnosis. Let us have a look at how the explanation subsystem in MYCIN works :
To explain the reasoning for deciding on a particular medical parameter’s or
symptom’s value, it retrieves a set of rules and their conclusions. It allows the user to
ask questions or queries during a consultation.
To answer the questions the system relies on its ability to display a rule invoked at any
point during the consultation and also recording the order of rule invocations and
associating them with particular events (like particular questions).
As the system using backward chaining so most of the questions belong to “Why” or
“How” category. To answer “Why” questions, the system looks up the hierarchy (i.e.,
tree) of rules to see which goals the system is trying to achieve and to answer “Why”
30
Applications of Artificial questions, the system must look down the hierarchy (i.e., tree) to find out that which
Intelligence sub-goals were satisfied to achieve the goal.
We can see that explanation process is nothing but a search problem requiring tree
traversal.
As MYCIN keeps track of the goal to sub-goal sequence of the computation, so it can
answer questions like:
“Why did you ask that if the stain of the organism is gram negative ?”
In response to this, the system would quote the rules which states “gram negative
staining” may be in conjunction with other conditions and would state that what it was
trying to achieve.
In its reply, MYCIN would state the rules that were applied in reaching this
conclusions and their degree of certainty and what was the last question asked etc.
Thus we can see that because of the backward chaining approach, the system is able to
answer “Why” and “How” questions satisfactorily. But the rule application would not
be easy if the chains of reasoning are long.
It is used to communicate with the user. The user interface is generally not a part of
the expert system technology, and was not given much attention in the past. However,
it is now widely accepted that the user interface can make a critical difference in the
utility of a system regardless of the system’s performance.
Now as an example, let us discuss and expert system shell called EMYCIN.
31
also included a knowledge editor (a program) called TEIRESIAS whose job was Expert Systems
to provide help for the development and maintenance of large knowledge bases.
• It also has a user interface which allows the user to communicate with the system
smoothly.
• Diagnosis and Troubleshooting: This class comprises systems that deduce faults
and suggest corrective actions for a malfunctioning device or process. Medical
diagnosis was one of the first knowledge areas to which ES technology was
applied, but use of expert systems for solving and diagnosis of engineered systems
has become common.
• Planning and Scheduling: Systems that fall into this class analyze a set of one or
more potentially complex and interacting goals in order to determine a set of
actions to achieve those goal. This class of expert systems has great commercial
potential. Examples include scheduling of flights, personnel, manufacturing
process planning etc.
• Process Monitoring and Control: Systems falling in this class analyze real-
time data from physical devices with the goal of noticing errors, predicting
trends, and controlling for both optimality and failure correction. Examples of
real-time systems that actively monitor processes are found in the steel making
and oil refining industries.
• Financial Decision Making: The financial services industry has also been using
expert system techniques. Expert systems belonging to this category act as
advisors, risk analyzers etc.
1.8 SUMMARY
In this unit, we have discussed various issues in respect of expert systems. To begin
with, in section 1.2 we define the concept of ‘expert system’. In view of the fact that
an expert system contains knowledge of a specific domain, in section 1.4, we discuss
various schemes for representing the knowledge of a domain. Section 1.5 contains
examples of some well-known expert systems. Tools for building expert systems are
discussed in section 1.6. Finally, applications of expert systems are explained in
section 1.7.
32
Applications of Artificial
Intelligence 1.9 SOLUTIONS/ANSWERS
Ex. 1) In this case, it is not the same ‘strike’ action but two strike actions which are
involved in the sentence. Therefore, we use ‘strike’ to denote a generic action
of striking whereas ‘strike-1’ and ‘strike-2’ are its instances or members.
Thus, we get the semantic network.
Mother-of
past of
struck
Ex. 2) (date
(day (integer (1….31)))
(month (integer (1…..12)))
(year (integer (1…..10000)))
(day-of-the-week (set (Mon Tue Wed Thu Fri Sat Sun)))
(procedure (compute day-of-the-week (day month year))),
There are problems in expressing certain kinds of knowledge when either semantic
networks or frames are used for knowledge representation. For example, it is difficult
although not impossible to express disjunctions and hence implications, negations,
and general non-taxonomic knowledge (i.e., non-hierarchical knowledge) in these
representations.
Applying Modus Ponens to formulae (i) and (ii) we get the formula
(v) SG
Applying Modus Ponens to (v) and (iii), we get
(vi) TG
But formula (vi) is the same as (iv), which is required to be established. Hence the
proof.
We start with one of the assertions about “raja” from the working memory.
The working memory is now updated by adding the assertion (mammal raja).
So the working memory now looks like:
(mammal raja)
(has-hair raja)
(big-mouth raja)
(long-pointed-teeth raja)
(claws raja)
Now we try to match assertion (mammal raja) to the antecedent part of a rule. The
first rule whose antecedent part supports the assertion is rule 3. So the control moves
to the second part of rule 3, which says that (eats-meat X), but this is not found in any
of the assertions present in working memory, so rule 3 fails.
Now, the system tries to find another rule which matches assertion (mammal raja), it
find rule 4 whose first part of antecedent supports this. So the control moves to the
second part of the antecedent in rule 4, which says that something i.e., X must have
pointed teeth. This fact is present in the working memory in the form of assertion
(long-pointed-teeth raja) so the control now moves to the third antecedent part of rule
4 i.e., something i.e., X must have claws. We can see that this is supported by the
assertion (claws raja) in the working memory. Now, as the whole antecedent of rule 4
is satisfied so the consequent of rule 4 is established and the working memory is
updated by the addition of the assertion (carnivorous raja), after substituting “raja” in
place of X.
(carnivorous raja)
34
Applications of Artificial (mammal raja)
Intelligence
(has-hair raja)
(big-mouth raja)
(long-pointed-teeth raja)
(claws raja)
Now in the next step, the system tries to match the assertion (carnivorous raja) with
one of the rules in working memory. The first rule whose antecedent part matches this
assertion is rule 6. Now as the first part of the antecedent in rule 6 matches with the
assertion, the control moves to the second part of the antecedent i.e., X has dark spots.
There is no assertion in working memory which supports this, so the rule 6 is aborted.
The system now tries to match with the next rule which matches the assertion
(carnivorous raja). It finds rule 8 whose first part of antecedent matches with the
assertion. So the control moves to the second part of the antecedent of rule 8 which
says that something i.e., X must have big mouth. Now this is already present in the
working memory in the form of assertion (big-mouth raja) so the second part and
ultimately the whole antecedent of rule 8 is satisfied.
And, so the consequent part of rule 8 is established and the working memory is
updated by the addition of the assertion (lion raja), after substituting “raja” in place of
X.
(lion raja)
(carnivorous raja)
(mammal raja)
(has-hair raja)
(big-mouth raja)
(long-pointed-teeth raja)
(claws raja)
Hence, as the goal to be achieved i.e., “raja is a lion” is now part of the working
memory in the form of assertion (lion raja), so the goal is established and processing
stops.
35
Applications of Artificial
Intelligence
UNIT 2 INTELLIGENT AGENTS
Structure Page Nos.
2.0 Introduction 36
2.1 Objectives 37
2.2 Definitions 37
2.3 Agents and Rationality 39
2.3.1 Rationality vs. Omniscience
2.3.2 Autonomy and learning capability of the agent
2.2.3 Example: A boundary following robot
2.4 Task Environment of Agents 42
2.4.1 PEAS (Performance, Environment, Actuators, Sensors)
2.4.2 Example An Automated Public Road Transport Driver
2.4.3 Different Types of Task Environments
2.4.3.1 Fully Observable vs. Partially Observable Environment
2.4.3.2 Static vs. Dynamic Environment
2.4.3.3 Deterministic vs. Stochastic Environment
2.4.3.4 Episodic vs. Sequential Environment
2.4.3.5 Single agent vs. Multi-agent Environment
2.4.3.6 Discrete Vs. Continuous Environment
2.4.4 Some Examples of Task Environments
2.4.4.1 Crossword Puzzle
2.4.4.2 Medical Diagnosis
2.4.4.3 Playing Tic-tac-toe
2.4.4.4 Playing Chess
2.4.4.5 Automobile Driver Agent
2.5 The Structure of Agents 49
2.5.1 SR (Simple Reflex) Agents
2.5.2 Model Based reflex Agents
2.5.3 Goal-based Agents
2.5.4 Utility-based Agents
2.5.5
Learning Agents
2.6 Different Forms of Learning in Agents 56
2.7 Summary 58
2.8 Solutions/Answers 58
2.9 Further Readings 58
2.0 INTRODUCTION
Since the time immemorial, we, the human beings, have always toyed with the idea of
having some sort of slaves or agents, which would act as per our command,
irrespective of the shape they take, as long as they do the job for which they have been
designed or acquired. With the passage of time, human beings have developed
different kinds of machines, where each machine has been intended to perform
specific operation or a set of operations. However, with the development of the
computer, human aspirations have increased manifolds as it has allowed us to think
of and actually implement non-human agents, which would show some level of
independence and intelligence. Robot, one of the most popular non-human agents, is
a machine capable of perceiving the environment it is in, and further capable of taking
some action or of performing some job either on its own or after taking some
command.
Despite their perceived benefits, the dominant public image of the artificially
embodied intelligent machines is more as potentially dangerous than potentially
beneficial machines to the human race. The mankind is worried about the potentially
dangerous capabilities of the robots to be designed and developed in the future.
36
Intelligent Agents Actually, any technology is a double-edged sword. For example, the Internet along
with World Wide Web, on one hand, allows us to acquire information at the click of a
button, but, at the same time, the Internet provides an environment in which a number
of children (and, of course, the adults also) become addicts to downloading
pornographic material. Similarly, the development of non-human agents might not
be without its tradeoffs. For example, the more intelligent the robots are designed
and developed, the more are the chances of a robot pursuing its own agenda than its
master’s, and more are the chances of a robot even attempting to destroy others to
become more successful. Some intellectuals even think that, not in very distant future,
there might be robots capable of enslaving the human beings, though designed and
developed by the human beings themselves. Such concerns are not baseless. However,
the (software) agents developed till today and the ones to be developed in the near
future are expected to have very limited capabilities to match the kind of intelligence
required for such behaviour.
In respect of the design and development of intelligent agents, with the passage of
time, the momentum seems to have shifted from hardware to software, the letter being
thought of as a major source of intelligence. But, obviously, some sort of hardware is
essentially needed as a home to the intelligent agent.
2.1 OBJECTIVES
2.2 DEFINITIONS
An agent may be thought of as an entity that acts, generally on behalf of someone
else. More precisely, an agent is an entity that perceives its environment through
sensors and acts on the environment through actuators. Some experts in the field
require an agent to be additionally autonomous and goal directed also.
A percept may be thought of as an input to the agent through its censors, over a unit
of time, sufficient enough to make some sense from the input.
Percept sequence is a sequence of percepts, generally long enough to allow the agent
to initiate some action.
In order to further have an idea about what a computer agent is, let us consider one of
the first definitions of agent, which was coined by John McCarthy and his friends at
MIT.
A software agent is a system which, when given a goal to be achieved, could carry
out the details of the appropriate (computer) operations and further, in case it gets
stuck, it can ask for advice and can receive it from humans, may even evaluate the
appropriateness of the advice and then act suitably.
As the concept of (software) agent is of relatively recent origin, different pioneers and
other experts have been conceiving and using the term in different ways. There are
two distinct but related approaches for defining an agent. The first approach treats an
agent as an ascription i.e., the perception of a person (which includes expectations
and points of view) whereas the other approach defines an agent on the basis of the
description of the properties that the agent to be designed is expected to possess.
Let us first discuss the definition of agent according to first approach. Among the
people who consider an agent as an ascription, a popular slogan is “Agent is that
agent does”. In everyday context, an agent is expected to act on behalf of someone
to carry out a particular task, which has been delegated to it. But to perform its task
successfully, the agent must have knowledge about the domain in which it is operating
and also about the properties of its current user in question. In the course of normal
life, we hire different agents for different jobs based on the required expertise for each
job. Similarly, a non-human intelligent agent also is imbedded with required
expertise of the domain as per requirements of the job under consideration. For
example, a football-playing agent would be different from an email-managing
agent, although both will have the common attribute of modeling their user.
According to the second approach, an agent is defined as an entity, which functions
continuously and autonomously, in a particular environment, which may have other
agents also. By continuity and autonomy of an agent, it is meant that the agent
must be able to carry out its job in a flexible and intelligent fashion and further is
expected to adapt to the changes in its environment without requiring constant human
guidance or intervention. Ideally, an agent that functions continuously in an
environment over a long period of time would also learn from its experience. In
addition, we expect an agent, which lives in a multi-agent environment, to be able to
communicate and cooperate with them, and perhaps move from place to place in
doing so.
According to the second approach to defining agent, an agent is supposed to
possess some or all of the following properties:
Reactivity: The ability of sensing the environment and then acting accordingly.
Autonomy: The ability of moving towards its goal, changing its moves or
strategy, if required, without much human intervention.
Communicating ability: The ability to communicate with other agents and
humans.
Ability to coexist by cooperating: The ability to work in a multi-agent
environment to achieve a common goal.
Ability to adapt to a new situation: Ability to learn, change and adapt to the
situations in the world around it.
Ability to draw inferences: The ability to infer or conclude facts, which may be
useful, but are not available directly.
Temporal continuity: The ability to work over long periods of time.
Personality: Ability to impersonate or simulate someone, on whose behalf the
agent is acting.
Mobility: Ability to move from one environment to another.
38
Intelligent Agents
2.3 AGENTS AND RATIONALITY
Further, a rational agent is an agent that acts in a manner that achieves best outcome
in an environment with certain outcomes. In an uncertain environment, a rational
agent through its actions attempts the best-expected outcome.
It may be noted that correct inferencing is one of the several possible mechanisms for
achieving rationality. However, sometimes a rational action is also possible without
inferencing. For example, removing hand when a very hot utensil is touched
unintentionally is an example of rationality based on reflex action instead of based on
inferencing.
We discuss the concepts of rationality and rational agent in some more detail.
Attempting to take always the correct action, possibly but not necessarily involving
logical reasoning, is only one part of being rational. Further, if a perfectly correct
action or inference is not possible then taking an approximately correct, but, optimal
action under the circumstances is a part of rationality.
Like other attributes, we need some performance measure (i.e., the criteria to judge
the performance or success in respect of the task to be performed) to judge rationality.
A good performance measure for rationality must be:
Objective in nature
It must be measurable or observable and
It must be decided by the designer of the agent keeping in mind the problem or the
set of problems for handling which the agent is designed.
From the previous discussion, we know that a rational agent should take an action,
which would correct its performance measure on the basis of its knowledge of the
world around it and the percept sequence.
The basic difference between being rational and being omniscient is that
rationality deals with trying to maximize the output on the basis of current input,
environmental conditions, available actions and past experience whereas being
omniscient means having knowledge of everything, including knowledge of the future
i.e., what will be the output or outcome of its action. Obviously being omniscient is
next to impossible.
39
Applications of Artificial
In this context, let us consider the following scenario: Sohan is going to the nearby Intelligence
grocery shop, but unfortunately when Sohan is passing through the crossing suddenly
a police party comes at that place chasing a terrorist and attempts to shoot the terrorist
but unfortunately the bullet hits Sohan and he is injured.
Now the question is: Is Sohan irrational in moving through that place. The answer is
no, because the human agent Sohan has no idea, nor is expected to have an idea, of
what is going to happen at that place in the near future. Obviously, Sohan is not
omniscient but, from this incident can not be said to be irrational.
Autonomy means the dependence of the agent own its on percepts (what it perceives
or receives from the environment through senses) rather than the prior knowledge of
its designer. In other words, an agent is autonomous if it is capable of learning from
its experience and has not to depend upon its prior knowledge which may either be not
complete or be not correct or both. Greater the autonomy more flexible and more
rational the agent is expected to be. So we can say that a rational agent should be
autonomous because it should be able to learn to compensate for the incomplete or
incorrect prior knowledge provided by the designer. In the initial stage of its
operations, the agent may not, rather should not, have complete autonomy. This is
desirable, in view of the fact that in the initial stage, the agent is yet to acquire
knowledge of the environment should use the experience and knowledge of the
designer. But, as the agent gains experience of its environment, its behavior should
become more and more independent of its prior knowledge provided to it by its
designer.
Learning means the agent can update its knowledge based on its experience and
changes in the environment, for the purpose of taking better actions in future.
Although the designer might feed some prior knowledge of the environment in the
agent but, as mentioned earlier, as the environment might change with the passage of
time, therefore, feeding complete knowledge in the agent, at the beginning of the
operations is neither possible nor desirable. Obviously, there could be some extreme
cases in which environment is static, i.e., does not change with time, but such cases
are rare. In such rare cases, however, giving complete information about the
environment may be desirable. So, in general, in order to update itself, the agent
should have the learning capability to update its knowledge with the passage of time.
Let us consider the first example of an agent, which is a robot that follows a
boundary wall.
Outer boundary
C1 C2 C3
C8 C4
C7 C6 C5 X
The sensory input about the surrounding cells can be represented in the form of an
8-tuple vector say <c1,c2,c3,c4,c5,c6,c7,c8> where each ci is a binary number i.e., ci
is either a 0 for representing a free cell or a 1 for representing the fact that the cell is
occupied by some object. If, at some stage, the robot is in the cell represented by X,
as shown in Figure 2.1, then the sense vector would be <0,0,0,0,0,0,0,1,1>. It may be
noted that the corner and the boundary cells may not have exact eight surrounding
cells. However, for such cells, the missing neighbouring cell may be treated as
occupied. For example, for the cell y of Figure 2.1, the neighbouring cells to the left
of the cell y do not exist. However, we assume these cells exist but are occupied.
Thus, if the robot is in position Y of Figure 2.1 then the sense vector for the robot is
<1,0,0,0,0,0,1,1>.
Also, the robot performs only one type of action i.e., it can move in one of the free
cells adjacent to it. In case the robot attempts to move to a cell, which is not free, the
action will have no effect i.e., there will be no action.
On the basis of the properties of the environment in which the robot is operating, it is
the job of the designer to develop an algorithm for the process that converts
robot’s precept sequences (which in this case is the sensory input c1 to c8) into
corresponding actions. Let us divide the task of determining the action the agent
should take on the basis of the sensory inputs into two phases, viz., perception phase
and action phase.
41
Applications of Artificial
Perception Phase: Intelligence
There is an enclosed space, which we call a room. The room is thought of as divided
into cells. A cell has just enough space for the robot to be accommodated. Further, at
any time, robot, except during transition, can not lie across two cells.
B
A C
D
if A is the cell currently occupied by the robot and the cell B is occupied by some
object, but cells C and D are free. And, further, signal on the right is green, then the
robot moves to cell C. However, if both B and C are occupied, than the robot moves
to cell D. These environmental conditions are determined in terms of what we call
features. For the current problem, let us assume that there are 4 binary valued
features of the agent, namely f1, f2, f3 and f4 and these features are used to
calculate the action to be taken by the agent.
The rules for assigning binary values to features may be assumed to be as given
below:
IF c2 = 1 OR c3 = 0 THEN f1 = 1 ELSE f1 = 0.
IF c4 = 0 OR c5 = 1 THEN f2 = 1 ELSE f2 = 0.
IF c6 = 1 OR c7 = 1 THEN f3 = 1 ELSE f3 = 0.
IF c8 = 1 OR c1 = 1 THEN f4 = 1 ELSE f4 = 0.
Action Phase:
After calculating the values of the feature set of the robot, a pre-defined function is
used to determine boundary following action of the robot (i.e., the movement to one of
the surrounding cell). In addition to specified actions determined by the features and
the pre-defined function, there may be a default action of movement to a surrounding
free cell. The default action may be executed when no action is possible.
Action 2: IF f1 = 1 AND f2 = 0
42
Intelligent Agents THEN move topmost free surrounding cell on the right by
one cell. If no such cell is free, then stop.
Action 3: IF f2 = 1 AND f3 = 0
THEN move down to the left-most free cell. If no such cell is
free then attempt Action 2.
Action 4: IF f3 = 1 AND f4 = 0
THEN move to the bottom-most free cell on the left. If no
such surrounding cell is free then move to action 1.
We can see that the combinations of the features values serve as conditions under
which the robot would perform its actions.
Next, we enumerate below some of the possible parameters for performance measures
in case of such a simple agent.
¾ The number of times agent has to reverse its earlier courses of actions (more times
indicates lower performance).
¾ The maximum of the average of distances from the walls (the less the maximum,
higher the performance).
¾ The boundary distance traversed in particular time period (the more the distance
covered, better the performance).
In the next section, we discuss some of the commonly used terms in context of the
Agents.
Exercise 1
On the basis of the figure given below, find the sense vector of the robot if:
(a) It starts from the location L1
(b) It starts from the location
The robot can sense whether the 8 cells adjacent to it are free for it to move into
one of them. If the location of the robot is such that some of the surrounding cells
do not exist, then these cells are, assumed to exist and further assumed to be
occupied Outer boundary
C1 C2 C3
C8 C4
C7 C6 C5
L2
L1
Task environments or problem environments are the environments, which include all
the elements involved in the problems for which agents are thought of as solutions.
Task environments will vary with every new task or problem for which an agent is
being designed. Specifying the task environment is a long process which involves
looking at different measures or parameters. Next we discuss a standard set of
measures or parameters for specifying a task environment under the heading PEAS.
For designing an agent, the first requirement is to specify the task environment to the
maximum extent possible. The task environment for an agent to solve one type of
problems, may be described by the four major parameters namely, performance
(which is actually the expected performance), environment (i.e., the world around the
agent), actuators (which include entities through which the agent may perform
actions) and sensors (which describes the different entities through which the agent
will gather information about the environment).
We must remember that the environment or the world around the agent is extremely
uncertain or open ended. There are unlimited combinations of possibilities of the
environment situations, which such an agent could face. Let us enumerate some of the
possibilities or circumstances which an agent might face:
Variety of roads e.g., from 12-lane express-ways, freeways to dusty rural bumpy
roads; different road rules including the ones requiring left-hand drive in some
parts of the world and right-hand drive in other parts.
The degree of knowledge of various places through which and to which driving is
to be done.
44
Intelligent Agents Various kinds of passengers, including high cultured to almost ruffians etc.
All kind of other traffic possibly including heavy vehicles, ultra modern cars,
three-wheelers and even bullock carts.
Actuators:
Sensors:
The agent acting as automated public road transport driver must have some way of
sensing the world around it i.e., the traffic around it, the distance between the
automobile and the automobiles ahead of it and its speed, the speeds of neighbouring
vehicles, the condition of the road, any turn ahead etc. It may use sensors like
odometer, speedometer, sensors telling the different parameters of the engine, Global
Positioning System (GPS) to understand its current location and the path ahead. Also,
there should be some sort of sensors to calculate its distance from other vehicles etc.
We must remember that the agent example the automated public road transport
driver, which we have considered above, is quite difficult to implement. However,
there are many other agents, which operate in comparatively simpler and less dynamic
environments, e.g., a game playing robot, an assembly line robot control, and an
image processing agent etc.
If the agent knows everything about its environment or the world in which it exists,
through its sensors, then we say that the environment is fully observable. It would be
quite convenient for the agent if the environment is a fully observable one because the
agent will have a wholesome idea of what all is happening around before taking any
action. The agent in a fully observable environment will have a complete idea of the
world around it at all times and hence it need not maintain an internal state to
remember and store what is going around it. Fully observable environments are
found rarely in reality.
A partially observable environment is that in which the agent can sense some of the
aspects or features of the world around it, but not all because of any number of
reasons including the limitations of its sensors. Normally, there are greater chances of
finding a partially observable environment in reality. In the first example, viz., a
boundary following robot, which we have studied, the agent or robot has a limited
45
Applications of Artificial
view of the environment around it, i.e., the agent has information only about the eight Intelligence
cells immediately adjacent to it in respect of whether each of these is occupied or free.
It has no idea of how many non-moving objects are there within the room. Nor, it has
an idea of its distance from the boundary wall. Also, in the second example, viz., an
automated public road transport driver, the driver or the agent has a very
restricted idea of the environment around it. It operates in a very dynamic
environment, i.e., an environment that changes very frequently. Further, it has no idea
of the number of vehicles and their positions on the road, the action the driver of the
vehicle ahead is expected to take, the location of various potholes it is going to face in
the next one kilometer and so on. So we can see that in both of these examples, the
environment is partially observable.
Static environment means an environment which does not change while the agent is
operating. Implementing an agent in a static environment is relatively very simple, as
the agent does not have to keep track of the changes in the environment around it. So
the time taken by an agent to perform its operations has no effect on the environment
because the environment is static. The task environment of the “boundary
following robot” is static because no changes are possible in the environment around
it, irrespective of how long the robot might operate. Similarly, the environment of an
agent solving a crossword puzzle is static.
A dynamic environment on the other hand may change with time on its own or
because of the agent’s actions. Implementing an agent in a dynamic environment is
quite complex as the agent has to keep track of the changing environment around it.
The task environment of the “Automated public road transport driver” is an
example of a dynamic environment because the whole environment around the
agent keeps on changing continuously.
In case an environment does not change by itself but an agent’s performance changes
the environment, then the environment is said to be semi-dynamic.
A deterministic environment means that the current state and the current set of
actions performed by the agent will completely determine the next state of the agent’s
environment otherwise the environment is said to be stochastic. The decision in
respect of an environment being stochastic or deterministic, is taken from the point of
view of the agent. If the environment is simple and fully observable then there are
greater chances of its being deterministic. However, if the environment is partially
observable and complex then it may be stochastic to the agent. For example, the
boundary following robot exists in a deterministic environment whereas an automated
road transport driver agent exists in a stochastic environment. In the later case, the
agent has no prior knowledge and also it cannot predict the behavior of the traffic
around it.
In sequential environments there are no episodes and the previous actions could
affect the current decision or action and further, the current action could effect
the future actions or decisions. The task environment of “An automated public road
transport driver” is an example of a sequential environment as any decision taken or
action performed by the driver agent may have long consequences.
If there is a possibility of other agents also existing in addition to the agent being
described then the task environment is said to be multi-agent environment. In a
multi-agent environment, the scenario becomes quite complex as the agent has to keep
track of the behavior and the actions of the other agents also. There can be two general
scenarios, one in which all the agents are working together to achieve some common
goal, i.e., to perform a collective action. Such a type of environment is called
cooperative multi-agent environment.
Also, it may happen that all or some of the agents existing in the environment are
competing with each other to achieve something (for example, in a tic-tac-toe game or
in a game of chess both of which are two agent games, each player tries to win by
trying to predict the possible moves of the other agent), such an environment is called
competitive multi-agent environment.
The word discrete and continuous are related to the way time is treated, i.e., whether
time is divided into slots (i.e., discrete quantities) or it is treated as a continuous
quantity (continuous).
As we have already discussed the environment is single agent and static which makes
it simple as there are no other players, i.e., agents and the environment does not
change. But the environment is sequential in nature, as the current decision will affect
all the future decisions also. On the simpler side, the environment of a crossword
puzzle is fully observable as it does not require remembering of some facts or
decisions taken earlier and also the environment is deterministic as the next state of
the environment fully depends on the current state and the current action taken by the
agent. Time can be divided into discrete quanta between moves and so we can say
that the environment is discrete in nature.
The first problem in the task environment of an agent performing a medical diagnosis
is to decide whether the environment is to be treated as a single-agent or multi-agent
one. If the other entities like the patients or staff members also have to be viewed as
agents (obviously based on whether they are trying to maximize some performance
measure e.g., Profitability) then it will be a multi-agent environment otherwise it will
be a single agent environment. To keep the discussion simple, in this case, let us
choose the environment as a single agent one.
We can very well perceive that the task environment is partially observable as all the
facts and information needed may not be available readily and hence, need to be
remembered or retrieved. The environment is stochastic in nature, as the next state of
the environment may not be fully determined only by the current state and current
action. Diagnosing a disease is stochastic rather than deterministic.
Also the task environment is partially episodic in nature and partially sequential. The
environment is episodic because, each patient is diagnosed, irrespective of the
diagnoses of earlier patients. The environment is sequential, in view of the fact that
for a particular patient, earlier treatment decisions are also taken into consideration in
deciding further treatment. Furthermore, the environment may be treated as
continuous as there seems to be no benefit of dividing the time in discrete slots.
48
Intelligent Agents 2.4.4.4 Playing Chess
We have already discussed one subclass of a general driver agent in detail, viz., an
automated public road transport driver which may include a bus driver, taxi driver or
auto driver etc. A driver agent might also be christened as a cruise control agent,
which may include other types of transport like water transport or air transport also
with some variations in their environments.
Coming back to an automated public road transport driver, we can see that this is one
of the most complex environments discussed so far. So the environment is multi-
agent and partially observable as it is not possible to see and assume what all other
agents i.e., other drivers are doing or thinking. Also the environment is fully dynamic,
as it is changing all the time as the location of the agent changes and also the locations
of the other agents change. The environment is stochastic in nature as anything
unpredictable can happen, which may not be perceived exactly as a result of the
current state and the actions of various agents. The environment is sequential as any
decision taken by the driver might affect or change the whole future course of actions.
Also the time is continuous in nature although the sensors of the driver agent might
work in discrete mode.
Also, the agent program and its architecture are related in the sense that for a different
agent architecture a different type of agent program is required and vice-versa. For
example, in case of a boundary following robot, if the robot does not have the
capability of sensing adjacent cells to the right, then the agent program for the robot
has to be changed.
Next, we discuss different categories of agents, which are differentiated from each
other on the basis of their agent programs. Capability to write efficient agent programs
is the key to the success for developing efficient rational agents. Although the table
driven approach (in which an agent acts on the basis of the set of all possible percepts
by storing these percepts in tables) to design agents is possible yet the approach of
developing equivalent agent programs is found much more efficient.
Next we discuss some of the general categories of agents based on their agents
programs:
These are the agents or machines that have no internal state (i.e., the don’t remember
anything) and simply react to the current percepts in their environments. An
interesting set of agents can be built, the behaviour of the agents in which can be
captured in the form of a simple set of functions of their sensory inputs. One of the
earliest implemented agent of this category was called Machina Speculatrix. This
was a device with wheels, motor, photo cells and vacuum tubes and was designed to
move in the direction of light of less intensity and was designed to avoid the direction
of the bright light. A boundary following robot is also an SR agent. For an
automobile-driving agent also, some aspects of its behavior like applying brakes
immediately on observing either the vehicle immediately ahead applying brakes or a
human being coming just in front of the automobile suddenly, show the simple reflex
capability of the agent. Such a simple reflex action in the agent program of the agent
can be implemented with the help of simple condition-action rules.
Although implementation of SR agents is simple yet on the negative side this type of
agents have very limited intelligence because they do not store or remember
anything. As a consequence they cannot make use of any previous experience. In
summary, they do not learn. Also they are capable of operating correctly only if
the environment is fully observable.
Simple Reflex agents are not capable of handling task environments that are not fully
observable. In order to handle such environments properly, in addition to reflex
capabilities, the agent should, maintain some sort of internal state in the form of a
function of the sequence of percepts recovered up to the time of action by the agent.
Using the percept sequence, the internal state is determined in such a manner that it
reflects some of the aspects of the unobservable environment. Further, in order to
reflect properly the unobserved environment, the agent is expected to have a model of
the task environment encoded in the agent’s program, where the model has the
knowledge about–
(i) the process by which the task environment evolves independent of the agent and
(ii) effects of the actions of the agent have on the environment.
Thus, in order to handle properly the partial observability of the environment, the
agent should have a model of the task environment in addition to reflex capabilities.
Such agents are called Model-based Reflex Agents.
Percepts
T
AGENT Sensors A
S
Internal State
The current K
state of the
world
The affect of
agent’s actions E
N
V
Rules about Logic for I
evolution of the action R
world of the agent
O
N
Action Rules Actions M
Actuators E
N
T
In order to design appropriate agent for a particular type of task, we know the nature
of the task environment plays an important role. Also, it is desirable that the
complexity of the agent should be minimum and just sufficient to handle the task in a
particular environment. In this regard, first we discussed the simplest type of agents,
51
Applications of Artificial
viz., Simple Reflex Agents. The action of this type of agent is decided by the current Intelligence
precept only. Next, we discussed the Model-Based Reflex Agents, for which an action
is decided by taking into consideration not only the latest precept, but the whole
precept history summarized in the form of internal state. Also, action for this type of
agent is also decided by taking into consideration the knowledge of the task
environment, represented by a model of the environment and encoded into the agent’s
program. However, in respect of a number of tasks, even this much knowledge may
not be sufficient for appropriate action. For example, when we are going from city A
to city B, in order to take appropriate action, it is not enough to know the summary of
actions and path which has taken us to some city C between A and B. We also have to
remember the goal of reaching to city B.
Goal based agents are driven by the goal they want to achieve, i.e., their actions are
based on the information regarding their goal, in addition to, of course, other
information in the current state. This goal information is also a part of the current
state description and it describes everything that is desirable to achieve the goal. As
mentioned earlier, an example of a goal-based agent is an agent that is required to find
the path to reach a city. In such a case, if the agent is an automobile driver agent, and
if the road is splitting ahead into two roads then the agent has to decide which way to
go to achieve its goal of reaching its destination. Further, if there is a crossing ahead
then the agent has to decide, whether to go straight, to go to the left or to go to the
right. In order to achieve its goal, the agent needs some information regarding the goal
which describes the desirable events and situations to reach the goal. The agent
program would then use this goal information to decide the set of actions to take in
order to reach its goal.
Another desirable capability which a good goal based agent should have is that if an
agent finds that a part of the sequence of the previous steps has taken the agent away
from its goal then it should be able to retract and start its actions from a point which
may take the agent toward the goal.
As the goal-based agents may have to reason before they take an action, these
agents might be slower than other types of agents but will be more flexible in taking
actions as their decisions are based on the acquired knowledge which can be modified
also. Hence, as compared to SR agents which may require rewriting of all the
condition-action rules in case of change in the environment, the goal-based agents can
adapt easily when there is any change in its goal.
52
Intelligent Agents
Percepts
T
AGENT Sensors A
Internal State
S
The current K
state of the
world
The affect of
agent’s actions E
N
The new state The action
V
Rules about after the to be I
the evolution action taken
of the world of
R
the agent O
N
Goals to be M
achieved Actuators Actions E
N
T
Goal based agent’s success or failure is judged in terms of its capability for
achieving or not achieving its goal. A goal-based agent, for a given pair of
environment state and possible input, only knows whether the pair will lead to the
goal state or not. Such an agent will not be able to decide in which direction to
proceed when there are more than one conflicting goals. Also, in a goal-based agent,
there is no concept of partial success or somewhat satisfactory success. Further, if
there are more than one methods of achieving a goal, then no mechanism is
incorporated in a Goal-based agent of choosing or finding the method which is faster
and more efficient one, out of the available ones, to reach its goal.
A more general way to judge the success or happiness of an agent may be, through
assigning to each state a number as an approximate measure of its success in reaching
the goal from the state. In case, the agent is embedded with such a capability of
assigning such numbers to states, then it can choose, out of the reachable states in the
next move, the state with the highest assigned number, out of the numbers assigned to
various reachable states, indicating possibly the best chance of reaching the goal.
It will allow the goal to be achieved more efficiently. Such an agent will be more
useful, i.e. will have more utility. A utility-based agent uses a utility function, which
maps each of the world states of the agent to some degree of success. If it is possible
53
Applications of Artificial
to define the utility function accurately, then the agent will be able to reach the goal Intelligence
quite efficiently. Also, a utility-based agent is able to make decisions in case of
conflicting goals, generally choosing the goal with higher success rating or value.
Further, in environments with multiple goals, the utility-based agent quite likely
chooses the goal with least cost or higher utility goal out of multiple goals.
Percepts
T
AGENT Sensors A
Internal State S
The current K
state
The affect of
actions
The new state
consequent
E
How the world upon an action N
evolves
independently V
The degree of I
success R
achieved The desirable
consequent action
O
Utility upon an N
action
M
E
Actuators Actions
N
T
It is not possible to encode all the knowledge in advance, required by a rational agent
for optimal performance during its lifetime. This is specially true of the real life, and
not just theoretical, environments. These environments are dynamic in the sense that
the environmental conditions change, not only due to the actions of the agents under
considerations, but due to other environmental factors also. For example, all of a
sudden a pedestrian comes just in front of the moving vehicle, even when there is
green signal for the vehicle. In a multi-agent environment, all the possible decisions
and actions an agent is required to take, are generally unpredictable in view of the
decisions taken and actions performed simultaneously by other agents. Hence, the
ability of an agent to succeed in an uncertain and unknown environment depends
on its learning capability i.e., its capability to change approximately its knowledge
of the environment. For an agent with learning capability, some initial knowledge is
coded in the agent program and after the agent starts operating, it learns from its
actions the evolving environment, the actions of its competitors or adversaries etc. so
as to improve its performance in ever-changing environment. If approximate learning
component is incorporated in the agent, then the knowledge of the agent gradually
54
Intelligent Agents increases after each action starting from its initial knowledge which was manually
coded into it at the start.
(i) Learning Component: It is the component of the agent, which on the basis of the
percepts and the feedback from the environment, gradually improves the
performance of the agent.
(ii) Performance Component: It is the component from which all actions originate on
the basis of external percepts and the knowledge provided by the learning
component.
The design of learning component and the design of performance element are
very much related to each other because a learning component is of no use unless
the performance component can be designed to convert the newly acquired knowledge
into better useful actions.
(iii) Critic Component: This component finds out how well the agent is doing with
respect to a certain fixed performance standard and it is also responsible for any
future modifications in the performance component. The critic is necessary to
judge the agent’s success with respect to the chosen performance standard,
specially in a dynamic environment. For example, in order to check whether a
certain job is accomplished, the critic will not depend on external percepts only
but it will also compare the current state to the state, which indicates the
completion of that task.
55
Applications of Artificial
Fixed Performance Standard Intelligence
T
Percepts A
AGENT Sensors S
Critic Component K
feedback
E
Indicate changes N
Performance
Learning Element
V
Component knowledge I
R
O
learning goals
N
Actions M
Problem
Generator Actuators E
Component N
T
A Learning agent
The purpose of embedding learning capability in an agent is that it should not depend
totally on the knowledge initially encoded in it and on the external percepts for its
actions. The agent learns by evaluating its own decisions and/or making observations
of new situations it encounters in the ever-changing environment.
There may be various criteria for developing learning taxonomies. The criteria may be
based on –
• The type of knowledge learnt, e.g., concepts, problem-solving or game playing,
• The type of representation used, e.g., predicate calculus, rules or frames,
• The area of application, e.g., medical diagnosis, scheduling or prediction.
56
Intelligent Agents Rote learning is the simplest form of learning, which involves least amount of
inferencing. In this form of learning, the knowledge is simply copied in the knowledge
base. This is the learning, which is involved in memorizing multiplication tables.
Next type of learning is learning through instruction. This type of learning involves
more inferencing because the knowledge in order to be operational should be
integrated in the existing knowledge base. This is the type of learning that is involved
in the learning of a pupil from a teacher.
For instance, in the case of the concept of cow as discussed above, we may find a
black cow, or we may find a three-legged cow who has lost one leg in an accident or a
single-horn cow.
Supervised Learning: It involves a learning function from the given examples of its
inputs and outputs. Some of the examples of this type of learning are:
Unsupervised Learning: In this type of learning, the pair of inputs and corresponding
expected outputs are not available. Hence, the learning system, on its own, has to find
57
Applications of Artificial
relevant properties from the otherwise unknown objects. For example, finding shortest Intelligence
path, without any prior knowledge, between two cities in a totally unknown country.
In this kind of task environment, an action policy is needed to maximize reward. But
sometimes in case of ongoing and non-terminating tasks the future reward might be
infinite, so it is difficult to decide how to maximize it. In such a scenario, a method of
progressing ahead is to discount future rewards beyond a certain factor. i.e., the agent
may prefer rewards in the immediate future to those in the distant future.
2.7 SUMMARY
In this unit, the concept of ‘Intelligent Agent’ is introduced and further various issues
about intelligent agents are discussed. Some definitions in this context are given in
section 2.2. The next section discusses the concept of rationality in context of agents.
Section 2.4 discusses various types of task environments for agents. Next section
discusses structure of an agent. Finally, section 2.6 discusses various forms of learning
in an agent.
2.8 SOLUTIONS/ANSWERS
Ex. 1) For L1
The left upper cell is unoccupied, therefore, C1=0, C2=0=C3=C4 and C8=0
However, as C5, C6 and C7 do not exist hence are assumed to be occupied.
Thus, the sense-vector is (0, 0, 0, 0, 1, 1, 1, 0)
For L2
C1=1=C8 and C2=C3=C4=C5=C6=C7=0
Thus, the sense-vector is (1, 0, 0, 0, 0, 0, 0, 1)
58