Unit 1 Introduction To Intelligence and Artificial Intelligence
Unit 1 Introduction To Intelligence and Artificial Intelligence
1.0 Introduction 5
1.1 Objectives 6
1.2 Some Simple Definition of A.I. 6
1.3 Definition by Eliane Rich 6
1.4 Definition by Buchanin and Shortliffe 8
1.5 Another Definition by Elaine Rich 12
1.6 Definition by Barr and Feigenbaum 13
1.7 Definition by Shalkoff 18
1.8 Summary 19
1.9 Further Readings/References 20
1.0 INTRODUCTION
In this unit, we discuss intelligence, both machine and human. However, as our
subject matter in the course is machine intelligence, or artificial intelligence, our
discussion of the subject matter is mainly from the point of view of machine
intelligence. Machine intelligence is popularly known as Artificial Intelligence and is
generally referred to by its abbreviation viz. AI. We also shall use the name AI for
the discipline throughout. The style of discussion in this unit is to start with a
definition of AI by some pioneer in the field, and then elaborate the ideas involved in
the definition. Further, while elaborating the ideas involved in the definition, we
introduce a number of relevant new ideas, concepts and definitions to be used later. In
this process, we have introduced and/or explained the following:
5
Introduction to A.I
1.1 OBJECTIVES
Before looking at what A.I. is in the expert‟s opinions that involve technical terms
needing some explanation, we state below three simple definitions from completely
non-specialists‟ point of view:
2. A.I. is the study of making computer models of human intelligence; and finally
3. A.I. is the study concerned with building machines that simulate human
behaviour.
In order to have still better and concrete opinion about what is AI and its subject-
matter, we consider definitions suggested by leading writers and pioneer contributors
to the development of A.I. We supplement these definitions with comments to
facilitate the understanding of the underlying ideas and of the technical terms involved
in the definitions.
Definition 1: The first definition we consider is by Elaine Rich, the author of the
book entitled „Artificial Intelligence‟[1]. It states: Artificial Intelligence is the
study of how to make computers do things, at which, at the moment, people are
better.
Comment 1, Definition 1: Implicit in the Rich‟s definition is the idea that there are
mental tasks that computers can do better than human beings and vice-versa, there are
tasks which at the moment human beings can do better than computers. It is well-
known that computers are better than human beings in the matter of
6 numerical computation,
information storage, and Introduction to Intelligence
repetitive tasks. and Artificial Intelligence
On the other hand, at the moment, human beings are much better than machine in
the matter of
understanding including the capability of explaining,
predicting the behaviour and structure of a system,
in the matter of common-sense reasoning,
in drawing conclusions when available information is either incomplete,
inconsistent or even both, and
also, in visual understanding and speech understanding, which require
simultaneous availability (availability in parallel) of large amount of information.
In essence, it is found that computers are better than human beings in tasks
requiring sequential but fast computations, where human beings are better than
computers in tasks, requiring essentially parallel processing. In order to clarify
what it is for a problem to essentially require parallel processing for its solution, we
consider the following problem:
Figure 1.1
We are given a paper with some letter, say, C written on it and a card-board with a
pin-hole in it. The card board is placed on the paper in such a manner that the letter is
fully covered by the card board as shown in Figure 1.1. We are allowed to look at the
paper only through the pin-hole in the card-board. The problem is to tell correctly the
letter written on the paper by just looking through the pin-hole. As the information
about the black and white pixels is not available simultaneously, it is not possible to
figure out the letter written on the paper. The figuring out the letter on the paper
requires, simultaneous availability of the whole of the grey-level information of all the
points constituting the letter and its surrounding on the paper. The gray-level
information of the surrounding of the letter provides the context in which to interpret
the letter.
We consider another example that shows the significance of contextual information or
knowledge and its simultaneous availability for visual understanding. From the
following picture, we can conclude that one of the curved lines represents a river and
other curved lines represent sides of the hills only on the basis of the simultaneous
availability of information of the pixels.
7
Introduction to A.I
Contextual information plays a very important role not only in the visual
understanding but also in the language and speech understanding. In case of speech
understanding, consider the following example, in which the word „with‟ has a
number of meanings (or connotations) each being determined by the context.
Further, the phrase „for a long time‟ may stand for a few hours to millions of years,
but again determined by the context, as explained below.
Comment 3, Definition 1: The definition is rather weak in the sense that it fails to
include some areas of potentially large importance viz, problems that can be solved at
present neither by human beings nor by computers. Also, it may be noted that, by
and by, if computer systems become so powerful that there is no problem left, which
human beings can solve better than computers, then nothing is left of AI according to
this definition.
8
Definition 2 AI is the branch of computer science that deals with symbolic rather Introduction to Intelligence
than numeric processing and non-algorithmic methods including the rules of and Artificial Intelligence
thumb or heuristics instead of algorithms as techniques for solving problems.
On the other hand, even a non-digital character sequence say ‘ABC’ may represent a
number, for example, in hexadecimal number system. Also, words of English (or any
other) language when considered lexicographically ordered, acquire some numeric
attributes.
The conclusion we draw from the above discussion, is that a word as a sequence of
characters (including digits) may denote a number or a symbol (henceforth, a symbol
stands for non-numeric symbol) depending upon the context in which it is used.
And the context is determined by the nature of the problem under consideration. If
the problem can be solved using only numerical aspects of the objects in the domain
and environment of the problem, then we have the advantage of having built-in
relations (like less than, equal to etc.) and the built-in operations (like +, -, * etc.) that
can be readily used without having to define these relations and operations explicitly.
But, unfortunately, most of the problems, we encounter for our day to day survival or
even for our intellectual pursuits, involve not only quantitative, but qualitative aspects
also of the objects of the problem domain. In order to solve these problems, we use
common sense reasoning, exploit our capability for visual and linguistic
understanding, try to get meaning out of incomplete and even inconsistent information
that is available, in addition to a number of other known and unknown mechanism.
Qualitative aspects, their ideal representations, defining relations and operations
involving these aspects, are generally different for different types of problems.
Hence, it is impossible to capture in general relevant relations and operations for all
types of problems, and then defining these as built-in operations of the machine,
because there are potentially infinite types of problems that we encounter and try to
solve.
This discussion explains the basic difference between numeric processing and
(non-numeric) symbolic processing. Summarizing, numeric processing involves
only a small number of well-defined relations and operations having universally
accepted meanings, and hence, these relations and operations can be incorporated as a
part of a computer system. On the other hand, in symbolic processing the relations
and operations required to solve a problem depend upon the problem under
consideration, and hence, have to be defined explicitly along-with or as a part of
programs constituting the solutions of the problems.
They proved that even through a problem may be expressed precisely or formally (i.e.,
in terms of mathematical entities like sets, relations functions etc.), yet it need not
yield to an algorithmic solution. A problem which has at least one algorithmic
solution is called a solvable problem. They further proved that out of even solvable
problems, only a small fraction can be solved if only feasible amount of resources
like, time and space are used. Informally, feasible amount of resources means that
the requirement for resources does not increase too rapidly with the increase in size of
the problem. The notion of the size of a problem will be defined formally later on
(under comment 1 on Definition 3). However, an intuitive idea about the concept of
the size of a problem and its role in estimating the resource requirement for solving
the problem can be had through the simple problem of calculating income tax for each
of the tax-payers. The requirement of resources like, time and computing equipment
for 1000 tax-payers would be much less, as compared to the requirement of resources
for computing income-tax for one million tax payers. In this problem, n, the number of
tax-payers for whom the income-tax is to be calculated, may be taken as size of the
problem.
This limitation and other difficulties with algorithmic solutions has given impetus to
efforts for finding non-algorithmic solutions of problems. Neural Network
approach to solving many difficult problems, is a well-known alternative to
algorithmic methods of solving problems. In AI, there are mainly two approaches to
solve problems, which generally difficult to solve with algorithmic methods. One
approach is Neural approach, mentioned just above. The other approach is called
symbolic approach. The symbolic approach cannot be said to be non-algorithmic. The
main difference between symbolic approach of AI and algorithmic approach is that
symbolic approach of AI emphasizes exploitation of the knowledge of the domain and
the environment of the problem under consideration. Some of this knowledge is in the
form rules of thumb, generally, called heuristics in AI.
Consider the problem of crossing from one side over to the other side of a busy road
on which a number of vehicles are moving at different velocities. A step-by-step (i.e.,
algorithmic) method of solving this problem may consist of:
10
(i) Knowing (exactly) the distances of various vehicles from the path to be Introduction to Intelligence
followed to cross over. and Artificial Intelligence
(ii) Knowing the velocities and accelerations of the various vehicles moving on the
road within a distance of, say, one kilometer.
2
(iii) Using Newton‟s Laws of motion and their derivatives like s = ut + at , and
calculating the times that would be taken by each of the various vehicles to
reach the path intended to be followed to cross over.
(iv) Adjusting dynamically our speeds on the path so that no collision takes place
with any of the vehicle moving on the road.
The above is a systematic step-by-step method, i.e., an algorithm, of crossing the road
that may ensure no collision with any vehicle. But, how many of us can follow it?
Hardly anybody! First of all, it is practically impossible to measure distances,
velocities and accelerations of various vehicles on the road, even within a radius of
one kilometer. Secondly, even if we assume theoretically that it is possible to measure
distances, velocities and accelerations of various vehicles and to calculate safe timings
to cross the road, we would not like or care to follow the above-mentioned algorithm,
because our past experience, our sense of survival and other built-in mechanisms have
allowed us, in the past, to cross over safely without following any systematic method.
All of us just guess the distances of the vehicles, safe enough to cross over, and then
actually cross over at an appropriate time. Not even one in 1000, on an average gets
hurt when crossing a road using only guesses, in a crowded city like, Delhi, where
movement of vehicles is one of the most chaotic and unruly in the whole world.
However, this is not to deny that once in a while, the guess is incorrect and someone
or other gets hurt or even is killed almost every day.
Each one of us every day, comes across hundreds of problems similar to the one of
crossing of a road. And, for each such problem one uses a good guess and one
generally is able to solve the problem satisfactorily each time, though the solutions
may not be the best possible ones. And, or once in a while, we even fail to get any
solution using the guess. However, if we insist on only following a systematic step-
by-stop method that guarantees best possible solution for solving each problem, then
we would hardly be able to make any progress in our day to day business of even
mere survival.
The essence of the above discussion is that while attempting solutions of many of the
problems, it is not only desirable but almost essential that for each of such problems
we follow some good guess instead of following a step-by-step systematic method
that guarantees the best solution. In A.I, these guesses are called heuristics. In later
chapters, we discuss heuristics in detail. However, for the time being, we state that
heuristics are good guesses, possibly based on past experience, judgement, intuition
or hunches, which lead us most of the time to reasonably good solutions, though these
guesses do not guarantee the best solutions or even any solution for every instance of
the problem under consideration.
11
Introduction to A.I
1.5 ANOTHER DEFINITION BY ELAINE RICH
The next definition, again by Elaine Rich [1] is more technical and involves some
concepts from Theory of Computation. It sates:
Definition 3: Artificial Intelligence is the study of techniques for solving
exponentially hard problems in polynomial time exploiting knowledge about the
problem domain.
As computer study is partly engineering in nature, in the sense that we design and
implement or produce computer solutions for different types of problems and hence
these products, i.e., solutions, need to be evaluated vis-a-vis problem specifications
and other measures like, efficiency in respect of time and space requirements of the
solutions. In order to measure the efficiency of a suggested computer solution of a
problem, the earlier mentioned logicians/mathematicians suggested the concepts of
time complexity and space complexity for the solutions and even for the problems.
The basic idea behind these complexity measures is that all the operations that a
computer (present or future generations) can execute, may be thought of as composed
of a small number of basic operations. These basic operations can be easily compared
for their relative requirements for time and space. For the basic operation say O 1,
which is expected to take minimum time (or space) among all the basic operations, the
time (or space) complexity is assigned the number one. For any other basic operation,
complexity is a positive number depending upon the expected relative requirement for
time (or space) for the operation as compared to that for the operation O 1. For other
computer operations, time/space complexity may be computed from those for the
basic operations. Also from these complexities, we can compute the complexities of
the programs using the size of the input data as an additional parameter. For example,
to multiply two n x n matrices we require n3 multiplications and (n3 ─ n2) additions.
Further, if Ms X also knows the house number in Hauz Khas, then there is hardly any
search required and X can directly reach Y‟s residence. Next, consider just opposite
situation so far as availability of knowledge is concerned. Let us X even do not know
that Y lives in Delhi. We can easily guess the plight of X when she, if follows a step-
by-step method, is required to search, possibly all over the world, for the residence of
Y.
Fisher and Firschein in their book „Intelligence: The Eye, the Brain and the
Computer‟ [9] on Page 4 state that they expect an intelligent agent to be able to:
They further state that there are a number of human attributes that are related
to the concept of intelligence, but are normally considered distinct from it:
Awareness (consciousness)
Aesthetic appreciation (art, music)
Emotion (anger, sorrow, pain, pleasure, love, hate)
14
Sensory acuteness Introduction to Intelligence
Muscular coordination (motor skills) and Artificial Intelligence
Next, we discuss „intelligence‟ from more fundamental level. The ideas explained
below are based on the Information Transfer Model of scientific phenomena due to
Norbert Wiener (1894-1964). Norbert Wiener, an intellectual prodigy and author of
the famous book entitled Cybernetics [14], suggested the Transfer of Information
model to be a better model than the prevailing model based on Transfer of Energy for
explanation of a number of scientific phenomena. Through the Wiener‟s theory, a new
discipline was born, also, called Cybernetics
However, our discussion is mainly based on ideas explained in the book „Beyond
Information‟ by Tom Stonier [10]: According to the ideas explained in Stonier,
there are four fundamental properties of the universe viz. energy, matter,
information and evolution (or change). The cardinality of information in the
universal scheme of things can be judged from the following argument: All the
entities from down to nucleons to the whole of the universe, each is known to us as an
organised system of simpler objects, e.g., fundamental particles organise into
nucleolus, nucleolus organise to form atomic nuclei, which alongwith electrons and
protons organise into atoms and so on. Molecules, polymers, membranes, organs,
living beings, societies, planets, planetary systems, galaxies … and finally the whole
universe, each is known as an organised system of some simpler objects. An
organisation builds upon pre-existing organisations. Thus an organised system is
recursively obtained (or defined) as an interdependent assembly of elements
and/or organised systems. And it is „information‟ what is exchanged between
components of an organised system to effect their interdependence and to maintain
the integrity of the system as long as the system survives against the fourth
fundamental property of the universe, i.e., evolution or change. Gravitational pull,
now an established entity, is just an information processing activity. Thus
„information‟ is no more or no less an abstract concept than „energy‟ or „matter‟.
What mass is to matter and the heat is to energy, so is organisation to
information. Each of the former is a visible and measurable form of the
corresponding latter. More the mass, more the matter in a system; more the heat,
more the capacity to do work, i.e., energy in the system; similarly higher the degree
(or more the complexity) of the organization (in terms of underlying organizations of
the components and their components and so on, and in terms of the number and
levels of interactions and relations between components at a particular level) higher is
the information content of the system.
Remark 2: Information organises not only matter and energy but itself as well.
Evolution leads to discontinuities, i.e., to something which is qualitatively different
from the earlier existing entities. And intelligence is the phenomenon which has
evolved out of information but which is qualitatively different from information.
In the similar manner, we consider a finite set of attributes and degrees for each
attribute for organizations, i.e., information processing systems, which allow us to
categorise systems as intelligent or otherwise in such a way that the systems which are
generally considered as intelligent are categorised as intelligent and further whatever
systems are generally considered as non-intelligent are categorized as non-intelligent.
As evolution has taken over billions of years, hence divergence among information
processing systems intelligence-wise must be potentially infinite. Thus any
categorization based on only finite number of attributes would always be incomplete
16
and leave large number of cases „uncategorisable‟. To begin with, we start with a Introduction to Intelligence
working definition of intelligence and then later expand on it: and Artificial Intelligence
The above principle fits best, at least, in the limiting cases: At one extreme is a cube
of sugar dissolving in a cup of tea. Although highly organised, the cube is totally
controlled by environmental elements and hence, according to the above principle, it
has zero intelligence. This is exactly what we also feel. On the other extreme is
technologically advanced human society which can divert the waters of rivers to
irrigate plains to provide an assured supply of food to its population. Thus
intelligence measure of a technologically advanced society as a whole is, according to
the above principle, quite high. This conclusion of the above principle is in
consonance with what we also feel.
Fishler and Firschein [9] on Page 4 state: Intelligence involves learning capability
and goal-oriented behaviour. Additional attributes of intelligence include reasoning,
common-sense, planning, perception, creativity, memory retention & recall.
Shanks [11] on Page 49 observes: The simplest and perhaps safest definition of
intelligence is the ability to react to something new in a non-programmed way. The
ability to be surprised or to think for oneself is really what we mean by intelligence.
In order to explain the concept of A.I. through „Definition 4‟, we discussed the
concept of intelligence itself as a phenomenon. Next, we quote another definition of
A.I. again based on the concept of intelligence and given but from engineering point
of view by another pioneer in the field, viz Shalkoff, a Professor of Electrical
Engineering.
17
Introduction to A.I
1.7 DEFINITION BY SHALKOFF
In view of the fact that A.I. is partly an engineering discipline according to the above
definition, let us recall what is meant by the concept engineering.
Again, in the light of the definition of Engineering given above, a part of the
definition by Shalkoff may be paraphrased as „…through application of A.I., products
are obtained that exhibit intelligent behaviour….’ This paraphrased part of the
definition by Shalkoff raises another issue: How to judge/evaluate whether a product
obtained through an application of A.I., is actually intelligent.
The issue of testing an A.I. product as intelligent product was considered by the
pioneers themselves including Alan Turing, the most well known name among the
pioneers. In honour of Turing, the most prestigious award for contributions to the field
of computer science, has been instituted and is given annually.
Turing suggested a test, which is well known as Turing Test, for testing whether a
product has intelligence. An outline of the Turing test is given below.
For the purpose of the test, there are three rooms. In one of the rooms is a computer
system claimed to have imbedded intelligence. In the other two rooms, two persons
are sitting, one in each room. The role of one of the persons, let us call A, is to put
questions to the computer and to the other person to be called B, without knowing to
whom a particular question is being directed, and, of course, with the specific purpose
of identifying the computer. On the other hand, the computer would answer in such a
way that its identity is not revealed to A.
The communication among the three is only through computer terminals so that
identity of the computer or the person B can be known only on the basis of quality of
responses as intelligent or otherwise, and not just on the basis of other human or
machine characteristics. If A is not able to know the identity of the computer, then
computer is intelligent. More appropriately, if the computer is able to conceal its
identity from A, then the computer is intelligent.
We may note here that, in order to be called intelligent, the computer should be clever
enough not to give answer too quickly, at least not within a fraction of a second, even
if it can, say, to a question involving finding of the product of two numbers each of
more than 20 digits.
18
Objections to Turing Test: There have been a number of objections to the Turing Introduction to Intelligence
test as a test of intelligence of a machine. One of the most well known objections is and Artificial Intelligence
called Chinese Room Test proposed by John Searle. The essence of the Chinese
Room Test, that we are going to explain below, is that convincing successfully by a
system, say A , of possessing qualities of another system, say B, does not imply that
the system A actually possesses the qualities of B. For example, the capability of
convincing others by a male human of being a woman, does not give the male the
quality of bearing a child like a woman.
The scenario for the Chinese Room Test consists of a single room with two windows.
In the room a scholar on Shakespeare, knowing English, but not knowing Chinese, is
sitting with a sort of encyclopedia on Shakespeare. The encyclopedia is printed in
such a way that for each pair of facing pages, one page is written in Chinese
characters and the other page is translation in English of the contents of the facing
page in Chinese. Through one of the windows questions on Shakespeare‟s literature in
Chinese characters are sent to the person sitting inside. The person looks through the
encyclopedia and on finding in the encyclopedia the exact copy of the sequence of
characters sent in, reads its translation in English, thinks of its answer and writes the
answer in English for his/her own understanding, finds the corresponding sequence of
Chinese characters in the encyclopedia, and sends the sequence of Chinese characters
through the other window. Now, Searle says that, though the scholar successfully
behaves as if s/he knows Chinese, but, as per assumption it is not so. Just from the fact
that a system is able to simulate a quality, it can not be inferred that the system
possesses the quality.
1.8 SUMMARY
This is an introductory unit to the course. The unit gives a bird‟s eye view of the
whole of the course of Artificial Intelligence. The approach, in the unit, is to start with
a definition by some pioneer in A.I. In the process of discussion of the definition, a
number of relevant new concepts are gradually built up and discussed.
In Section 0.4, we discuss the differences (i) between number and symbol, (ii)
between algorithmic and non-algorithmic methods of solving problems.
In the Section 0.5, another definition by Eliane Rich, as given below, is discussed:
Artificial Intelligence is the study of techniques for solving exponentially hard
problems in polynomial time exploiting knowledge about the problem domain.
In section 0.6, we discuss the following definition of A.I. by Barr & Feigenbaum:
Artificial Intelligence is the part of computer science concerned with designing 19
Introduction to A.I intelligent computer systems, i.e., systems that exhibit the characteristics we
associate with intelligence in human behaviour.
20