Ai Unit - 1
Ai Unit - 1
BY
PRASHU
• Thinking rationally" means thinking based on reason
rather than emotion. It can also mean the ability to
draw sensible conclusions from facts, logic, and data
• KEY TAKEAWAYS
• Expected utility refers to the utility of an entity or aggregate economy
over a future period of time, given unknowable circumstances.
• Expected utility theory is used as a tool for analyzing situations in
which individuals must make a decision without knowing the
outcomes that may result from that decision.
• The expected utility theory was first posited by Daniel Bernoulli who
used it to solve the St. Petersburg Paradox.
• Expected utility is also used to evaluate situations without immediate
payback, such as purchasing insurance.
• Maximizing your expected utility means choosing the
option that has the highest average utility, where
average utility is the sum of all utilities weighted by
their probabilities. This theory is used to understand
decisions made in risky situations
• Homo sapiens—man the wise
AI
• For thousands of years, we have tried to understand how we think; that is, how a mere
handful of matter can perceive, understand, predict, and manipulate a world far larger
and more complicated than itself. The field of artificial intelligence, or AI, goes
further still: it attempts not just to understand but also to build intelligent entities.
• The definitions on top are concerned with thought processes and reasoning, whereas the ones on
the bottom address behavior.
• The definitions on the left measure success in terms of fidelity to human performance, whereas the
ones on the right measure against an ideal performance measure, called rationality.
• The interdisciplinary field of cognitive science brings together computer models from
AI and experimental techniques from psychology to construct precise and testable
theories of the human mind
• The main difference between artificial intelligence (AI) and cognitive
science is that AI is a technology that aims to simulate human
intelligence, while cognitive science is the study of the human mind:
• Artificial intelligence (AI)
• AI is a technology that allows machines to simulate human
intelligence, such as learning, problem solving, and decision
making. AI can be used to create systems that can see and identify
objects, understand human language, and perform specific tasks.
• Cognitive science
• Cognitive science is the study of the human mind and brain, and how
it represents and manipulates knowledge. Cognitive science is an
interdisciplinary field that includes philosophy, psychology,
neuroscience, linguistics, and anthropology
• the purpose of AI is to think on its own and make
decisions independently, whereas the purpose of
Cognitive Computing is to simulate and assist human
thinking and decision-making.
Thinking rationally: The “laws of thought” approach
• The Greek philosopher Aristotle was one of the first to attempt to codify “right
patterns for argument structures that always yielded correct conclusions when
given correct premises—for example, “Socrates is a man; all men are mortal;
the operation of the mind; their study initiated the field called logic
• Logicians in the 19th century developed a precise notation for statements about all
kinds of objects in the world and the relations among them.
• By 1965, programs existed that could, in principle, solve any solvable problem
described in logical notation
• The so-called logicist tradition within artificial intelligence hopes to build on such
programs to create intelligent systems
Acting rationally: The rational agent approach
• An agent is just something that acts (agent comes from the Latin agere, to
do). Of course, all computer programs do something, but computer agents are
expected to do more: operate autonomously, perceive their environment,
persist over a prolonged time period, adapt to
• change, and create and pursue goals. A rational agent is one that acts so as
to achieve the
• best outcome or, when there is uncertainty, the best expected outcome
• In the “laws of thought” approach to AI, the emphasis was on correct inferences.
Making
• correct inferences is sometimes part of being a rational agent, because one way to
act
• rationally is to reason logically to the conclusion that a given action will achieve
one’s goals
• and then to act on that conclusion. On the other hand, correct inference is not all of
rationality;
• in some situations, there is no provably correct thing to do, but something must still
be
• done. There are also ways of acting rationally that cannot be said to involve
inference. For
• example, recoiling from a hot stove is a reflex action that is usually more successful
than a
• slower action taken after careful deliberation
• All the skills needed for the Turing Test also allow an
agent to act rationally. Knowledge
• representation and reasoning enable agents to reach
good decisions. We need to be able to
• generate comprehensible sentences in natural language
to get by in a complex society. We
• need learning not only for erudition, but also because it
improves our ability to generate
• effective behavior
• The rational-agent approach has two advantages over the other approaches. First, it
• is more general than the “laws of thought” approach because correct inference is just
one
• of several possible mechanisms for achieving rationality
• Philosophy
• Can formal rules be used to draw valid conclusions?
• • How does the mind arise from a physical brain?
• • Where does knowledge come from?
• • How does knowledge lead to action?
• Aristotle (384–322 B.C.), whose bust appears on the front cover of this book
• was the first
• to formulate a precise set of laws governing the rational part of the mind. He
developed an
• informal system of syllogisms for proper reasoning, which in principle allowed one to
generate
• conclusions mechanically, given initial premises.
• Thomas
• Hobbes (1588–1679) proposed that reasoning was like numerical computation, that “we
add
• and subtract in our silent thoughts.” The automation of computation itself was already
well
• under way.
• The first
• known calculating machine was constructed around 1623 by the German scientist
Wilhelm
• Schickard (1592–1635), although the Pascaline, built in 1642 by Blaise Pascal (1623–
1662
• Pascal wrote that “the arithmetical machine produces effects which appear
• nearer to thought than all the actions of animals.”
• It’s one thing to say that the mind operates, at least in part, according to logical rules,
and
• to build physical systems that emulate some of those rules; it’s another to say that the
mind
• itself is such a physical system.
• Descartes was a strong advocate of the power
• of reasoning in understanding the world, a philosophy now called rationalism.
• But Descartes was also a proponent of dualism.
• He held that there is a part of the human mind (or soul or spirit) that is outside of
nature,
• exempt from physical laws.
• Animals, on the other hand, did not possess this dual quality;
• they could be treated as machines. An alternative to dualism is materialism, which
holds
• that the brain’s operation according to the laws of physics constitutes the mind
• The empiricism movement, starting with Francis Bacon’s (1561–
• 1626) Novum Organum,2 is characterized by a dictum of John Locke (1632–1704): “Nothing
• is in the understanding, which was not first in the senses.”
• David Hume’s (1711–1776) A
• Treatise of Human Nature (Hume, 1739) proposed what is now known as the principle of
• induction: that general rules are acquired by exposure to repeated associations between
their
• elements.
• the famous Vienna Circle, led by Rudolf Carnap (1891–1970), developed the
• doctrine of logical positivism. This doctrine holds that all knowledge can be characterized
by
• logical theories connected, ultimately, to observation sentences that correspond to
sensory
• inputs; thus logical positivism combines rationalism and empiricism
• The confirmation theory of Carnap and Carl Hempel (1905–1997) attempted to
analyze the acquisition of knowl-
• edge from experience. Carnap’s book The Logical Structure of the World (1928) defined
an
• explicit computational procedure for extracting knowledge from elementary
experiences. It
• was probably the first theory of
• mind as a computational process.
• The final element in the philosophical picture of the mind is the connection between
• knowledge and action. This question is vital to AI because intelligence requires action
as well
• as reasoning. Moreover, only by understanding how actions are justified can we
understand
• how to build an agent whose actions are justifiable (or rational).
• Aristotle argued (in De Motu
• Animalium) that actions are justified by a logical connection between goals and
knowledge of
• the action’s outcome.
• Mathematics
• • What are the formal rules to draw valid conclusions?
• • What can be computed?
• • How do we reason with uncertain information?
• The first ALGORITHM nontrivial algorithm is thought to be Euclid’s algorithm for
computing
• greatest common divisors. The word algorithm (and the idea of studying them) comes
from
• al-Khowarazmi, a Persian mathematician of the 9th century, whose writings also
introduced
• Arabic numerals and algebra to Europe
• In 1931, G¨odel showed that limits on deduction do exist. His incompleteness
theorem showed that in any formal theory as strong as
• Peano arithmetic (the elementary theory of natural numbers), there are true
statements that
• are undecidable in the sense that they have no proof within the theory
• This fundamental result can also be interpreted as showing that some functions on the
• integers cannot be represented by an algorithm—that is, they cannot be computed. This
• motivated Alan Turing (1912–1954) to try to characterize exactly which functions are com
• putable—capable of being computed.
• Although decidability and computability are important to an understanding of computation,
the notion of tractability has had an even greater impact. Roughly speaking, a problem
• is called intractable if the time required to solve instances of the problem grows
exponentially
• with the size of the instances.
• The theory of NP-completeness, pioneered
• by Steven Cook (1971) and Richard Karp (1972), provides a method. Cook and Karp
• showed the existence of large classes of canonical combinatorial search and reasoning
problems
• that are NP-complete.
• Besides logic and computation, the third great contribution of mathematics to AI is the
• theory of probability. The Italian Gerolamo Cardano (1501–1576) first framed the idea of
• probability, describing it in terms of the possible outcomes of gambling events.
• Economics
• • How should we make decisions so as to maximize payoff?
• • How should we do this when others may not go along?
• • How should we do this when the payoff may be far in the future?
• Princeton was home to another influential figure in AI, John McCarthy. After receiving his
• PhD there in 1951 and working for two years as an instructor, McCarthy moved to
Stanford
• and then to Dartmouth College, which was to become the official birthplace of the field
• Newell and Simon’s early success was followed up with the General Problem Solver,
• or GPS. Unlike Logic Theorist, this program was designed from the start to imitate
human
• problem-solving protocols. Within the limited class of puzzles it could handle, it turned
out
• that the order in which the program considered subgoals and possible actions was
similar to
• that in which humans approached the same problems. Thus, GPS was probably the first
program
• to embody the “thinking humanly” approach. The success of GPS and subsequent
programs
• as models of cognition led Newell and Simon (1976) to formulate the famous physical
• symbol system hypothesis, which states that “a physical symbol system has the
necessary and
• What they meant is that any system (human
• or machine) exhibiting intelligence must operate by manipulating data structures
composed
• of symbols.
• At IBM, Nathaniel Rochester and his colleagues produced some of the first AI programs.
• Herbert Gelernter (1959) constructed the Geometry Theorem Prover, which was
• able to prove theorems that many students of mathematics would find quite tricky.
Starting
• in 1952, Arthur Samuel wrote a series of programs for checkers (draughts) that
eventually
• learned to play at a strong amateur level
• John McCarthy moved from Dartmouth to MIT and there made three crucial
contributions
• in one historic year: 1958. InMIT AI LabMemo No. 1,McCarthy defined the high-level
• language Lisp, which was to become LISP the dominant AI programming language for
the next 30
• years.
• Also in 1958, McCarthy published a paper entitled Programs with Common Sense,
• in which he described the Advice Taker, a hypothetical program that can be seen as the
first
• complete AI system. Like the Logic Theorist and Geometry Theorem Prover, McCarthy’s
• program was designed to use knowledge to search for solutions to problems
• 1958 also marked the year that Marvin Minsky moved to MIT. His initial collaboration
• with McCarthy did not last, however. McCarthy stressed representation and reasoning
in formal
• logic, whereas Minsky was more interested in getting programs to work and eventually
• developed an anti-logic outlook. In 1963, McCarthy started the AI lab at Stanford. His
plan
• to use logic to build the ultimate Advice Taker was advanced by J. A. Robinson’s
discovery
• in 1965 of the resolution method (a complete theorem-proving algorithm for first-order
• logic; see Chapter 9). Work at Stanford emphasized general-purpose methods for
logical
• reasoning. Applications of logic included Cordell Green’s question-answering and
planning
• systems (Green, 1969b) and the Shakey robotics project at the Stanford Research
Institute
• Minsky supervised a series of students who chose limited problems that appeared to
• require intelligence to solve. These limited domains became known as microworlds.
James
• Slagle’s SAINT program (1963) was able to solve closed-form calculus integration
problems
• typical of first-year college courses. Tom Evans’s ANALOGY program (1968) solved
geometric
• analogy problems that appear in IQ tests. Daniel Bobrow’s STUDENT program (1967)
• solved algebra story problems, such as the following:
• If the number of customers Tom gets is twice the square of 20 percent of the number
• of advertisements he runs, and the number of advertisements he runs is 45, what is
the
• number of customers Tom gets?
• The most famous microworld was the blocks world, which consists of a
set of solid blocks
• placed on a tabletop (or more often, a simulation of a tabletop), as shown
in Figure 1.4.
• A typical task in this world is to rearrange the blocks in a certain way,
using a robot hand
• that can pick up one block at a time. The blocks world was home to the
vision project of
• David Huffman (1971), the vision and constraint-propagation work of
David Waltz (1975),
• the learning theory of Patrick Winston (1970), the natural-language-
understanding program
• of Terry Winograd (1972), and the planner of Scott Fahlman (1974).
• The work of Winograd and Cowan (1963) showed how a large number of
elements could
• collectively represent an individual concept, with a corresponding
increase in robustness and
• parallelism. Hebb’s learning methods were enhanced by Bernie Widrow
(Widrow and Hoff,
• 1960; Widrow, 1962), who called his networks adalines, and by Frank
Rosenblatt (1962)
• with his perceptrons. The perceptron convergence theorem (Block et al.,
1962) says that
• the learning algorithm can adjust the connection strengths of a
perceptron to match any input
• data, provided such a match exists.
A dose of reality (1966–1973)
• A typical story
• occurred in early machine translation efforts, which were generously funded by the U.S. National
• Research Council in an attempt to speed up the translation of Russian scientific papers
• in the wake of the Sputnik launch in 1957. It was thought initially that simple syntactic
transformations
• based on the grammars of Russian and English, and word replacement from an
• electronic dictionary, would suffice to preserve the exact meanings of sentences. The fact is
• that accurate translation requires background knowledge in order to resolve ambiguity and
• establish the content of the sentence. The famous retranslation of “the spirit is willing but
• the flesh is weak” as “the vodka is good but the meat is rotten” illustrates the difficulties
encountered.
• In 1966, a report by an advisory committee found that “there has been no machine
• translation of general scientific text, and none is in immediate prospect.” All U.S. government
• funding for academic translation projects was canceled. Today, machine translation is an
imperfect
• but widely used tool for technical, commercial, government, and Internet documents
• The illusion of unlimited computational power was not confined to problem-solving
• programs. Early experiments in machine evolution (now called genetic algorithms)
(Fried-
• berg, 1958; Friedberg et al., 1959) were based on the undoubtedly correct belief that
by
• making an appropriate series of small mutations to a machine-code program, one can
generate
• a program with good performance for any particular task. The idea, then, was to try
• random mutations with a selection process to preserve mutations that seemed useful.
Despite
• thousands of hours of CPU time, almost no progress was demonstrated. Modern genetic
• algorithms use better representations and have shown more successthe new back-propagation
learning algorithms
• for multilayer networks that were to cause an enormous resurgence in neural-net
• research in the late 1980s were actually discovered first in 1969 (Bryson and Ho, 1969).
Knowledge-based systems: The key
to power? (1969–1979
• The picture of problem solving that had arisen during the first decade of AI research
was of
• a general-purpose search mechanism trying to string together elementary reasoning
steps to
• find complete solutions. Such approaches have been called weak methods because,
although
• general, they do not scale up to large or difficult problem instances.
• The DENDRAL program (Buchanan et al., 1969) was an early example of this approach.
• It was developed at Stanford, where Ed Feigenbaum (a former student of Herbert
Simon),
• Bruce Buchanan (a philosopher turned computer scientist), and Joshua Lederberg (a
Nobel
• laureate geneticist) teamed up to solve the problem of inferring molecular structure
from the
• information provided by a mass spectrometer.
• Feigenbaum and others at Stanford began the Heuristic Programming
• Project (HPP) to investigate the extent to which the new methodology of expert
• systems could be applied to other areas of EXPERT SYSTEMS human expertise. The
next major effort was in
• the area of medical diagnosis. Feigenbaum, Buchanan, and Dr. Edward Shortliffe
developed
• MYCIN to diagnose blood infections. With about 450 rules, MYCIN was able to perform
• as well as some experts, and considerably better than junior doctors. It also contained
two
• major differences from DENDRAL.
• They had to be acquired from
• extensive interviewing of experts, who in turn acquired them from textbooks, other
experts,
• and direct experience of cases. Second, the rules had to reflect the uncertainty
associated with
• medical knowledge. MYCIN incorporated a calculus of uncertainty called certainty
factors
• (see Chapter 14), which seemed (at the time) to fit well with how doctors assessed the
impact
• of evidence on the diagnosis.
AI becomes an industry (1980–present
• The first successful commercial expert system, R1, began operation at the Digital Equipment
• Corporation (McDermott, 1982). The program helped configure orders for new computer
• systems; by 1986, it was saving the company an estimated $40 million a year. By 1988,
• DEC’s AI group had 40 expert systems deployed, with more on the way. DuPont had 100 in
• use and 500 in development, saving an estimated $10 million a year. Nearly every major U.S.
• corporation had its own AI group and was either using or investigating expert systems.
• In 1981, the Japanese announced the “Fifth Generation” project, a 10-year plan to build
• intelligent computers running Prolog. In response, the United States formed the Microelectronics
• and Computer Technology Corporation (MCC) as a research consortium designed to
• assure national competitiveness. In both cases, AI was part of a broad effort, including chip
• design and human-interface research. In Britain, the Alvey report reinstated the funding that
• was cut by the Lighthill report.13 In all three countries, however, the projects never met their
• ambitious goals.
• Overall, the AI industry boomed from a few million dollars in 1980 to billions of dollars
• in 1988, including hundreds of companies building expert systems, vision systems,
robots,
• and software and hardware specialized for these purposes. Soon after that came a
period
• called the “AIWinter,” in which many companies fell by the wayside as they failed to
deliver
• on extravagant promises.
The return of neural networks (1986–present)
• One influential paper in this line was Yarowsky’s (1995) work on word-sense
disambiguation:
• given the use of the word “plant” in a sentence, does that refer to flora or factory?
• Previous approaches to the problem had relied on human-labeled examples combined
with
• machine learning algorithms. Yarowsky showed that the task can be done, with
accuracy
• above 96%, with no labeled examples at all. Instead, given a very large corpus of
unannotated
• text and just the dictionary definitions of the two senses—“works, industrial plant” and
• “flora, plant life”—one can label examples in the corpus, and from there bootstrap to
learn new patterns that help label new examples. Banko and Brill (2001) show that
techniques
• like this perform even better as the amount of available text goes from a million words
to a
• billion and that the increase in performance from using more data exceeds any
• Hays and Efros (2007) discuss the problem of filling in holes in a
• photograph. Suppose you use Photoshop to mask out an ex-friend from a group photo,
but
• now you need to fill in the masked area with something that matches the background.
Hays
• and Efros defined an algorithm that searches through a collection of photos to find
something
• that will match. They found the performance of their algorithm was poor when they
used
• a collection of only ten thousand photos, but crossed a threshold into excellent
performance
• when they grew the collection to two million photos
STATE OF THE ART: What can AI do today?