AI Course File HIMAKIRAN
AI Course File HIMAKIRAN
UNIT-I:
Introduction to artificial intelligence: Introduction, history, intelligent systems,
foundations of AI, Applications, tic-tac-tie game playing, development of ai languages,
current trends in AI
UNIT-II:
Problem solving: state-space search and control strategies : Introduction, general problem
solving, characteristics of problem, exhaustive searches, heuristic search techniques,
iterative-deepening a*, constraint satisfaction
Problem reduction and game playing: Introduction, problem reduction, game playing,
alpha-beta pruning, two-player perfect information games
UNIT-III:
Logic concepts: Introduction, propositional calculus, proportional logic, natural deduction
system, axiomatic system, semantic tableau system in proportional logic, resolution
refutation in proportional logic, predicate logic
UNIT-IV:
Knowledge representation: Introduction, approaches to knowledge representation,
knowledge representation using semantic network, extended semantic networks for KR,
knowledge representation using frames advanced knowledge representation techniques:
Introduction, conceptual dependency theory, script structure, cyc theory, case grammars,
semantic web
UNIT-V:
Expert system and applications: Introduction phases in building expert systems, expert
system versus traditional systems, rule-based expert systems blackboard systems truth
maintenance systems, application of expert systems, list of shells and tools
UNIT-VI:
Uncertainty measure: probability theory: Introduction, probability theory, Bayesian belief
networks,certainty factor theory, dempster-shafer theory
Fuzzy sets and fuzzy logic: Introduction, fuzzy sets, fuzzy set operations, types of
membership functions, multi valued logic, fuzzy logic, linguistic variables and hedges, fuzzy
propositions, inference rules for fuzzy propositions, fuzzy systems.
TEXT BOOKS:
2. Artificial intelligence, A modern Approach , 2nd ed, Stuart Russel, Peter Norvig, PEA
3. Artificial Intelligence- Rich, Kevin Knight, Shiv Shankar B Nair, 3rd ed, TMH
REFERNCE BOOKS:
1. Atificial intelligence, structures and Strategies for Complex problem solving, -George
F.Lugar,5thed, PEA
Course Objectives
COE-2 : Understanding of the blind and heuristic search and Game Playing algorithms
Course Outcomes
CO1: Identify Methods in AI that may be suited to solving a given problem and Game Playing
CO2: Analyze the basic issues of different types of knowledge representation techniques to
build intelligent system
3 Applications of AI, Intelligent systems Chalk & Board T1: 1.3 - 1.5
Introduction:
Artificial Intelligence is concerned with the design of intelligence in an artificial device. The term
was coined by John McCarthy in 1956.
Intelligence is the ability to acquire, understand and apply the knowledge to achieve goals in the
world.
AI is the study of the mental faculties through the use of computational models
AI is the study of intellectual/mental processes as computational processes.
AI program will demonstrate a high level of intelligence to a degree that equals or exceeds the
intelligence required of a human in performing some task.
AI is unique, sharing borders with Mathematics, Computer Science, Philosophy,
History of AI:
Philosophy
e.g., foundational issues (can a machine think?), issues of knowledge and believe, mutual
knowledge
Psychology and Cognitive Science
e.g., problem solving skills
Neuro-Science
e.g., brain architecture
Computer Science And Engineering
e.g., complexity theory, algorithms, logic and inference, programming languages, and system
building.
Mathematics and Physics
e.g., statistical modeling, continuous mathematics,
Statistical Physics, and Complex Systems.
1) Game Playing
Deep Blue Chess program beat world champion Gary Kasparov
2) Speech Recognition
PEGASUS spoken language interface to American Airlines' EAASY SABRE reseration
system, which allows users to obtain flight information and make reservations over the
telephone. The 1990s has seen significant advances in speech recognition so that limited
systems are now successful.
3) Computer Vision
Face recognition programs in use by banks, government, etc. The ALVINN system from
CMU autonomously drove a van from Washington, D.C. to San Diego (all but 52 of 2,849
miles), averaging 63 mph day and night, and in all weather conditions. Handwriting
recognition, electronics and manufacturing inspection, photo interpretation, baggage
inspection, reverse engineering to automatically construct a 3D geometric model.
4) Expert Systems
Application-specific systems that rely on obtaining the knowledge of human experts in an
area and programming that knowledge into a system.
a. Diagnostic Systems : MYCIN system for diagnosing bacterial infections of the blood
and suggesting treatments. Intellipath pathology diagnosis system (AMA approved).
Pathfinder medical diagnosis system, which suggests tests and makes diagnoses.
Whirlpool customer assistance center.
b. System Configuration
DEC's XCON system for custom hardware configuration. Radiotherapy treatment
planning.
d. Classification Systems
Put information into one of a fixed set of categories using several sources of
information. E.g., financial decision making systems. NASA developed a system for
classifying very faint areas in astronomical images into either stars or galaxies with
very high accuracy by learning from human experts' classifications.
9) Machine Learning
Application of AI:
AI algorithms have attracted close attention of researchers and have also been applied
successfully to solve problems in engineering. Nevertheless, for large and complex problems, AI
algorithms consume considerable computation time due to stochastic feature of the search
approaches
2) Engineering: check design, offer suggestions to create new product, expert systems for
all engineering problems
3) Manufacturing: assembly, inspection and maintenance
5) Education: in teaching
6) Fraud detection
7) Object identification
8) Information retrieval
Building AI Systems:
1) Perception
Intelligent biological systems are physically embodied in the world and experience the world
through their sensors (senses).
For an autonomous vehicle, input might be images from a camera and range information from a
rangefinder.
For a medical diagnosis system, perception is the set of symptoms and test results that have been
obtained and input to the system manually.
2) Reasoning
Inference, decision-making, classification from what is sensed and what the internal "model" is of
the world. Might be a neural network, logical deduction system, Hidden Markov Model
induction, heuristic searching a problem space, Bayes Network inference, genetic algorithms, etc.
Includes areas of knowledge representation, problem solving, decision theory, planning, game
theory, machine learning, uncertainty reasoning, etc.
3) Action
Biological systems interact within their environment by actuation, speech, etc. All behavior is
centered around actions in the world. Examples include controlling the steering of a Mars rover
or autonomous vehicle, or suggesting tests and making diagnoses for a medical diagnosis system.
Includes areas of robot actuation, natural language generation, and speech synthesis.
a) "The exciting new effort to make computers b) "The study of mental faculties through the
think . . . machines with minds, in the full and use of computational models" (Charniak
literal sense" (Haugeland, 1985) and McDermott, 1985)
"The automation of] activities that we "The study of the computations that make
associate with human thinking, activities such it possible to perceive, reason, and act"
as decision-making, problem solving, (Winston, 1992)
learning..."(Bellman, 1978)
c) "The art of creating machines that perform d) "A field of study that seeks to explain and
functions that require intelligence when emulate intelligent behavior in terms of
performed by people" (Kurzweil, 1990) computational processes" (Schalkoff, 1
990)
"The study of how to make computers do
things at which, at the moment, people "The branch of computer science that is
are better" (Rich and Knight, 1 99 1 ) concerned with the automation of
intelligent behavior" (Luger and
Stubblefield, 1993)
The definitions on the top, (a) and (b) are concerned with reasoning, whereas those on the bottom, (c)
and (d) address behavior.
The definitions on the left, (a) and (c) measure success in terms of human performance, and those on the
right, (b) and (d) measure the ideal concept of intelligence called rationality
Intelligent Systems:
In order to design intelligent systems, it is important to categorize them into four categories (Luger and
Stubberfield 1993), (Russell and Norvig, 2003)
Scientific Goal: To determine which ideas about knowledge representation, learning, rule systems search,
and so on, explain various sorts of real intelligence.
Engineering Goal: To solve real world problems using AI techniques such as Knowledge representation,
learning, rule systems, search, and so on.
Traditionally, computer scientists and engineers have been more interested in the engineering
goal, while psychologists, philosophers and cognitive scientists have been more interested in the scientific
goal.
Cognitive Science: Think Human-Like
a. Requires a model for human cognition. Precise enough models allow simulation by
computers.
b. Focus is not just on behavior and I/O, but looks like reasoning process.
c. Goal is not just to produce human-like behavior but to produce a sequence of steps of the
reasoning process, similar to the steps followed by a human in solving the same task.
a. The study of mental faculties through the use of computational models; that it is, the study of
computations that make it possible to perceive reason and act.
b. Focus is on inference mechanisms that are probably correct and guarantee an optimal solution.
c. Goal is to formalize the reasoning process as a system of logical rules and procedures of
inference.
a. The art of creating machines that perform functions requiring intelligence when performed by
people; that it is the study of, how to make computers do things which, at the moment, people do
better.
b. Focus is on action, and not intelligent behavior centered around the representation of the world
o The interrogator can communicate with the other 2 by teletype (to avoid the
machine imitate the appearance of voice of the person)
o The interrogator tries to determine which the person is and which the machine is.
o The machine tries to fool the interrogator to believe that it is the human, and the
person also tries to convince the interrogator that it is the human.
o If the machine succeeds in fooling the interrogator, then conclude that the
machine is intelligent.
a. Tries to explain and emulate intelligent behavior in terms of computational process; that it is
concerned with the automation of the intelligence.
b. Focus is on systems that act sufficiently if not optimally in all situations.
Strong AI makes the bold claim that computers can be made to think on a level (at least) equal to
humans.
Weak AI simply states that some "thinking-like" features can be added to computers to make them more
useful tools... and this has already started to happen (witness expert systems, drive-by-wire cars and
speech recognition software).
AI Problems:
Common-Place Tasks:
1. Recognizing people, objects.
2. Communicating (through natural language).
3. Navigating around obstacles on the streets.
These tasks are done matter of factly and routinely by people and some other animals.
Expert tasks:
1. Medical diagnosis.
2. Mathematical problem solving
3. Playing games like chess
These tasks cannot be done by all people, and can only be performed by skilled specialists.
Clearly tasks of the first type are easy for humans to perform, and almost all are able to master
them. The second range of tasks requires skill development and/or intelligence and only some specialists
can perform them well. However, when we look at what computer systems have been able to achieve to
date, we see that their achievements include performing sophisticated tasks like medical diagnosis,
performing symbolic integration, proving theorems and playing chess.
1.4.4: Lecture 4: Strategies of Solving Tic-Tac-Toe Game Playing
Tic-Tac-Toe Game Playing:
Tic-Tac-Toe is a simple and yet an interesting board game. Researchers have used various approaches to
study the Tic-Tac-Toe game. For example, Fok and Ong and Grim et al. have used artificial neural
network based strategies to play it. Citrenbaum and Yakowitz discuss games like Go-Moku, Hex and
Bridg-It which share some similarities with Tic-Tac-Toe.
Fig 1.
The board used to play the Tic-Tac-Toe game consists of 9 cells laid out in the form of a 3x3 matrix (Fig.
1). The game is played by 2 players and either of them can start. Each of the two players is assigned a
unique symbol (generally 0 and X). Each player alternately gets a turn to make a move. Making a move is
compulsory and cannot be deferred. In each move a player places the symbol assigned to him/her in a
hitherto blank cell.
Let a track be defined as any row, column or diagonal on the board. Since the board is a square
matrix with 9 cells, all rows, columns and diagonals have exactly 3 cells. It can be easily observed that
there are 3 rows, 3 columns and 2 diagonals, and hence a total of 8 tracks on the board (Fig. 1). The goal
of the game is to fill all the three cells of any track on the board with the symbol assigned to one before
the opponent does the same with the symbol assigned to him/her. At any point of the game, if
there exists a track whose all three cells have been marked by the same symbol, then the player to
whom that symbol have been assigned wins and the game terminates. If there exist no track whose
cells have been marked by the same symbol when there is no more blank cell on the board then the game
is drawn.
Let the priority of a cell be defined as the number of tracks passing through it. The priorities of the nine
cells on the board according to this definition are tabulated in Table 1. Alternatively, let the
priority of a track be defined as the sum of the priorities of its three cells. The priorities of the eight tracks
on the board according to this definition are tabulated in Table 2. The prioritization of the cells and the
tracks lays the foundation of the heuristics to be used in this study. These heuristics are somewhat similar
to those proposed by Rich and Knight.
Strategy 1:
Algorithm:
2. Use the computed number as an index into Move-Table and access the vector stored there.
Procedure:
1) Elements of vector:
0: Empty
1: X
2: O
b) Element = A vector which describes the most suitable move from the
Comments:
3. Difficult to extend
Data Structure:
2: Empty
3: X
5: O
1,2,3, etc
Function Library:
1. Make2:
a) Return a location on a game-board.
IF (board[5] = 2)
ELSE
RETURN any cell that is not at the board’s corner;
// (cell: 2,4,6,8)
c) can_win(P) :
P has filled already at least two cells on a straight line (horizontal, vertical, or diagonal)
d) cannot_win(P) = NOT(can_win(P))
2. Posswin(P):
IF (cannot_win(P))
RETURN 0;
ELSE
can_win(P)
Algorithm:
1. Turn = 1: (X moves)
2. Turn = 2: (O moves)
Go(5)
ELSE
Go(1)
3. Turn = 3: (X moves)
Go(9)
ELSE
Go(3).
4. Turn = 4: (O moves)
ELSE Go (Make2)
5. Turn = 5: (X moves)
Go(Posswin(X))
//Win for X.
Go(Posswin(O))
Go(7)
ELSE Go(3).
Comments:
1. Not efficient in time, as it has to check several conditions before making each move.
3. Hard to generalize.
AI programming involves working in domains for which the problem is often poorly understood -
hence the design-code cycle is often very tightly coupled. This gives rise to a need for flexible
environments and languages capable of supporting rapid changes in knowledge representation
schemes and alterations to the inference processes at work.
Programming languages in Artificial Intelligence (AI) are the major tool for exploring and
building computer programs that can be used to simulate intelligent processes such as learning,
reasoning and understanding symbolic information in context.
In the engineering context the problem is often very well understood but the
solution is not.
In the `scientific' approach, a major issue is the representation of the problem
itself. Often, one of the main results of such work is to clarify the precise nature
of the problem
In AI, the automation or programming of all aspects of human cognition is considered from its
foundations in cognitive science through approaches to symbolic and sub-symbolic AI, natural
language processing, computer vision, and evolutionary or adaptive systems.
From the requirements of symbolic computation and AI programming, two new basic
programming paradigms emerged as alternatives to the imperative style: the functional and the
logical programming style. Both are based on mathematical formalisms, namely recursive
function the-ory and formal logic. The first practical and still most widely used AI programming
language is the functional language
Lisp developed by John McCarthy in the late 1950s. Lisp is based on mathematical function
theory and the lambda abstraction. A number of important and influential AI applications have
been written in Lisp so we will describe this programming language in some detail in this article.
During the early 1970s, a new programming paradigm appeared, namely logic programming on
the basis of predicate calculus. The first and still most important logic programming language is
Prolog, developed by Alain Colmerauer, Robert Kowalski and Phillippe Roussel. Problems in
Prolog are stated as facts, axioms and logical rules for deducing new facts. Prolog is
mathematically founded on predicate calculus and the theoretical results obtained in the area of
automatic theorem proving in the late 1960s.
Current trends in AI are basically towards the development of technologies which have origin and
analogy with biological or behavioral phenomena related to human or animals such as evolutionary
computation.
Evolutionary Computing is the collective name for a range of problem-solving techniques based on
principles of biological evolution, such as natural selection and genetic inheritance. These techniques are
successfully applied to numerous problems from different domains, including optimization, automatic
programming, signal processing, bioinformatics, social systems, and so on.
Evolutionary Algorithm:
Repeat
Until Done
GENETIC ALGORITHM:
Genetic Algorithms are search algorithms based on natural selection and natural genetics. They combine
survival of fittest among structures with structured yet randomized information exchange to form a search
algorithm.
Genetic Algorithm has been developed by John Holland and his co-workers in the University of Michigan
in the early 60‟s.
Genetic algorithms are theoretically and empirically proved to provide robust search in complex spaces.
Its validity in –Function Optimization and Control Applications is well established .
Genetic Algorithms (GA) provide a general approach for searching for global minima or maxima within a
bounded, quantized search space.
Swarm Intelligence:
A swarm is a large number of homogenous, simple agents interacting locally among themselves, and their
environment, with no central control to allow a global interesting behavior to emerge.
Swarm Intelligence (SI) can therefore be defined as a relatively new branch of Artificial Intelligence that
is used to model the collective behaviour of social swarms in nature, such as ant colonies, honey bees, and
bird flocks.
Swarm Intelligence (SI) Models:
Swarm intelligence models are referred to as computational models inspired by natural swarm systems.
To date, several swarm intelligence models based on different natural swarm systems have been proposed
in the literature, and successfully applied in many real-life applications.
Examples of swarm intelligence models are: Ant Colony Optimization , Particle Swarm Optimization
Artificial Bee Colony , Bacterial Foraging , Cat Swarm Optimization , Artificial Immune System , and
Glowworm Swarm Optimization .
The history of neural networks begins in the early 1940’s and thus nearly simultaneously with the history
An Artificial Neural Network (ANN) is an information processing paradigm that is inspired by the way
biological nervous systems, such as the brain, process information . The key element of this paradigm is
the novel structure of the information processing system. It is composed of a large number of highly
interconnected processing elements (neurons) working in unison to solve specific problems.
ANNs have been applied to an increasing number of real-world problems of considerable complexity.
Their most important advantage is in solving problems that are too complex for conventional technologies
-- problems that do not have an algorithmic solution or for which an algorithmic solution is too complex
to be found.
1. Adaptive learning: An ability to learn how to do tasks based on the data given for training or
initial experience.
2. Self-Organisation: An ANN can create its own organisation or representation of the information
it receives during learning time.
3. Real Time Operation: ANN computations may be carried out in parallel, and special hardware
devices are being designed and manufactured which take advantage of this capability.
- Pattern Classification
- Clustering/Categorization
- Function approximation
- Prediction/Forecasting
- Optimization
- Content-addressable Memory
Web Links:
4. http://www.tutorialspoint.com/artificial_intelligence/artificial_intelligence_overview.htm
5. http://www.hs-weingarten.de/~ertel/aibook/aibook-ertel-slides.pdf
6. http://epub.uni-regensburg.de/13629/1/ubr06078_ocr.pdf
7. http://iiscs.wssu.edu/drupal/node/3661
8. http://www.myreaders.info/html/artificial_intelligence.html
9. http://nptel.ac.in/courses/106105077/
10. https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-034-artificial-
intelligence-fall-2010/lecture-videos/
11. http://videolectures.net/Top/Computer_Science/Artificial_Intelligence/
Review Questions
3. Define an agent.
4. What are the capabilities should the computer possess to pass the Turing Test?
1. Explain about Tic-Tac-Toe game problem by assuming one player is X the other
one can be either human or a computer by taking 3Χ3 grid space.
2. Explain in detail the applications of Artificial Intelligence.
3. Discuss categorization of intelligent systems.
4. What is an AI technique? Explain briefly.
5. Write about some of the cross domains of Artificial intelligence.
6. Briefly explain the history of Artificial Intelligence.
7. Write short notes on the following
i) AI languages
ii) Intelligent system
iii) Sub-areas of AI
iv) History of AI
8. “AI is interdisciplinary in nature and its foundations are in various fields.” Justify
the statement with valid reasons.
Assignment questions:
v) AI languages
vi) Intelligent system
vii) Sub-areas of AI
2. Explain about Tic-Tac-Toe game problem by assuming one player is X the other one can
be either human or a computer by taking 3Χ3 grid space
Unit – II - Problem Solving and Game Playing
2. 2.Unit Outcomes:
CO-1: Converting real world problems into AI search problems using the appropriate search algorithm
CO-2: Given a search problem, analyze and formalize the problem
CO-3: Implement backtracking search with conflict in CSP
CO-4: Design good evaluation functions and strategies for game playing.
2.3. Lecture Plan
Lecture Quick
Topic
no. Methodology reference
7 introduction to Problem Solving, general problem solving chalk-board T1:2.1
8 general problem solving, Water-jug problem , chalk-board T1 : 2.2
9 Rules to solve Water-jug problem chalk-board T1 : 2.2
10 Missionaries and Cannibals Problem chalk-board T1 : 2.2
11 8 -puzzle Problem, Control stratergies chalk-board T1 : 2.2-2.3
12 Forward chaining and Back ward Chaining chalk-board T1 : 2.3
13 Forward chaining and Back ward Chaining in Expert systems chalk-board T1 : 2.3
14 Exhaustive searches-BFS,DFS chalk-board T1 : 2.4
15 Exhaustive searches- Iterative deeping DFS chalk-board T1 : 2.4
16 Uniform Cost Search chalk-board T1 : 2.5
17 Heurtistic searches- Greedy BFS chalk-board T1 : 2.5
18 Heurtistic searches-A*, chalk-board T1 : 2.5
19 A* Algorthm applications chalk-board T1 : 2.5
20 A* Algorthm Properties chalk-board T1 : 2.5
21 ID A* Algorthm chalk-board T1 : 2.5
22 Hill Climbing search chalk-board T1 : 2.5
23 Constraint satisfaction problems - crypt Arithmetic Puzzle chalk-board T1 : 2.6
24 crypt Arithmetic Puzzle Explanation chalk-board T1 : 2.6
25 Game playing: Mini max algorithm chalk-board T1 : 3.1-3.4
26 Alpha-beta pruning algorithm chalk-board T1 : 3.5
27 Exaplantion of Alpha-beta pruning algorithm chalk-board T1 : 3.5
T1: Artificial Intelligence- Saroj Kaushik, CENGAGE Learning
Which search algorithm one should use will generally depend on the problem domain.
There are four important factors to consider:
2. Optimality – Is the solution found guaranteed to be the best (or lowest cost) solution if there
exists more than one solution?
3. Time Complexity – The upper bound on the time required to find a solution, as a function of
the complexity of the problem.
4. Space Complexity – The upper bound on the storage space (memory) required at any point
during the search, as a function of the complexity of the problem.
Newell and Simon defined each problem as a space. At one end of the space is the starting point;
on the other side is the goal. The problem-solving procedure itself is conceived as a set of
operations to cross that space, to get from the starting point to the goal state, one step at a time.
The General Problem Solver, the program tests various actions (which Newell and Simon called
operators) to see which will take it closer to the goal state. An operator is any activity that
changes the state of the system. The General Problem Solver always chooses the operation that
appears to bring it closer to its goal.
A Water Jug Problem: You are given two jugs, a 4-gallon one and a 3-gallon one,
a pump which has unlimited water which you can use to fill the jug, and the ground
on which water may be poured. Neither jug has any measuring markings on it. How
can you get exactly 2 gallons of water in the 4-gallon jug?
Operators -we must defi ne a set of operators that will take us from one state to another:
Second Solution:
2.4.3. Control strategies
Control Strategies means how to decide which rule to apply next during the process of searching
for a solution to a problem.
Let us discuss these strategies using water jug problem. These may be applied to any search
problem.
Generate all the offspring of the root by applying each of the applicable rules to the initial state.
Now for each leaf node, generate all its successors by applying all the rules that are appropriate
8 Puzzle Problem.
The 8 puzzle consists of eight numbered, movable tiles set in a 3x3 frame. One cell of the frame
is always empty thus making it possible to move an adjacent numbered tile into the empty cell.
Such a puzzle is illustrated in following diagram.
The program is to change the initial configuration into the goal configuration. A solution to the
problem is an appropriate sequence of moves, such as “move tiles 5 to the right, move tile 7 to
the left, move tile 6 to the down, etc”.
Solution:
To solve a problem using a production system, we must specify the global database the rules, and
the control strategy. For the 8 puzzle problem that correspond to these three components. These
elements are the problem states, moves and goal. In this problem each tile configuration is a
state. The set of all configuration in the space of problem states or the problem space, there are
only 3, 62,880 different configurations o the 8 tiles and blank space. Once the problem states
have been conceptually identified, we must construct a computer representation, or description of
them . this description is then used as the database of a production system. For the 8-puzzle, a
straight forward description is a 3X3 array of matrix of numbers. The initial global database is
this description of the initial problem state. Virtually any kind of data structure can be used to
describe states.
A move transforms one problem state into another state. The 8-puzzle is conveniently interpreted as
having the following for moves. Move empty space (blank) to the left, move blank up, move blank
to the right and move blank down,. These moves are modeled by production rules that operate on
the state descriptions in the appropriate manner.
The rules each have preconditions that must be satisfied by a state description in order for them to
be applicable to that state description. Thus the precondition for the rule associated with “move
blank up” is derived from the requirement that the blank space must not already be in the top row.
The problem goal condition forms the basis for the termination condition of the production system.
The control strategy repeatedly applies rules to state descriptions until a description of a goal state is
produced. It also keeps track of rules that have been applied so that it can compose them into
sequence representing the problem solution. A solution to the 8-puzzle problem is given in the
following figure.
Example:- Depth – First – Search traversal and Breadth - First - Search traversal
Search is the systematic examination of states to find path from the start/root state to the goal state.
Many traditional search algorithms are used in AI applications. For complex problems, the
traditional algorithms are unable to find the solution within some practical time and space limits.
Consequently, many special techniques are developed; using heuristic functions. The algorithms
that use heuristic functions are called heuristic algorithms. Heuristic algorithms are not really
intelligent; they appear to be intelligent because they achieve better performance.
Heuristic algorithms aremore efficient because they take advantage of feedback from the data to
direct the search path.
Uninformed search
Also called blind, exhaustive or brute-force search, uses no information about the problem to guide
the search and therefore may not be very efficient.
Informed Search:
Also called heuristic or intelligent search, uses information about the problem to guide the search,
usually guesses the distance to a goal state and therefore efficient, but the search may not be always
possible.
Uninformed Search Methods:
• Algorithm:
1. Create a variable called NODE-LIST and set it to initial state
2. Until a goal state is found or NODE-LIST is empty do
a. Remove the first element from NODE-LIST and call it E. If NODE-LIST was
empty, quit
b. For each way that each rule can match the state described in E do:
i. Apply the rule to generate a new state
ii. If the new state is a goal state, quit and return this state
iii. Otherwise, add the new state to the end of NODE-LIST
BFS illustrated:
Step 1: Initially fringe contains only one node corresponding to the source state A.
Figure 1
FRINGE: A
Step 2: A is removed from fringe. The node is expanded, and its children B and C are generated. They
are placed at the back of fringe.
Figure 2
FRINGE: B C
Step 3: Node B is removed from fringe and is expanded. Its children D, E are generated and put at the
back of fringe.
Figure 3
FRINGE: C D E
Step 4: Node C is removed from fringe and is expanded. Its children D and G are added to the back of
fringe.
Figure 4
FRINGE: D E D G
Step 5: Node D is removed from fringe. Its children C and F are generated and added to the back of
fringe.
Figure 5
FRINGE: E D G C F
Step 6: Node E is removed from fringe. It has no children.
Figure 6
FRINGE: D G C F
Figure 7
FRINGE: G C F B F
Step 8: G is selected for expansion. It is found to be a goal node. So the algorithm returns the path A C
G by following the parent pointers of the node corresponding to G. The algorithm terminates.
• Algorithm:
1. Create a variable called NODE-LIST and set it to initial state
2. Until a goal state is found or NODE-LIST is empty do
a. Remove the first element from NODE-LIST and call it E. If NODE-LIST was
empty, quit
b. For each way that each rule can match the state described in E do:
i. Apply the rule to generate a new state
ii. If the new state is a goal state, quit and return this state
iii. Otherwise, add the new state in front of NODE-LIST
DFS illustrated:
Step 2: A is removed from fringe. A is expanded and its children B and C are put in front of fringe.
Figure 2
FRINGE: B C
Step 3: Node B is removed from fringe, and its children D and E are pushed in front of fringe.
Figure 3
FRINGE: D E C
Step 4: Node D is removed from fringe. C and F are pushed in front of fringe.
Figure 4
FRINGE: C F E C
Step 5: Node C is removed from fringe. Its child G is pushed in front of fringe.
Figure 5
FRINGE: G F E C
Figure 6
FRINGE: G F E C
Note that the time taken by the algorithm is related to the maximum depth of the search tree. If the
search tree has infinite depth, the algorithm may not terminate. This can happen if the search space
is infinite. It can also happen if the search space contains cycles. The latter case can be handled by
checking for cycles in the algorithm. Thus Depth First Search is not complete.
2.4.5. Lecture 11: Exhaustive searches- Iterative Deeping DFS
Description:
It is a search strategy resulting when you combine BFS and DFS, thus combining the
advantages of each strategy, taking the completeness and optimality of BFS and the modest
memory requirements of DFS.
IDS works by looking for the best search depth d, thus starting with depth limit 0 and make a
BFS and if the search failed it increase the depth limit by 1 and try a BFS again with depth 1
and so on – first d = 0, then 1 then 2 and so on – until a depth d is reached where a goal is
found.
Algorithm:
procedure IDDFS(root)
for depth from 0 to ∞
found ← DLS(root, depth)
if found ≠ null
return found
Performance Measure:
o Completeness: IDS is like BFS, is complete when the branching factor b is finite.
o Optimality: IDS is also like BFS optimal when the steps are of the same cost.
Time Complexity:
o One may find that it is wasteful to generate nodes multiple times, but actually it is
not that costly compared to BFS, that is because most of the generated nodes are
always in the deepest level reached, consider that we are searching a binary tree and
our depth limit reached 4, the nodes generated in last level = 2 4 = 16, the nodes
generated in all nodes before last level = 20 + 21 + 22 + 23= 15
o Imagine this scenario, we are performing IDS and the depth limit reached depth d,
now if you remember the way IDS expands nodes, you can see that nodes at depth d
are generated once, nodes at depth d-1 are generated 2 times, nodes at depth d-2 are
generated 3 times and so on, until you reach depth 1 which is generated d times, we
can view the total number of generated nodes in the worst case as:
N(IDS) = (b)d + (d – 1)b2 + (d – 2)b3 + …. + (2)bd-1 + (1)bd = O(bd)
o If this search were to be done with BFS, the total number of generated nodes in the
worst case will be like:
N(BFS) = b + b2 + b3 + b4 + …. bd + (bd + 1 – b) = O(bd + 1)
o If we consider a realistic numbers, and use b = 10 and d = 5, then number of
generated nodes in BFS and IDS will be like
N(IDS) = 50 + 400 + 3000 + 20000 + 100000 = 123450
N(BFS) = 10 + 100 + 1000 + 10000 + 100000 + 999990 = 1111100
BFS generates like 9 time nodes to those generated with IDS.
Space Complexity:
Weblinks:
i. https://www.youtube.com/watch?v=7QcoJjSVT38
ii. https://mhesham.wordpress.com/tag/iterative-deepening-depth-first-search
Conclusion:
We can conclude that IDS is a hybrid search strategy between BFS and DFS inheriting their
advantages.
IDS is faster than BFS and DFS.
It is said that “IDS is the preferred uniformed search method when there is a large search
space and the depth of the solution is not known”.
2.4.6 Heuristic Searches, Hill Climbing:
A Heuristic technique helps in solving problems, even though there is no guarantee that it will never
lead in the wrong direction. There are heuristics of every general applicability as well as domain
specific. The strategies are general purpose heuristics. In order to use them in a specific domain they
are coupler with some domain specific heuristics. There are two major ways in which domain -
specific, heuristic information can be incorporated into rule-based search procedure.
A heuristic function is a function that maps from problem state description to measures desirability,
usually represented as number weights. The value of a heuristic function at a given node in the
search process gives a good estimate of that node being on the desired path to solution.
We will assume we are trying to maximize a function. That is, we are trying to find a point in the
search space that is better than all the others. And by "better" we mean that the evaluation is higher.
We might also say that the solution is of better quality than all the others.
You should note that this algorithm does not maintain a search tree. It only returns a final solution.
Also, if two neighbors have the same evaluation and they are both the best quality, then the
algorithm will choose between them at random.
The main problem with hill climbing (which is also sometimes called gradient descent) is that we
are not guaranteed to find the best solution. In fact, we are not offered any guarantees about the
solution. It could be abysmally bad.
You can see that we will eventually reach a state that has no better neighbours but there are better
solutions elsewhere in the search space. The problem we have just described is called a local
maxima.
The best first search allows us to switch between paths thus gaining the benefit of both
approaches. At each step the most promising node is chosen. If one of the nodes chosen
generates nodes that are less promising it is possible to choose another at the same level and
in effect the search changes from depth to breadth. If on analysis these are no better than this
previously unexpanded node and branch is not forgotten and the search method reverts to the
OPEN is a priorityqueue of nodes that have been evaluated by the heuristic function but which have
not yet been expanded into successors. The most promising nodes are at the front.
CLOSED are nodes that have already been generated and these nodes must be stored because a
graph is being used in preference to a tree.
Algorithm:
• If it has not been generated before ,evaluate it ,add it to OPEN and record its
parent
• If it has been generated before change the parent if this new path is better and
in that case update the cost of getting to any successor nodes.
The A* search algorithm (pronounced "Ay-star") is a tree search algorithm that finds a path from a
given initial node to a given goal node (or one passing a given goal test). It employs a "heuristic
estimate" which ranks each node by an estimate of the best route that goes through that node. It
visits the nodes in order of this heuristic estimate.
Similar to greedy best-first search but is more accurate because A* takes into account the nodes that
have already been traversed.
g is a measure of the distance/cost to go from the initial node to the current node
Thus f is an estimate of how long it takes to go from the initial node to the solution
Algorithm:
g(s)= 0, f(s)=h(s)
save n in CLOSED
a) If m € [OPEN U CLOSED]
Insert m in OPEN
b) If m € [OPEN U CLOSED]
Set g(m) = min { g(m) , g(n) + c(n , m)}
Move m to OPEN.
Description:
A* begins at a selected node. Applied to this node is the "cost" of entering this node (usually
zero for the initial node). A* then estimates the distance to the goal node from the current
node. This estimate and the cost added together are the heuristic which is assigned to the
path leading to this node. The node is then added to a priority queue, often called "open".
The algorithm then removes the next node from the priority queue (because of the way a
priority queue works, the node removed will have the lowest heuristic). If the queue is
empty, there is no path from the initial node to the goal node and the algorithm stops. If the
node is the goal node, A* constructs and outputs the successful path and stops.
If the node is not the goal node, new nodes are created for all admissible adjoining nodes;
the exact way of doing this depends on the problem at hand. For each successive node, A*
calculates the "cost" of entering the node and saves it with the node. This cost is calculated
from the cumulative sum of costs stored with its ancestors, plus the cost of the operation
which reached this new node.
The algorithm also maintains a 'closed' list of nodes whose adjoining nodes have been
checked. If a newly generated node is already in this list with an equal or lower cost, no
further processing is done on that node or with the path associated with it. If a node in the
closed list matches the new one, but has been stored with a higher cost, it is removed from
the closed list, and processing continues on the new node.
Next, an estimate of the new node's distance to the goal is added to the cost to form the
heuristic for that node. This is then added to the 'open' priority queue, unless an identical
node is found there.
Once the above three steps have been repeated for each new adjoining node, the original
node taken from the priority queue is added to the 'closed' list. The next node is then popped
from the priority queue and the process is repeated
The heuristic costs from each city to Bucharest:
A* search properties:
The algorithm A* is admissible. This means that provided a solution exists,
the first solution found by A* is an optimal solution. A* is admissible under
the following conditions:
A* is also complete.
IDA* is complete & optimal Space usage is linear in the depth of solution. Each iteration is
depth first search, and thus it does not require a priority queue.
Iterative deepening A* (IDA*) eliminates the memory constraints of A* search algorithm
without sacrificing solution optimality.
Each iteration of the algorithm is a depth-first search that keeps track of the cost, f(n) = g(n)
+ h(n), of each node generated.
As soon as a node is generated whose cost exceeds a threshold for that iteration, its path is
cut off, and the search backtracks before continuing.
The cost threshold is initialized to the heuristic estimate of the initial state, and in each
successive iteration is increased to the total cost of the lowest-cost node that was pruned
during the previous iteration.
The algorithm terminates when a goal state is reached whose total cost dees not exceed the
current threshold.
Sometimes a problem is not embedded in a long set of action sequences but requires picking the
best option from available choices. A good general-purpose problem solving technique is to list the
constraints of a situation (either negative constraints, like limitations, or positive elements that you
want in the final solution). Then pick the choice that satisfies most of the constraints.
Formally speaking, a constraint satisfaction problem (or CSP) is defined by a set of variables,
X1;X2; : : : ;Xn, and a set of constraints, C1;C2; : : : ;Cm. Each variable Xi has a nonempty domain
Di of possible values. Each constraint Ci involves some subset of t variables and specifies the
allowable combinations of values for that subset. A state of the problem is defined by an assignment
of values to some or all of the variables, {Xi = vi;Xj = vj ; : : :} An assignment that does not violate
any constraints is called a consistent or legal assignment. A complete assignment is one in which
every variable is mentioned, and a solution to a CSP is a complete assignment that satisfies all the
constraints. Some CSPs also require a solution that maximizes an objectivefunction.
1. Initial state: the empty assignment fg, in which all variables are unassigned.
2. Successor function: a value can be assigned to any unassigned variable, provided that it does
not conflict with previously assigned variables.
Examples:
The task of coloring each region red, green or blue in such a way that no neighboring
regions have the same color.
We are given the task of coloring each region red, green, or blue in such a way that the neighboring
regions must not have the same color.
To formulate this as CSP, we define the variable to be the regions: WA, NT, Q, NSW, V, SA, and
T. The domain of each variable is the set {red, green, blue}. The constraints require neighboring
regions to have distinct colors: for example, the allowable combinations for WA and NT are the
pairs {(red,green),(red,blue),(green,red),(green,blue),(blue,red),(blue,green)}. (The constraint can
also be represented as the inequality WA ≠ NT). There are many possible solutions, such as {WA
= red, NT = green, Q = red, NSW = green, V = red, SA = blue, T = red}. Map of Australia
showing each of its states and territories
Constraint Graph: A CSP is usually represented as an undirected graph, called constraint
graph where the nodes are the variables and the edges are the binary constraints.
> Initial state : the empty assignment {},in which all variables are unassigned.
> Successor function: a value can be assigned to any unassigned variable, provided that
it does not conflict with previously assigned variables.
> Goal test: the current assignment is complete.
> Path cost: a constant cost(E.g.,1) for every step.
MiniMax Algorithm:
1. Generate the whole game tree.
2. Apply the utility function to leaf nodes to get their values.
3. Use the utility of nodes at level n to derive the utility of nodes at level n-1.
4. Continue backing up values towards the root (one layer at a time).
5. Eventually the backed up values reach the top of the tree, at which point Max chooses the
move that yields the highest value. This is called the minimax decision because it maximises
the utility for Max on the assumption that Min will play perfectly to minimise it.
Example:
Properties of minimax:
• Pruning: eliminating a branch of the search tree from consideration without exhaustive
examination of each node
• a-b Pruning: the basic idea is to prune portions of the search tree that cannot improve the
utility value of the max or min node, by just considering the values of nodes seen so far.
• Alpha-beta pruning is used on top of minimax search to detect paths that do not need to be
explored. The intuition is:
• The MAX player is always trying to maximize the score. Call this a.
• The MIN player is always trying to minimize the score. Call this b .
• Alpha cutoff: Given a Max node n, cutoff the search below n (i.e., don't generate or
examine any more of n's children) if alpha(n) >= beta(n)
(alpha increases and passes beta from below)
• Beta cutoff.: Given a Min node n, cutoff the search below n (i.e., don't generate or examine
any more of n's children) if beta(n) <= alpha(n)
(beta decreases and passes alpha from above)
• Carry alpha and beta values down during search Pruning occurs whenever alpha >= beta
Algorithm:
Example:
1) Setup phase: Assign to each left-most (or right-most) internal node of the tree,
variables: alpha = -infinity, beta = +infinity
2) Look at first computed final configuration value. It’s a 3. Parent is a min node, so
set the beta (min) value to 3.
3) Look at next value, 5. Since parent is a min node, we want the minimum of
3 and 5 which is 3. Parent min node is done – fill alpha (max) value of its parent max
node. Always set alpha for max nodes and beta for min nodes. Copy the state of the max parent node
into the second unevaluated min child.
4) Look at next value, 2. Since parent node is min with b=+inf, 2 is smaller, change b.
5) Now, the min parent node has a max value of 3 and min value of 2. The value of the
2nd child does not matter. If it is >2, 2 will be selected for min node. If it is <2, it will be
selected for min node, but since it is <3 it will not get selected for the parent max node. Thus,
we prune the right subtree of the min node. Propagate max value up the tree.
6) Max node is now done and we can set the beta value of its parent and propagate node
state to sibling subtree’s left-most path.
7) The next node is 10. 10 is not smaller than 3, so state of parent does not change. We still have to look
at the 2nd child since alpha is still –inf.
8) The next node is 4. Smallest value goes to the parent min node. Min subtree is done, so the
parent max node gets the alpha (max) value from the child. Note that if the max node had a
2nd subtree, we can prune it since a>b.
9) Continue propagating value up the tree, modifying the corresponding alpha/beta values. Also
propagate the state of root node down the left-most path of the right subtree.
10) Next value is a 2. We set the beta (min) value of the min parent to 2. Since no other
children exist, we propagate the value up the tree.
11) We have a value for the 3 rd level max node, now we can modify the beta (min) value of the min
parent to 2. Now, we have a situation that a>b and thus the value of the rightmost subtree of the min
node does not matter, so we prune the whole subtree.
12) Finally, no more nodes remain, we propagate values up the tree. The root has a value of 3 that
comes from the left-most child. Thus, the player should choose the left-most child’s move in order to
maximize his/her winnings. As you can see, the result is the same as with the mini-max example, but
we did not visit all nodes of the tree.
Essay Questions:
1. How to define a problem as state space search? Discuss it with the help of an
example
2. Discuss A* algorithm in detail?
3. Solve the water-jug problem by writing the production rules
4. Search in game playing programs always proceeds forward from current state to goal
state. Why? Explain
5. Explain the problem characteristics.
6. Solve the following crypt arithmetic puzzle. Write constraint equations and find one
solution using DFS by showing the steps involved in finding the solution
8) Write A* algorithm
MCQ :
Problem Solving
Answer: b
Explanation:
Because of using greedy best-first search, It will quickly lead to the solution of the problem.
Game Theory :
Since intelligence often seems to involve some kind of reasoning it becomes clear that logic, the
science of reasoning, may play an important role in AI.The symbolistic approach to AI one may
again have different views as to the exact role of logic in this enterprise.
The most influential figure in logical AI is John McCarthy. McCarthy was one of the founders of
AI, and consistently advocated a research methodology that uses logical techniques to formalize the
reasoning problems that AI needs to solve.
An argument, in the sense understood by a logician (as opposed to the sense which simply means a
“disagreement”: recall the Monty Python sketch) is an attempt to establish a conclusion.
Given an argument, we want to know if the argument is valid. Does the conclusion follow logically
from the premises? Is the conclusion a logical consequence of the premises? To provide a clear
analysis of the logical concepts of “validity” and “logical consequence”, and to provide methods for
classifying arguments as valid or invalid, is the central task of logic.
An argument is VALID if it is impossible for the premises to be true, with the conclusion
false.
Types of logic
1. Propositional logic
2. First-order logic
3. Temporal logic
5. Map-coloring logic
3.4.2. Propositional logic
An atomic sentence consists of a single propositional symbol, representing a Proposition that can be
true or false.
A literal is a propositional symbol or its negation. Complex sentences are constructed from simpler
sentences using logical connectives:
Semantics:
A propositional symbol
– It rains outside
True (T), or False (F), depending on whether the symbol is satisfied in the world
I’: Light in the room is on -> False, It rains outside -> False
Translation:
Assume the following sentences:
Denote:
• r = We will go swimming
Some composite sentences may always (under any interpretation) evaluate to a single truth value:
DeMorgan’s Laws
¬ ( P ∧ Q ) ⇔ (¬ P ∨ ¬ Q )
¬ ( P ∨ Q ) ⇔ (¬ P ∧ ¬ Q )
A model (in logic): An interpretation is a model for a set of sentences if it assigns true to each
sentence in the set.
A sentence is satisfiable if it has a model; – There is at least one interpretation under which the
A sentence is valid if it is True in all interpretations – i.e., if its negation is not satisfiable (leads to
contradiction)
Example :
Inference rules for logic:
Logical Equivalences:
Resolution
• A powerful inference rule that yields a sound and complete inference algorithm when coupled with a
complete search algorithm:
Resolution Algorithm
• It’s easy to show that resolution algorithm is sound, but is it also complete?
• Resolution closure: the set of all clauses derived from the applications of resolution. Is it finite?
• Ground resolution theorem: if a set of clauses is unsatisfiable, then their closure contains an empty clause.
Here we will build a hierarchy of axiomatic systems for the propositional logic, by gradually
adding axioms for the logical connectives.
If we assume that the only logical connectives are ⌐ and → and all others are definable in terms
of them, then the axiomatic system comprises the following axioms and rules:
A formula A which can be derived by using the axioms and applying successively Modus Ponens is
said to be derivable in H, or a theorem of H, which we will denote by ├H A.
Using H one can derive logical consequences, too. In order to derive”If A1… An then B" we add the
premises A1… A to the set of axioms and try to derive B. If we succeed, we say that B is derivable
in H from the assumptions A1… An, denoted A1… An ├H B.
One can check that all axioms of H are tautologies and therefore, since the rule Modus Ponens is
valid, H can only derive tautologies when using these axioms as premises.
By the same argument, H can only derive valid logical consequences. Therefore, the axiomatic
system H is sound, or correct. In fact, it can be proved that H can derive all valid logical
consequences, and in particular, all tautologies, i.e., it is complete. Thus, H captures precisely the
notion of propositional logical consequence.
The method of semantic tableaux is an efficient decision procedure for satisfiability (and by duality
validity) in propositional logic.
The principle behind semantic tableaux is very simple: search for a model (satisfying interpretation)
by decomposing the formula into sets of atoms and negations of atoms. It is easy to check if there is
an interpretation for each set: a set of atoms and negations of atoms is satisfiable iff the set does not
contain an atom p and its negation ¬p. The formula is satisfiable iff one of these sets is satisfiable.
In the method of semantic tableaux, sets of formulas label nodes of a tree, where each path in the
tree represents the formulas that must be satisfied in one possible interpretation
.
Semantic Tableaux Rules:
Rule 1:A tableau for a formula (αΛβ) is constructed by adding both α and β to the same path
(branch). This can be represented as follows:
αΛβ
α
β
Rule 2: A tableau for a formula ~ (αΛβ) is constructed by adding two alternative paths one
containing ~ αand other containing ~ β
Rule 3: A tableau for a formula (αV β) is constructed by adding two new paths one containing αand
other containing β.
Rule 4: A tableau for a formula ~ (αV β) is constructed by adding both ~ α and ~ β to the same
path. This can be expressed as follows:
~ ( αV β)
~α
~β
Rule 5: Semantic tableau for ~ ~ α
~~α
Propositional Resolution:
Resolution rule:
αvβ
¬β v γ
αvγ
Resolution refutation:
Procedure :
Example 1.:
The above binary tree, showing resolution and resulting in the empty clause, is called a refutation
tree.
Example 2:
o If we can’t apply any more, then the conclusion cannot be proved from the axioms.
The type of predicate calculus that we have been referring to is also called firstorder
predicate logic (FOPL).
A first-order logic is one in which the quantifiers and can be applied to objects or terms,
but not to predicates or functions.
So we can define the syntax of FOPL as follows. First,we define a term:
A constant is a term.
A variable is a term. f(x1, x2, x3, . . . , xn) is a term if x1, x2, x3, . . . , xn are all terms.
Anything that does not meet the above description cannot be a term.
For example, the following is not a term: x P(x). This kind of construction we call a
sentence or a well-formed formula (wff), which is defined as follows.
In these definitions, P is a predicate, x1, x2, x3, . . . , xn are terms, and A,B are wff ’s. The
following are the acceptable forms for wff ’s:
P(x1, x2, x3, . . . , xn)
Soundness
We have seen that a logical system such as propositional logic consists of a syntax, a
semantics, and a set of rules of deduction.
A logical system also has a set of fundamental truths, which are known as axioms.
The axioms are the basic rules that are known to be true and from which all other theorems
within the system can be proved.
A theorem of a logical system is a statement that can be proved by applying the rules of
deduction to the axioms in the system.
It can be proved by induction that both propositional logic and FOPL are sound.
Completeness
Decidability
We can prove that propositional logic is decidable by using the fact that it is complete.
FOPL, on the other hand, is not decidable. This is due to the fact that it is not possible to
develop an algorithm that will determine whether an arbitrary wff in FOPL is logically
valid.
Monotonicity
A logical system is described as being monotonic if a valid proof in the system cannot be
made invalid by adding additional premises or assumptions.
In other words, even adding contradictory assumptions does not stop us from making the
proof in a monotonic system.
In fact, it turns out that adding contradictory assumptions allows us to prove anything,
including invalid conclusions. This makes sense if we recall the line in the truth table for
→, which shows that false → true. By adding a contradictory assumption, we make our
assumptions false and can thus prove any conclusion.
A proof must always be built from a fixed set of inference rules. The propositional
logic inference rules
A => B, A
---------
B
2. And-Elimination:
4. Or-Introduction:
Ai
------------------
A1 or A2 or ... An
5. Double-Negation Elimination:
NOT(NOT(A))
-----------
A
6. Unit Resolution:
A or B, NOT(B)
---------------
A
7. Resolution:
A or B, NOT(B) or C
-------------------
A or C
These rules are still valid for FOL.
We need some way to deal with variables and the quantifiers "EXISTS" and "FORALL".
To describe inference rules involving variables and quantifiers, we need the notion of "substitution".
The notation SUBST(theta, alpha) will denote the result of applying the substitution theta to the
sentence alpha Intuitively subst(x := g, alpha) is alpha with every appearance of x replaced by g.
Notice that this is a syntactic operation on sentences. Any operation involved in a proof has to be
syntactic: it has to just manipulate and transform sentences.
Many inference rules tell how how to get rid of (eliminate) or introduce a connective or quantifier
into a formula.
The "universal elimination" rule lets us use a universally quantified sentence to reach a specific
conclusion, i.e. to obtain a concrete fact.
FORALL x alpha
---------------------
SUBST({x/g} alpha)
Here g may be any term, including any term that is used elsewhere in the knowledge base.
The "existential elimination" rule lets us convert an existentially quantified sentence into a form
without the quantifier.
EXISTS x alpha
---------------------
SUBST({x/k} alpha)
Here k must be a new constant symbol that does not appear anywhere else in the database. It is
serving as a new name for something we know must exist, but whose name we do not yet know.
The "existential introduction" rule lets us use a specific fact to obtain an existentially quantified
sentence:
alpha
------------------------------
EXISTS y SUBST({g/y}, alpha)
This rule can be viewed as "leaving out detail": it omits the name of the actual entity that satisfies
the sentence alpha, and just says that some such entity exists.
Skolemization:
The "existential elimination" rule is also called the Skolemization rule, after a mathematician named
Thoralf Skolem.
EXISTS x alpha
--------------------
subst({x/k}, alpha) where k is a FRESH constant
Saying that k is a "fresh" constant means that k is not used anywhere else in the knowledge base.
This means that k can be the name of a new entity that is not named and therefore not referred to
anywhere else in the knowledge base.
Example 1
∃x ∀y Loves(x, y)
The value of x must be the same for any y. Replace x by a fresh constant, say, S.
∀y Loves(S, y)
Example 2
∀y ∃x Loves(x, y)
∀y Loves(Lover(y), y)
Horn clauses
Generalized modus ponens requires sentences to be in a standard form, called Horn clauses after the
mathematician Alfred Horn.
Where each qi and r is an atomic sentence and all variables are universally quantified.
Procedure:
1. Eliminate implications
2. Move ¬ inwards
3. Standardize bound variables apart
4. Move quantifiers out
5. Skolemize existential variables
6. Eliminate universal quantifiers
7. Distribute ∨ over ∧
8. Flatten conjunctions and disjunctions
9. Eliminate conjunctions
7. Distribute ∨ over ∧
There is a standard algorithm that given two sentences, finds their unique most general unifying
substitution.
Notice that the substitutions always involve a variable and a ground term.
The variable x cannot have two values at the same time, so this last example fails.
This fails because a variable may never occur in the term it is being unified with.
Resolution by Refutation:
A proof by contradiction is also called a refutation. The resolution rule with factoring a is a
refutation-complete
Inference system: if a set of clauses is unsatisfiable, then False is provable from it by resolution
The law says that it is a crime for an American to sell weapons to hostile nations. The country
Nono, an enemy of America, has some missiles, and all of its missiles were sold to it by Colonel
West, who is American.
Soln :
Missile(x) ⇒ Weapon(x)
Enemy(x,America) =⇒ Hostile(x)
A resolution-based inference system has just one rule to apply to build a proof.
However, at any step there may be several possible ways to apply the resolution rule.
Several resolution strategies have been developed in order to reduce the search space.
Weblinks :
1. http://homepage.cs.uiowa.edu/~tinelli/classes/145/Fall05/notes/9-
inference.pdf
2, http://www.sdsc.edu/~tbailey/teaching/cse151/lectures/chap09a.html
3. https://people.cs.pitt.edu/~milos/courses/cs2710/lectures/Class9.pdf
4. artint.info/html/ArtInt_102.html
Essay questions :
MCQ :
Predicate Logic
1. There exist only two types of quantifiers, Universal Quantification and Existential Quantification.
a) True
b) False
View Answer
Answer: a
Explanation: None.
2. Translate the following statement into FOL.
“For every a, if a is a philosopher, then a is a scholar”
a) ∀ a philosopher(a) scholar(a)
b) ∃ a philosopher(a) scholar(a)
c) All of the mentioned
d) None of the mentioned
View Answer
Answer: a
Explanation: None.
3. A _________ is used to demonstrate, on a purely syntactic basis, that one formula is a logical consequence
of another formula.
a) Deductive Systems
b) Inductive Systems
c) Reasoning with Knowledge Based Systems
d) Search Based Systems
View Answer
Answer: a
Explanation: Refer the definition of Deductive based systems.
4. The statement comprising the limitations of FOL is/are
a) Expressiveness
b) Formalizing Natural Languages
c) Many-sorted Logic
d) All of the mentioned
View Answer
Answer: d
Explanation:
Expressiveness: The Löwenheim–Skolem theorem shows that if a first-order theory has any infinite model,
then it has infinite models of every cardinality. In particular, no first-order theory with an infinite model can
be categorical. Thus there is no first-order theory whose only model has the set of natural numbers as its
domain, or whose only model has the set of real numbers as its domain. Many extensions of first-order logic,
including infinitely logics and higher-order logics, are more expressive in the sense that they do permit
categorical axiomatizations of the natural numbers or real numbers. This expressiveness comes at a meta-
logical cost, however: by Lindström’s theorem, the compactness theorem and the downward Löwenheim–
Skolem theorem cannot hold in any logic stronger than first-order.
Formalizing Natural Languages : First-order logic is able to formalize many simple quantifier constructions
in natural language, such as “every person who lives in Perth lives in Australia”. But there are many more
complicated features of natural language that cannot be expressed in (single-sorted) first-order logic.
Many-sorted Logic: Ordinary first-order interpretations have a single domain of discourse over which all
quantifiers range.
Many-sorted first-order logic allows variables to have different sorts, which have different domains.
5. A common convention is:
• is evaluated first
• and are evaluated next
• Quantifiers are evaluated next
• is evaluated last.
a) True
b) False
View Answer
Answer: a
Explanation: None.
6. A Term is either an individual constant (a 0-ary function), or a variable, or an n-ary function applied to n
terms: F(t1 t2 ..tn).
a) True
b) False
View Answer
Answer: a
Explanation: Definition of term in FOL.
7. First Order Logic is also known as ___________
a) First Order Predicate Calculus
b) Quantification Theory
c) Lower Order Calculus
d) All of the mentioned
View Answer
Answer: d
Explanation: None.
8. The adjective “first-order” distinguishes first-order logic from ___________ in which there are predicates
having predicates or functions as arguments, or in which one or both of predicate quantifiers or function
quantifiers are permitted.
a) Representational Verification
b) Representational Adequacy
c) Higher Order Logic
d) Inferential Efficiency
View Answer
Answer: c
Explanation: None.
Propositional Logic
CO-2: Understand logic, and the relationship of logic to formal knowledge representation and reasoning
CO-3: Understand the underlying ideas of Semantic Web and its layered architecture
Lecture Plan:
Lecture Quick
no. Topic Methodology reference
40 Introduction ,Approaches to Knowledge Representation chalk-board T1:7.1-7.2
41 Knowowledge representation using semantic network chalk-board T1:7.3-7.4
42 Knowowledge representation using semantic network chalk-board T1:7.3-7.4
43 Extended semantic networks for KR chalk-board T1:7.5
T1:15.1-15.2-
44 Knowledge representation using frames chalk-board 15.3
45 Knowledge representation using frames chalk-board T1:15.4-15.5
46 Introduction to Advanced KR techniques chalk-board T1:15.4
47 conceptual dependecy theory chalk-board T1:15.5
48 script structure, cyc theory chalk-board T1:15.6
49 case grammars, semantic web chalk-board T1:15.6
T1: Artificial Intelligence- Saroj Kaushik, CENGAGE Learning
Knowledge and Representation are two distinct entities. They play central but
distinguishable roles in intelligent system.
2. Rules: Production rules sometimes called IF-THEN rules are most popular KR. productions
rules are simple but powerful forms of KR. production rules provide the flexibility of
combining declarative and procedural representation for using them in a unified form.
Examples of production rules :
1. IF condition THEN action
2. IF premise THEN conclusion
3. IF proposition p1 and proposition p2 are true THEN proposition p3 is true
3. Frames
4. Semantic Net
Knowledge representation using semantic network
Lexical part :
nodes – denoting objects
links – denoting relations between objects
labels – denoting particular objects and relations
Structural part
the links and nodes form directed graphs
the labels are placed on the links and nodes
Semantic part
meanings are associated with the link and node labels
(the details will depend on the application domain)
Procedural part
constructors allow creation of new links and nodes
destructors allow the deletion of links and nodes
writers allow the creation and alteration of labels
readers can extract answers to questions
a. They allow us to structure the knowledge to reflect the structure of that part of the
world which is being represented.
c. There are very powerful representational possibilities as a result of “is a” and “is a
part of” inheritance hierarchies.
d. They can accommodate a hierarchy of default values (for example, we can assume
the height of an adult male to be 178cm, but if we know he is a baseball player we
should take it to be 195cm).
e. They can be used to represent events and natural language sentences.
Knowledge representation using frames
The idea of semantic networks started out as a natural way to represent labeled
connections between entities. But, as the representations are expected to support
increasingly large ranges of problem solving tasks, the representation schemes
necessarily become increasingly complex. In particular, it becomes necessary to
assign more structure to nodes, as well as to links.
It is natural to use database ideas to keep track of everything, and the nodes and their
relations begin to look more like frames.
A frame consists of a selection of slots which can be filled by values, or procedures
for calculating values, or pointers to other frames.
For Example:
Frames and semantic nets: Frames can be viewed as a structural representation of semantic nets.
Examples: below are four frames for the entities "Mammal", "Elephant", "Clyde", and "Nellie"
(The symbol * means that the value of the feature is typical for the entity, represented by the frame.)
Mammal
subclass: Animal
warm_blooded: yes
Elephant
subclass: Mammal
* color: grey
* size: large
Clyde
instance: Elephant
color: pink
owner: Fred
Nellie
instance: Elephant
size: small
Owner
instance: Slot
single_valued: no
range: Person
The attribute value Fred (and even "large", "grey", etc) could be represented as a frame, e.g.:
Fred
instance: Person
occupation: Elephant-breeder
Frames have greater representational power than semantic nets
Necessary attributes
Typical attributes ("*" used to indicate attributes that are only true of a typical member of
the class, and not necessarily every member).
Type constraints and default values of slots, overriding values.
Slots and procedures: a slot may have a procedure to compute the value of the slot if needed
e.g. object area, given the size
Primitive conceptualizations:
Conceptual dependency theory of four primitive conceptualizations
1. Actions ( ACT: actions)
2. objects (PP : picture producers)
3. modifiers of actions ( AA : action aiders)
4. modifiers of objects ( PA: picture aiders)
Conceptual Roles
Conceptualization: The basic unit of the conceptual level of understanding.
Actor : The performer of an ACT.
ACT : An action done to an object.
Object : A thing that is acted upon.
Recipient : The receiver of an object as the result of an ACT.
Direction : The location that an ACT is directed toward.
State : The state that an object is in.
Advantages of CD:
Using these primitives involves fewer inference rules.
Many inference rules are already represented in CD structure.
The holes in the initial structure help to focus on the points still to be established.
Disadvantages of CD:
Knowledge must be decomposed into fairly low level primitives.
Impossible or difficult to find correct set of primitives.
A lot of inference may still be required.
Representations can be complex even for relatively simple actions. Consider:
Scripts:
Scripts are Minsky’s original idea of a frame based structure that describes stereotyped sequences of
events in a particular context.
The slots in such frames will contain several different kinds of information, some of which may be
rather complex. Typically they will contain:
1. Information about how to use the frame.
2. A specification for the language/notation used in the frame.
3. Details about the ‘props’ and ‘roles’ that may be encountered.
4. Instructions about what one can expect to happen next, or what one should do next.
5. Indications about what to do if our expectations are not confirmed.
6. Any other information/instructions that might be appropriate.
In formulating scripts it is sensible to build them within a framework of agents manipulating props
using a well defined set of primitive acts, such as:
MOVE Movement of a body part by its owner (e.g. kick)
PROPEL Application of physical force to an object (e.g. push)
GRASP Grasping of an object by an agent (e.g. clutch)
INGEST Ingestion of an object by an animal (e.g. eat)
ATTEND Focussing of a sensor towards a stimulus (e.g. listen)
MTRANS Transfer of mental information (e.g. tell)
SPEAK Production of sounds (e.g. say)
PTRANS Transfer of the physical location of an object (e.g. go)
ATRANS Transfer of an abstract relationship (e.g. give)
MBUILD Building of new information out of old (e.g. decide)
Components of a Script
Looking at some typical scripts we can identify six important components:
Entry conditions: Conditions that must be satisfied before the events described in the script can
occur.
Roles : Agents involved in the events described in the script (which may be explicitly or implicitly
declared).
Props : Objects involved in the events described in the script (which may be explicitly or implicitly
declared).
Scenes: All the actual sequences of events that are represented in the script.
Track : The specific route through the possible sequences of events that arises when the
Script is processed.
Results: Conditions that will be true after the events described in the script have occurred.
Cyc Project:
AI comprises three tasks:
1. Develop a language (actually logic) for expressing knowledge. Since we would like to allow
many different programs to use this knowledge, this "representation language" needs a
declarative semantics.
2. Develop a set of procedures for manipulating (i.e., using) knowledge. Some of these will of
necessity be heuristic, some will be strategic or meta-level, some will be aimed at truth
maintenance and default reasoning, and some will be inductive rather than deductive, and so
on.
3. Build the knowledge base(s). For example, encode knowledge in the language developed in
i) above, so that one (person or machine) can apply to it the reasoning mechanisms of ii).
Semantic Web:
The Semantic Web is the extension of the World Wide Web that enables people to share content
beyond the boundaries of applications and websites. It has been described in rather different ways:
as a utopic vision, as a web of data, or merely as a natural paradigm shift in our daily use of the
Web.
Essay Questions :
Every living thing needs oxygen to live. Every human is a living thing. John is human. Answer
john is living thing john needs oxygen to live
MCQ :
Semantic Nets:
1. What among the following constitutes to the representation of the knowledge in different forms?
a) Relational method where each fact is set out systematically in columns
b) Inheritable knowledge where relational knowledge is made up of objects
c) Inferential knowledge
d) All of the mentioned
View Answer
Answer: d
Explanation: None.
2. Semantic Networks is
a) A way of representing knowledge
b) Data Structure
c) Data Type
d) None of the mentioned
View Answer
Answer: a
Explanation: None.
3. Graph used to represent semantic network is,
a) Undirected graph
b) Directed graph
c) Directed Acyclic graph (DAG)
d) Directed complete graph
View Answer
Answer: b
Explanation: Semantic Network is a directed graph consisting of vertices, which represent concepts and
edges, which represent semantic relations between the concepts.
4. Following are the Semantic Relations used in Semantic Networks.
a) Meronymy
b) Holonymy
c) Hyponymy
d) All of the mentioned
View Answer
Answer: d
Explanation: None.
5. Meronymy relation means,
a) A is part of B
b) B has A as a part of itself
c) A is a kind of B
d) A is superordinate of B
View Answer
Answer: a
Explanation: A meronym denotes a constituent part of, or a member of something. That is,
“X” is a meronym of “Y” if Xs are parts of Y(s), or
“X” is a meronym of “Y” if Xs are members of Y(s).
6. Hypernymy relation means,
a) A is part of B
b) B has A as a part of itself
c) A is a kind of B
d) A is superordinate of B
View Answer
Answer: d
Explanation: In linguistics, a hyponym is a word or phrase whose semantic field is included within that of
another word, its hypernym (sometimes spelled hyperonym outside of the natural language processing
community). In simpler terms, a hyponym shares a type-of relationship with its hypernym.
7. Holonymy relation means,
a) A is part of B
b) B has A as a part of itself
c) A is a kind of B
d) A is superordinate of B
View Answer
Answer: b
Explanation: Holonymy (in Greek holon = whole and onoma = name) is a semantic relation. Holonymy
defines the relationship between a term denoting the whole and a term denoting a part of, or a member of, the
whole. That is,
‘X’ is a holonym of ‘Y’ if Ys are parts of Xs, or
‘X’ is a holonym of ‘Y’ if Ys are members of Xs.
8. The basic inference mechanism in semantic network is to follow the links between the nodes.
a) True
b) False
View Answer
Answer: a
Explanation: None.
9. There exists two way to infer using semantic networks.
1) Intersection Search
2) Inheritance Search
a) True
b) False
View Answer
Answer: a
Explanation: None.
Frames :
1. Frames is
a) A way of representing knowledge
b) Data Structure
c) Data Type
d) None of the mentioned
View Answer
Answer: a
Explanation: None.
2. Frames in artificial intelligence is derived from semantic nets.
a) True
b) False
View Answer
Answer: a
Explanation: A frame is an artificial intelligence data structure used to divide knowledge into substructures
by representing “stereotyped situations.”.
3. Following are the elements, which constitutes to the frame structure.
a) Facts or Data
b) Procedures and default values
c) Frame names
d) Frame reference in hierarchy
View Answer
Answer: a
Explanation: None.
4. Like semantic networks, frames can be queried using spreading activation.
a) True
b) False
View Answer
Answer: a
Explanation: None.
5. Hyponymy relation means,
a) A is part of B
b) B has A as a part of itself
c) A is subordinate of B
d) A is superordinate of B
View Answer
Answer: c
Explanation: In linguistics, a hyponym is a word or phrase whose semantic field is included within that of
another word, its hypernym (sometimes spelled hyperonym outside of the natural language processing
community). In simpler terms, a hyponym shares a type-of relationship with its hypernym..
6. The basic inference mechanism in semantic network in which knowledge is represented as Frames is to
follow the links between the nodes.
a) True
b) False
View Answer
Answer: a
Explanation: None.
7. There exists two way to infer using semantic networks in which knowledge is represented as Frames.
1) Intersection Search
2) Inheritance Search
a) True
b) False
View Answer
Answer: a
Explanation: None.
Unit V - Expert system and Applications
Unit Outcomes:
CO-1: Use an expert system shell to develop a meaningful expert system application;
CO-2: Develop a simple expert system in a specialized expert system language;
CO-3: Aware of current market trends in expert system languages and packages.
Expert Systems (ES) are computer programs that try to replicate knowledge and skills of human
experts in some area, and then solve problems in this area (the way human experts would).
ES take their roots in Cognitive Science — the study of human mind using combination of
AI and psychology.
ES were the first successful applications of AI to real–world problems solving problems in
medicine, chemistry, finance and even in space (Space Shuttle, robots on other planets).
In business, ES allow many companies to save $ millions
KNOWLEDGE–BASED SYSTEMS:
DENDRAL (Feigenbaum et al, 1969) was a program that used rules to infer molecular structure
from spectral information. The challenge was that the number of possible molecules was so large,
that it was impossible to check all of them using simple rules (weak method).
The researchers consulted experts in chemistry and added several more specific rules to their
program. The number of combinations the program had to test was reduced dramatically
DENDRAL demonstrated the importance of the domain–specific knowledge.
KNOWLEDGE ENGINEERING:
The process of designing an ES is called knowledge engineering. It consists of three stages:
1. Knowledge acquisition: the process of obtaining the knowledge from experts (by
interviewing and/or observing human experts, reading specific books, etc).
2. Knowledge representation: selecting the most appropriate structures to represent the
knowledge (lists, sets, scripts, decision trees, object–attribute–value triplets, etc).
3. Knowledge validation: testing that the knowledge of ES is correct and complete :
There are different interdependent and overlapping phases in building an expert system as follows:
● Identification Phase:
− Knowledge engineer finds out important features of the problem with the help of
domain expert (human).
− He tries to determine the type and scope of the problem, the kind of resources
required, goal and objective of the ES.
● Conceptualization Phase:
− In this phase, knowledge engineer and domain expert decide the concepts, relations
and control mechanism needed to describe a problem solving.
● Formalization Phase:
− It involves expressing the key concepts and relations in some framework supported
by ES building tools.
− Formalized knowledge consists of data structures, inference rules, control strategies
and languages for implementation.
● Implementation Phase:
− During this phase, formalized knowledge is converted to working computer program
initially called prototype of the whole system.
● Testing Phase:
It involves evaluating the performance and utility of prototype systems and revising
it if need be. Expert evaluates the prototype system and his feedback help knowledge
engineer to revise it
Expert System Architecture
ARCHITECTURE:
Expert System
Inference Engine
Special Interfaces
Inference & Control
Static database
User User Interface
Dynamic database
(working memory)
Explanation Module
KB consists of knowledge about problem domain in the form of static and dynamic databases.
Static knowledge consists of
− rules and facts which is complied as a part of the system and does not change during
execution of the system.
Dynamic knowledge consists of facts related to a particular consultation of the system.
− At the beginning of the consultation, the dynamic knowledge base often called
working memory is empty.
− As a consultation progresses, dynamic knowledge base grows and is used along with
static knowledge in decision making.
Working memory is deleted at the end of consultation of the system.
2. Inference Engine
It consists of inference mechanism and control strategy.
Inference means search through knowledge base and derive new knowledge.
It involve formal reasoning involving matching and unification similar to the one performed
by human expert to solve problems in a specific area of knowledge.
Inference operates by using modus ponen rule.
Control strategy determines the order in which rules are applied.
There are mainly two types of control mechanism viz., forward chaining and backward chaining
3. Knowledge Acquisition:
● Knowledge acquisition module allows system to acquire knowledge about the problem
domain.
● Sources of Knowledge for ES
− text books, reports, case studies,
− empirical data and
− domain expert experience.
● Updation of Knowledge can be done using knowledge acquisition module of the system.
− insertion,
− deletion and
− updation of existing knowledge
4. Case History:
● Case History stores the file created by inference engine using the dynamic database created
at the time of consultation.
● Useful for learning module to enrich its knowledge base.
● Different cases with solutions are stored in Case Base system.
● These cases are used for solving problem using Case Base Reasoning (CBR).
5. Explanation module:
● Most expert systems have explanation facilities that allow the user to ask the system why it
asked some question, and how it reached to conclusion.
● It contains 'How' and 'Why' modules attached to it.
− The sub-module ‘How’ tells the user about the process through which system has
reached to a particular solution
− ‘Why' sub-module tells that why is that particular solution offered.
● It explains user about the reasoning behind any particular problem solution.
● Questions are answered by referring to the system goals, the rules being used, and any
existing problem data.
6. User Interfaces:
Allows user to communicate with system in interactive mode and helps system to create
working knowledge for the problem to be solved.
7. Special interfaces:
● It may be used for specialized activities such as handling uncertainty in knowledge.
● This is a major area of expert systems research that involves methods for reasoning with
uncertain data and uncertain knowledge.
● Knowledge is generally incomplete and uncertain.
● To deal with uncertain knowledge, a rule may have associated with it a confidence factor or
a weight.
The set of methods for using uncertain knowledge in combination with uncertain data in the
reasoning process is called reasoning with uncertainty
Advantages:
Increased productivity (find solutions much faster than humans)
Essay questions:
1. What is Inference Engine? Describe Backward and forward chaining mechanism used by an
inference engine?
2. How is an expert system different from a traditional program?
3. Explain the phases in building expert system
4. Briefly explain the architecture of expert systems.
5. Explain the Applications of the Expert systems.
6. Explain the Issues in black board systems for problem solving
Short answer questions:
1. What are Expert Systems?
2. Briefly explain the knowledge acquistion process.
3. List the characteristic features of a expert system.
4. Mention some of the key applications of ES.
5. What is learning? What are its types?
6. Define generalization.
7. Define Inductive Bias.
8. What is Explanation Based Learning? How is it useful?
MCQ :
4. The “Turing Machine” showed that you could use a/an _____ system to program any algorithmic task.
a) binary
b) electro-chemical
c) recursive
d) semantic
View Answer
Answer: a
Explanation: None.
5. MCC is investigating the improvement of the relationship between people and computers through a
technology called:
a) computer-aided design
b) human factors
c) parallel processing
d) all of the mentioned
View Answer
Answer: b
Explanation: None.
6. The first widely-used commercial form of Artificial Intelligence (Al) is being used in many popular
products like microwave ovens, automobiles and plug in circuit boards for desktop PCs. It allows machines
to handle vague information with a deftness that mimics human intuition. What is the name of this Artificial
Intelligence?
a) Boolean logic
b) Human logic
c) Fuzzy logic
d) Functional logic
View Answer
Answer: c
Explanation: None.
7. In his landmark book Cybernetics, Norbert Wiener suggested a way of modeling scientific phenomena
using not energy, but:
a) mathematics
b) intelligence
c) information
d) history
View Answer
Answer: c
Explanation: None.
8. Input segments of AI programming contain(s)
a) sound
b) smell
c) touch
d) None of the mentioned
View Answer
Answer: d
Explanation: None.
9. The applications in the Strategic Computing Program include:
a) battle management
b) autonomous systems
c) pilot’s associate
d) all of the mentioned
View Answer Answer: d
Unit – VI - Uncertainty measure, Fuzzy sets and fuzzy logic
Fuzzy sets and fuzzy logic: Introduction, fuzzy sets, fuzzy set operations,
types of membership functions, multi valued logic, fuzzy logic, linguistic
variables and hedges, fuzzy propositions, inference rules for fuzzy
propositions, fuzzy systems.
Unit Objectives:
After reading this Unit, you should be able to understand:
Unit Outcomes:
CO-1 : Build a Bayesian Belief Network for a real world applications
CO-2 : Distinguish between the crisp set and fuzzy set
CO-3 : Define fuzzy sets using linguistic words and represent these sets by membership
functions.
CO-4 : Familiar with fuzzy relations and the properties of these relations
Lecture Plan:
Lecture Quick
no. Topic Methodology reference
58 Introduction, probability theory chalk-board T1:7.1-7.2
59 Probabilty theory Continued chalk-board T1:7.3-7.4
60 Bayesian Belief Networks chalk-board T1:7.3-7.4
61 Bayesian Belief Networks chalk-board T1:7.5
62 Certainty Factor Theory chalk-board T1:15.1
63 Certainty Factor Theory chalk-board T1:15.2
64 Dempster -Shafer theory chalk-board T1:15.3
65 introdcuttion to fuzzy sets chalk-board T1:15.4
66 Fuzzy set operations chalk-board T1:15.5
67 Types of membership functions chalk-board T1:15.4
68 Multi-valued Logic chalk-board T1:15.5
69 fuzzy logic chalk-board T1:15.6
70 fuzzy propositions chalk-board T1:15.6
71 chalk-board
Inference rules for fuzzy propositions T1:15.6
T1: Artificial Intelligence- Saroj Kaushik, CENGAGE Learning
Figure 10.7: The simulated annealing search algorithm, a version of stochastic hill climbing where
some downhill moves are allowed. Downhill moves are accepted readily early in 1 the annealing
schedule and then less often as time goes on. The schedule input determines the value of T as a
function of time.
Probability theory is also useful to engineers building systems that have to operate intelligently in an
uncertain world.
Conditional Probabilities:
Conditional probabilities are key for reasoning because they formalize the process of accumulating
evidence and updating probabilities based on new evidence.
If P(A|B) = 1, this is equivalent to the sentence in Propositional Logic B => A. Similarly, if P(A|B)
=0.9, then this is like saying B => A with 90% certainty.
Given several measurements and other "evidence", E1, ..., Ek, we will formulate queries as P(Q |
E1, E2, ..., Ek) meaning "what is the degree of belief that Q is true given that we know E1, ..., Ek
and nothing else."
1. Rewriting the definition of conditional probability, we get the Product Rule: P(A,B) = P(A|
B)P(B)
2. Chain Rule: P(A,B,C,D) = P(A|B,C,D)P(B|C,D)P(C|D)P(D), which generalizes the product
rule for a joint probability of an arbitrary number of variables. Note that ordering the
variables results in a different expression, but all have the same resulting value.
3. Conditionalized version of the Chain Rule: P(A,B|C) = P(A|B,C)P(B|C)
4. Bayes's Rule: P(A|B) = (P(A)P(B|A))/P(B), which can be written as follows to more clearly
emphasize the "updating" aspect of the rule: P(A|B) = P(A) * [P(B|A)/P(B)] Note: The terms
P(A) and P(B) are called the prior (or marginal) probabilities. The term P(A|B) is called the
posterior probability because it is derived from or depends on the value of B.
5. Conditionalized version of Bayes's Rule: P(A|B,C) = P(B|A,C)P(A|C)/P(B|C)
6. Conditioning (aka Addition) Rule: P(A) = Sum{P(A|B=b)P(B=b)} where the sum is over all
possible values b in the sample space of B.
7. P(~B|A) = 1 - P(B|A)
Bayesian Networks, also known as Bayes Nets, Belief Nets, Causal Nets, and Probability
Nets, are a space-efficient data structure for encoding all of the information in the full joint
probability distribution for the set of random variables defining a domain. That is, from the
Bayesian Net one can compute any value in the full joint probability distribution of the set of
random variables.
Represents all of the direct causal relationships between variables
Intuitively, to construct a Bayesian net for a given set of variables, draw arcs from cause
variables to immediate effects.
Space efficient because it exploits the fact that in many real-world problem domains the
dependencies between variables are generally local, so there are a lot of conditionally
independent variables
Captures both qualitative and quantitative relationships between variables
Can be used to reason
Forward (top-down) from causes to effects -- predictive reasoning (aka causal reasoning)
Backward (bottom-up) from effects to causes -- diagnostic reasoning
Formally, a Bayesian Net is a directed, acyclic graph (DAG), where there is a node for each
random variable, and a directed arc from A to B whenever A is a direct causal influence on
B. Thus the arcs represent direct causal relationships and the nodes represent states of
affairs. The occurrence of A provides support for B, and vice versa. The backward influence
is call "diagnostic" or "evidential" support for A due to the occurrence of B.
Each node A in a net is conditionally independent of any subset of nodes that are not
descendants of A given the parents of A.
Intuitively, "to construct a Bayesian Net for a given set of variables, we draw arcs from cause
variables to immediate effects. In almost all cases, doing so results in a Bayesian network [whose
conditional independence implications are accurate]." (Heckerman, 1996)
1. Identify a set of random variables that describe the given problem domain
2. Choose an ordering for them: X1, ..., Xn
3. for i=1 to n do
4. Add a new node for Xi to the net
5. Set Parents(Xi) to be the minimal set of already added nodes such that we have
conditional independence of Xi and all other members of {X1, ..., Xi-1} given
Parents(Xi)
6. Add a directed arc from each node in Parents(Xi) to Xi
7. If Xi has at least one parent, then define a conditional probability table at Xi: P(Xi=x |
possible assignments to Parents(Xi)). Otherwise, define a prior probability at Xi: P(Xi)
There is not, in general, a unique Bayesian Net for a given set of random variables. But all
represent the same information in that from any net constructed every entry in the joint
probability distribution can be computed.
The "best" net is constructed if in Step 2 the variables are topologically sorted first. That is,
each variable comes before all of its children. So, the first nodes should be the roots, then the
nodes they directly influence, and so on.
The algorithm will not construct a net that is illegal in the sense of violating the rules of
probability.
Certainty Factor Theory :
A certainty factor (CF) is a numerical value that expresses a degree of subjective belief that a
particular item is true. The item may be a fact or a rule.
The MYCIN developers realized that a Bayesian approach was intractable, as too much data
and/or suppositions/estimates are required.
In addition, medical diagnosis systems based on Bayesian methods were not acepted because
the systems did not provide simple explanations of how it has reached its conclusion.
Certainty Factors are similar to conditional probabilities,but somewhat different.
We can associate CFs with facts: – E.g., padre(John, Mary) with CF .90
We can also associate CFs with rules: – (if (sneezes X) then (has_cold X) ) with CF 0.7
– where the CF measures our belief in the conclusion given the premise is observed.
1. MB(H, E) – Measure of Belief: value between 0 and 1 representing the degree to which
belief in the hypothesis H is supported by observing evidence E.
2. MD(H, E) – Measure of Disbelief: value between 0 and 1 representing the degree to which
disbelief in the hypothesis H is supported by observing
evidence E.
CF is calculated in terms of the difference between MB and MD:
p(H|E)=0.21/0.54=0.388
p(H)=0.6
• MB(H,E) = 0
• MD(H,E)=(0.6-0.388)/0.6 = 0.3519
“The mass m(A) of a given member of the power set, A, expresses the proportion of
all relevant and available evidence that supports the claim that the actual state
belongs to A but to no particular subset of A.” (wikipedia)
“The value of m(A) pertains only to the set A and makes no additional claims about
any subsets of A, each of which has, by definition, its own mass
4 people (B, J, S and K) are locked in a room when the lights go out.
When the lights come on, K is dead, stabbed with a knife.
Not suicide (stabbed in the back)
No-one entered the room.
Assume only one killer.
Θ = { B, J, S}
P(Θ) = (Ø, {B}, {J}, {S}, {B,J}, {B,S}, {J,S}, {B,J,S} )
Detectives, after reviewing the crime-scene, assign mass probabilities to various elements of
the power set:
Using a term principle of incompatibility, Dr. Zadeh states "As the complexity of a system
increases, our ability to make precise and yet significant statements about its behaviour
diminishes until a threshold is reached beyond which precision and significance (or
relevance) become almost mutually exclusive characteristics.”
Fuzzy sets are functions that map each member in a set to a real number in [0, 1] to indicate
the degree of membership of that member.
Essay Questions:
MCQ:
2. a) Consider a game tree given in the fig. in which static scores are shown along leaf nodes from the
first player’s (MAX) point of view. i) What move should the first player choose?
1M
ii) What nodes would need to be examined using α-β pruning algorithm assuming that nodes are
examined in left to right order?
4M
3. a) Determine whether the following formula is consistent or inconsistent using tableau method
( A B ) ( C B ) (C B ) 5M
IV-B. Tech MID – I EXAMINATION SET-2
CSE 1&2 Subject: Artificial Intelligence
Name of the faculty: Mr.Ch.Viswanatha Sarma Date of Exam: 19.12.2013
2. Find Optimal path and optimal cost using A* algorithm to the following example 5M
A (B C ), B A C 5 M ( or)
KB : P R R Q Q T P S T S