AI Notes_Module4
AI Notes_Module4
1
MODULE 4
2
Game Playing
Charles Babbage, the nineteenth-century computer architect thought about programming
his analytical engine to play chess and later of building a machine to play tic-tac-toe.
There are two reasons that games appeared to be a good domain.
1. They provide a structured task in which it is very easy to measure success or
failure.
2. They are easily solvable by straightforward search from the starting state to
a winning position.
The first is true is for all games bust the second is not true for all, except simplest games.
For example, consider chess.
The average branching factor is around 35. In an average game, each player might
make 50.
So in order to examine the complete game tree, we would have to examine 35100
Thus it is clear that a simple search is not able to select even its first move during
the lifetime of its opponent.
It is clear that to improve the effectiveness of a search based problem-solving
program two things can do.
1. Improve the generate procedure so that only good moves generated.
2. Improve the test procedure so that the best move will recognize and explored first.
If we use legal-move generator then the test procedure will have to look at each of them
because the test procedure must look at so many possibilities, it must be fast.
Instead of the legal-move generator, we can use plausible-move generator in which only
some small numbers of promising moves generated.
As the number of lawyers available moves increases, it becomes increasingly
important in applying heuristics to select only those moves that seem more promising.
The performance of the overall system can improve by adding heuristic knowledge
into both the generator and the tester.
In game playing, a goal state is one in which we win but the game like chess. It is
not possible. Even we have good plausible move generator.
The depth of the resulting tree or graph and its branching factor is too great.
It is possible to search tree only ten or twenty moves deep then in order to choose the
best move. The resulting board positions must compare to discover which is most
advantageous.
This is done using static evolution function, which uses whatever information it has to
evaluate individual board position by estimating how likely they are to lead eventually
to a win.
Its function is similar to that of the heuristic function h’ in the A* algorithm: in the
absence of complete information, choose the most promising position.
3
Now we can apply the static evolution function to those positions and simply choose
the best one.
After doing so, we can back that value up to the starting position to represent
our evolution of it.
Here we assume that static evolution function returns larger values to indicate
good situations for us.
So our goal is to maximize the value of the static evaluation function of the next
board position.
The opponents’ goal is to minimize the value of the static evaluation function.
The alternation of maximizing and minimizing at alternate ply when evaluations are
to be pushed back up corresponds to the opposing strategies of the two players is
called MINIMAX.
It is the recursive procedure that depends on two procedures
MOVEGEN(position, player)— The plausible-move generator, which returns a
list of nodes representing the moves that can make by Player in Position.
STATIC(position, player)– static evaluation function, which returns a
number representing the goodness of Position from the standpoint of Player.
With any recursive program, we need to decide when recursive procedure should stop.
There are the variety of factors that may influence the decision they are,
Has one side won?
How many plies have we already explored? Or how much time is left?
How stable is the configuration?
We use DEEP-ENOUGH which assumed to evaluate all of these factors and to
return TRUE if the search should be stopped at the current level and FALSE
otherwise.
It takes two parameters, position, and depth, it will ignore its position parameter
and simply return TRUE if its depth parameter exceeds a constant cut off value.
One problem that arises in defining MINIMAX as a recursive procedure is that it needs
to return not one but two results.
The backed-up value of the path it chooses.
The path itself. We return the entire path even though probably only the first
element, representing the best move from the current position, actually
needed.
We assume that MINIMAX returns a structure containing both results and we have two
functions, VALUE and PATH that extract the separate components.
Initially, It takes three parameters, a board position, the current depth of the search,
and the player to move,
MINIMAX(current,0,player-one) If player –one is to move
MINIMAX(current,0,player-two) If player –two is to move
5
It requires maintaining of two threshold values, one representing a lower bound on that
a maximizing node may ultimately assign (we call this alpha).
And another representing an upper bound on the value that a minimizing node may
assign (this we call beta).
Each level must receive both the values, one to use and one to pass down to the next
level to use.
The MINIMAX procedure as it stands does not need to treat maximizing and minimizing
levels differently. Since it simply negates evaluation each time it changes levels.
Instead of referring to alpha and beta, MINIMAX uses two values, USE-THRESH and
PASSTHRESH.
USE-THRESH used to compute cutoffs. PASS-THRESH passed to next level as its
USETHRESH.
USE-THRESH must also pass to the next level, but it will pass as PASS-THRESH so that
it can be passed to the third level down as USE-THRESH again, and so forth.
Just as values had to negate each time they passed across levels.
Still, there is no difference between the code required at maximizing levels and
that required at minimizing levels.
PASS-THRESH should always the maximum of the value it inherits from above and
the best move found at its level.
If PASS-THRESH updated the new value should propagate both down to lower levels.
And back up to higher ones so that it always reflects the best move found anywhere in
the tree.
The MINIMAX-A-B requires five arguments, position, depth, player, Use-thresh, and
passThresh.
MINIMAX-A-B(current,0,player-one,maximum value static can compute, minimum
value static can compute).
Iterative Deepening Search(IDS) or Iterative Deepening Depth First Search(IDDFS)
There are two common ways to traverse a graph, BFS and DFS. Considering a Tree (or Graph)
of huge height and width, both BFS and DFS are not very efficient due to following reasons.
1. DFS first traverses nodes going through one adjacent of root, then next adjacent. The
problem with this approach is, if there is a node close to root, but not in first few
subtrees explored by DFS, then DFS reaches that node very late. Also, DFS may not find
shortest path to a node (in terms of number of edges).
6
2. BFS goes level by level, but requires more space. The space required by DFS is O(d)
where d is depth of tree, but space required by BFS is O(n) where n is number of nodes
in tree (Why? Note that the last level of tree can have around n/2 nodes and second last
level n/4 nodes and in BFS we need to have every level one by one in queue).
IDDFS combines depth-first search’s space-efficiency and breadth-first search’s fast search (for
nodes closer to root).
How does IDDFS work?
IDDFS calls DFS for different depths starting from an initial value. In every call, DFS is
restricted from going beyond given depth. So basically we do DFS in a BFS fashion.
Algorithm:
7
if DLS(i, target, limit?1)
return true
return false
An important thing to note is, we visit top level nodes multiple times. The last (or max depth)
level is visited once, second last level is visited twice, and so on. It may seem expensive, but it
turns out to be not so costly, since in a tree most of the nodes are in the bottom level. So it does
not matter much if the upper levels are visited multiple times.
8
Natural Language Processing
Introduction to Natural Language Processing
9
Language meant for communicating with the world.
Also, By studying language, we can come to understand more about the world.
If we can succeed at building computational mode of language, we will have a
powerful tool for communicating with the world.
Also, We look at how we can exploit knowledge about the world, in combination with
linguistic facts, to build computational natural language systems.
Natural Language Processing (NLP) problem can divide into two tasks:
1. Processing written text, using lexical, syntactic and semantic knowledge of the
language as well as the required real-world information.
2. Processing spoken language, using all the information needed above plus additional
knowledge about phonology as well as enough added information to handle the further
ambiguities that arise in speech.
Steps in Natural Language Processing
Morphological Analysis
Individual words analyzed into their components and non-word tokens such as
punctuation separated from the words.
Syntactic Analysis
Linear sequences of words transformed into structures that show how the words relate
to each other.
Moreover, Some word sequences may reject if they violate the language’s rule for how
words may combine.
Semantic Analysis
The structures created by the syntactic analyzer assigned meanings.
Also, A mapping made between the syntactic structures and objects in the task domain.
Moreover, Structures for which no such mapping possible may reject.
Discourse integration
The meaning of an individual sentence may depend on the sentences that precede it. And
also, may influence the meanings of the sentences that follow it.
Pragmatic Analysis
Moreover, The structure representing what said reinterpreted to determine what
was actually meant.
Summary
Results of each of the main processes combine to form a natural language system.
All of the processes are important in a complete natural language understanding system.
Not all programs are written with exactly these components.
Sometimes two or more of them collapsed.
Doing that usually results in a system that is easier to build for restricted subsets
of English but one that is harder to extend to wider coverage.
10
Pull apart the word “Bill’s” into proper noun “Bill” and the possessive suffix “’s”
Recognize the sequence “.init” as a file extension that is functioning as an adjective in
the sentence.
This process will usually assign syntactic categories to all the words in the sentence.
Syntactic Analysis
A syntactic analysis must exploit the results of the morphological analysis to build
a structural description of the sentence.
The goal of this process, called parsing, is to convert the flat list of words that form
the sentence into a structure that defines the units that represented by that flat list.
The important thing here is that a flat sentence has been converted into a hierarchical
structure. And that the structure corresponds to meaning units when a semantic
analysis performed.
Reference markers (set of entities) shown in the parenthesis in the parse tree.
Each one corresponds to some entity that has mentioned in the sentence.
These reference markers are useful later since they provide a place in which
to accumulate information about the entities as we get it.
Semantic Analysis
The semantic analysis must do two important things:
1. It must map individual words into appropriate objects in the knowledge base or
database.
2. It must create the correct structures to correspond to the way the meanings of
the individual words combine with each other.
11
Discourse Integration
Specifically, we do not know whom the pronoun “I” or the proper noun “Bill” refers to.
To pin down these references requires an appeal to a model of the current discourse
context, from which we can learn that the current user is USER068 and that the only
12
person named “Bill” about whom we could be talking is USER073.
Once the correct referent for Bill known, we can also determine exactly which
file referred to.
Pragmatic Analysis
The final step toward effective understanding is to decide what to do as a result.
13
One possible thing to do to record what was said as a fact and done with it.
For some sentences, a whose intended effect is clearly declarative, that is the
precisely correct thing to do.
But for other sentences, including this one, the intended effect is different.
We can discover this intended effect by applying a set of rules that characterize
cooperative dialogues.
The final step in pragmatic processing to translate, from the knowledge-based
representation to a command to be executed by the system.
Syntactic Processing
Syntactic Processing is the step in which a flat input sentence converted into a
hierarchical structure that corresponds to the units of meaning in the sentence.
This process called parsing.
It plays an important role in natural language understanding systems for two reasons:
1. Semantic processing must operate on sentence constituents. If there is no syntactic
parsing step, then the semantics system must decide on its own constituents. If
parsing is done, on the other hand, it constrains the number of constituents that
semantics can consider.
2. Syntactic parsing is computationally less expensive than is semantic
processing. Thus it can play a significant role in reducing overall system
complexity.
Although it is often possible to extract the meaning of a sentence without using
grammatical facts, it is not always possible to do so.
Almost all the systems that are actually used have two main components:
1. A declarative representation, called a grammar, of the syntactic facts about
the language.
2. A procedure, called parser that compares the grammar against input sentences to
produce parsed structures.
Grammars and Parsers
The most common way to represent grammars is a set of production rules.
The first rule can read as “A sentence composed of a noun phrase followed by
Verb Phrase”; the Vertical bar is OR; ε represents the empty string.
Symbols that further expanded by rules called non-terminal symbols.
Symbols that correspond directly to strings that must found in an input sentence
called terminal symbols.
Grammar formalism such as this one underlies many linguistic theories, which in turn
provide the basis for many natural language understanding systems.
Pure context-free grammars are not effective for describing natural languages.
NLPs have less in common with computer language processing systems such
14
as compilers.
Parsing process takes the rules of the grammar and compares them against the
input sentence.
The simplest structure to build is a Parse Tree, which simply records the rules and
how they matched.
Every node of the parse tree corresponds either to an input word or to a non-terminal in
our grammar.
Each level in the parse tree corresponds to the application of one grammar rule.
15
Example for Syntactic Processing – Augmented Transition
Network
Syntactic Processing is the step in which a flat input sentence is converted into a hierarchical
structure that corresponds to the units of meaning in the sentence. This process called parsing.
It plays an important role in natural language understanding systems for two reasons:
1. Semantic processing must operate on sentence constituents. If there is no syntactic
parsing step, then the semantics system must decide on its own constituents. If parsing
is done, on the other hand, it constrains the number of constituents that semantics can
consider.
2. Syntactic parsing is computationally less expensive than is semantic processing. Thus
it can play a significant role in reducing overall system complexity.
Example: A Parse tree for a sentence: Bill Printed the file
16
17
Statistical Natural Language Processing
Formerly, many language-processing tasks typically involved the direct hand coding of rules,
which is not in general robust to natural-language variation. The machine-learning
paradigm calls instead for using statistical inference to automatically learn such rules through the
analysis of large corpora of typical real-world examples (a corpus (plural, "corpora") is a set of
documents, possibly with human or computer annotations).
Many different classes of machine learning algorithms have been applied to natural-language
processing tasks. These algorithms take as input a large set of "features" that are generated from
the input data. Some of the earliest-used algorithms, such as decision trees, produced systems of
hard if-then rules similar to the systems of hand-written rules that were then common.
Increasingly, however, research has focused on statistical models, which make
soft, probabilistic decisions based on attaching real-valued weights to each input feature. Such
models have the advantage that they can express the relative certainty of many different possible
answers rather than only one, producing more reliable results when such a model is included as a
18
component of a larger system.
Systems based on machine-learning algorithms have many advantages over hand-produced rules:
19
The learning procedures used during machine learning automatically focus on the most
common cases, whereas when writing rules by hand it is often not at all obvious where
the effort should be directed.
Automatic learning procedures can make use of statistical inference algorithms to
produce models that are robust to unfamiliar input (e.g. containing words or structures
that have not been seen before) and to erroneous input (e.g. with misspelled words or
words accidentally omitted). Generally, handling such input gracefully with hand-written
rules—or more generally, creating systems of hand-written rules that make soft
decisions—is extremely difficult, error-prone and time-consuming.
Systems based on automatically learning the rules can be made more accurate simply by
supplying more input data. However, systems based on hand-written rules can only be
made more accurate by increasing the complexity of the rules, which is a much more
difficult task. In particular, there is a limit to the complexity of systems based on hand-
crafted rules, beyond which the systems become more and more unmanageable.
However, creating more data to input to machine-learning systems simply requires a
corresponding increase in the number of man-hours worked, generally without
significant increases in the complexity of the annotation process.
20
21