Algorithms List in AI
Algorithms List in AI
1
2 INTELLIGENT AGENTS
Figure 2.3 The TABLE -D RIVEN -AGENT program is invoked for each new percept and returns an
action each time. It retains the complete percept sequence in memory.
Figure 2.4 The agent program for a simple reflex agent in the two-state vacuum environment. This
program implements the agent function tabulated in Figure ??.
Figure 2.6 A simple reflex agent. It acts according to a rule whose condition matches the current
state, as defined by the percept.
2
3
Figure 2.8 A model-based reflex agent. It keeps track of the current state of the world, using an
internal model. It then chooses an action in the same way as the reflex agent.
3 SOLVING PROBLEMS BY
SEARCHING
Figure 3.1 A simple problem-solving agent. It first formulates a goal and a problem, searches for a
sequence of actions that would solve the problem, and then executes the actions one at a time. When
this is complete, it formulates another goal and starts over.
4
5
Figure 3.7 An informal description of the general tree-search and graph-search algorithms. The
parts of G RAPH -S EARCH marked in bold italic are the additions needed to handle repeated states.
Figure 3.13 Uniform-cost search on a graph. The algorithm is identical to the general graph search
algorithm in Figure ??, except for the use of a priority queue and the addition of an extra check in case
a shorter path to a frontier state is discovered. The data structure for frontier needs to support efficient
membership testing, so it should combine the capabilities of a priority queue and a hash table.
Figure 3.17 The iterative deepening search algorithm, which repeatedly applies depth-limited search
with increasing limits. It terminates when a solution is found or if the depth-limited search returns
failure, meaning that no solution exists.
function RBFS(problem, node, f limit) returns a solution, or failure and a new f -cost limit
if problem.G OAL -T EST(node.S TATE) then return S OLUTION(node)
successors ← [ ]
for each action in problem.ACTIONS(node.S TATE) do
add C HILD -N ODE( problem, node, action) into successors
if successors is empty then return failure, ∞
for each s in successors do /* update f with value from previous search, if any */
s.f ← max(s.g + s.h, node.f ))
loop do
best ← the lowest f -value node in successors
if best .f > f limit then return failure, best .f
alternative ← the second-lowest f -value among successors
result , best .f ← RBFS(problem, best , min( f limit, alternative))
if result 6= failure then return result
Figure 4.2 The hill-climbing search algorithm, which is the most basic local search technique. At
each step the current node is replaced by the best neighbor; in this version, that means the neighbor
with the highest VALUE, but if a heuristic cost estimate h is used, we would find the neighbor with the
lowest h.
Figure 4.5 The simulated annealing algorithm, a version of stochastic hill climbing where some
downhill moves are allowed. Downhill moves are accepted readily early in the annealing schedule and
then less often as time goes on. The schedule input determines the value of the temperature T as a
function of time.
8
9
repeat
new population ← empty set
for i = 1 to S IZE( population) do
x ← R ANDOM -S ELECTION( population, F ITNESS -F N)
y ← R ANDOM -S ELECTION( population , F ITNESS -F N)
child ← R EPRODUCE(x , y)
if (small random probability) then child ← M UTATE(child)
add child to new population
population ← new population
until some individual is fit enough, or enough time has elapsed
return the best individual in population, according to F ITNESS -F N
Figure 4.8 A genetic algorithm. The algorithm is the same as the one diagrammed in Figure ??, with
one variation: in this more popular version, each mating of two parents produces only one offspring,
not two.
Figure 4.11 An algorithm for searching AND – OR graphs generated by nondeterministic environ-
ments. It returns a conditional plan that reaches a goal state in all circumstances. (The notation [x | l]
refers to the list formed by adding object x to the front of list l.)
10 Chapter 4. Beyond Classical Search
Figure 4.21 An online search agent that uses depth-first exploration. The agent is applicable only in
state spaces in which every action can be “undone” by some other action.
Figure 4.24 LRTA*-AGENT selects an action according to the values of neighboring states, which
are updated as the agent moves about the state space.
5 ADVERSARIAL SEARCH
Figure 5.3 An algorithm for calculating minimax decisions. It returns the action corresponding
to the best possible move, that is, the move that leads to the outcome with the best utility, under the
assumption that the opponent plays to minimize utility. The functions M AX -VALUE and M IN -VALUE
go through the whole game tree, all the way to the leaves, to determine the backed-up value of a state.
The notation argmaxa ∈ S f (a) computes the element a of set S that has the maximum value of f (a).
11
12 Chapter 5. Adversarial Search
Figure 5.7 The alpha–beta search algorithm. Notice that these routines are the same as the
M INIMAX functions in Figure ??, except for the two lines in each of M IN -VALUE and M AX -VALUE
that maintain α and β (and the bookkeeping to pass these parameters along).
CONSTRAINT
6 SATISFACTION
PROBLEMS
function AC-3( csp) returns false if an inconsistency is found and true otherwise
inputs: csp, a binary CSP with components (X, D, C)
local variables: queue, a queue of arcs, initially all the arcs in csp
Figure 6.3 The arc-consistency algorithm AC-3. After applying AC-3, either every arc is arc-
consistent, or some variable has an empty domain, indicating that the CSP cannot be solved. The
name “AC-3” was used by the algorithm’s inventor (?) because it’s the third version developed in the
paper.
13
14 Chapter 6. Constraint Satisfaction Problems
Figure 6.5 A simple backtracking algorithm for constraint satisfaction problems. The algo-
rithm is modeled on the recursive depth-first search of Chapter ??. By varying the functions
S ELECT-U NASSIGNED -VARIABLE and O RDER -D OMAIN -VALUES, we can implement the general-
purpose heuristics discussed in the text. The function I NFERENCE can optionally be used to impose
arc-, path-, or k-consistency, as desired. If a value choice leads to failure (noticed either by I NFERENCE
or by B ACKTRACK), then value assignments (including those made by I NFERENCE) are removed from
the current assignment and a new value is tried.
Figure 6.8 The M IN -C ONFLICTS algorithm for solving CSPs by local search. The initial state may
be chosen randomly or by a greedy assignment process that chooses a minimal-conflict value for each
variable in turn. The C ONFLICTS function counts the number of constraints violated by a particular
value, given the rest of the current assignment.
15
n ← number of variables in X
assignment ← an empty assignment
root ← any variable in X
X ← T OPOLOGICAL S ORT(X , root)
for j = n down to 2 do
M AKE -A RC -C ONSISTENT(PARENT(Xj ), Xj )
if it cannot be made consistent then return failure
for i = 1 to n do
assignment [Xi ] ← any consistent value from Di
if there is no consistent value then return failure
return assignment
Figure 6.11 The T REE -CSP-S OLVER algorithm for solving tree-structured CSPs. If the CSP has a
solution, we will find it in linear time; if not, we will detect a contradiction.
7 LOGICAL AGENTS
Figure 7.1 A generic knowledge-based agent. Given a percept, the agent adds the percept to its
knowledge base, asks the knowledge base for the best action, and tells the knowledge base that it has in
fact taken that action.
16
17
Figure 7.8 A truth-table enumeration algorithm for deciding propositional entailment. (TT stands
for truth table.) PL-T RUE ? returns true if a sentence holds within a model. The variable model rep-
resents a partial model—an assignment to some of the symbols. The keyword “and” is used here as a
logical operation on its two arguments, returning true or false.
Figure 7.9 A simple resolution algorithm for propositional logic. The function PL-R ESOLVE re-
turns the set of all possible clauses obtained by resolving its two inputs.
18 Chapter 7. Logical Agents
Figure 7.12 The forward-chaining algorithm for propositional logic. The agenda keeps track of
symbols known to be true but not yet “processed.” The count table keeps track of how many premises
of each implication are as yet unknown. Whenever a new symbol p from the agenda is processed, the
count is reduced by one for each implication in whose premise p appears (easily identified in constant
time with appropriate indexing.) If a count reaches zero, all the premises of the implication are known,
so its conclusion can be added to the agenda. Finally, we need to keep track of which symbols have
been processed; a symbol that is already in the set of inferred symbols need not be added to the agenda
again. This avoids redundant work and prevents loops caused by implications such as P ⇒ Q and
Q ⇒ P.
19
Figure 7.14 The DPLL algorithm for checking satisfiability of a sentence in propositional logic. The
ideas behind F IND -P URE -S YMBOL and F IND -U NIT-C LAUSE are described in the text; each returns a
symbol (or null) and the truth value to assign to that symbol. Like TT-E NTAILS ?, DPLL operates over
partial models.
Figure 7.15 The WALK SAT algorithm for checking satisfiability by randomly flipping the values of
variables. Many versions of the algorithm exist.
20 Chapter 7. Logical Agents
Figure 7.17 A hybrid agent program for the wumpus world. It uses a propositional knowledge base
to infer the state of the world, and a combination of problem-solving search and domain-specific code
to decide what actions to take.
21
function SAT PLAN( init, transition, goal , T max ) returns solution or failure
inputs: init, transition, goal , constitute a description of the problem
T max , an upper limit for plan length
for t = 0 to T max do
cnf ← T RANSLATE -T O -SAT( init, transition, goal , t)
model ← SAT-S OLVER(cnf )
if model is not null then
return E XTRACT-S OLUTION(model )
return failure
Figure 7.19 The SATP LAN algorithm. The planning problem is translated into a CNF sentence in
which the goal is asserted to hold at a fixed time step t and axioms are included for each time step up to
t. If the satisfiability algorithm finds a model, then a plan is extracted by looking at those proposition
symbols that refer to actions and are assigned true in the model. If no model exists, then the process is
repeated with the goal moved one step later.
8 FIRST-ORDER LOGIC
22
9 INFERENCE IN
FIRST-ORDER LOGIC
Figure 9.1 The unification algorithm. The algorithm works by comparing the structures of the in-
puts, element by element. The substitution θ that is the argument to U NIFY is built up along the way and
is used to make sure that later comparisons are consistent with bindings that were established earlier. In
a compound expression such as F (A, B), the O P field picks out the function symbol F and the A RGS
field picks out the argument list (A, B).
23
24 Chapter 9. Inference in First-Order Logic
Figure 9.8 Pseudocode representing the result of compiling the Append predicate. The function
N EW-VARIABLE returns a new variable, distinct from all other variables used so far. The procedure
C ALL(continuation) continues execution with the specified continuation.
10 CLASSICAL PLANNING
26
27
Figure 10.3 A planning problem in the blocks world: building a three-block tower. One solution is
the sequence [MoveToTable(C, A), Move(B, Table, C), Move(A, Table, B)].
Init(Have(Cake))
Goal (Have(Cake) ∧ Eaten(Cake))
Action(Eat (Cake)
P RECOND : Have(Cake)
E FFECT: ¬ Have(Cake) ∧ Eaten(Cake))
Action(Bake(Cake)
P RECOND : ¬ Have(Cake)
E FFECT: Have(Cake))
Figure 10.7 The “have cake and eat cake too” problem.
Figure 10.9 The G RAPHPLAN algorithm. G RAPHPLAN calls E XPAND -G RAPH to add a level until
either a solution is found by E XTRACT-S OLUTION, or no solution is possible.
PLANNING AND ACTING
11 IN THE REAL WORLD
Figure 11.1 A job-shop scheduling problem for assembling two cars, with resource constraints. The
notation A ≺ B means that action A must precede action B.
28
29
Figure 11.4 Definitions of possible refinements for two high-level actions: going to San Francisco
airport and navigating in the vacuum world. In the latter case, note the recursive nature of the refine-
ments and the use of preconditions.
Figure 11.5 A breadth-first implementation of hierarchical forward planning search. The initial plan
supplied to the algorithm is [Act]. The R EFINEMENTS function returns a set of action sequences, one
for each refinement of the HLA whose preconditions are satisfied by the specified state, outcome.
30 Chapter 11. Planning and Acting in the Real World
Figure 11.8 A hierarchical planning algorithm that uses angelic semantics to identify and com-
mit to high-level plans that work while avoiding high-level plans that don’t. The predicate
M AKING -P ROGRESS checks to make sure that we aren’t stuck in an infinite regression of refinements.
At top level, call A NGELIC -S EARCH with [Act ] as the initialPlan.
Actors(A, B)
Init(At (A, LeftBaseline) ∧ At(B, RightNet ) ∧
Approaching (Ball, RightBaseline)) ∧ Partner (A, B) ∧ Partner (B, A)
Goal (Returned (Ball) ∧ (At(a, RightNet ) ∨ At (a, LeftNet ))
Action(Hit(actor , Ball),
P RECOND :Approaching (Ball, loc) ∧ At(actor , loc)
E FFECT:Returned (Ball))
Action(Go(actor , to),
P RECOND :At (actor , loc) ∧ to 6= loc,
E FFECT:At (actor , to) ∧ ¬ At (actor , loc))
Figure 11.10 The doubles tennis problem. Two actors A and B are playing together and can be in
one of four locations: LeftBaseline, RightBaseline, LeftNet , and RightNet . The ball can be returned
only if a player is in the right place. Note that each action must include the actor as an argument.
KNOWLEDGE
12 REPRESENTATION
31
13 QUANTIFYING
UNCERTAINTY
32
PROBABILISTIC
14 REASONING
Figure 14.9 The enumeration algorithm for answering queries on Bayesian networks.
factors ← [ ]
for each var in O RDER(bn.VARS) do
factors ← [M AKE -FACTOR(var , e)|factors ]
if var is a hidden variable then factors ← S UM -O UT(var , factors )
return N ORMALIZE(P OINTWISE -P RODUCT(factors ))
Figure 14.10 The variable elimination algorithm for inference in Bayesian networks.
33
34 Chapter 14. Probabilistic Reasoning
function P RIOR -S AMPLE(bn) returns an event sampled from the prior specified by bn
inputs: bn, a Bayesian network specifying joint distribution P(X1 , . . . , Xn )
Figure 14.12 A sampling algorithm that generates events from a Bayesian network. Each variable
is sampled according to the conditional distribution given the values already sampled for the variable’s
parents.
for j = 1 to N do
x ← P RIOR -S AMPLE(bn)
if x is consistent with e then
N[x ] ← N[x ]+1 where x is the value of X in x
return N ORMALIZE(N)
Figure 14.13 The rejection-sampling algorithm for answering queries given evidence in a Bayesian
network.
35
for j = 1 to N do
x, w ← W EIGHTED -S AMPLE(bn, e)
W[x ] ← W[x ] + w where x is the value of X in x
return N ORMALIZE(W)
Figure 14.15 The Gibbs sampling algorithm for approximate inference in Bayesian networks; this
version cycles through the variables, but choosing variables at random also works.
PROBABILISTIC
15 REASONING OVER TIME
fv[0] ← prior
for i = 1 to t do
fv[i] ← F ORWARD(fv[i − 1], ev[i])
for i = t downto 1 do
sv[i] ← N ORMALIZE(fv[i] × b)
b ← BACKWARD(b, ev[i])
return sv
Figure 15.4 The forward–backward algorithm for smoothing: computing posterior probabilities of
a sequence of states given a sequence of observations. The F ORWARD and BACKWARD operators are
defined by Equations (??) and (??), respectively.
36
37
Figure 15.6 An algorithm for smoothing with a fixed time lag of d steps, implemented as an online
algorithm that outputs the new smoothed estimate given the observation for a new time step. Notice
that the final output N ORMALIZE(f × B1) is just α f × b, by Equation (??).
function PARTICLE -F ILTERING(e, N , dbn) returns a set of samples for the next time step
inputs: e, the new incoming evidence
N , the number of samples to be maintained
dbn, a DBN with prior P(X0 ), transition model P(X1 |X0 ), sensor model P(E1 |X1 )
persistent: S , a vector of samples of size N , initially generated from P(X0 )
local variables: W , a vector of weights of size N
for i = 1 to N do
S [i] ← sample from P(X1 | X0 = S [i]) /* step 1 */
W [i] ← P(e | X1 = S[i]) /* step 2 */
S ← W EIGHTED -S AMPLE -W ITH -R EPLACEMENT(N , S , W ) /* step 3 */
return S
Figure 15.17 The particle filtering algorithm implemented as a recursive update operation with state
(the set of samples). Each of the sampling operations involves sampling the relevant slice variables
in topological order, much as in P RIOR -S AMPLE. The W EIGHTED -S AMPLE -W ITH -R EPLACEMENT
operation can be implemented to run in O(N ) expected time. The step numbers refer to the description
in the text.
16 MAKING SIMPLE
DECISIONS
Figure 16.9 Design of a simple information-gathering agent. The agent works by repeatedly select-
ing the observation with the highest information value, until the cost of the next observation is greater
than its expected benefit.
38
17 MAKING COMPLEX
DECISIONS
repeat
U ← U ′; δ ← 0
for each state s in S do X
U ′ [s] ← R(s) + γ max P (s′ | s, a) U [s′ ]
a ∈ A(s)
s′
if |U [s] − U [s]| > δ then δ ← |U ′ [s] − U [s]|
′
Figure 17.4 The value iteration algorithm for calculating utilities of states. The termination condi-
tion is from Equation (??).
39
40 Chapter 17. Making Complex Decisions
repeat
U ← P OLICY-E VALUATION(π, U , mdp)
unchanged ? ← true
for each state sX
in S do
P (s′ | s, a) U [s′ ] > P (s′ | s, π[s]) U [s′ ] then do
X
if max
a ∈ A(s)
s′ s′
P (s′ | s, a) U [s′ ]
X
π[s] ← argmax
a ∈ A(s)
s′
unchanged ? ← false
until unchanged ?
return π
Figure 17.7 The policy iteration algorithm for calculating an optimal policy.
Figure 17.9 A high-level sketch of the value iteration algorithm for POMDPs. The
R EMOVE -D OMINATED -P LANS step and M AX -D IFFERENCE test are typically implemented as linear
programs.
18 LEARNING FROM
EXAMPLES
Figure 18.4 The decision-tree learning algorithm. The function I MPORTANCE is described in Sec-
tion ??. The function P LURALITY-VALUE selects the most common output value among a set of
examples, breaking ties randomly.
41
42 Chapter 18. Learning from Examples
local variables: errT , an array, indexed by size, storing training-set error rates
errV , an array, indexed by size, storing validation-set error rates
for size = 1 to ∞ do
errT [size], errV [size] ← C ROSS -VALIDATION(Learner , size, k , examples)
if errT has converged then do
best size ← the value of size with minimum errV [size]
return Learner (best size, examples)
Figure 18.7 An algorithm to select the model that has the lowest error rate on validation data by
building models of increasing complexity, and choosing the one with best empirical error rate on val-
idation data. Here errT means error rate on the training data, and errV means error rate on the
validation data. Learner (size, examples) returns a hypothesis whose complexity is set by the parame-
ter size, and which is trained on the examples . PARTITION(examples, fold, k) splits examples into two
subsets: a validation set of size N/k and a training set with all the other examples. The split is different
for each value of fold.
repeat
for each weight wi,j in network do
wi,j ← a small random number
for each example (x, y) in examples do
/* Propagate the inputs forward to compute the outputs */
for each node i in the input layer do
ai ← xi
for ℓ = 2 to L do
P j in layer ℓ do
for each node
in j ← i wi,j ai
aj ← g(in j )
/* Propagate deltas backward from output layer to input layer */
for each node j in the output layer do
∆[j] ← g ′ (in j ) × (yj − aj )
for ℓ = L − 1 to 1 do
for each node i in layer
P ℓ do
∆[i] ← g ′ (in i ) j wi,j ∆[j]
/* Update every weight in network using deltas */
for each weight wi,j in network do
wi,j ← wi,j + α × ai × ∆[j]
until some stopping criterion is satisfied
return network
for k = 1 to K do
h[k ] ← L(examples, w)
error ← 0
for j = 1 to N do
if h[k ](xj ) 6= yj then error ← error + w[j]
for j = 1 to N do
if h[k ](xj ) = yj then w[j] ← w[j] · error /(1 − error )
w ← N ORMALIZE(w)
z[k ] ← log (1 − error )/error
return W EIGHTED -M AJORITY(h, z)
Figure 18.33 The A DA B OOST variant of the boosting method for ensemble learning. The al-
gorithm generates hypotheses by successively reweighting the training examples. The function
W EIGHTED -M AJORITY generates a hypothesis that returns the output value with the highest vote from
the hypotheses in h, with votes weighted by z.
19 KNOWLEDGE IN
LEARNING
Figure 19.2 The current-best-hypothesis learning algorithm. It searches for a consistent hypothesis
that fits all the examples and backtracks when no consistent specialization/generalization can be found.
To start the algorithm, any hypothesis can be passed in; it will be specialized or gneralized as needed.
45
46 Chapter 19. Knowledge in Learning
Figure 19.3 The version space learning algorithm. It finds a subset of V that is consistent with all
the examples.
for i = 0 to n do
for each subset Ai of A of size i do
if C ONSISTENT-D ET ?(Ai , E ) then return Ai
Figure 19.12 Sketch of the F OIL algorithm for learning sets of first-order Horn clauses from exam-
ples. N EW-L ITERALS and C HOOSE -L ITERAL are explained in the text.
20 LEARNING
PROBABILISTIC MODELS
48
REINFORCEMENT
21 LEARNING
Figure 21.2 A passive reinforcement learning agent based on adaptive dynamic programming. The
P OLICY-E VALUATION function solves the fixed-policy Bellman equations, as described on page ??.
49
50 Chapter 21. Reinforcement Learning
if s ′ is new then U [s ′ ] ← r ′
if s is not null then
increment N s [s]
U [s] ← U [s] + α(Ns [s])(r + γ U [s ′ ] − U [s])
if s .T ERMINAL ? then s, a, r ← null else s, a, r ← s ′ , π[s ′ ], r ′
′
return a
Figure 21.4 A passive reinforcement learning agent that learns utility estimates using temporal dif-
ferences. The step-size function α(n) is chosen to ensure convergence, as described in the text.
Figure 21.8 An exploratory Q-learning agent. It is an active learner that learns the value Q(s, a) of
each action in each situation. It uses the same exploration function f as the exploratory ADP agent,
but avoids having to learn the transition model because the Q-value of a state can be related directly to
those of its neighbors.
NATURAL LANGUAGE
22 PROCESSING
Figure 22.1 The HITS algorithm for computing hubs and authorities with respect to a query.
R ELEVANT-PAGES fetches the pages that match the query, and E XPAND -PAGES adds in every page
that links to or is linked from one of the relevant pages. N ORMALIZE divides each page’s score by the
sum of the squares of all pages’ scores (separately for both the authority and hubs scores).
51
23 NATURAL LANGUAGE
FOR COMMUNICATION
Figure 23.4 The CYK algorithm for parsing. Given a sequence of words, it finds the most probable
derivation for the whole sequence and for each subsequence. It returns the whole table, P , in which
an entry P [X , start, len] is the probability of the most probable X of length len starting at position
start . If there is no X of that size at that location, the probability is 0.
52
53
Figure 23.5 Annotated tree for the sentence “Her eyes were glazed as if she didn’t hear or even
see him.” from the Penn Treebank. Note that in this grammar there is a distinction between an object
noun phrase (NP) and a subject noun phrase (NP-SBJ). Note also a grammatical phenomenon we have
not covered yet: the movement of a phrase from one part of the tree to another. This tree analyzes
the phrase “hear or even see him” as consisting of two constituent VP s, [VP hear [NP *-1]] and [VP
[ADVP even] see [NP *-1]], both of which have a missing object, denoted *-1, which refers to the NP
labeled elsewhere in the tree as [NP-1 him].
24 PERCEPTION
54
25 ROBOTICS
Figure 25.9 A Monte Carlo localization algorithm using a range-scan sensor model with indepen-
dent noise.
55
26 PHILOSOPHICAL
FOUNDATIONS
56
27 AI: THE PRESENT AND
FUTURE
57
28 MATHEMATICAL
BACKGROUND
58
29 NOTES ON LANGUAGES
AND ALGORITHMS
Figure 29.1 Example of a generator function and its invocation within a loop.
59