0% found this document useful (0 votes)
64 views19 pages

Artificial Intelligence Final

The document discusses various artificial intelligence methods for controlling agents in games, using Ms. Pac-Man as an example. It describes finite state machines, behavior trees, and utility-based approaches for ad-hoc behavior authoring. It also covers uninformed search, best-first search, and Monte Carlo tree search for tree search methods. Finally, it summarizes artificial neural networks and decision tree learning as examples of supervised learning techniques for games.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
64 views19 pages

Artificial Intelligence Final

The document discusses various artificial intelligence methods for controlling agents in games, using Ms. Pac-Man as an example. It describes finite state machines, behavior trees, and utility-based approaches for ad-hoc behavior authoring. It also covers uninformed search, best-first search, and Monte Carlo tree search for tree search methods. Finally, it summarizes artificial neural networks and decision tree learning as examples of supervised learning techniques for games.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

DATE :

UNIVERSITY ABSTRACT
Artificial Intelligence is today’s world

OF MAURITIUS hottest Topic. More researchers than


ever work on AI in some form, and

FACULTY OF more non-researchers than ever are


interested in the field. Games are a

ENGINEERING
popular application area for AI
research. While board games have been
central to AI research since the
inception of the field, video games have
during the last decade increasingly
become the domain of choice for
testing and showcasing new algorithms.
Paper illustrate some AI methods used
to play games and how they may be
applied.

Team member
THIBAUT RIFAI
(1312767)
LALLBAHADOOR GOVIND
(1413959)
ARTIFICIAL INTELLIGENCE IN GAMES
BEng ( Hons ) MECHATRONICS
Contents
1. Introduction ............................................................................................................................. 2

2. AI Methods for playing games ............................................................................................... 3

A. AI method: Ad-Hoc Behavior Authoring......................................................................... 4

1. Finite State Machines .......................................................................................................... 4

2. Behavior Trees .................................................................................................................... 5

3. Utility-Based AI .................................................................................................................. 7

A Short Note on Ad-Hoc Behavior Authoring ....................................................................... 8

B. AI method: Tree Search ................................................................................................... 8

1. Uninformed Search ............................................................................................................. 9

2. Best-First Search ................................................................................................................. 9

3. Monte Carlo Tree Search .................................................................................................. 10

C. AI methods: Supervised Learning .................................................................................. 13

1. ANNs .......................................................................................................................... 13

2. Decision Tree Learning .............................................................................................. 14

3. Conclusion ............................................................................................................................ 16

References ..................................................................................................................................... 17

1
1. Introduction
Artificial Intelligence (AI) has seen immense progress in recent years. AI advances have enabled
better understanding of images and speech, emotion detection, self-driving cars, web searching,
AI-assisted creative design, and game-playing, among many other tasks; for some of these tasks
machines have reached human level status or beyond. There is, however, a difference between
what machines can do well and what humans are good at. In the early days of AI, these problems
were presented to the machines as a set of formal mathematical notions within rather narrow and
controlled spaces, which could be solved by some form of symbol manipulation or search in
symbolic space. Over the years the focus of much AI research has shifted to tasks that are relatively
simple for humans to do but are hard for us to describe how to do, such as remembering a face or
recognizing our friend’s voice over the phone [STUDY PANEL (2016) , Yannakakis. G. N (2018)]

Game AI—in particular video game or computer game AI—has seen major advancements in the
(roughly) fifteen years of its existence as a separate research field. We can nowadays use AI to
play many games better than any human, we can design AI bots that are more believable and
human-like than human players, we can collaborate with AI to design better and unconventional
(aspects of) games, we can better understand players and play by modeling the overall game
experience, we can better understand game design by modeling it as an algorithm, and we can
improve game design and tune our monetization strategy by analyzing massive amounts of player
data. Any AI-informed decisions about the future of a game’s design or development are based on
evidence rather than intuition, which showcases the potential of AI—via game analytics and game
data mining—for better design, development and quality assurance procedures [Togelius. J
(2011)].

2
2. AI Methods for playing games
This chapter presents a number of basic AI methods that are commonly used in games.

Throughout the chapter the game of Ms Pac-Man (Namco, 1982) is being taken as reference to
enable the understanding of how these method can be used in game. In the Artificial Intelligence
and Games ,Ms Pac-Man was picked for its popularity and its game design simplicity as well as
for its high complexity when it comes to playing the game. It is important to remember that Ms
Pac-Man is a non-deterministic variant of its ancestor Pac-Man (Namco, 1980) which implies that
the movements of ghosts involve a degree of randomness. For the sake of consistency, all the
methods covered are employed to control Ms Pac-Man’s behavior even though they can find a
multitude of other uses in this game [Yannakakis. G. N (2018)].

All methods has a former principle; algorithm’s representation and the second is its utility. On
the one hand, any AI algorithm somehow stores and maintains knowledge obtained about a
particular task at hand. On the other hand, most AI algorithms seek to find better representations
of knowledge. This seeking process is driven by a utility function of some form. We should note
that the utility is of no use solely in methods that employ static knowledge representations such as
finite state machines or behavior trees.

1. Representation

Appropriately representing knowledge is a key challenge for artificial intelligence and it is


motivated by the capacity of the human brain to store and retrieve obtained knowledge about the
world. As a response to the open general questions regarding knowledge and its representation, AI
has identified numerous and very specific ways to store and retrieve information which is authored,
obtained, or learned. The representation of knowledge about a task or a problem can be viewed as
the computational mapping of the task under investigation. On that basis, the representation needs
to store knowledge about the task in a format that a machine is able to process, such as a data
structure. [Woodbury. R.F (1991)]

2. Utility

Utility in game theory is a measure of rational choice when playing a game. In general, it can be
viewed as a function that is able to assist a search algorithm to decide which path to take. For that

3
purpose, the utility function samples aspects of the search space and gathers information about the
“goodness” of areas in the space. In a sense, a utility function is an approximation of the solution
we try to find. In other words, it is a measure of goodness of the existing representation we search
through. [Woodbury. R.F (1991)]

A. AI method: Ad-Hoc Behavior Authoring


1. Finite State Machines
2. Behavior Trees
3. Utility-Based AI

1. Finite State Machines


FSMs belong to the expert-knowledge systems area and are represented as graphs. An FSM graph
is an abstract representation of an interconnected set of objects, symbols, events, actions or
properties of the phenomenon that needs to be represented. In particular, the graph contains nodes
(states) which embed some mathematical abstraction and edges (transitions) which represent a
conditional relationship between the nodes. The FSM can only be in one state at a time; the current
state can change to another if the condition in the corresponding transition is fulfilled [Gill. A.
(1962)]

An FSM for Ms Pac-Man:

In this section we showcase FSMs as employed to control the Ms Pac-Man agent. A hypothetical
and simplified FSM controller for Ms Pac-Man is illustrated in Fig. 2.1. In this example our FSM
has three states (seek pellets, chase ghosts and evade ghosts) and four transitions (ghosts flashing,
no visible ghost, ghost in sight, and power pill eaten). While in the seek pellets state, Ms Pac-Man
moves randomly up until it detects a pellet and then follows a path finding algorithm to eat as
many pellets as possible and as soon as possible. If a power pill is eaten, then Ms Pac-Man moves
to the chase ghosts’ state in which it can use any tree-search algorithm to chase the blue ghosts.
When the ghosts start flashing, Ms Pac-Man moves to the evade ghosts state in which it uses tree
search to evade ghosts so that none is visible within a distance; when that happens Ms Pac-Man
moves back to the seek pellets state.

4
Figure 2-1 A high-level and simplified FSM example for controlling Ms Pac-Man.

2. Behavior Trees
A Behavior Tree (BT) [Champandard, 2007] is an expert-knowledge system which, similarly to
an FSM, models transitions between a finite set of tasks (or behaviors). The strength of BTs
compared to FSMs is their modularity: if designed well, they can yield complex behaviors
composed of simple tasks. The main difference between BT and FSMs is that they are composed
of behaviors rather than states. As with finite state machines, BTs are easy to design, test and
debug, which made them dominant in the game development scene.

BT employs a tree structure with a root node and a number of parent and corresponding child
nodes representing behaviors—see Fig. 2.2 for an example. We traverse a BT starting from the
root. We then activate the execution of parent-child pairs as denoted in the tree. A child may return
the following values to the parent in predetermined time steps (ticks): run if the behavior is still
active, success if the behavior is completed, failure if the behavior failed. BTs are composed of
three node types: the sequence, the selector, and the decorator the basic functionality of which is
described below:

• Sequence (see blue rectangle in Fig. 2.2): if the child behavior succeeds, the sequence continues
and eventually the parent node succeeds if all child behaviors succeed; otherwise the sequence
fails.

• Selector (see red rounded rectangle in Fig. 2.2): there are two main types of selector nodes: the
probability and the priority selectors. When a probability selector is used child behaviors are
selected based on parent-child probabilities set by the BT designer. On the other hand if priority

5
selectors are used, child behaviors are
ordered in a list and tried one after the
other. Regardless of the selector type
used, if the child behavior succeeds
the selector succeeds. If the child
behavior fails, the next child in the
order is selected (in priority selectors)
or the selector fails (in probability
selectors).

• Decorator (see purple hexagon in


Fig. 2.2): the decorator node adds
complexity to and enhances the
capacity of a single child behavior.
Decorator examples include the
number of times a child behavior runs
or the time given to a child behavior
to complete the task.

A BT for Ms Pac-Man

In Fig. 2.3 we illustrate a simple BT for the seek pellets behavior of Ms Pac-Man. While in the
seek pellets sequence behavior Ms Pac-Man will first move (selector), it will then find a pellet and
finally it will keep eating pellets until a ghost is found in sight (decorator). While in the move
behavior— which is a priority selector—Ms Pac-Man will prioritize ghost-free corridors over
corridors with pellets and over corridors without pellets.

6
3. Utility-Based AI
An increasingly popular method for ad-hoc behavior authoring that eliminates the modularity
limitations of FSMs and BTs is the utility-based AI approach which can be used for the design of
control and decision making systems in games [Mark. D, 2009]. Following this approach, instances
in the game get assigned a particular utility function that gives a value for the importance of the
particular instance [Alexander. B, 2002.]. For instance, the importance of an enemy being present
at a particular distance or the importance of an agent’s health being low in this particular context.
Given the set of all utilities available to an agent and all the options it has, utility-based AI decides
which is the most important option it should consider at this moment [Mark. D and Dill. k. (2010)].
The approach is similar to the design of membership functions in a fuzzy set. A utility can measure
anything from observable objective data (e.g., enemy health) to subjective notions such as
emotions, mood and threat. The various utilities about possible actions or decisions can be
aggregated into linear or non-linear formulas and guide the agent to take decisions based on the
aggregated utility.

So while FSMs and BTs would examine one decision at a time, utility-based AI architectures
examine all available options, assign a utility to them and select the option that is most appropriate.
(Highest utility).

Utility-Based AI for Ms Pac-Man

Figure 2.4 illustrates an example of three simple utility functions that could be considered by Ms
Pac-Man during play. Each function corresponds to a different behavior that is dependent on the
current threat level of the game; threat is, in turn, a function of current ghost positions. At any
point in the game Ms Pac-Man selects the behavior with the highest utility value.

7
Figure 2-4 A utility-based approach for controlling Ms Pac-Man behavior. The threat level (x-axis) is a function that lies
between 0 and 1 which is based on the current position of ghosts. Ms Pac-Man considers the current level of threat, assigns
utility values (through the three different curves) and decides to follow the behavior with the highest utility value. In this example
the utility of ghost evading rises exponentially with the level of threat. The utility of ghost hunting decreases linearly with respect
to threat up to a point where it stabilizes; it then decreases linearly as the threat level increases above a threshold value. Finally,
the utility of pellet seeking increases linearly up to considerable threat level from which point it decreases exponentially.

A Short Note on Ad-Hoc Behavior Authoring


It is important to remember that all three methods covered in this section (and, in general, the
methods covered in this chapter) represent the very basic variants of the algorithms. As a result,
the algorithms we covered appear as static representations of states, behaviors or utility functions.
It is possible, however, to create dynamic variants of those by adding non-deterministic or fuzzy
elements; for instance, one may employ fuzzy transitions in an FSM or evolve behaviors in a BT.

B. AI method: Tree Search


Almost every AI problem can be cast as a search problem, which can be solved by finding the
best (according to some measure) plan, path, model, function, etc. Search algorithms are therefore
often seen as being at the core of AI, to the point that many textbooks (such as Russell and Norvig’s
famous textbook [Russell. S and Norvig. P (1995)] start with a treatment of search algorithms. The
algorithms presented below can all be characterized as tree search algorithms as they can be seen
as building a search tree where the root is the node representing the state where the search starts.
Edges in this tree represent actions the agent takes to get from one state to another, and nodes
represent states. There are typically several different actions that can be taken in a given state and
these actions are known as the tree branches. Tree search algorithms mainly differ in which
branches are explored and in what order.

8
The three main tree search are:

1. Uninformed Search
2. Best-First Search
3. Monte Carlo Tree Search

1. Uninformed Search
Uninformed search algorithms are algorithms which search a state space without any further
information about the goal. The basic uninformed search algorithms are commonly seen as
fundamental computer science algorithms, and are sometimes not even seen as AI. Depth-first
search is a search algorithm which explores each branch as far as possible before backtracking and
trying another branch. At every iteration of its main loop, depth-first search selects a branch and
then moves on to explore the resulting node in the next iteration. When a terminal node is reached,
depth-first search advances up the list of visited nodes until it finds one which has unexplored
actions [Russell. S and Norvig. P (1995)].

Uninformed Search for Ms Pac-Man

A depth-first approach in Ms Pac-Man would normally consider the branches of the game tree
until Ms Pac-Man either completes the level or loses. The outcome of this search for each possible
action would determine which action to take at a given moment. Breadth-first instead would first
explore all possible actions of Ms Pac-Man at the current state of the game (e.g., going left, up,
down or right) and would then explore all their resulting nodes (children) and so on. The game tree
of either method is too big and complex to visualize within a Ms Pac-Man example.

2. Best-First Search
In best-first search, the expansion of nodes in the search tree is informed by some knowledge
about the goal state. In general, the node that is closest to the goal state by some criterion is
expanded first. The most well-known best-first search algorithm is A* (pronounced A star). The
A* algorithm keeps a list of “open” nodes, which are next to an explored node but which have not
themselves been explored. For each open node, an estimate of its distance from the goal is made.
New nodes are chosen to explore based on a lowest cost basis, where the cost is the distance from
the origin node plus the estimate of the distance to the goal. A* can easily be understood as

9
navigation in two- or three-dimensional space. Variants of this algorithm are therefore commonly
used for path finding in games [Sturtevant. N (2008)].

Best-First Search for Ms Pac-Man

Best-first search can be applicable in Pac-Man in the form of A*. Following the paradigm of the
2009 Mario AI competition champion, Ms Pac-Man can be controlled by an A* algorithm that
searches through possible game states within a short time frame and takes a decision on where to
move next (up, down, left or right). The game state can be represented in various ways: from a
very direct, yet costly, representation that takes ghost and pellet coordinates into account to an
indirect representation that considers the distance to the closest ghost or pellet. Regardless of the
representation chosen, A* requires the design of a cost function that will drive the search. Relevant
cost functions for Ms Pac-Man would normally reward moves to areas containing pellets and
penalizing areas containing ghosts.

3. Monte Carlo Tree Search


Compare to the other tree search, Monte Carlo Tree Search does not search all branches of the
search tree to an even depth, instead it concentrates on the more promising branches. This makes
it possible to search certain branches to a considerable depth even though the branching factor is
high. Further, to get around the lack of good evaluation functions, determinism and imperfect
information, the standard formulation of MCTS uses rollouts to estimate the quality of the game
state, randomly playing from a game state until the end of the game to see the expected win (or
loss) outcome. The utility values obtained via the random simulations may be used efficiently to
adjust the policy towards a best-first strategy (a Minimax tree approximation). The core loop of
the MCTS algorithm can be divided into four steps: Selection, Expansion (the first two steps are
also known as tree policy), Simulation and Back-propagation [Chaslot. G. M. J. B (2008)]

10
Figure 2-5 The four basic steps of MCTS exemplified through one iteration of the algorithm. The

Figure is a recreation of the corresponding MCTS outline figure by Chaslot et al. [118].

MCTS for Ms Pac-Man

MCTS can be applicable to the real-time control of the Ms Pac-Man agent. We will outline the
approach followed by Pepels et al. [524] given its success in obtaining high scores for Ms Pac-
Man. When MCTS is used for real-time decision making a number of challenges become critical.
First, the algorithm has limited rollout computational budget which increases the importance of
heuristic knowledge. Second, the action space can be particularly fine-grained which suggests that
macro-actions are a more powerful way to model the game tree; otherwise the agent’s planning
will be very short-term. Third, there might be no terminal node in sight which calls for good
heuristics and possibly restricting the simulation depth. The MCTS agent of Pepels et al. [Pepels.
T (2014] managed to cope with all the above challenges of using MCTS for real-time control by
using a restricted game tree and a junction-based game state representation (see Fig. 2.8).

11
Figure 2-5 The junction-based representation of a game state for the Maastricht MCTS controller [524]. All letter nodes refer to
game tree nodes (decisions) for Ms Pac-Man. Imaged adapted from [524] with permission from authors.

12
C. AI methods: Supervised Learning
Supervised learning requires a set of labeled training examples; hence supervised. More
specifically, the training signal comes as a set of supervised labels on the data (e.g., this is an apple
whereas that one is a pear) which acts upon a set of characterizations of these labels (e.g., this
apple has red color and medium size). The ultimate goal of supervised learning is not to merely
learn from the input-output pairs but to derive a function that approximates (better, imitates) their
relationship. The derived function should be able to map well to new and unseen instances of input
and output pairs [Bishop. M. C. (2006)].

1. ANNs
ANN learn by First, evaluation a cost function of the quality of any set of weights and Second, it
using a search strategy within the space of possible solutions (i.e., the weight space). In the
following two subsections these aspect are outline.

Figure 2-5 An MLP example with three inputs, one hidden layer containing four hidden neurons

and two outputs. The ANN has labeled and ordered neurons and example connection weight labels.

Bias weights bj are not illustrated in this example but are connected to each neuron j of the ANN.

Cost (Error) Function

Before we attempt to adjust the weights to approximate f, we need some measure of MLP
performance. The most common performance measure for training ANNs in a supervised manner
is the squared Euclidean distance (error) between the vectors of the actual output of the ANN (a)
and the desired labeled output y (see equation below).

13
Where the sum is taken over all the output neurons (the neurons in the final layer). Note that the
yj labels are constant values and more importantly, also note that E is a function of all the weights
of the ANN since the actual outputs depend on them. As we will see below, ANN training
algorithms build strongly upon this relationship between error and weights

Back-propagation

The back-propagation [Rumelhart. D. E. (1986)] algorithm is based on gradient descent


optimization and is arguably the most common algorithm for training ANNs. Back-propagation
stands for backward propagation of errors as it calculates weight updates that minimize the error
function—that we defined earlier —from the output to the input layer.

ANNs for Ms Pac-Man

One straightforward way to use ANNs in Ms Pac-Man is to attempt to imitate expert players of
the game. Thus, one can ask experts to play the game and record their play-throughs, through
which a number of features can be extracted and used as the input of the ANN. The resolution of
the ANN input may vary from simple statistics of the game—such as the average distance between
ghosts and Ms Pac-Man—to detailed pixel-to-pixel RGB values of the game level image. The
output data, on the other hand, may contain the actions selected by Ms Pac-Man in each frame of
the game. Given the input and desired output pairs, the ANN is trained via back-propagation to
predict the action performed by expert players (ANN output) given the current game state (ANN
input). The size (width and depth) of the ANN depends on both the amount of data available from
the expert Ms Pac-Man players and the size of the input vector considered.

2. Decision Tree Learning


In decision tree learning [Breiman. L (1984)], the function f we attempt to derive uses a decision
tree representation which maps attributes of data observations to their target values. The former
(inputs) are represented as the nodes and the latter (outputs) are represented as the leaves of the
tree. The possible values of each node (input) are represented by the various branches of that node.
As with the other supervised learning algorithms, decision trees can be classified depending on the

14
output data type they attempt to learn. The goal of decision tree learning is to construct a mapping
(a tree model) that predicts the value of target outputs based on a number of input attributes.

Decision Trees for Ms Pac-Man

As with ANNs, decision tree learning requires data to be trained on. Presuming that data from
expert Ms Pac-Man players would be of good quality and quantity, decision trees can be
constructed to predict the strategy of Ms Pac-Man based on a number of ad-hoc designed attributes
of the game state. Figure 2.16 illustrates a simplified hypothetical decision tree for controlling Ms
Pac-Man. According to that example if a ghost is nearby then Ms Pac-Man checks if power pills
are available in a close distance and aims for those; otherwise it takes actions so that it evades the
ghost. If alternatively, ghosts are not visible Ms Pac-Man checks for pellets. If those are nearby or
in a fair distance then it aims for them; otherwise it aims for the fruit, if that is available on the
level. It is important to note that the leaves of the tree in our example represent control strategies
(macro-actions) rather than actual actions (up, down, left, and right) for Ms Pac-Man.

15
3. Conclusion
AI can either play to win or play in order to create a particular experience for a human player or
observer [Yannakakis. G. N (2018)]. The former goal involves maximizing a utility that maps to
the game performance whereas the latter goal involves objectives beyond merely winning such as
engagement, believability, balance and interestingness. AI as an actor can take the role of either a
player character or a non-player character that exists in a game. The characteristics of games an
AI method needs to consider when playing include the number of players, the level of stochasticity
of the game, the amount of observability available, the action space and the branching factor, and
the time granularity. Further, when we design an algorithm to play a game we also need to consider
algorithmic aspects such as the state representation, the existence of a forward model, the training
time available, and the number of games AI can play. The above roles and characteristics were
detailed in the first part of the chapter as they are important and relevant regardless of the AI
method applied. When it comes to the methods covered in this chapter, we focused on tree-search,
and supervised learning, and hybrid approaches for playing games. In a sense, we tailored the
methods outlined in Chapter 2 to the task of game playing.

16
References
Alexander. B. (2002). The beauty of response curves. AI Game Programming Wisdom, page 78.

Bishop. C.M. (2006). Pattern Recognition and Machine Learning.

Breiman. L, Friedman. J, Stone. C. J and Olshen. R. A. (1984) Classification and regression trees.
CRC Press

Champandard. A. J. (2007) Understanding Behavior Trees. AiGameDev.com

Chaslot. G. M. J. B., Mark H. M. Winands, H. Jaap van Den Herik, Jos W. H. M. Uiterwijk, and
Bruno Bouzy. (2008). Progressive strategies for Monte-Carlo tree search. New Mathematics and
Natural Computation, 4(03):343–357

Developers Conference].

Gill. A. (1962). Introduction to the theory of Finite-State Machines.

Mark. D and Dill. k. (2010).Improving AI decision modeling through utility theory. In Game

Mark. D. (2009). Behavioral Mathematics for game AI. Charles River Media

Pepels. T, Mark H. M. Winands, and Lanctot. M. (2014). Real-time Monte Carlo tree search in
Ms Pac-Man. IEEE Transactions on Computational Intelligence and AI in Games, 6(3):245– 257.

Rumelhart. D. E , Hinton. G.E , and Williams. R. J. (1986). Learning representations by back-


propagating errors. Nature, 323(6088):533–536

Russell. S and Norvig. P (1995). Artificial Intelligence: A Modern Approach. Prentice-Hall,


Englewood Cliffs

STUDY PANEL (2016) by Stanford University. Artificial Intelligence and Life in 2030 is made
available under a Creative Commons Attribution-NoDerivatives 4.0 License (International):
https://creativecommons. org/licenses/by-nd/4.0/

Sturtevant.N (2008). Memory-Efficient Pathfinding Abstractions. In AI Programming Wisdom 4.


Charles River Media

17
Togelius. J, Yannakakis. G. N., Stanley .K. O, and Browne. C. (2011) .Search based procedural
content generation: A taxonomy and survey. Computational Intelligence and AI in Games, IEEE
Transactions on, 3(3):172–186

Woodbury. R.F (1991). Searching for designs: Paradigm and practice. Building and
Environment, 26(1):61–73.

Yannakakis. G. N and Togelius. J. (2018). Artificial Intelligence and Games. New York, NY,
USA: Springer. Available at: http://gameaibook.org/book.pdf

18

You might also like