Lab Manual
Lab Manual
LABORATORY WORKBOOK
Name ____________________
Roll No ___________________
Date ______________________
Marks Obtained ____________
Signature___________________
AI Lab Topics
S
Object of Experiments Remarks Date Signature
No.
1 Introduction to Python
4 Programming agents
5 Uninformed search
6 Informed Search
9 Adversarial Search
12 Reinforcement Learning
13 Bayesian Networks
14 Markov Chain
LAB # 01
INTRODUCTION
The purpose of this lab is to get you familiar with Python and its IDE
Name
Date
Registration No
Department
Quiz
Assignment
___________________
Lab Instructor Signature
Experiment
INTRODUCTION
01
OBJECTIVE
Python Installation
To get started working with Python 3, you’ll need to have access to the Python interpreter. There
are several common ways to accomplish this:
• Python can be obtained from the Python Software Foundation website at python.org.
Typically, that involves downloading the appropriate installer for your operating system
and running it on your machine.
• Some operating systems, notably Linux, provide a package manager that can be run to
install Python.
• On macOS, the best way to install Python 3 involves installing a package manager called
Homebrew. You’ll see how to do this in the relevant section in the tutorial.
• On mobile operating systems like Android and iOS, you can install apps that provide a
Python programming environment. This can be a great way to practice your coding skills
on the go.
Alternatively, there are several websites that allow you to access a Python interpreter online
without installing anything on your computer at all.
It is highly unlikely that your Windows system shipped with Python already installed. Windows
systems typically do not. Fortunately, installing does not involve much more than downloading
the Python installer from the python.org website and running it. Let’s take a look at how to install
Python 3 on Windows:
Step 1: Download the Python 3 Installer
1. Open a browser window and navigate to the Download page for Windows at python.org.
2. Underneath the heading at the top that says Python Releases for Windows, click on the
link for the Latest Python 3 Release - Python 3.x.x. (As of this writing, the latest is
Python 3.6.5.)
3. Scroll to the bottom and select either Windows x86-64 executable installer for 64-bit or
Windows x86 executable installer for 32-bit. (See below.)
• If your system has a 32-bit processor, then you should choose the 32-bit installer.
• On a 64-bit system, either installer will actually work for most purposes. The 32-bit
version will generally use less memory, but the 64-bit version performs better for
applications with intensive computation.
• If you’re unsure which version to pick, go with the 64-bit version.
Note: Remember that if you get this choice “wrong” and would like to switch to another version
of Python, you can just uninstall Python and then re-install it by downloading another installer
from python.org.
To install PyCharm
THEORY
Operators
The Python interpreter can be used to evaluate expressions, for example simple arithmetic
expressions. If you enter such expressions at the prompt ( >>>) they will be evaluated and the
result will be returned on the next line.
>>> 1 + 1 2
>>> 2 * 3
6
Boolean operators also exist in Python to manipulate the primitive True and False values.
>>> 1==0
False
>>> not (1==0)
True
>>> (2==2) and (2==3)
False >>> (2==2)
or (2==3)
True
Strings
Like Java, Python has a built in string type. The + operator is overloaded to do string
concatenation on string values.
There are many built-in methods which allow you to manipulate strings.
>>> 'artificial'.upper()
'ARTIFICIAL'
>>> 'HELP'.lower()
'help' >>>
len('Help')
4
Notice that we can use either single quotes ' ' or double quotes " " to surround string. This
allows for easy nesting of strings.
In Python, you do not have declare variables before you assign to them.
Python comes equipped with some useful built-in data structures, broadly similar to Java's
collections package.
Lists
Python also allows negative-indexing from the back of the list. For instance, fruits[-1] will
access the last element 'banana':
>>> fruits[-2] 'pear'
>>> fruits.pop()
'banana'
>>> fruits
['apple', 'orange', 'pear']
>>> fruits.append('grapefruit')
>>> fruits
['apple', 'orange', 'pear', 'grapefruit']
>>> fruits[-1] = 'pineapple'
>>> fruits
['apple', 'orange', 'pear', 'pineapple']
We can also index multiple adjacent elements using the slice operator. For instance,
fruits[1:3], returns a list containing the elements at position 1 and 2. In general
fruits[start:stop] will get the elements in start, start+1, ..., stop-1. We can also do
fruits[start:] which returns all elements starting from the start index. Also fruits[:end]
will return all elements before the element at position end:
>>> fruits[0:2]
['apple', 'orange']
>>> fruits[:3]
['apple', 'orange', 'pear']
>>> fruits[2:]
['pear', 'pineapple'] >>>
len(fruits)
4
The items stored in lists can be any Python data type. So for instance we can have lists of lists:
Tuples
A data structure similar to the list is the tuple, which is like a list except that it is immutable once
it is created (i.e. you cannot change its content once created). Note that tuples are surrounded
with parentheses while lists have square brackets.
The attempt to modify an immutable structure raised an exception. Exceptions indicate errors:
index out of bounds errors, type errors, and so on will all report exceptions in this way.
Sets
A set is another data structure that serves as an unordered list with no duplicate items. Below, we
show how to create a set:
>>> shapes = ['circle','square','triangle','circle'] >>> setOfShapes
= set(shapes)
Next, we show how to add things to the set, test if an item is in the set, and perform common set
operations (difference, intersection, union):
Note that the objects in the set are unordered; you cannot assume that their traversal or
print order will be the same across machines!
Dictionaries
The last built-in data structure is the dictionary which stores a map from one type of object (the
key) to another (the value). The key must be an immutable type (string, number, or tuple). The
value can be any Python data type.
Note: In the example below, the printed order of the keys returned by Python could be different
than shown below. The reason is that unlike lists which have a fixed ordering, a dictionary is
simply a hash table for which there is no fixed ordering of the keys (like HashMaps in Java). The
order of the keys depends on how exactly the hashing algorithm maps keys to buckets, and will
usually seem arbitrary. Your code should not rely on key ordering, and you should not be
surprised if even a small modification to how your code uses a dictionary results in a new key
ordering.
>>> studentIds = {'knuth': 42.0, 'turing': 56.0, 'nash': 92.0 } >>>
studentIds['turing']
56.0
>>> studentIds['nash'] = 'ninety-two'
>>> studentIds
{'knuth': 42.0, 'turing': 56.0, 'nash': 'ninety-two'}
>>> del studentIds['knuth']
>>> studentIds
{'turing': 56.0, 'nash': 'ninety-two'}
>>> studentIds['knuth'] = [42.0,'forty-two']
>>> studentIds
{'knuth': [42.0, 'forty-two'], 'turing': 56.0, 'nash': 'ninety-two'}
>>> studentIds.keys()
['knuth', 'turing', 'nash']
>>> studentIds.values()
[[42.0, 'forty-two'], 56.0, 'ninety-two']
>>> studentIds.items()
[('knuth',[42.0, 'forty-two']), ('turing',56.0), ('nash','ninety-two')] >>>
len(studentIds) 3
Writing Scripts
Now that you've got a handle on using Python interactively, let's write a simple Python script that
demonstrates Python's for loop. Open the file called foreach.py, which should contain the
following code:
At the command line, use the following command in the directory containing foreach.py:
[cs188-ta@nova ~/tutorial]$ python foreach.py
apples for sale oranges for sale pears for sale
bananas for sale apples are too expensive!
oranges cost 1.500000 a pound pears cost
1.750000 a pound
Remember that the print statements listing the costs may be in a different order on your screen
than in this tutorial; that's due to the fact that we're looping over dictionary keys, which are
unordered. To learn more about control structures (e.g., if and else) in Python, check out the
official Python tutorial section on this topic.
If you like functional programming you might also like map and filter:
>>> list(map(lambda x: x * x, [1,2,3]))
[1, 4, 9]
>>> list(filter(lambda x: x > 3, [1,2,3,4,5,4,3,2,1])) [4, 5,
4]
Learn about the methods Python provides for strings. To see what methods Python provides for a
datatype, use the dir and help commands:
>>> s = 'abc'
>>> dir(s)
['__add__', '__class__', '__contains__', '__delattr__', '__doc__', '__eq__',
'__ge__', '__getattribute__', '__getitem__', '__getnewargs__',
'__getslice__', '__gt__', '__hash__', '__init__','__le__', '__len__',
'__lt__', '__mod__', '__mul__', '__ne__', '__new__', '__reduce__',
'__reduce_ex__','__repr__', '__rmod__', '__rmul__', '__setattr__', '__str__',
'capitalize', 'center', 'count', 'decode', 'encode', 'endswith',
'expandtabs', 'find', 'index', 'isalnum', 'isalpha', 'isdigit', 'islower',
'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip',
'replace', 'rfind','rindex', 'rjust', 'rsplit', 'rstrip', 'split',
'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper',
'zfill']
>>> help(s.find)
Help on built-in function find:
Return -1 on failure.
>> s.find('b')
1
Try out some of the string functions listed in dir (ignore those with underscores '_' around the
method name).
Exercise: Lists
Play with some of the list functions. You can find the methods you can call on an object via the
dir and get information about them via the help command:
>>> dir(list)
['__add__', '__class__', '__contains__', '__delattr__', '__delitem__',
'__delslice__', '__doc__', '__eq__', '__ge__', '__getattribute__',
'__getitem__', '__getslice__', '__gt__', '__hash__', '__iadd__', '__imul__',
'__init__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__',
'__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__',
'__rmul__', '__setattr__', '__setitem__', '__setslice__', '__str__', 'append',
'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']
>>> help(list.reverse)
Help on built-in function reverse:
reverse(...)
L.reverse() -- reverse *IN PLACE*
Note: Ignore functions with underscores "_" around the names; these are private helper methods.
Press 'q' to back out of a help screen.
Exercise: Dictionaries
Use dir and help to learn about the functions you can call on dictionaries.
Write a list comprehension which, from a list, generates a lowercased version of each
string that has length greater than five.
Exercise : Functions
Name
Date
Registration No
Department
Total Marks
Marks Obtained
Remarks
___________________
Lab Instructor Signature
Experiment
INTRODUCTION
02
OBJECTIVE
You will learn syntax and semantics of advanced concepts of Python, and get introduced to structured
data objects.
THEORY
Beware of Indendation!
Unlike many other languages, Python uses the indentation in the source code for interpretation.
So for instance, for the following script:
if 0 == 1: print('We are in a world of
arithmetic pain') print('Thank you for playing')
will output
Thank you for playing
there would be no output. The moral of the story: be careful how you indent! It's best to use four
spaces for indentation -- that's what the course code uses.
Tabs vs Spaces
Because Python uses indentation for code evaluation, it needs to keep track of the level of
indentation across code blocks. This means that if your Python file switches from using tabs as
indentation to spaces as indentation, the Python interpreter will not be able to resolve the
ambiguity of the indentation level and throw an exception. Even though the code can be lined up
visually in your text editor, Python "sees" a change in indentation and most likely will throw an
exception (or rarely, produce unexpected behavior).
This most commonly happens when opening up a Python file that uses an indentation scheme
that is opposite from what your text editor uses (aka, your text editor uses spaces and the file uses
tabs). When you write new lines in a code block, there will be a mix of tabs and spaces, even
though the whitespace is aligned. For a longer discussion on tabs vs spaces, see this discussion
on StackOverflow.
Writing Functions
# Main Function if
__name__ == '__main__':
buyFruit('apples',2.4)
buyFruit('coconuts',2)
Rather than having a main function as in Java, the __name__ == '__main__' check is used to
delimit expressions which are executed when the file is called as a script from the command line.
The code after the main check is thus the same sort of code you would put in a main function in
Java.
Object Basics
Although this isn't a class in object-oriented programming, you'll have to use some objects in the
programming projects, and so it's worth covering the basics of objects in Python. An object
encapsulates data and provides functions for interacting with that data.
Defining Classes
FruitShop:
The FruitShop class has some data, the name of the shop and the prices per pound of some fruit,
and it provides functions, or methods, on this data. What advantage is there to wrapping this data
in a class?
So how do we make an object and use it? Make sure you have the FruitShop implementation in
shop.py. We then import the code from this file (making it accessible to other scripts) using
import shop, since shop.py is the name of the file. Then, we can create FruitShop objects as
follows:
import shop
So what just happended? The import shop statement told Python to load all of the functions and
classes in shop.py. The line berkeleyShop = shop.FruitShop(shopName, fruitPrices)
constructs an instance of the FruitShop class defined in shop.py, by calling the __init__
function in that class. Note that we only passed two arguments in, while __init__ seems to take
three arguments: (self, name, fruitPrices). The reason for this is that all methods in a class
have self as the first argument. The self variable's value is automatically set to the object itself;
when calling a method, you only supply the remaining arguments. The self variable contains all
the data (name and fruitPrices) for the current specific instance (similar to this in Java). The
print statements use the substitution operator (described in the Python docs if you're curious).
The following example illustrates how to use static and instance variables in Python.
Create the person_class.py containing the following code:
class Person:
population = 0 def
__init__(self, myAge):
self.age = myAge
Person.population += 1 def
get_population(self):
return Person.population
def get_age(self):
return self.age
In the code above, age is an instance variable and population is a static variable. population is
shared by all instances of the Person class whereas each instance has its own age variable.
This tutorial has briefly touched on some major aspects of Python that will be relevant to the
course. Here are some more useful tidbits:
>>> reload(shop)
Python Object Inheritance
Inheritance is the process by which one class takes on the attributes and methods of another.
Newly formed classes are called child classes, and the classes that child classes are derived from
are called parent classes.
It’s important to note that child classes override or extend the functionality (e.g., attributes and
behaviors) of parent classes. In other words, child classes inherit all of the parent’s attributes and
behaviors but can also specify different behavior to follow. The most basic type of class is an
object, which generally all other classes inherit as their parent.
When you define a new class, Python 3 it implicitly uses object as the parent class. So the
following two definitions are equivalent: class Dog(object): pass
Note: In Python 2.x there’s a distinction between new-style and old-style classes. I won’t go into
detail here, but you’ll generally want to specify object as the parent class to ensure you’re
definint a new-style class if you’re writing Python 2 OOP code.
Each breed of dog has slightly different behaviors. To take these into account, let’s create
separate classes for each breed. These are child classes of the parent Dog class.
# Class attribute
species = 'mammal'
Read the comments aloud as you work through this program to help you understand what’s
happening, then before you run the program, see if you can predict the expected output.
You should see:
Jim is 12 years old
Jim runs slowly
# Class attribute
species = 'mammal'
Make sense? Both jim and julie are instances of the Dog() class, while johnnywalker is not an
instance of the Bulldog() class. Then as a sanity check, we tested if julie is an instance of jim,
which is impossible since jim is an instance of a class rather than a class itself—hence the
reason for the TypeError.
The SomeBreed() class inherits the species from the parent class, while the SomeOtherBreed()
class overrides the species, setting it to reptile.
Exercises
3. Write a quickSort function in Python using list comprehensions. Use the first element as
the pivot.
# Class Attribute
species = 'mammal'
# Initializer / Instance
Attributes def __init__(self,
name, age): self.name = name
self.age = age
Using the same Dog class, instantiate three new dogs, each with a different age. Then
write a function called, get_biggest_number(), that takes any number of ages (*args)
and returns the oldest one. Then output the age of the oldest dog like so:
Create a Pets class that holds instances of dogs; this class is completely separate from the Dog
class. In other words, the Dog class does not inherit from the Pets class. Then assign three dog
instances to an instance of the Pets class. Start with the following code below. Save the file as
pets_class.py. Your output should look like this:
Name
Date
Registration No
Department
Total Marks
Marks Obtained
Remarks
___________________
Lab Instructor Signature
Experiment 3
INTRODUCTION
OBJECTIVE
THEORY
Introduction
class Car(object):
num_wheels = 4
def __init__(self,
color):
self.wheels = Car.num_wheels
self.color = color
def drive(self): if
self.wheels <= Car.num_wheels:
return self.color + ' car cannot drive!'
return self.color + ' car goes vroom!'
def pop_tire(self):
if self.wheels > 0:
self.wheels -= 1
● class: a blueprint for how to build a certain type of object. The Car class
(shown above) describes the behavior and data that all Car objects have.
● instance: a particular occurrence of a class. In Python, we create
instances of a class like this:
>>> my_car = Car('red')
my_car is an instance of the Car class.
attribute or field: a variable that belongs to the class. Think of an attribute as a quality
of the object: cars have wheels and color, so we have given our Car class self.wheels
and self.color attributes. We can access attributes using dot notation:
>>> my_car.color
'red'
>>> my_car.wheels
● 4
method: Methods are just like normal functions, except that they are tied to an
instance or a class. Think of a method as a "verb" of the class: cars can drive and also
pop their tires, so we have given our Car class the methods drive and pop_tire. We
call methods using dot notation:
>>> my_car = Car('red')
>>> my_car.drive()
● self.color = color
The constructor takes in one argument, color. As you can see, the constructor
also creates the self.wheels and self.color attributes.
self: in Python, self is the first parameter for many methods (in this class, we will
only use methods whose first parameter is self). When a method is called, self is
bound to an instance of the class. For example:
>>> my_car = Car('red')
● >>> car.drive()
Notice that the drive method takes in self as an argument, but it looks like we
didn't pass one in! This is because the dot notation implicitly passes in car as
self for us.
Types of variables
When dealing with OOP, there are three types of variables you should be aware of:
● local variable: These are just like the variables you see in normal
functions — once the function or method is done being called, this variable
is no longer able to be accessed. For example, the color variable in the
__init__ method is a local variable (not the self.color variable).
instance attribute: Unlike local variables, instance attributes will still be accessible
after method calls have finished. Each instance of a class keeps its own version of the
instance attribute — for example, we might have two Car objects, where one's
self.color is red, and the other's self.color is blue.
>>> car1 = Car('red')
>>> car2 = Car('blue')
>>> car1.color 'red'
>>> car2.color
'blue'
>>> car1.color = 'yellow'
>>> car1.color 'yellow'
>>> car2.color
● 'blue'
class attribute: As with instance attributes, class attributes also persist across method
calls. However, unlike instance attributes, all instances of a class will share the same
class attributes. For example, num_wheels is a class attribute of the Car class.
>>> car1 = Car('red')
>>> car2 = Car('blue')
>>> car1.num_wheels 4
>>> car2.num_wheels
4
>>> Car.num_wheels = 2
>>> car1.num_wheels 2
>>> car2.num_wheels
●2
Notice that we can access class attributes by saying <class
name>.<attribute>, such as Car.num_wheels, or by saying
<instance>.<attribute>, such as car1.num_wheels.
Question 1
Predict the result of evaluating the following calls in the interpreter. Then try them out
yourself!
______
>>> a.holder
_____
>>> Account.holder
______
>>> Account.interest
______
>>> a.interest
______
>>> Account.interest = 0.03
>>> a.interest
______
>>> a.deposit(1000)
______
>>> a.balance
______
>>> a.interest = 9001
>>> Account.interest
Question 2
Modify the following Person class to add a repeat method, which repeats the last
thing said. See the doctests for an example of its use.
Hint: you will have to modify other methods as well, not just the repeat
method.
class Person(object):
"""Person class.
Inheritance
Question 3
Predict the result of evaluating the following calls in the interpreter. Then try them out
yourself!
______
>>> c = CheckingAccount("Eric")
>>> c.balance
______
>>> a.deposit(30)
______
>>> c.deposit(30)
______
>>> c.interest
______
Question 4
Suppose now that we wanted to define a class called DoubleTalker to represent people
who always say things twice:
Consider the following three definitions for DoubleTalker that inherit from the
Person class:
Question 5
Modify the Account class from lecture so that it has a new attribute, transactions,
that is a list keeping track of any transactions performed. See the doctest for an example.
class Account(object):
"""A bank account that allows deposits and withdrawals.
Question 6
We'd like to be able to cash checks, so let's add a deposit_check method to our
CheckingAccount class. It will take a Check object as an argument, and check to see
if the payable_to attribute matches the CheckingAccount's holder. If so, it marks
the Check as deposited, and adds the amount specified to the CheckingAccount's
total.
Write an appropriate Check class, and add the deposit_check method to the
CheckingAccount class. Make sure not to copy and paste code! Use inheritance
whenever possible.
See the doctests for examples of how this code should work.
class CheckingAccount(Account):
"""A bank account that charges for withdrawals.
Question 7
We'd like to create a Keyboard class that takes in an arbitrary number of Buttons and
stores these Buttons in a dictionary. The keys in the dictionary will be ints that
represent the postition on the Keyboard, and the values will be the respective Button.
Fill out the methods in the Keyboard class according to each description, using the
doctests as a reference for the behavior of a Keyboard.
class Keyboard:
"""A Keyboard takes in an arbitrary amount of buttons, and
has a dictionary of positions as keys, and values as
Buttons.
def press(self,
info):
"""Takes in a position of the button pressed, and
returns that button's output"""
def typing(self,
typing_input):
"""Takes in a list of positions of buttons pressed, and
returns the total output"""
Name
Date
Registration No
Department
Total Marks
Marks Obtained
Remarks
___________________
Lab Instructor Signature
Experiment
INTRODUCTION
04
OBJECTIVE
The objective of this lab is to show the loop of interaction between the agent and the
environment
THEORY
@author: dr.aarij
'''
from abc import abstractmethod
class
Environment(object):
'''
classdocs
'''
@abstractmethod
def __init__(self, n):
self.n = n
def executeStep(self,n=1): raise
NotImplementedError('action must be defined!')
def executeAll(self): raise
NotImplementedError('action must be defined!')
def delay(self,
n=100): self.delay
= n
Room class
@author: dr.aarij
''' class
Room:
def __init__(self,location,status="dirty"):
self.location = location
self.status = status
Abstract agent
@author: dr.aarij
'''
from abc import abstractmethod
class
Agent(object):
'''
classdocs
'''
@abstractmethod
def __init__(self):
pass
@abstractmethod def
sense(self,environment):
pass
@abstractmethod
def act(self):
pass
def
__init__(self):
'''
Constructor
''' pass
def
sense(self,env):
self.environment = env
def act(self): if
self.environment.currentRoom.status == 'dirty':
return 'clean' if
self.environment.currentRoom.location == 'A':
return 'right' return 'left'
Test program
if __name__ == '__main__': vcagent =
VaccumAgent.VaccumAgent() env =
TwoRoomVaccumCleanerEnvironment(vcagent)
env.executeStep(50)
Exercises
1. Run the two room vacuum cleaner agent program and understand it. Convert the program
to a Three room environment
2. Convert the environment to a ‘n’ room environment where n >= 2
3. Does the agent ever stop? If no, can you make it stop? Is your program rational?
4. Score your agent, -1 points for moving from a room, +25 points to clean a room that is
dirty, and -10 points if a room is dirty. The scoring will take place after every 1 second
5. Convert the agent to a reflex based agent with a model. Afterwards take the sensors away
from the agents i.e. now the agent cannot perceive anything. Does your agent still work?
if so then why?
EXPERIMENT # 05
INTRODUCTION
The purpose of this lab is to show you how to solve problems using uninformed search
Name
Date
Registration No
Department
Total Marks
Marks Obtained
Remarks
___________________
Lab Instructor Signature
Experiment
INTRODUCTION
05
OBJECTIVE
The goal of this tutorial is to understand how to (i) model problems to solve and (ii) implement
the search algorithms as described in the book.
THEORY
Often we encounter problems that do not have a trivial solution i.e. more than one step is required
to solve the problem. To solve these kinds of problems we can search for solutions. We start with
the given state of the problem and then apply the relevant actions that are allowed at that
particular state. We repeat this process until either we get to the solution or we have no other
actions to apply, in which case we declare that there is no solution exists.
In this lab we will look at some reusable code to model the search problem. In addition, we will
also look at some algorithms to solve these problems
//CLASS NODE: class
Node(object):
'''
classdocs
'''
@abstractmethod
def __init__(self, params):pass
@abstractmethod def
initialState(self): pass
@abstractmethod
def succesorFunction(self,currentState): pass
@abstractmethod
def isGoal(self,currentState): pass
@abstractmethod
def __str__(self) : pass
//SEARCH STATE
The SearchState class is an abstraction of what every state should have. Every state should have a
representation of itself, the action that got the search to this particular state, the cost of getting to
this state and the string representation of the state. This string representation comes handy when
maintaining a duplicate state set.
from abc import
abstractmethod
class
SearchState(object):
'''
classdocs
'''
@abstractmethod
def __init__(self, params): pass
@abstractmethod
def getCurrentState(self): pass
@abstractmethod def
getAction(self): pass
@abstractmethod
def getCost(self):pass
@abstractmethod def
stringRep(self) : pass
Now we will show how the searchProblem abstract class can be used to model any search
problem. For an example we will model the 8 Puzzle problem. Actually the way we have
modelled this problem, any valid N-Puzzle problem can be solved.
emptyFound = False
for i in
range(len(currentState)):
for j in
range(len(currentState[i])): if
currentState[i][j] == 0:
emptyRow,emptyColumn = i,j
emptyFound = True break
if emptyFound:
break
#check up move
if emptyRow != 0:
newState = copy.deepcopy(currentState)
tempS = newState[emptyRow‐1][emptyColumn]
newState[emptyRow‐1][emptyColumn] = 0
newState[emptyRow][emptyColumn] = tempS ep =
EightPuzzleState(newState, 'Move Up', 1.0)
nextMoves.append(ep)
return nextMoves
def
isGoal(self,currentState):
cs = currentState.getCurrentState()
for i in range(len(cs)): for j in
range(len(cs[i])): if cs[i][j] !=
self._goalState[i][j]:
return False return
True
We also have a node class that represents the node of a search tree. The class is as
follows: class Node(object):
'''
classdocs
'''
Search Strategy
As we have discussed in class, the different search strategies like BFS, DFS and UCS only differ
in the way they maintain the fringe list. Given below is the generic search strategy class. Every
search strategy requires only three operations. 1) to check whether the fringe list is empty or nor,
2) to add a node, 3) to remove a node from the list.
@abstractmethod
def __init__(self, params):pass
@abstractmethod
def isEmpty(self):pass
@abstractmethod def
addNode(self,node):pass
@abstractmethod
def removeNode(self):pass
For this experiment, I will show you how to implement the breadthfirstsearch
strategy. from com.search.strategy.searchStrategy import SearchStrategy
from queue import Queue
class
BreadthFirstSearchStrategy(SearchStrategy):
'''
classdocs
'''
def __init__(self):
self.queue = Queue()
def
isEmpty(self):
return self.queue.empty()
def addNode(self,node):
return self.queue.put(node)
def removeNode(self):
return self.queue.get()
Given these classes we can generically define the search class as follows:
from com.search.node import Node from
com.search.eightPuzzleProblem import EightPuzzleProblem
from com.search.strategy.breadthFirstSearchStrategy import BreadthFirstSearchStrategy
class
Search(object):
''' classdocs
'''
result = None
while not
self.searchStrategy.isEmpty():
.searchStrategy.isEmpty():
currentNode = self
self.searchStrategy.removeNode()
.searchStrategy.removeNode()
if self.searchProblem.isGoal(currentNode.state):
.searchProblem.isGoal(currentNode.state):
result = currentNode break
nextMoves = self.searchProblem.succesorFunction(currentNode.state)
.searchProblem.succesorFunction(currentNode.state)
for nextState in nextMoves:
if nextState.stringRep() not in duplicateMap:
newNode = Node(
Node(nextState,
nextState, currentNode, currentNode.depth + 1,
currentNode.cost + nextState.cost, nextState.action)
self.searchStrategy.addNode(newNode)
.searchStrategy.addNode(newNode)
duplicateMap[newNode.state.stringRep()] = newNode.state.stringRep()
return result
def
printResult(self,result): if
result.parentNode is None:
print("Game
"Game Starts"
Starts")
print("Initial
"Initial State : %s" % result.state.getCurrentState())
return
self.printResult(result.parentNode)
.printResult(result.parentNode) print("Perform
"Perform the
following action %s, New State is %s, cost is
%d"%(result.action,result.state.getCurrentState(),result.cost))
%(result.action,result.state.getCurrentState(),result.cost))
To initialize the search, we need to pass it the search problem and a searching strategy.
The search process if as follows:
ASSIGNMENT
1. Modify the code in the search class to count the number of nodes expanded by the search
.
2. Model the Sudoku solving problem .
3. Understand the code and implement the following search strategies,using problem solving
agent steps
I.Breadth first search
II.Depth first search
III.Uniform cost sea
EXPERIMENT # 06
INTRODUCTION
The purpose of this lab is to show the working of an informed search agent
Name
Date
Registration No
Department
Total Marks
Marks Obtained
Remarks
___________________
Lab Instructor Signature
Experiment
INTRODUCTION
06
OBJECTIVE
The goal of this tutorial is to understand how to (i) model problems to solve and (ii) implement
the informed search algorithms as described in the book.
THEORY
Much of the theory for informed search remains the same as it was for uninformed search. As
discussed in class, the main difference between informed search and uninformed search is the use
of heuristics.
Heuristics is a function that informs about the estimate of the shortest distance between the
“current state” and the “Goal”. Heuristics needs to be admissible i.e. the value given by the
heuristics should be less than the actual cost.
Here we will use the same classes as we did in the previous lab.
I will show the construction for “Greedy Search”. Remember, greedy search organizes the fringe
list in the increasing order of the nodes’ heuristic vales i.e. the node with the least heuristic value
is the node with the highest priority.
class GreedySearch(SearchStrategy):
'''
classdocs
'''
self.heuristic = heuristic
self.queue = PriorityQueue()
def isEmpty(self):
return self.queue == []
def addNode(self,node):
self.queue.put((self.heuristic.evaluateNode(node.state.currentState),node))
def removeNode(self):
return self.queue.get()[1]
@abstractclassmethod
def __init__(self): pass
@abstractclassmethod
def evaluateNode(self,state,goal): pass
class MisPlacedTilesEightPuzzleHeuristic(Heuristic):
'''
classdocs
'''
def __init__(self,goal):
self.goal = goal
def evaluateNode(self,state):
totalCost = 0
for i in range(len(state)):
for j in range(len(state[i])):
if state[i][j] != self.goal[i][j]:
totalCost += 1
return totalCost
if __name__ == "__main__":
# 0,8,7,6,5,4,3,2,1
goal = [[0,1,2],[3,4,5],[6,7,8]]
heuristic = MisPlacedTilesEightPuzzleHeuristic(goal)
searchProblem = EightPuzzleProblem([[0,8,7],[6,5,4],[3,2,1]], goal)
searchStrategy = GreedySearch(heuristic)
Name
Date
Registration No
Total Marks
___________________
Lab Instructor Signature
Experiment
INTRODUCTION
07
OBJECTIVE
Thus far, we have seen how to search for solutions in a state space. In this lab, we will slightly
modify the search problem. We will see that we will search for values that will satisfy different
constraints. Our solution will be found once we have satisfied all the constraints and assigned
values to all the variables.
THEORY
Previously we have seen how to model the search problems and how to solve them using
informed and uninformed search.
The formulation we used to model our search problem, made the goal test, the successor function
and the structure of the state as a black box i.e. we didn’t know what composed a state, or how to
successors are generated, or when a goal is found.
In this lab, we will see a more specialized form of search formulation. Using this formulation our
problem will become as to assign some values to some variables while making sure that no
constraint, defined on the variables, is violated. Our formulation will require us to model our
problem in terms of Variables, the variables’ Domains and the Constraints defined on those
variables. More formally, this kind of problem solving technique is called constraint satisfaction
problems (CSP). CSP is formally defined as:
Variables (Xi)
Domains (Di)
Constraints
The solution of a CSP is found when we have assigned each variable (Xi) has been assigned a
value from its domain (Di) while all the constraints involving Xi are satisfied.
The objective of this lab is to demonstrate to you how to solve a CSP using different strategies.
The formulation used in our code has been inspired by the AIMA book:
A CSP is a collection of variables, their domains and the constraints between the different
variables:
class CSP(object):
'''
classdocs
'''
def setUpVariableDomains(self):
for var in self._variables:
self.addVariableDomain(var, self._domain)
def setUpConstraints(self):
for constraint in self._constraints:
self.addConstraint(constraint)
def addVariableDomain(self,var,domain):
self._domainOfVariable[var] = copy.deepcopy(domain)
def addConstraint(self,constraint):
for var in constraint.getScope():
if var not in self._contraintsOfVariable:
self._contraintsOfVariable[var] = []
self._contraintsOfVariable[var].append(constraint)
def addSingleConstraint(self,constraint):
self._constraints.append(constraint)
for var in constraint.getScope():
if var not in self._contraintsOfVariable:
self._contraintsOfVariable[var] = []
self._contraintsOfVariable[var].append(constraint)
def addVariable(self,variable):
self._variables.append(variable)
self.addVariableDomain(variable,self._domain)
def getVariables(self):
return self._variables
def getDomainValues(self,var):
return self._domainOfVariable[var]
def getConstraints(self,var):
if var not in self._contraintsOfVariable:
return []
return self._contraintsOfVariable[var]
def getVariableDomains(self):
return self._domainOfVariable
def setVariableDomains(self,domainOfVariable):
self._domainOfVariable = domainOfVariable
def copy(self):
variables = copy.deepcopy(self._variables)
domains = copy.deepcopy(self._variables)
constraints = copy.deepcopy(self._variables)
csp = CSP(variables, domains, constraints)
return csp
def getNeighbour(self,variable,constraint):
neigh = []
for va in constraint.getScope():
if va != variable and (va not in neigh):
neigh.append(va)
return neigh
def removeValueFromDomain(self,variable,value):
values = []
for val in self.getDomainValues(variable):
if val != value:
values.append(val)
self._domainOfVariable[variable] = values
The different functions present in this class are necessary when solving a CSP problem. Their
role will become more evident as we present the other classes.
As mentioned earlier, a csp has three basic elements, variables, domains and constraints. The
classes for these elements are given below
@abstractmethod
def isConsistentWith(self,assignment): pass
@abstractmethod
def getScope(self): pass
import copy
class
Domain(object):
'''
classdocs
'''
self._values = copy.deepcopy(values)
def getValues(self):
return self._values
class
Variable(object):
'''
classdocs
'''
self._name = name
def getName(self):
return self._name
def __hash__(self):
return hash(self._name)
def __str__(self):
return self._name
While solving a csp using different strategies, we need to keep track at what point which
variables have been assigned values. This assignment is stored in the following class:
class
Assignment(object):
'''
classdocs
'''
def __init__(self):
self._variables = []
self._valueOfVariable = {}
def addVariableToAssignment(self,var,value):
if var not in self._valueOfVariable:
self._variables.append(var)
self._valueOfVariable[var] = value
def removeVariableFromAssignment(self,var):
if var in self._valueOfVariable:
self._variables.remove(var)
del self._valueOfVariable[var]
def getAssignmentOfVariable(self,var):
if var not in self._valueOfVariable:
return None
return self._valueOfVariable[var]
def isConsistent(self,constraints):
for con in constraints:
if not con.isConsistentWith(self):
return False
return True
def hasAssignmentFor(self,var):
return var in self._valueOfVariable
def isComplete(self,variables):
for var in variables:
if not self.hasAssignmentFor(var):
return False
return True
def isSolution(self,csp):
return self.isComplete(csp.getVariables()) and
self.isConsistent(csp.getConstraints())
def __str__(self):
result = []
return str(result)
Now that we have defined the basic data structures of a csp. Let us discuss how can we solve a
csp. A CSP can be solved using a strategy called backtracking search.
class BactrackingSearch(SearchStrategy):
'''
classdocs
'''
def __init__(self, inferenceProcdeure,listeners =
[],variableOrdering=False,valueOrdering=False):
'''
Constructor
'''
SearchStrategy.__init__(self, listeners)
self._inferenceProcedure = inferenceProcdeure
self._variableOrdering = variableOrdering
self._valueOrdering = valueOrdering
def solve(self,csp):
return self.recursiveBacktrackingSearch(csp, Assignment())
def recursiveBacktrackingSearch(self,csp,assignment):
if assignment.isComplete(csp.getVariables()):
return assignment
var = self.selectUnAssignedVariable(csp, assignment)
def selectUnAssignedVariable(self,csp,assignment):
def orderDomainValues(self,csp,var):
return csp.getDomainValues(var)
def fireListeners(self,csp,assignment):
for listener in self._listeners:
listener.fireChange(csp,assignment)
Map coloring problem requires a NotEqual constraint. The implementation of this constraint is
given below:
def getScope(self):
return self._scope
def isConsistentWith(self,assignment):
val1 = assignment.getAssignmentOfVariable(self._scope[0])
val2 = assignment.getAssignmentOfVariable(self._scope[1])
return val1 == None or val2 == None or val1 != val2
This is how we can formulate the map coloring problem using the classes given above
wa = Variable("WA")
sa = Variable("SA")
nt = Variable("NT")
q = Variable("Q")
nsw = Variable("NSW")
v = Variable("V")
t = Variable("T")
variables = [wa,sa,nt,q,nsw,v,t]
domains = ["RED","GREEN","BLUE"]
constraints = [NotEqualConstraint(wa,sa),
NotEqualConstraint(wa,nt),
NotEqualConstraint(nt,sa),
NotEqualConstraint(q,nt),
NotEqualConstraint(sa,q),
NotEqualConstraint(sa,nsw),
NotEqualConstraint(q,nsw),
NotEqualConstraint(sa,v),
NotEqualConstraint(nsw,v)]
Csp = CSP(variables,domains,constraints)
inPro = SimpleInference() bts =
BactrackingSearch(inPro,[ConsoleListener()],variableOrdering = True)
True
start = time.time()
result = bts.solve(csp)
end = time.time()
print(end ‐ start)
where the simple inference class, is the vanilla inference. This can be
overridden for more advanced methods as Forward checking and Arc
Consistency.
ASSIGNMENT
Name
Date
Registration No
Total Marks
___________________
Lab Instructor Signature
Experiment
INTRODUCTION
08
OBJECTIVE
In this experiment we will learn how to solve CSPs using local search methods
THEORY
In the previous experiment, we learned about the basics of CSP and how to solve them using
searching algorithms.
As we have seen in class, that we can improve the performance of backtracking if we can detect
failure early. One way of detecting failure early is to propagate the impact of assigning a value to
a variable to that variable’s neighbors. Whenever, we assign a value to a variable, we can impose
the effect of this assignment on the neighboring variables. For example, if we consider the
problem of map coloring, then whenever we assign a color to a variable then we subtract that
color from the neighboring variables, and if by doing so the domain of any variable becomes
empty then we backtrack. In this way we can detect a failure a bit early. Thus, resulting in
improvement in the backtracking algorithm. This is called Forward Checking.
'''
@author: dr.aarij
'''
from com.ai.csp.assignment.assignment import Assignment
class ForwardCheckingInference(object):
'''
classdocs
'''
def __init__(self):pass
def doInference(self,inferenceInfo,csp,variable,value):
assignment = Assignment()
assignment.addVariableToAssignment(variable, value)
for con in csp.getConstraints(variable):
otherVariables = csp.getNeighbour(variable,con)
for ov in otherVariables:
someValues = []
changed = False
domVals = inferenceInfo.getDomainsOfAffectedVariables(ov)
if domVals is None:
domVals = csp.getDomainValues(ov)
if changed:
inferenceInfo.addToAffectedVariables(ov,someValues)
assignment.removeVariableFromAssignment(ov)
return []
And if doing so, we find that the domain of any variable has emptied then we can claim that a
failure has occurred and now we have to backtrack.
Forward checking, still rums inside backtracking search and no need is required to make changes
in the backtracking algorithm.
To make backtracking search use forward checking following changes can be done in the main
csp.py file.
if __name__ == "__main__":
csp = createMapColoringCSP()
inPro = ForwardCheckingInference()
bts = BactrackingSearch(inPro,[ConsoleListener()],variableOrdering =
True)
start = time.time()
result = bts.solve(csp)
end = time.time()
print(end ‐ start)
Arc Consistency
Forward checking is quite useful in detecting failure early, however, it cannot detect all kinds of
failures. For example, consider the following scenario of the map coloring problem, as discussed
in class.
It is obvious that the above situation will return in a failure because the variable NT and SA are
neighbors and both have the same value remaining. If we assign NT the color blue then the
domain of SA will become empty, and vice versa. Forward Checking cannot detect this failure
because Forward Checking can only detect where situations where the domain of a variable has
been emptied.
In order to detect this kind of failure we need a stronger form of filtering called arc-consistency.
We try to enforce that the values at every arc (constraint) are consistent. An arc X Y is
consistent iff for every x in the tail there is some y in the head which could be assigned without
violating a constraint. A simple form of propagation makes sure all arcs are consistent.
Local Searches
Up till now we were finding a solution by incrementally building it up i.e. we select a variable at
every iteration and then assign it a value. We repeat this process until we have given a value to
every variable and all the constraints are satisfied.
One other way of solving CSP is to use a complete state formulation i.e. start by assigning every
variable some random values. Then at every iteration we select a variable that is violating any
constraint and assign it a value that is the Least Constraining on the other variables. We repeat
this process until all the constraints are satisfied.
This process is called local search, because we are just keeping the information about the current
state and how can we find a neighboring state. In general, local searches work with complete
state formulation, where they compare the current state with a neighboring state and if the
neighboring state is better than the current state then the neighboring state becomes the current
state and the process is repeated until all the constraints are satisfied.
In the AIMA book, we saw a local search algorithm to solve CSP called the min-conflicts search.
The pseudo code of the algorithm is given below:
The implementation of the pseudo code is given below:
def intiliazeRandomly(self,csp):
self._assignment = Assignment()
domainLength = len(csp.getListOfDomains())
for va in csp.getVariables():
self._assignment.addVariableToAssignment(va,
csp.getListOfDomains()[random.randint(0,domainLength‐1)])
def solve(self,csp):
self.intiliazeRandomly(csp)
for _ in range(self._maxSteps):
if self._assignment.isSolution(csp):
return self._assignment
cands = self.getConflictedVariable(csp)
var = cands[random.randint(0,len(cands)‐1)]
val = self.getMinConflictValueFor(var,csp)
# print(str(var)+"_"+str(val)+"__"+str(len(cands)))
self.fireListeners(csp, self._assignment)
self._assignment.addVariableToAssignment(var,val)
return False
def getConflictedVariable(self,csp):
resultVariables = []
return resultVariables
def getMinConflictValueFor(self,var,csp):
constraints = csp.getConstraints(var)
assignment = self._assignment.returnCopy()
minConflict = 100000000000
candidates = []
return candidates[random.randint(0,len(candidates)‐1)]
def fireListeners(self,csp,assignment):
for listener in self._listeners:
listener.fireChange(csp,assignment)
ASSIGNMENT
Question 2) Compare the performance of Forward checking and Forward checking with
arcconsistency
Question 3) Use local search to solve the n-queen problem. Compare its performance with
arcconsistency for 1000 queen problem
Name
Date
Registration No
Total Marks
___________________
Lab Instructor Signature
OBJECTIVE
Theory
So far, we have seen single agent environments. However, there are environments where there is
more than one agent. These are called multi agent environments. Here we are focusing on the
environments that are adversarial i.e. where your agent is up against an agent that is trying to
minimize your utility. The more the adversarial agent minimize your agent’s utility the more the
adversarial agent will maximize its utility.
Board games like chess, checkers, ludo, tic-tac-toe, etc. are examples of this kind of adversarial
environments. These games are turn taking games. Unlike, search, we cannot plan for victory
because after your agent has made a move, the next move belongs to the adversary.
In these environments, we must devise a strategy that make an optimal decision given a state. We
can model these games as a search problem as follows:
We can have an abstract representation of the above formulation as follows:
@abstractmethod
def __init__(self, params): pass
@abstractmethod
def getInitialState(self):pass
@abstractmethod
def getPlayer(self,state):pass
@abstractmethod
def getActions(self,state): pass
@abstractmethod
def getResult(self, state, action): pass
@abstractmethod
def terminalTest(self,state): pass
@abstractmethod
def utility(self,state,player): pass
@abstractmethod
def getAgentCount(self): pass
def getAction(self):pass
For now, we will consider that the environment is deterministic i.e. we know the outcomes of an
action taken by any agent at any particular state. To make an optimal decision our agent has to
simulate the actions taken by the adversarial agent if the agent chooses any particular agent. In
order to simulate the agent, we need to simulate what the adversarial agent will be thinking about
our agent, and what our agent will be thinking about the adversarial agent and so on. This is
graphically depicted in the following diagram.
So for all the possible actions, available to our agent, at any given state, our agent will take the
action that returns the maximum value. Whereas, the adversarial agent will choose an action that
returns the minimum value, as the lower the value the better it is for our adversarial agent.
One algorithm that calculates these values is called the minimax algorithm. There pseudo code is
given below:
The value of an action is recursively calculated at the terminal nodes, as shown in the figure
below:
class AdversarialNode(State):
'''
classdocs
'''
def isLeaf(self):
return len(self._children) == 0
def isMax(self):
return self._isMax
def isNextAgentMax(self):
return not self._isMax
def addChild(self,child):
self._children.append(child);
def getAction(self):
return self._action
def __str__(self):
s = "Name is %s, value is %d" %(self._name,self._utility)
s += "\n children are "
for ch in range(len(self._children)):
s+= str(self._children[ch])
return s
To model a tree such as shown in the figure above we could make a game for it:
class MinimaxTreeGame(Game):
'''
classdocs
'''
def __init__(self):
''' Constructor
A
B C D
E F G H I J K L M 3
12 8 2 4 6 14 5 2 '''
bottom1 = [AdversarialNode(3,"E",True,[]),
AdversarialNode(12,"F",True,[]),
AdversarialNode(8,"G",True,[])]
bottom2 = [AdversarialNode(2,"H",True,[]),
AdversarialNode(4,"I",True,[]),
AdversarialNode(6,"J",True,[])]
bottom3 = [AdversarialNode(14,"K",True,[]),
AdversarialNode(5,"L",True,[]),
AdversarialNode(2,"M",True,[])]
b = AdversarialNode(‐sys.maxsize ‐ 1,"B",False,bottom1)
c = AdversarialNode(‐sys.maxsize ‐ 1,"C",False,bottom2) d =
AdversarialNode(‐sys.maxsize ‐ 1,"D",False,bottom3)
a = AdversarialNode(‐sys.maxsize ‐ 1,"A",True,[b,c,d])
self._root = a
def getInitialState(self):
return self._root
def getPlayer(self,state):
return state.isMax()
def getActions(self,state):
return [x for x in range(len(state._children))]
def terminalTest(self,state):
return state.isLeaf()
def utility(self,state,player):
return state._value
def getAgentCount(self):
return 2
def printState(self,state):
toPrintNodes = []
toPrintNodes.append(state)
while len(toPrintNodes) > 0:
node = toPrintNodes[0]
del toPrintNodes[0]
print("Name = %s, value = %d"%(node._name,node._utility))
toPrintNodes += node._children
An implementation of vanilla minimax is shown below. The code shown below is a bit different
from the pseudo code shown in the AIMA book. However, the code is more generic and could be
used to simulate environments where there could be more than one adversarial agents.
class SimpleMinimax(object):
'''
classdocs
'''
def minimax_decision(self,state):
self._duplicateStates[str(state)] = state
if self._game.terminalTest(state):
return state._utility
if state.isMax():
return self.maxvalue(state)
else:
return self.minvalue(state)
def self
minvalue(
,state): ss =
str(state)
if ss in self._duplicateStates and self._duplicateStates[ss]._utility
> state._utility:
return state._utility
else:
self._duplicateStates[str(state)] = state
self._expandedNodes += 1
retValue = 1000000000000
# player = self._game.getPlayer(state)
actions = self._game.getActions(state)
return retValue
def maxvalue(self,state):
ss = str(state)
if ss in self._duplicateStates and self._duplicateStates[ss]._utility
> state._utility:
return state._utility
else:
self._duplicateStates[str(state)] = state
self._expandedNodes += 1
retValue = ‐1000000000000
# player = self._game.getPlayer(state)
actions = self._game.getActions(state)
return retValue
The driver program for the simple minimax with the game tree discussed above is as follows:
if __name__ == "__main__":
game = MinimaxTreeGame()
minimax = SimpleMinimax(game)
initialState = game.getInitialState()
minimax.minimax_decision(initialState)
game.printState(initialState)
return str(self._name)+"_"+str(self._sybmbol)
@author: dr.aarij
'''
from copy import deepcopy
class TictactoeState(object):
'''
classdocs
'''
def copy(self):
return TictactoeState(deepcopy(self._board),self._move,self._utility)
def isMax(self):
return self._move == 0
def isNextAgentMax(self):
return (self._move + 1) % 2 == 0
def getAction(self):
return self._action
def __str__(self):
return str(self._board)+"_"+str(self._move)
@author: dr.aarij
'''
from com.ai.adversarial.sample.tictactoe.tictactoeState import TictactoeState
from com.ai.adversarial.sample.tictactoe.tictactoePlayer import
TictactoePlayer
from com.ai.adversarial.elements.game import Game from
com.ai.adversarial.search.minimax import Minimax
class TicTacToeGame(Game):
'''
classdocs
'''
self._winningPositions = [[(0,0),(0,1),(0,2)],
[(1,0),(1,1),(1,2)], [(2,0),(2,1),(2,2)],
[(0,0),(1,0),(2,0)],
[(0,1),(1,1),(2,1)],
[(0,2),(1,2),(2,2)],
[(0,0),(1,1),(2,2)],
[(2,0),(1,1),(0,2)]]
def getInitialState(self):
return TictactoeState(self._board,self._move)
def getPlayer(self,state):
return self._agents[state._move]
def getActions(self,state):
actions = []
for i in range(3):
for j in range(3):
if state._board[i][j] == 0:
actions.append((i,j))
return actions
newState._board[action[0]][action[1]] = player._symbol
newState._move = (newState._move + 1) % 2
winposfound = True
for pos in self._winningPositions:
winposfound = True for indpos
in pos:
if newState._board[indpos[0]][indpos[1]] != player._symbol:
winposfound = False
break
if winposfound:
break
if winposfound:
newState._move = ‐1
if player._symbol == "X":
newState._utility = 1 else:
newState._utility = ‐1
else:
zeroFound = False
for i in range(3):
for j in range(3):
if newState._board[i][j] == 0:
zeroFound = True
break
if zeroFound:
break
if not zeroFound:
newState._move = ‐1
newState._utility = 0
return newState
def terminalTest(self,state):
return state._move == ‐1
def utility(self,state,player):
return state._utility
def getAgentCount(self):
return 2
ASSIGNMENT
1) Using the formulation of the adversarial tree, model the following tree, and print the values of
all the nodes:
2) Extend the simple minimax program to implement the alpha beta pruning algorithm
3) Using the formulation of the tic-tac-toe game, and play a game against it
EXPERIMENT # 10
INTRODUCTION
In this lab we will learn how to solve AI problem:
Name
Date
Registration No
Total Marks
___________________
Lab Instructor Signature
OBJECTIVE
Theory
Question: In this problem, three missionaries and three cannibals must cross a river
using a boat which can carry at most two people, under the constraint that, for both
banks, that the missionaries present on the bank cannot be outnumbered by cannibals.
The boat cannot cross the river by itself with no people on board.
Solution:
First let us consider that both the missionaries (M) and cannibals(C) are on the same
side of the river.
Left Right
Initially the positions are : 0M , 0C and 3M , 3C (B)
Now let’s send 2 Cannibals to left of bank : 0M , 2C (B) and 3M , 1C
Name
Date
Registration No
Total Marks
___________________
Lab Instructor Signature
OBJECTIVE
Theory
Markov Decision Processes (MDPs)
1. MDP formalism
2. Value Iteration
3. Policy Iteration
Reinforcement Learning :
Reinforcement Learning is a type of Machine Learning. It allows machines and software agents
to automatically determine the ideal behavior within a specific context, in order to maximize its
performance. Simple reward feedback is required for the agent to learn its behavior; this is
known as the reinforcement signal.
There are many different algorithms that tackle this issue. As a matter of fact, Reinforcement
Learning is defined by a specific type of problem, and all its solutions are classed as
Reinforcement Learning algorithms. In the problem, an agent is supposed to decide the best
action to select based on his current state. When this step is repeated, the problem is known as
a Markov Decision Process.
A Markov Decision Process (MDP) model contains:
The following sections explain the key terms of reinforcement learning, namely:
An agent lives in the grid. The above example is a 3*4 grid. The grid has a START state(grid no
1,1). The purpose of the agent is to wander around the grid to finally reach the Blue Diamond
(grid no 4,3). Under all circumstances, the agent should avoid the Fire grid (orange color, grid
no 4,2). Also the grid no 2,2 is a blocked grid, it acts like a wall hence the agent cannot enter it.
The agent can take any one of these actions: UP, DOWN, LEFT, RIGHT
Walls block the agent path, i.e., if there is a wall in the direction the agent would have taken,
the agent stays in the same place. So for example, if the agent says LEFT in the START grid he
would stay put in the START grid.
First Aim: To find the shortest sequence getting from START to the Diamond. Two such
sequences can be found:
RIGHT RIGHT UP UP RIGHT
UP UP RIGHT RIGHT RIGHT
Let us take the second one (UP UP RIGHT RIGHT RIGHT) for the subsequent discussion.
The move is now noisy. 80% of the time the intended action works correctly. 20% of the time
the action agent takes causes it to move at right angles. For example, if the agent says UP the
probability of going UP is 0.8 whereas the probability of going LEFT is 0.1 and probability of
going RIGHT is 0.1 (since LEFT and RIGHT is right angles to UP).
The agent receives rewards each time step:-
Small reward each step (can be negative when can also be term as punishment, in the
above example entering the Fire can have a reward of -1).
Big rewards come at the end (good or bad).
The goal is to Maximize sum of rewards.
ALGORITHM:
DEMO CODE POLICY ITERATION 1:(ELEMENTS)
importsys
Importrandom
class MDP(object):
def __init__(self,states,actions,transition,reward,discount=0.5):
self._states = states
self._actions = actions
self._transition = transition
self._reward = reward
self._discount = discount
self._initial_v = [ 0 for _ in states]
self._initial_q = [[0 for _ in actions] for _ in states]
delta = 0.0
for _ in range(iterations):
returnMatrix = [ 0 for _ in self._states]
for s in range(len(self._states)):
maxValue = -sys.maxsize - 1
for a in range(len(self._actions)):
actionValue = 0
possibleOutcomes = self._transition(s,a)
if len(possibleOutcomes) == 0:
maxValue = self._reward(s,a,None)
continue
for sp, prob in possibleOutcomes:
actionValue += prob * (self._reward(s,a,sp) +
self._discount * previousMatrix[sp])
returnQMatrix[s][a] = actionValue
maxValue = max(maxValue,actionValue)
returnMatrix[s] = maxValue
delta = max(delta, abs(previousMatrix[s] - returnMatrix[s]))
previousMatrix = returnMatrix
if delta < threshold:
break
while True:
policy_stable = True
self.policyEvaluation(start_policy, start_v, threshold)
for s in range(len(self._states)):
old_action = start_policy[s]
maxValue = -sys.maxsize - 1
for a in range(len(self._actions)):
actionValue = 0
possibleOutcomes = self._transition(s,a)
for sp, prob in possibleOutcomes:
actionValue += prob * (self._reward(s,a,sp) +
self._discount * start_v[sp])
if maxValue < actionValue:
maxValue = actionValue
start_policy[s] = a
if old_action != start_policy[s]:
policy_stable = False
if policy_stable:
break
class GridMDP(object):
'''
classdocs
'''
def readFile(self,file):
f = open(file,"+r")
lines = f.readlines()
self._rows = int(lines[0])
self._columns = int(lines[1])
# print(self._states)
def transition(self,state,action):
returnStates = []
if self._states[state] == 2:
return returnStates
if self._states[state] == 3 or self._states[state] == 4:
return returnStates
possibleActions = [ (self._actions[action][0],self._actions[action][1],1-
self._noise),
((self._actions[action][0]**2 +
1)%2,(self._actions[action][1]**2 + 1)%2,self._noise/2.0),
( ((self._actions[action][0]**2 + 1)%2)*-1,
((self._actions[action][1]**2 + 1)%2)*-1,self._noise/2.0)]
for pa in possibleActions:
if stateRow + pa[0] >= 0 and\
stateRow + pa[0] < self._rows and\
stateColumn + pa[1] >= 0 and\
stateColumn + pa[1] < self._columns and\
self._states[int((stateRow + pa[0]) * self._columns + (stateColumn +
pa[1]))] != 2:
if __name__ == "__main__":
grid = GridMDP("grid.txt",livingReward=-2.0)
mdp = MDP(grid._states, grid._actions, grid.transition, grid.reward, .9)
v = mdp.policyIteration()
print(v[0])
print(v[1])
Assignment
Understand above source code and apply policy iteration (MDP)on given grid
LAB # 12
Reinforcement Learning
The purpose of this lab is to understand the problem solving using Reinforcement Learning
Name
Date
Registration No
Department
Quiz
Assignment
___________________
Lab Instructor Signature
Experiment
OBJECTIVE
12
To solve the Taxi Parking Problem using Reinforcement Learning (Q Learning)
The Q-learning algorithm is used mainly as a simple algorithm for reinforcement. This uses the
environmental incentives to learn the best action to take in a given state over time. In the above
implementation, we have our "P" incentive table, where the agent learns from. Using the reward
table it chooses the next action if it’s beneficial or not and then updates a new value called the Q-
value. The newly generated table is called a combination of the Q-table and the maps named
(state, action). If the Q-values are higher we've got more incentives tailored.
Imagine we teach a taxi how to move people to four different locations in a car park
(R, G, Y, B).
Prerequisite : Using OpenAi's Gym to set up the taxi-problem system, one of the most commonly
used libraries for solving reinforcement problems.
To install the library, use the Python package installer (pip): pip install gym
Now let's see how this will make our climate. All models and interfaces are already built in Gym
and called in: Taxi-V2
The only car in this parking lot is the taxi. We will split the car park into a 5x5 grid, which gives
us 25 taxi locations. These 25 sites are a part of our room for the administration. Note that our
taxi's current position state is coordinate (3, 1).
There are four potential positions in the area where the passengers can be dropped off: R, G, Y,
B or [(0,0), (0,4), (4,0), (4,3)]in (row, col) coordinates if you can view the above-rendered area
as a coordinate axis.
If we also account for one (1) additional passenger status inside the taxi, we can take all
combinations of passenger locations and destination locations for our taxi setting to a total
number of states — there are four (4) destinations and five (4 + 1) passenger locations. And our
taxi system has 5 x 5 x5 x4=500 possible total states.
In other words, we have six possible actions: pickup, drop, north, east, south, west(These four
directions are the moves by which the taxi is moved.)
This is the action space: the set of all the actions that our agent can take in a given state.
To solve the problem without any reinforcement learning, we can set the target status, pick some
sample spaces and then if it achieves the target status with a number of iterations we conclude
that is the maximum reward, then the reward will be increased if it is close to the goal status and
the penalty will be raised if the reward for the move is -10 which is low.
Now let's code this problem without thinking about reinforcement. Since in each state we have
our P table for default incentives, we can try to make our taxi navigate using just that. We're
going to create an infinite loop that runs until one passenger reaches a destination (one episode),
or in other words when the reward received is 20.The env.action_space.sample() method
automatically selects one random action from set of all possible actions.
import gym
from time import sleep
epochs += 1
frames(frames)
Solve Using Q-Learning
Learning Algorithm
When a taxi faces a state that involves a passenger at its current venue, the pick-up
pick Q-value is
highly likely to be higher relative to other behavior, ssuch as drop-off
off or north. Q-values are
initialized to an arbitrary value, and the Q
Q-values
values are modified using the equation as the agent
exposes itself to the world and receives various rewards for performing different actions:
import gym
import numpy as np
import random
from IPython.display import clear_output
# Hyperparameters
alpha = 0.1
gamma = 0.6
epsilon = 0.1
all_epochs = []
all_penalties = []
# Init Vars
epochs, penalties, reward, = 0, 0, 0
done = False
while not done:
if random.uniform(0, 1) < epsilon:
# Check the action space
action = env.action_space.sample()
else:
# Check the learned values
action = np.argmax(q_table[state])
if reward == -10:
penalties += 1
state = next_state
epochs += 1
if i % 100 == 0:
clear_output(wait=True)
print("Episode: {i}")
print("Training finished.")
Reference : Vihar Kurama, Reinforcement learning with python: a guide to designing and solving problems
https://builtin.com/data-science/reinforcement-learning-python
Lab Assignment:
There are few obstacles present in between the locations (represented with smooth lines). L6 is
the highest priority location for the preparation of guitar bodies, including the polished wood.
Now the task is to allow the robots to find the shortest route on their own from any given
location to another location.
Name
Date
Registration No
Department
Quiz
Assignment
___________________
Lab Instructor Signature
Experiment
OBJECTIVE
13
To Solve Monty Hall Problem Using Bayesian Networks In Python
Bayesian Networks have given shape to complex problems that provide limited resources and
knowledge. This is being applied in the era's most innovative technologies like Artificial
Intelligence and Machine Learning.
A Bayesian Network comes under the umbrella of Probabilistic Graphical Modeling (PGM)
methodology used to quantify uncertainties through the use of probability concepts. Bayesian
Networks, popularly known as the Belief Networks, use Directed Acyclic Graphs (DAG) to
model uncertainties. Bayesian models are based on the basic probability principle. Joint
Probability is a statistical indicator of two or more simultaneous occurrences, i.e., P(A, B, C), the
likelihood of occurrence of events A, B and C. It can be defined as the probability of two or
more things occurring at the intersection. Conditional likelihood of an event X is the likelihood
of the event occurring provided that an event Y has already occurred.
If X and Y are dependent events then the expression for conditional probability is given
by:
P (X| Y) = P (X and Y) / P (Y)
If A and B are independent events then the expression for conditional probability is given
by:
P(X| Y) = P (X)
Where, 𝑋 denotes a random variable, whose probability depends on the probability of the parent
nodes, 𝑃𝑎𝑟𝑒𝑛(𝑋 ).
Monty Hall Problem
The Monty Hall problem named after the host of the TV series, ‘Let’s Make A Deal’, is a
paradoxical probability puzzle that has been confusing people for over a decade.
The game includes three doors, provided that there is a car behind one of those doors and the
other two have goats behind them. Thus you begin by picking a random door, say # 2.
The host, on the other hand, knows where the car is hidden and he opens another door, say # 1 (a
goat behind it). Here's the catch, the host will ask you if you want to choose door # 3 instead of
your first choice i.e. # 2.
Is it better if you switch your choice or should you stick to your first choice? That is just what we
will be modeling for. We will create a Bayesian Network to understand the probability of
winning if the participant chooses to switch his option.
The graph has three nodes, each representing the door chosen by:
In the code blow 'A,' 'B' and 'C' reflect the guest picked doors, the reward door and the Monty
picked door. The conditional likelihood for each of the nodes is drawn out here. There is not
much to say as the reward door and the guest door are chosen randomly. However, the door
picked by Monty depends on the other two doors, so I've figured out the conditional likelihood in
the above code, taking all possible scenarios into consideration.
# The door Monty picks, depends on the choice of the guest and the prize door
monty =ConditionalProbabilityTable(
[[ 'A', 'A', 'A', 0.0 ],
[ 'A', 'A', 'B', 0.5 ],
[ 'A', 'A', 'C', 0.5 ],
[ 'A', 'B', 'A', 0.0 ],
[ 'A', 'B', 'B', 0.0 ],
[ 'A', 'B', 'C', 1.0 ],
[ 'A', 'C', 'A', 0.0 ],
[ 'A', 'C', 'B', 1.0 ],
[ 'A', 'C', 'C', 0.0 ],
[ 'B', 'A', 'A', 0.0 ],
[ 'B', 'A', 'B', 0.0 ],
[ 'B', 'A', 'C', 1.0 ],
[ 'B', 'B', 'A', 0.5 ],
[ 'B', 'B', 'B', 0.0 ],
[ 'B', 'B', 'C', 0.5 ],
[ 'B', 'C', 'A', 1.0 ],
[ 'B', 'C', 'B', 0.0 ],
[ 'B', 'C', 'C', 0.0 ],
[ 'C', 'A', 'A', 0.0 ],
[ 'C', 'A', 'B', 1.0 ],
[ 'C', 'A', 'C', 0.0 ],
[ 'C', 'B', 'A', 1.0 ],
[ 'C', 'B', 'B', 0.0 ],
[ 'C', 'B', 'C', 0.0 ],
[ 'C', 'C', 'A', 0.5 ],
[ 'C', 'C', 'B', 0.5 ],
[ 'C', 'C', 'C', 0.0 ]], [guest, prize] )
Now that we’ve built the model, it’s time to make predictions.
beliefs = network.predict_proba({ 'guest' : 'A' })
beliefs = map(str, beliefs)
print("n".join( "{}t{}".format( state.name, belief ) for state, belief in
zip( network.states, beliefs ) ))
guest A
prize {
"class" :"Distribution",
"dtype" :"str",
"name" :"DiscreteDistribution",
"parameters" :[
{
"A" :0.3333333333333333,
"B" :0.3333333333333333,
"C" :0.3333333333333333
}
],
}
monty {
"class" :"Distribution",
"dtype" :"str",
"name" :"DiscreteDistribution",
"parameters" :[
{
"C" :0.49999999999999983,
"A" :0.0,
"B" :0.49999999999999983
}
],
}
In the above code snippet, we’ve assumed that the guest picks door ‘A’. Given this information,
the probability of the prize door being ‘A’, ‘B’, ‘C’ is equal (1/3) since it is a random process.
However, the probability of Monty picking ‘A’ is obviously zero since the guest picked door
‘A’.
And the other two doors have a 50 percent chance of being selected by Monty because we don't
know what is the key to the draw.
beliefs = network.predict_proba({'guest' : 'A', 'monty' : 'B'})
print("n".join( "{}t{}".format( state.name, str(belief) ) for state, belief
in zip( network.states, beliefs )))
guest A
prize {
"class" :"Distribution",
"dtype" :"str",
"name" :"DiscreteDistribution",
"parameters" :[
{
"A" :0.3333333333333334,
"B" :0.0,
"C" :0.6666666666666664
}
],
}
monty B
In the above code snippet, we’ve provided two inputs to our Bayesian Network, this is where
things get interesting. We’ve mentioned the following:
Notice the output, the probability of the car being behind door ‘C’ is approx. 66%. This proves
that if the guest switches his choice, he has a higher probability of winning. Though this might
seem confusing to some of you, it’s a known fact that:
Guests who decided to switch doors won about 2/3 of the time
Guests who refused to switch won about 1/3 of the time
For these instances, Bayesian Networks are used, involving predicting unpredictable tasks and
performance. In the section below you'll understand how to use Bayesian Networks to solve
some of these issues.
Consider the following Directed Acyclic Graph with probability distribution table.
Name
Date
Registration No
Department
Quiz
Assignment
___________________
Lab Instructor Signature
Experiment
OBJECTIVE
14
To Understand Markov Chain by Implementation in Python
A Markov chain is a mathematical system usually defined as a set of random variables that,
according to certain probabilistic rules, transition from one state to another. This transition set
satisfies the Markov Property, which states that the likelihood of transition to any given state
depends solely on the current state and time elapsed, and not on the sequence of state preceding
it. The peculiar property of Markov processes renders them memoryless.
A Markov chain with Markov property is a random operation. A random process is a
mathematical phenomenon known as a set of random variables, or also called stochastic
properties. A Markov chain has either a discrete state space (list of potential random variables
values) or a discrete index list (often time representation)-provided that there are several variants
for a Markov chain. The term "Markov chain" is typically reserved for a cycle that has a discrete
set of times, which is a Discrete Time Markov chain (DTMC).
Markov chain is a sequence of random variables X1, X2, X3, ... with the Markov property, such
that the probability of moving to the next state depends only on the present state and not on the
previous states. Putting this is mathematical probabilistic formula:
State Diagram
The Markov Chain shown in the state diagram has three possible statements: sleep, ride,
icecream. So, the matrix for the transformation will be 3 x 3. Remember that the arrows leaving a
state always sum up to exactly one, likewise the entries in each row in the transition matrix will
add up to exactly one-representing the distribution of probability. The cells do the same job in
the transition matrix as the arrows do in the State diagram.
Transition Matrix
With the example you saw, you can now address questions like: "Beginning from the state: sleep,
what is the probability of Cj running (state: run) at the end of a sad 2 day duration?"
Let's figure this one out: to switch from state: sleep to state: run, Cj will either remain in state:
sleep the first switch (or day), then move to state: run the next (second) move (0.2 . 0.6); or move
to state: run the first day and then remain the second day (0.6 . 0.6) or move to state: icecream on
the first move and then state: run in the second movement (0.2 . 0.7). Thus the likelihood: ((0.2 .
0.6) + (0.6 . 0.6) + (0.2 . 0.7)) = 0.62. So, we can now conclude that there's a 62 percent chance
Cj will move to state: run after two days of being sad, if she began to sleep in the state.
Therefore the probability: ((0.2.0.6) + (0.6.0.6) + (0.2.0.7))) = 0.62. So, we can now infer that Cj
is going to switch to state a 62 percent chance: run after two days of being sad if she started
sleeping in the house.
Let's try coding the above example in Python. And while you'd probably be using a library in
real life that encodes Markov Chains in a really effective way, the code will help you get started.
import numpy as np
import random as rm
Let's now define the states and their probability: the transition matrix. Remember, the matrix is
going to be a 3 X 3 matrix since you have three states. Also, you will have to define the
transition paths, you can do this using matrices as well.
# The statespace
states = ["Sleep","Icecream","Run"]
Oh, always make sure the probabilities sum up to 1. And it doesn't hurt to leave error messages,
at least when coding
if sum(transitionMatrix[0])+sum(transitionMatrix[1])+sum(transitionMatrix[1])
!= 3:
print("Somewhere, something went wrong. Transition matrix, perhaps?")
else: print("All is gonna be okay, you should move on!! ;)")
Now let's code the real thing. You will use the numpy.random.choice to generate a random
sample from the set of transitions possible. While most of its arguments are self-explanatory, the
p might not be. It is an optional argument that lets you enter the probability distribution for the
sampling set, which is the transition matrix in this case.
# A function that implements the Markov model to forecast the state/mood.
def activity_forecast(days):
# Choose the starting state
activityToday = "Sleep"
print("Start state: " + activityToday)
# Shall store the sequence of states taken. So, this only has the
starting state for now.
activityList = [activityToday]
i = 0
# To calculate the probability of the activityList
prob = 1
while i != days:
if activityToday == "Sleep":
change =
np.random.choice(transitionName[0],replace=True,p=transitionMatrix[0])
if change == "SS":
prob = prob * 0.2
activityList.append("Sleep")
pass
elif change == "SR":
prob = prob * 0.6
activityToday = "Run"
activityList.append("Run")
else:
prob = prob * 0.2
activityToday = "Icecream"
activityList.append("Icecream")
elif activityToday == "Run":
change =
np.random.choice(transitionName[1],replace=True,p=transitionMatrix[1])
if change == "RR":
prob = prob * 0.5
activityList.append("Run")
pass
elif change == "RS":
prob = prob * 0.2
activityToday = "Sleep"
activityList.append("Sleep")
else:
prob = prob * 0.3
activityToday = "Icecream"
activityList.append("Icecream")
elif activityToday == "Icecream":
change =
np.random.choice(transitionName[2],replace=True,p=transitionMatrix[2])
if change == "II":
prob = prob * 0.1
activityList.append("Icecream")
pass
elif change == "IS":
prob = prob * 0.2
activityToday = "Sleep"
activityList.append("Sleep")
else:
prob = prob * 0.7
activityToday = "Run"
activityList.append("Run")
i += 1
print("Possible states: " + str(activityList))
print("End state after "+ str(days) + " days: " + activityToday)
print("Probability of the possible sequence of states: " + str(prob))
# Function that forecasts the possible state for the next 2 days
activity_forecast(2)
You get a random set of transitions possible along with the probability of it happening, starting
from state: Sleep. Extend the program further to maybe iterate it for a couple of hundred times
with the same starting state, you can then see the expected probability of ending at any particular
state along with its probability. Let's rewrite the function activity_forecast and add a fresh
set of loops to do this...
def activity_forecast(days):
# Choose the starting state
activityToday = "Sleep"
activityList = [activityToday]
i = 0
prob = 1
while i != days:
if activityToday == "Sleep":
change =
np.random.choice(transitionName[0],replace=True,p=transitionMatrix[0])
if change == "SS":
prob = prob * 0.2
activityList.append("Sleep")
pass
elif change == "SR":
prob = prob * 0.6
activityToday = "Run"
activityList.append("Run")
else:
prob = prob * 0.2
activityToday = "Icecream"
activityList.append("Icecream")
elif activityToday == "Run":
change =
np.random.choice(transitionName[1],replace=True,p=transitionMatrix[1])
if change == "RR":
prob = prob * 0.5
activityList.append("Run")
pass
elif change == "RS":
prob = prob * 0.2
activityToday = "Sleep"
activityList.append("Sleep")
else:
prob = prob * 0.3
activityToday = "Icecream"
activityList.append("Icecream")
elif activityToday == "Icecream":
change =
np.random.choice(transitionName[2],replace=True,p=transitionMatrix[2])
if change == "II":
prob = prob * 0.1
activityList.append("Icecream")
pass
elif change == "IS":
prob = prob * 0.2
activityToday = "Sleep"
activityList.append("Sleep")
else:
prob = prob * 0.7
activityToday = "Run"
activityList.append("Run")
i += 1
return activityList
# `Range` starts from the first count up until but excluding the last count
for iterations in range(1,10000):
list_activity.append(activity_forecast(2))
Reference: Markov Chain Monte Carlo, Population Health Method, Columbia University Mailman School of
Public Health https://www.mailman.columbia.edu/research/population-health-methods/markov-chain-monte-carlo