Term Project Report CIS 667
Term Project Report CIS 667
CIS 667
Submitted by:
Khushboo Gupta
1. Introduction
1.1 Purpose
The purpose of the project is to create a Red-Green-Blank Puzzle Solver using State
Space Search algorithms and compare the performance of each algorithm based on
generated results given variety of inputs.
1.4 Challenges
While developing the solution to the above problem, following factors can be
challenging and needs to be be dealt with -
● There might not be a solution, so there must be a limit on running time of the solver.
● Comparison factors are needed to be clearly incorporated.
● Multiple problem sets are needed to be designed to check the performance of each
algorithm. Need to take care for the cases where problem grid may not reach the
goal.
● As input for puzzle and grids is being asked from the user, proper handling of the
cases is needed where the user input are wrong or invalid.
2
Khushboo Gupta CIS -667 Term Project Report
2. Implementation
3
Khushboo Gupta CIS -667 Term Project Report
Loc.add(row,col)
Break
If (i+1)%boardlength == 0:
Increment row by 1
return loc
Function validmoves(board, operator) returns True if the operator is valid move for
the Blank and False otherwise.
def validmoves(board, operator):
Loc = lookBlank(board)
If operator == Up
return True if loc[0] is not 0, else False
If operator == Down
return True if loc[0] is not boardlength-1, else False
If operator == Left
return True if loc[1] is not 0, else False
If operator == Up
return True if loc[1] is not boardlength-1, else False
Function updateboard(board, operator) updates the board to next configuration if
the operation is valid. The function just checks if the operator is Up, Down, Left or
Right and based on the operator, it uses a temporary list as an intermediate to
exchange the position of Blank on the board. Then the updated board is returned.
def updateboard(board, oper):
childboard = board
loc = lookBlank(board)
l = boardlength
if oper == "Up":
t = childboard[int((loc[0]*l)+loc[1])]
childboard[int((loc[0]*l)+loc[1])] =
board[int((loc[0]*l)+(loc[1]-l))]
childboard[int((loc[0]*l)+(loc[1]-l))] = t
4
Khushboo Gupta CIS -667 Term Project Report
childboard[int((loc[0]*l)+(loc[1]-1))] = t
else:
t = childboard[int((loc[0]*l)+loc[1])]
childboard[int((loc[0]*l)+loc[1])] =
board[int((loc[0]*l)+(loc[1]+1))]
childboard[int((loc[0]*l)+(loc[1]+1))] = t
return childboard
3. Simulation - This sections just acts as front end to run the various algorithms based
on given user inputs as the problem number(1 to 5), input problem file and input goal
file. 1 corresponds to Breadth First Search, 2 corresponds to Depth First Search, 3
corresponds to Iterative Deepening Search, 4 corresponds to Greedy Best First Search
and 5 corresponds to A* Search.
1. Uninformed State Search Algorithms - The uninformed search algorithms don't use
any domain specific knowledge before generating the tree for searching. In this
case, these approaches don't have any knowledge about the goal grid.
- Breadth First Search - Breadth First Search or BFS searches breadth wise
of left to right in the problem space. It involves a FIFO(First in First Out)
queue to maintain the nodes to be explored, while looking for the potential
candidate for goal. The nodes are generated from the root of the search
tree, and then one level of the tree is explored at a time until the goal state is
found.
This algorithm generates a node whenever the moves for the Blank cell is
validated, and the current node is promoted as parent of the new node
generated, with a check to test if the goal state is found. The pseudocode for
this approach is as below -
bfs_solver(root, goal, filename):
nodesexplored, nodesgenerated = 0
Operators = setOperators()
current is root
Add root to the Queue
If current is goal:
Write solution to filename
Return True
Else while True
If Queue is empty:
5
Khushboo Gupta CIS -667 Term Project Report
Return False
Current = pop first node from Queue
Add current to explored nodes
Increment nodesexplored by 1
For n in operators:
if operator n is valid == True:
board = updateboard(current.getboard(),operator[n])
Child = Node(board)
Child.parent = current
Child.operator = operator[n]
Increment nodesgenerated by 1
If child is not in explored:
If child is goal:
Write solution to filename
Print solution board
Print nodesexplored, nodesgenerated
Return True
Add Child to the Queue
6
Khushboo Gupta CIS -667 Term Project Report
7
Khushboo Gupta CIS -667 Term Project Report
Child = Node(board)
Child.parent = current
Child.operator = operator[n]
Increment nodesgenerated by 1
result = R_depthlsearch(child, depth-1,goal)
If result is goal:
return result
return node
2. Informed State Search Algorithms - The informed state search algorithms utilize an
evaluation function, which estimates cost to the goal and expands any node based
on this evaluation function. In this domain, two algorithms namely Greedy Best First
Search and A* Search are used.
- Greedy Best First Search - Greedy Best First is a special case of the Best
First Search algorithm which uses a function to estimate the node closest to
the goal node.
In this instance, the heuristic function is based on the how close similar the
cells on the grid of a node is to the goal node. The lower the heuristic value,
the higher priority for the node to be expanded. The pseudocode is given
below -
def greedyH(board, goal):
Hval = 0
For i in boardsize:
If board[i] is not “B”:
If board[i] is not goal[i]:
Increment Hval by 1
return Hval
8
Khushboo Gupta CIS -667 Term Project Report
- A* Search - The A* algorithm is the combination of best first search and with
cost of the node to efficiently compute optimal solutions. The cost associated
with a node is f(n) = g(n) + h(n), where g(n) is the cost of the path from the
initial state to node n and h(n) is the heuristic estimate to a goal. Hence, f(n)
estimates the lowest total cost of any solution path going through node n. At
each point a node with lowest f value is chosen for expansion. A* algorithm
guides an optimal path to a goal if the heuristic function h(n) is admissible,
meaning it never overestimates actual cost.
9
Khushboo Gupta CIS -667 Term Project Report
In this case, the heuristic function being used is similar to the one being
used in the Greedy Best First Search, so the solution might not be optimal
as the heuristic might overestimate the cost. The pseudocode is as follow -
10
Khushboo Gupta CIS -667 Term Project Report
F = current.Hval + current.cost
Increment nodesgenerated by 1
If child is not in explored:
Add child to explored
Else if child is in explored :
Check if child.cost>Cost:
Update the child.cost with Cost
Remove the old child from PriorityQueue
Add the updated child to the PriorityQueue
For performing the experiment, I have made 5 different problem sets. The 5 problem sets
contain initial RBG grids ranging from 2*2 to 6*6 RBG puzzle and their respective goal files.
I ran the same problem set on all 5 of the algorithm solver to collect the performance data.
That gives a total of 25 performance datasets, 1 problem set with each of the 5 algorithms.
Also if in any case, a particular algorithm crossed the 60 seconds mark, I terminated the
execution of the algorithm as other algorithms in the same problem set found the solution
much faster and it was enough to determine the performance of the slower algorithm in
comparison to others in the same problem set.
11
Khushboo Gupta CIS -667 Term Project Report
The results and observations for the experiments are mentioned in the next section.
12
Khushboo Gupta CIS -667 Term Project Report
Below is the the set of 5 problems I used to draw comparison among the 5 algorithms.
Before actually running them, it was necessary to be sure if the goal is possible at all or not.
I have summarized the results these problems generate.
Problem Goal
2*2 Grid RB RR
RG BG
13
Khushboo Gupta CIS -667 Term Project Report
RGGRR RGBGR
RGGRG RGGGR
RRRBR RRRRR
Breadth First Search - In all the cases, according to the results BFS always find the optimal
solution(if it exists). However, it seems a bit slower for the cases where the problem is
comparatively smaller and less complicated. It is safe to say, BFS can be used to solve this
puzzle when the expectation is to find optimal step solution to the RBG puzzle, with
reasonable time. However, it generated and explored more nodes in comparison to the
informed state search algorithms in most of the cases.
Depth First Search - I used 60 seconds as cut off running time for all the algorithms. DFS
performed worst in every possible term. In problems with higher number of cells and the
problem’s board configuration was more different than just 3-4 cells(5*5 and 6*6 problems),
it didn't find any solution within 60 seconds while other 4 algorithms were done within a
second. Also except the 2*2 problem, it didn't find the optimal solution. So DFS is the
poorest choice out of the 5 algorithms in this case.
14
Khushboo Gupta CIS -667 Term Project Report
Iterative Deepening Search - IDS is also one of the best choices to find the optimal solution
for this problem within reasonable time. In all the problem sets, it has always found the
optimal solution. Only with the 4*4 grid, it took 20-25 times of the running time compared to
BFS, DFS and GBFS. That may be because the configuration of the 4*4 grid was very
different from the goal in term of both the position of the cells and distance of the Blank cell
from the position of Blank cell in goal grid.
15
Khushboo Gupta CIS -667 Term Project Report
Also I calculated the nodes explored and generated over the period of all iterations till the
solution is found, so they are reflecting the fact that the deeper the tree is explored, the
nodes are explored multiple times. For example at d, root is explored d times, the children
of root are explored d-1 times and so on.
Greedy Best First Search - GBFS also performs reasonably well considering the heuristic
based on the position of cells with respect to goal grid. The heuristic value of the node is
increased by 1 whenever a cell of node is different from the corresponding cell in the goal
grid node. Out of the 5 problems, GBFS finds optimal solution for 4 of them. Also on
average, it has the lowest running time out of the 5 algorithms. Also compared to the other
4 algorithms, it has the lowest number of nodes generated and nodes explored.
16
Khushboo Gupta CIS -667 Term Project Report
A* Search - Given the current heuristic function, A* performs the second worst given the
current problem set. This concludes that the heuristic function used is not admissible for
each node or state. In all the cases except 2*2 grid, where it found solution, it was never
optimal. However in term of running time, is comes close to BFS. Also, it all cases, it has
generated and explored more nodes than GBFS but less than other 3 algorithms.
The respective performance of all 5 algorithms can be determined from the graphs below.
The 4*4 grid problem is the most complicated among the whole problem set.
17
Khushboo Gupta CIS -667 Term Project Report
18
Khushboo Gupta CIS -667 Term Project Report
4. Conclusion -
This RBG puzzle solver implements 5 algorithms and 2 of the algorithms, Breadth First
Search and Iterative Deepening Search always find optimal answers. Depth First Search
performs the worst both with respect to running time and bigger problem size with
complicated grid configuration compared to goal. Greedy Best First Search also performs
quite well , with the lowest running time and nodes explored on average and 4 out of 5
optimal solutions found. A* Search performs the second worst, as it finds non optimal
solution for 4 out of 5 problems, with unnecessary long sequence of steps.
In case of Iterative Deepening Search, the nodes explored were more than nodes
generated while its opposite with other algorithms. It most probably is due to the fact that
with each iteration when the search goes deeper, the nodes are repeatedly explored and
my implementation reflected that fact. The most difficult part of implementing this project
was figuring out priority queue implementation and proper heuristic function for A*. I use the
same heuristic function in both Greedy Best First Search and A*, with way better results in
case of Greedy Best Search. One theory I have about this oddity is the cost associated with
each node. The cost is incremented by 1 each time a child is generated based on the valid
operator(move). As priority queue/frontier for A* holds nodes based on the sum of both the
cost and heuristic value, the cost might have affected the priority for nodes and hence the
wrong nodes or less optimal nodes might get explored. Also as there is no prior way to
figure out the threshold value for the cost and heuristic value combined, no way was
implemented to stop exploring the non optimal nodes for the optimal solution.
For future work, I would like to figure out proper heuristic function for A* algorithm
implementation for this problem. It might be a better direction to explore how Manhattan
and Euclidean distances fit in A*’s heuristic function. However as the problem grid doesn’t
19
Khushboo Gupta CIS -667 Term Project Report
only have different location of the “Blank” cell compared to the goal grid, the configuration
and position of the rest of cells can be very different from the goal grid. The best option may
be to define the heuristic function as a combination of Manhattan distance of the “Blank”
cell from goal’s “Blank” and position of rest of the cells, with respect to goal’s grid. Also,
more problem sets can be generated to get more concrete idea of the performance of all 5
algorithms experimentally.
20