Assignment # 06: Graph Search Technques
Assignment # 06: Graph Search Technques
Page 1 of 11
Graph
A graph is a representation of a set of objects where some pairs of objects are connected by links. The interconnected objects are represented by mathematical abstractions called vertices, and the links that connect some pairs of vertices are called edges. Typically, a graph is depicted in diagrammatic form as a set of dots for the vertices, joined by lines or curves for the edges. Graphs are one of the objects of study in discrete mathematics.
Graph Search
One of the most fundamental tasks on graphs is searching a graph by starting at some vertex, or set of vertices, and visiting new vertices by crossing (out) edges until there is nothing left to search. In such a search we need to be systematic to make sure that we visit all vertices that we can reach and that we do not visit vertices multiple times. This will require recording what vertices we have already visited so we dont visit them again. Graph searching can be use to determine various properties of graphs, such as whether the graph is connected or whether it is bipartite, as well as various properties relating vertices, such as whether a vertex u is reachable from v, or finding the shortest path between u and v.
graph is shallow (the longest shortest path from the source to any vertex is reasonably small). In fact, the depth of the graph will show up in the bounds for span. Fortunately many real-world graphs are shallow, but if we are concerned with worst-case behavior over any graph, then BFS is also sequential.
Uninformed Search
An uninformed (Blind) search algorithm generates the search tree without using any domain specific knowledge. If a state is not a goal , we cannot tell how close to the goal it might be. Hence, all we can do is move systematically between states until we stumble on a goal. In contrast, informed ( heuristic ) search uses a guess on how close to the goal a state might be.
Page 3 of 11
Depth-First Search(DFS):
Enqueue nodes on nodes in LIFO (last-in, first-out) order. That is, nodes used as a stack data structure to order nodes. May not terminate without a depth bound, i.e., cutting off search below a fixed depth D ( depth-limited search) Not complete (with or without cycle detection, and with or without a cutoff depth) Exponential time, O(bd), but only linear space, O(bd) Can find long solutions quickly if lucky (and short solutions slowly if unlucky!) When search hits a dead-end, can only back up one level at a time even if the problem occurs because of a bad operator choice near the top of the tree. Hence, only does chronological backtracking
Uniform-Cost Search(UCS):
Enqueue nodes by path cost. That is, let g(n) = cost of the path from the start node to the current node n. Sort nodes by increasing value of g. Called Dijkstras Algorithm in the algorithms literature and similar to Branch and Bound Algorithm in operations research literature Complete (*) Optimal/Admissible (*) Admissibility depends on the goal test being applied when a node is removed from the nodes list, not when its parent node is expanded and the node is first generated Exponential time and space complexity, O(bd)
Informed Search
A problem determines the graph and the goal but not which path to select from the frontier. This is the job of a search strategy. A search strategy specifies which paths are selected from the frontier. Different strategies are obtained by modifying how the selection of paths in the frontier is implemented.
Page 4 of 11
It is not difficult to see that uninformed search will pursue options that lead away from the goal as easily as it pursues options that lead towards the goal. For any but the smallest problems this leads to searches that take unacceptable amounts of time and/or space. Informed search tries to reduce the amount of search that must be done by making intelligent choices for the nodes that are selected for expansion. This implies the existence of some way of evaluating the likely hood that a given node is on the solution path. In general this is done using a heuristic function.
Best-first search:
Idea: use an evaluation functionfor each node estimate of desirability Expand most desirable unexpanded node Implementation: fringe is a queue sorted in decreasing order of desirability Special cases: greedy search A* search
Page 5 of 11
Greedy Search:
Evaluation functionh(n) (heuristic) = estimate of cost fromnto the closest goal E.g. hSLD(n) = straight-line distance from n to Bucharest Greedy search expands the node that appears to be closest to goal
Informed search strategies Also known as heuristic search, informed search strategies use information about the domain to (try to) (usually) head in the general direction of the goal node(s) Informed search methods: Hill climbing, best-first, greedy search, beam search, A, A*
A* Algorithm
The A* algorithm combines features of uniform-cost search and pure heuristic search to efficiently compute optimal solutions. A* algorithm is a best-first search algorithm in which the cost associated with a node is f(n) = g(n) + h(n), where g(n) is the cost of the path from the initial state to node n and h(n) is the heuristic estimate or the cost or a path from node n to a goal. Thus, f(n) estimates the lowest total cost of any solution path going through node n. At each point a node with lowest f value is chosen for expansion. Ties among nodes of equal f value should be broken in favor of nodes with lower h values. The algorithm terminates when a goal is chosen for expansion. A* algorithm guides an optimal path to a goal if the heuristic function h(n) is admissible, meaning it never overestimates actual cost. For example, since airline distance never overestimates actual highway distance, and manhatten distance never overestimates actual moves in the gliding tile. For Puzzle, A* algorithm, using these evaluation functions, can find optimal solutions to these problems. In addition, A* makes the most efficient use of the given heuristic function in the following sense: among all shortest-path algorithms using the given heuristic function h(n). A* algorithm expands the fewest number of nodes. The defining characteristics of the A* algorithm are the building of a "closed list" to record areas already evaluated, a "fringe list" to record areas adjacent to those already evaluated, and the calculation of distances traveled from the "start point" with estimated distances to the "goal point". The fringe list, often called the "open list", is a list of all locations immediately adjacent to areas that have already been explored and evaluated (the closed list). The closed list is a record of all locations which have been explored and evaluated by the algorithm.
Page 7 of 11
Figure 1. The current location is the yellow square, it is now part of the closed list. The orange squares surrounding around the yellow are the fringe, these are the possible options which the algorithm can experiment with.
Figure 2. As the path progresses, the closed and fringe lists grow. Note that this path cuts corners. If the gray area represents an obstacle, like a wall, this path might be invalid since it passes unhindered through the wall.
Page 8 of 11
Figure 3. When cornering rules are imposed, the path will be better suited to avoiding obstacles.
The heuristic used to evaluate distances in A* is: f(n) = g(n) + h(n) where g(n) represents the cost (distance) of the path from the starting point to any vertex n, and h(n) represents the estimated cost from vertex n to the goal. Euclidean distance (straight line distance) is a common method to used for h(n).
x2 = coordinate of the goal location x1 = coordinate of the current location y2 = coordinate of the goal location y1 = coordinate of current location dx = | x2 - x1 | dy = | y2 - y1 |
Distance = sqrt(dx2 + dy2) The A* algorithm is fairly simple. There are two sets, FRINGE and CLOSED. The FRINGE set contains those nodes that are candidates for examining. Initially, the FRINGE set contains just one element: the starting position. The CLOSED set contains those nodes that have already been examined. Initially, the CLOSED set is empty. Graphically, the FRINGE set is the "frontier" and the CLOSED set is the "interior" of the visited areas. Each node also keeps a pointer to its parent node so that we can determine how it was found. There is a main loop that repeatedly pulls out the best node n in FRINGE (the node with the lowest f value) and examines it. If n is the goal, then we're done. Otherwise, node n is removed from FRINGE and added to CLOSED. Then, its neighbors n' are examined. A neighbor
Page 9 of 11
that is in CLOSED has already been seen, so we don't need to look at it. A neighbor that is in FRINGE will be examined if its f value becomes the lowest in FRINGE. Otherwise, we add it to FRINGE, with its parent set to n. The path cost to n', g(n'), will be set to g(n) + movementcost(n, n').
Pseudo code:
Inputs
Internal Data
fringe - a list of map locations to be evaluated, in ascending order of estimated distance closedList - a list of map locations that have been fully evaluated
RouteNode, contains
a map location pointer to this node's parent node d, the actual distance traveled to reach this node dPlusL2, which is d + linear distance to goal
Search()
findRoute()
if fringe is empty o // No route exists between start and goal. o return 0 else
Page 10 of 11
o o o
node = remove first fringe node (it will have the shortest estimated distance to the goal) if node's location is the goal return RouteNode data for current location else if node's location is not on the closedList add node to closedList addChildrenToFringe(node) return findRoute()
addChildrenToFringe(parentNode)
for all children of parentNode o if child's location is not on closedList childNode = new RoutNode() childNode.parent = parentNode childNode.d = parent.d + linearDistance(parent, child) L2 = linearDistance(childNode, goal) childNode.dPlusL2 = childNode.d + L2 o Add childNode to fringe, maintaining ascending dPlusL2 order
Drawback of A* algorithm:
The main drawback of A* algorithm and indeed of any best-first search is its memory requirement. Since at least the entire open list must be saved, A* algorithm is severely spacelimited in practice, and is no more practical than best-first search algorithm on current machines. For example, while it can be run successfully on the eight puzzle, it exhausts available memory in a matter of minutes on the fifteen puzzle.
References:
http://en.wikipedia.org/wiki/Graph_(mathematics) http://mnemstudio.org/path-finding-a-star.htm http://en.wikipedia.org/wiki/Bidirectional_search Lecture : Solving Problems by Searching by Marco Chiarandini University of Southern Denmark
Page 11 of 11