0% found this document useful (0 votes)
22 views36 pages

NOTES Unit - II

The document discusses problem-solving agents in artificial intelligence, focusing on various search strategies, including uninformed and informed search algorithms. It details the steps performed by problem-solving agents, key terminologies, properties of search algorithms, and specific types of search algorithms like breadth-first search, depth-first search, and bidirectional search. Additionally, it highlights the importance of heuristic functions in informed search algorithms for efficient problem-solving.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views36 pages

NOTES Unit - II

The document discusses problem-solving agents in artificial intelligence, focusing on various search strategies, including uninformed and informed search algorithms. It details the steps performed by problem-solving agents, key terminologies, properties of search algorithms, and specific types of search algorithms like breadth-first search, depth-first search, and bidirectional search. Additionally, it highlights the importance of heuristic functions in informed search algorithms for efficient problem-solving.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

UNIT II

SOLVING PROBLEMS BY SEARCHING


Problem - Solving Agents - Example Problems - Searching for Solutions - Uninformed
Search Strategies - Informed (Heuristic) Search Strategies – Heuristic Functions. BEYOND
CLASSICAL SEARCH: Local Search Algorithms and Optimization Problems - Local
Searching Continuous Spaces - Searching with Nondeterministic Actions - Searching with
Partial Observations. ADVERSARIAL SEARCH: Stochastic Games.

The simplest agents were the reflex agents, which base their actions on a direct
mapping from states to actions. Such agents cannot operate well in environments for
which this mapping would be too large to store and would take too long to learn.
Goal-based agents, on the other hand, consider future actions and the desirability
of their outcomes. One kind of goal-based agent called a problem-solving agent. Problem-
solving agents use atomic representations, states of the world are considered as wholes,
with no internal structure visible to the problem-solving algorithms. Goal-based agents
that use more advanced factored or structured representations are usually called planning
agents.
Our discussion of problem solving begins with precise definitions of problems and
their solutions. Several general-purpose search algorithms that can be used to solve these
problems. Several uninformed search algorithms—algorithms that are given no
information about the problem other than its definition. Although some of these
algorithms can solve any solvable problem, none of them can do so efficiently. Informed
search algorithms, on the other hand, can do quite well given some guidance on where
to look for solutions.

PROBLEM-SOLVING AGENTS
In Artificial Intelligence, Search techniques are universal problem-solving methods.
Rational agents or Problem-solving agents in AI mostly used these search strategies or
algorithms to solve a specific problem and provide the best result. Problem-solving agents
are the goal-based agents and use atomic representation. The problem-solving agent
performs precisely by defining problems and its several solutions.
A problem-solving refers to a state where we wish to reach to a definite goal from a
present state or condition or a problem-solving is a part of artificial intelligence which
encompasses a number of techniques such as algorithms, heuristics to solve a problem.
Therefore, a problem-solving agent is a goal-driven agent and focuses on satisfying
the goal.
Steps performed by Problem-solving agent
• Goal formulation: It is the first and simplest step in problem-solving. It organizes
the steps/sequence required to formulate one goal out of multiple goals as well as actions
to achieve that goal. Goal formulation is based on the current situation and the agent's
performance measure.
• Problem formulation: It is one of the core steps of problem-solving which decides
what action should be taken to achieve the formulated goal. In AI this core part is
dependent upon software agent which consisted of the following components to formulate
the associated problem.
• Initial State: It is the starting state or initial step of the agent towards its goal.
• Actions: It is the description of the possible actions available to the agent.
• Transition Model: It describes what each action does.
• Goal Test: It determines if the given state is a goal state.
• Path cost: It assigns a numeric cost to each path that follows the goal. The
problem-solving agent selects a cost function, which reflects its performance measure.
Remember, an optimal solution has the lowest path cost among all the solutions.

The process of looking for a sequence of actions that reaches the goal is called
search. A search algorithm takes a problem as input and returns a solution in the form
of an action sequence. Once a solution is found, the actions it recommends can be carried
out. This is called the execution phase. Thus, we have a simple “formulate, search,
execute” design for the agent.

Search Algorithm Terminologies:


• Search: Searching is a step-by-step procedure to solve a search-problem in a
given search space. A search problem can have three main factors:
• Search Space: Search space represents a set of possible solutions, which a
system may have.
• Start State: It is a state from where agent begins the search.
• Goal test: It is a function which observe the current state and returns
whether the goal state is achieved or not.
• Search tree: A tree representation of search problem is called Search tree. The
root of the search tree is the root node which is corresponding to the initial state.

Properties of Search Algorithms:


Following are the four essential properties of search algorithms to compare the
efficiency of these algorithms:
• Completeness: A search algorithm is said to be complete if it guarantees to return
a solution if at least any solution exists for any random input.
• Optimality: If a solution found for an algorithm is guaranteed to be the best
solution (lowest path cost) among all other solutions, then such a solution for is said to
be an optimal solution.
• Time Complexity: Time complexity is a measure of time for an algorithm to
complete its task.
• Space Complexity: It is the maximum storage space required at any point during
the search, as the complexity of the problem.

Types of search algorithms


Based on the search problems we can classify the search algorithms into uninformed
(Blind search) search and informed search (Heuristic search) algorithms.
Uninformed search algorithms are also called blind search algorithms. The search
algorithm produces the search tree without using any domain knowledge, which is a brute
force in nature. They don't have any background information on how to approach the goal
or whatsoever.
Informed Search algorithms have information on the goal state which helps in
more efficient searching. This information is obtained by a function that estimates how
close a state is to the goal state. The plans to reach the goal state from the start state differ
only by the order and length of actions.
Uninformed Search Algorithms
Uninformed search is a class of general-purpose search algorithms which
operates in brute force-way. Uninformed search algorithms do not have additional
information about state or search space other than how to traverse the tree, so it is also
called blind search.

Following are the various types of uninformed search algorithms:


• Breadth-first Search
• Depth-first Search
• Depth-limited Search
• Iterative deepening depth-first search
• Uniform cost search
• Bidirectional Search

1. Breadth-first Search:
Breadth-first search is the most common search strategy for traversing a tree or
graph. This algorithm searches breadthwise in a tree or graph, so it is called breadth-first
search.
BFS algorithm starts searching from the root node of the tree and expands all
successor node at the current level before moving to nodes of next level.
The breadth-first search algorithm is an example of a general-graph search algorithm.
Breadth-first search implemented using FIFO queue data structure.

Advantages:
• BFS will provide a solution if any solution exists.
• If there are more than one solution for a given problem, then BFS will provide the
minimal solution which requires the least number of steps.

Disadvantages:
• It requires lots of memory since each level of the tree must be saved into memory
to expand the next level.
• BFS needs lots of time if the solution is far away from the root node.

Example:
In the below tree structure, we have shown the traversing of the tree using BFS
algorithm from the root node S to goal node K. BFS search algorithm traverse in layers, so
it will follow the path which is shown by the dotted arrow, and the traversed path will be:
S---> A--->B---->C--->D---->G--->H--->E---->F---->I---->K

Time Complexity:
Time Complexity of BFS algorithm can be obtained by the number of nodes traversed
in BFS until the shallowest Node. Where the d= depth of shallowest solution and b is a
node at every state.
T (b) = 1+b2+b3+.......+ bd= O (bd)
Space Complexity:
Space complexity of BFS algorithm is given by the Memory size of frontier which is
O(bd).
Completeness:
BFS is complete, which means if the shallowest goal node is at some finite depth,
then BFS will find a solution.
Optimality:
BFS is optimal if path cost is a non-decreasing function of the depth of the node.

2. Depth-first Search
Depth-first search isa recursive algorithm for traversing a tree or graph data
structure. It is called the depth-first search because it starts from the root node and
follows each path to its greatest depth node before moving to the next path.
DFS uses a stack data structure for its implementation. The process of the DFS
algorithm is similar to the BFS algorithm.

Advantage:
• DFS requires very less memory as it only needs to store a stack of the nodes on
the path from root node to the current node.
• It takes less time to reach to the goal node than BFS algorithm (if it traverses in
the right path).

Disadvantage:
• There is the possibility that many states keep re-occurring, and there is no
guarantee of finding the solution.
• DFS algorithm goes for deep down searching and sometime it may go to the
infinite loop.

Example:
In the below search tree, we have shown the flow of depth-first search, and it will
follow the order as:
Root node--->Left node ----> right node.

It will start searching from root node S, and traverse A, then B, then D and E, after
traversing E, it will backtrack the tree as E has no other successor and still goal node is
not found. After backtracking it will traverse node C and then G, and here it will terminate
as it found goal node.
Completeness:
DFS search algorithm is complete within finite state space as it will expand every
node within a limited search tree.

Time Complexity:
Time complexity of DFS will be equivalent to the node traversed by the algorithm. It
is given by:
T(n)= 1+ n2+ n3 +.........+ nm=O(nm)
Where, m= maximum depth of any node and this can be much larger than d
(Shallowest solution depth)

Space Complexity:
DFS algorithm needs to store only single path from the root node, hence space
complexity of DFS is equivalent to the size of the fringe set, which is O(bm).

Optimal:
DFS search algorithm is non-optimal, as it may generate a large number of steps or
high cost to reach to the goal node.

3. Depth-Limited Search Algorithm:


A depth-limited search algorithm is similar to depth-first search with a predetermined
limit. Depth-limited search can solve the drawback of the infinite path in the Depth-first
search. In this algorithm, the node at the depth limit will treat as it has no successor
nodes further.
Depth-limited search can be terminated with two Conditions of failure:
• Standard failure value: It indicates that problem does not have any solution.
• Cutoff failure value: It defines no solution for the problem within a given depth
limit.

Advantages:
Depth-limited search is Memory efficient.

Disadvantages:
Depth-limited search also has a disadvantage of incompleteness.
It may not be optimal if the problem has more than one solution.

Example:

Completeness:
DLS search algorithm is complete if the solution is above the depth-limit.

Time Complexity:
Time complexity of DLS algorithm is O(bℓ).

Space Complexity:
Space complexity of DLS algorithm is O(b×ℓ).

Optimal:
Depth-limited search can be viewed as a special case of DFS, and it is also not optimal
even if ℓ>d.

4. Uniform-cost Search Algorithm:


Uniform-cost search is a searching algorithm used for traversing a weighted tree or
graph. This algorithm comes into play when a different cost is available for each edge. The
primary goal of the uniform-cost search is to find a path to the goal node which has the
lowest cumulative cost.
Uniform-cost search expands nodes according to their path costs form the root node.
It can be used to solve any graph/tree where the optimal cost is in demand. A uniform-
cost search algorithm is implemented by the priority queue. It gives maximum priority to
the lowest cumulative cost. Uniform cost search is equivalent to BFS algorithm if the path
cost of all edges is the same.

Advantages:
Uniform cost search is optimal because at every state the path with the least cost is
chosen.

Disadvantages:
It does not care about the number of steps involve in searching and only concerned
about path cost.
Due to which this algorithm may be stuck in an infinite loop.

Example:

Completeness:
Uniform-cost search is complete, such as if there is a solution, UCS will find it.

Time Complexity:
Let C* is Cost of the optimal solution, and ε is each step to get closer to the goal node.
Then the number of steps is = C*/ε+1. Here we have taken +1, as we start from state
0 and end to C*/ε.
Hence, the worst-case time complexity of Uniform-cost search isO(b1 + [C*/ε])/.
Space Complexity:
The same logic is for space complexity so, the worst-case space complexity of
Uniform-cost search is O(b1 + [C*/ε]).

Optimal:
Uniform-cost search is always optimal as it only selects a path with the lowest path
cost.

5. Iterative deepeningdepth-first Search:


The iterative deepening algorithm is a combination of DFS and BFS algorithms. This
search algorithm finds out the best depth limit and does it by gradually increasing the
limit until a goal is found.
This algorithm performs depth-first search up to a certain "depth limit", and it keeps
increasing the depth limit after each iteration until the goal node is found.
This Search algorithm combines the benefits of Breadth-first search's fast search and
depth-first search's memory efficiency.
The iterative search algorithm is useful uninformed search when search space is
large, and depth of goal node is unknown.

Advantages:
It combines the benefits of BFS and DFS search algorithm in terms of fast search and
memory efficiency.

Disadvantages:
The main drawback of IDDFS is that it repeats all the work of the previous phase.

Example:
Following tree structure is showing the iterative deepening depth-first search. IDDFS
algorithm performs various iterations until it does not find the goal node. The iteration
performed by the algorithm is given as:
1'st Iteration-----> A
2'nd Iteration----> A, B, C
3'rd Iteration------>A, B, D, E, C, F, G
4'th Iteration------>A, B, D, H, I, E, C, F, K, G
In the fourth iteration, the algorithm will find the goal node.

Completeness:
This algorithm is complete is if the branching factor is finite.

Time Complexity:
Let's suppose b is the branching factor and depth is d then the worst-case time
complexity is O(bd).

Space Complexity:
The space complexity of IDDFS will be O(bd).

Optimal:
IDDFS algorithm is optimal if path cost is a non- decreasing function of the depth of
the node.

6. Bidirectional Search Algorithm:


Bidirectional search algorithm runs two simultaneous searches, one form initial state
called as forward-search and other from goal node called as backward-search, to find the
goal node.
Bidirectional search replaces one single search graph with two small subgraphs in
which one starts the search from an initial vertex and other starts from goal vertex. The
search stops when these two graphs intersect each other.
Bidirectional search can use search techniques such as BFS, DFS, DLS, etc.

Advantages:
Bidirectional search is fast.
Bidirectional search requires less memory

Disadvantages:
Implementation of the bidirectional search tree is difficult.
In bidirectional search, one should know the goal state in advance.

Example:
In the below search tree, bidirectional search algorithm is applied. This algorithm
divides one graph/tree into two sub-graphs.

It starts traversing from node 1 in the forward direction and starts from goal node 16
in the backward direction.
The algorithm terminates at node 9 where two searches meet.

Completeness:
Bidirectional Search is complete if we use BFS in both searches.
Time Complexity:
Time complexity of bidirectional search using BFS is O(bd).

Space Complexity:
Space complexity of bidirectional search is O(bd).

Optimal:
Bidirectional search is Optimal.

Informed Search Algorithms


So far, we have talked about the uninformed search algorithms which looked through
search space for all possible solutions to the problem without having any additional
knowledge about search space. But informed search algorithm contains an array of
knowledge such as how far we are from the goal, path cost, how to reach the goal node,
etc. This knowledge help agents to explore less to the search space and find more efficiently
the goal node.
The informed search algorithm is more useful for large search spaces. An informed
search algorithm uses the idea of heuristic, so it is also called Heuristic search. Heuristics
function: Heuristic is a function that is used in Informed Search, and it finds the most
promising path.
It takes the current state of the agent as its input and produces the estimation of how
close the agent is to the goal. The heuristic method, however, might not always give the
best solution, but it guarantees to find a good solution in a reasonable time. Heuristic
function estimates how close a state is to the goal. It is represented by h(n), and it
calculates the cost of an optimal path between the pair of states. The value of the heuristic
function is always positive.

Admissibility of the heuristic function is given as:


h(n) <= h*(n)
Here h(n) is heuristic cost, and h*(n) is the estimated cost. Hence heuristic cost
should be less than or equal to the estimated cost.

Pure Heuristic Search:


Pure heuristic search is the simplest form of heuristic search algorithms. It expands
nodes based on their heuristic value h(n). It maintains two lists, OPEN and CLOSED list.
In the CLOSED list, it places those nodes which have already expanded and in the OPEN
list, it places nodes which have yet not been expanded.
On each iteration, each node n with the lowest heuristic value is expanded and
generates all its successors and n is placed to the closed list. The algorithm continues unit
a goal state is found.

In the informed search we will discuss two main algorithms which are given below:
• Best First Search Algorithm (Greedy search)
• A* Search Algorithm

1.) Best-first Search Algorithm (Greedy Search):


Greedy best-first search algorithm always selects the path which appears best at that
moment. It is the combination of depth-first search and breadth-first search algorithms.
It uses the heuristic function and search.
Best-first search allows us to take the advantages of both algorithms. With the help
of best-first search, at each step, we can choose the most promising node. In the best first
search algorithm, we expand the node which is closest to the goal node and the closest
cost is estimated by heuristic function, i.e.
f(n)= g(n).
Were, h(n)= estimated cost from node n to the goal. The greedy best first algorithm is
implemented by the priority queue.

Best first search algorithm:


Step 1: Place the starting node into the OPEN list.
Step 2: If the OPEN list is empty, Stop and return failure.
Step 3: Remove the node n, from the OPEN list which has the lowest value of h(n), and
places it in the CLOSED list.
Step 4: Expand the node n, and generate the successors of node n.
Step 5: Check each successor of node n, and find whether any node is a goal node or not.
If any successor node is goal node, then return success and terminate the search, else
proceed to Step 6.
Step 6: For each successor node, algorithm checks for evaluation function f(n), and then
check if the node has been in either OPEN or CLOSED list. If the node has not been in
both lists, then add it to the OPEN list.
Step 7: Return to Step 2.
Advantages:
• Best first search can switch between BFS and DFS by gaining the advantages of
both the algorithms.
• This algorithm is more efficient than BFS and DFS algorithms.

Disadvantages:
• It can behave as an unguided depth-first search in the worst-case scenario.
• It can get stuck in a loop as DFS.
• This algorithm is not optimal.

Example:
Consider the below search problem, and we will traverse it using greedy best-first
search. At each iteration, each node is expanded using evaluation function f(n)=h(n), which
is given in the below table.

In this search example, we are using two lists which are OPEN and CLOSED Lists.
Following are the iteration for traversing the above example.
Expand the nodes of S and put in the CLOSED list
Initialization: Open [A, B], Closed [S]

Iteration 1: Open [A], Closed [S, B]

Iteration 2: Open [E, F, A], Closed [S, B]


Open [E, A], Closed [S, B, F]

Iteration 3: Open [I, G, E, A], Closed [S, B, F]


Open [I, E, A], Closed [S, B, F, G]

Hence the final solution path will be: S----> B----->F----> G

Time Complexity:
The worst-case time complexity of Greedy best first search is O(bm).

Space Complexity:
The worst-case space complexity of Greedy best first search is O(bm). Where, m is the
maximum depth of the search space.

Complete:
Greedy best-first search is also incomplete, even if the given state space is finite.
Optimal:
Greedy best first search algorithm is not optimal.

A* Search Algorithm:
A* search is the most commonly known form of best-first search. It uses the heuristic
function h(n), and cost to reach the node n from the start state g(n). It has combined
features of UCS and greedy best-first search, by which it solves the problem efficiently. A*
search algorithm finds the shortest path through the search space using the heuristic
function.
This search algorithm expands fewer search trees and provides optimal results faster.
A* algorithm is similar to UCS except that it uses g(n)+h(n)instead of g(n). In A* search
algorithm, we use the search heuristic as well as the cost to reach the node. Hence, we
can combine both costs as follows, and this sum is called a fitness number.

Algorithm of A* search:
Step 1: Place the starting node in the OPEN list.
Step 2: Check if the OPEN list is empty or not, if the list is empty then return failure and
stops.
Step 3: Select the node from the OPEN list which has the smallest value of evaluation
function (g+h), if node n is the goal node, then return success and stop, otherwise
Step 4: Expand node n and generate all of its successors, and put n into the closed list.
For each successor n', check whether n' is already in the OPEN or CLOSED list, if not then
compute the evaluation function for n' and place it into the Open list.
Step 5: Else if node n' is already in OPEN and CLOSED, then it should be attached to the
back pointer which reflects the lowest g(n') value.
Step 6: Return to Step 2.

Advantages:
• A* search algorithm is the best algorithm than other search algorithms.
• A* search algorithm is optimal and complete.
• This algorithm can solve very complex problems.

Disadvantages:
• It does not always produce the shortest path as it is mostly based on heuristics
and approximation.
• A* search algorithm has some complexity issues.
• The main drawback of A* is memory requirement as it keeps all generated nodes
in the memory, so it is not practical for various large-scale problems.

Example:
In this example, we will traverse the given graph using the A* algorithm. The heuristic
value of all states is given in below table so we will calculate the f(n) of each state using
the formula f(n)= g(n) + h(n), where g(n) is the cost to reach any node from the start state.
Here we will use the OPEN and CLOSED list.

Solution:

Initialization: {(S, 5)}

Iteration 1: {(S--> A, 4), (S-->G, 10)}


Iteration 2: {(S--> A-->C, 4), (S--> A-->B, 7), (S-->G, 10)}

Iteration 3: {(S--> A-->C--->G, 6), (S--> A-->C--->D, 11), (S--> A-->B, 7), (S-->G, 10)}

Iteration 4: Will give the final result, as S--->A--->C--->G it provides the optimal path
with cost 6.

Points to remember:
• A* algorithm returns the path which occurred first, and it does not search for all
remaining paths.
• The efficiency of the A* algorithm depends on the quality of the heuristic.
• A* algorithm expands all nodes which satisfy the condition f(n)

Complete:
• A* algorithm is complete as long as:
• The branching factor is finite.
• The cost at every action is fixed.

Optimal:
A* search algorithm is optimal if it follows below two conditions:

Admissible:
The first condition required for optimality is that h(n) should be an admissible
heuristic for A* tree search. An admissible heuristic is optimistic in nature.

Consistency:
Second required condition is consistency for only A* graph-search.
If the heuristic function is admissible, then A* tree search will always find the least cost
path.

Time Complexity:
The time complexity of the A* search algorithm depends on the heuristic function,
and the number of nodes expanded is exponential to the depth of solution d. So, the time
complexity is O(b^d), where b is the branching factor.
Space Complexity:
The space complexity of the A* search algorithm is O(b^d)

Local Search Algorithms and Optimization Problem


The informed and uninformed search expands the nodes systematically in two ways:
• Keeping different paths in the memory and
• Selecting the best suitable path,
Which leads to a solution state required to reach the goal node. But beyond these
“classical search algorithms," we have some “local search algorithms” where the path cost
does not matter, and only focus on the solution-state needed to reach the goal node.
A local search algorithm completes its task by traversing on a single current node
rather than multiple paths and following the neighbours of that node generally.
Although local search algorithms are not systematic, still they have the following two
advantages:
• Local search algorithms use a very little or constant amount of memory as
they operate only on a single path.
• Most often, they find a reasonable solution in large or infinite state spaces
where the classical or systematic algorithms do not work.
Does the local search algorithm work for a pure optimized problem?
Yes, the local search algorithm works for pure optimized problems. A pure
optimization problem is one where all the nodes can give a solution. But the target is to
find the best state out of all according to the objective function. Unfortunately, the pure
optimization problem fails to find high-quality solutions to reach the goal state from the
current state.
Note: An objective function is a function whose value is either minimized or
maximized in different contexts of optimization problems. In the case of search algorithms,
an objective function can be the path cost for reaching the goal node, etc.

Working on a Local search algorithm


Let's understand the working of a local search algorithm with the help of an example:
Consider the below state-space landscape having both:
• Location: It is defined by the state.
• Elevation: It is defined by the value of the objective function or heuristic
cost function.
The local search algorithm explores the above landscape by finding the following two
points:
• Global Minimum: If the elevation corresponds to the cost, then the task is to
find the lowest valley, which is known as Global Minimum.
• Global Maxima: If the elevation corresponds to an objective function, then it
finds the highest peak which is called Global Maxima. It is the highest point in
the valley.
We will understand the working of these points better in the Hill-climbing search.
Below are some different types of local searches:
• Hill-climbing Search
• Simulated Annealing
• Local Beam Search
We will discuss the above searches in the next section.
Note: Local search algorithms do not burden to remember all the nodes in the
memory; it operates on complete state-formulation.

Hill Climbing Algorithm in AI


Hill Climbing Algorithm: Hill climbing search is a local search problem. The
purpose of the hill climbing search is to climb a hill and reach the topmost peak/ point of
that hill. It is based on the heuristic search technique where the person who is climbing
up on the hill estimates the direction which will lead him to the highest
peak.
State-space Landscape of Hill climbing algorithm
To understand the concept of hill-climbing algorithm, consider the below landscape
representing the goal state/peak and the current state of the climber. The topographical
regions shown in the figure can be defined as:
• Global Maximum: It is the highest point on the hill, which is the goal state.
• Local Maximum: It is the peak higher than all other peaks but lower than the
global maximum.
• Flat local maximum: It is the flat area over the hill where it has no uphill or
downhill. It is a saturated point of the hill.
• Shoulder: It is also a flat area where the summit is possible.
• Current state: It is the current position of the person.

Applications of Hill Climbing Technique


Hill Climbing technique can be used to solve many problems, where the current
state allows for an accurate evaluation function, such as
• Network-Flow
• Travelling Salesman problem
• 8-Queens problem
• Integrated Circuit design

Types of Hills climbing search algorithm


There are the following types of hill-climbing search:
• Simple hill climbing
• Steepest-ascent hill climbing
• Stochastic hill climbing
• Random-restart hill climbing

Simple hill-climbing search:


Simple hill climbing is the simplest technique to climb a hill. The task is to reach
the highest peak of the mountain. Here, the movement of the climber depends on his
move/steps. If he finds his next step better than the previous one, he continues to move
else remain in the same state. This search focus only on his previous and next step.

Simple hill-climbing Algorithm


Create a CURRENT node, NEIGHBOUR node, and a GOAL node.
If the CURRENT node=GOAL node, return GOAL and terminate the search.
Else CURRENT node<= NEIGHBOUR node, move ahead.
Loop until the goal is not reached or a point is not found.

Steepest-ascent hill climbing


Steepest-ascent hill climbing is different from a simple hill-climbing search. Unlike
a simple hill-climbing search, it considers all the successive nodes, compares them, and
chooses the node which is closest to the solution. Steepest hill-climbing search is similar
to the best-first search because it focuses on each node instead of one.

Note: Both simple, as well as steepest-ascent hill-climbing search, fails when there
is no closer node.

Steepest-ascent hill-climbing algorithm


Create a CURRENT node and a GOAL node.
If the CURRENT node=GOAL node, return GOAL and terminate the search.
Loop until a better node is not found to reach the solution.
If there is any better successor node present, expand it.
When the GOAL is attained, return GOAL and terminate.

Stochastic hill climbing


Stochastic hill climbing does not focus on all the nodes. It selects one node at
random and decides whether it should be expanded or search for a better one.

Random-restart hill climbing


Random-restart algorithm is based on try and try strategy. It iteratively searches
the node and selects the best one at each step until the goal is not found. The success
depends most commonly on the shape of the hill. If there are few plateaus, local maxima,
and ridges, it becomes easy to reach the destination.

Limitations of Hill climbing algorithm


Hill climbing algorithm is a fast and furious approach. It finds the solution state
rapidly because it is quite easy to improve a bad state. But there are the following
limitations of this search:

Local Maxima:
It is that peak of the mountain that is highest than all its neighbouring states but
lower than the global maxima. It is not the goal peak because there is another peak higher
than it.

Plateau:
It is a flat surface area where no uphill exists. It becomes difficult for the climber to
decide in which direction he should move to reach the goal point. Sometimes, the person
gets lost in the flat area.
Ridges:
It is a challenging problem where the person finds two or more local maxima of the
same height commonly. It becomes difficult for the person to navigate the right point and
stuck to that point itself.

Example for Hill Climbing Search:


Simulated Annealing
Simulated annealing is similar to the hill-climbing algorithm. It works in the current
situation. It picks a random move instead of picking the best move. If the move leads to
the improvement of the current situation, it is always accepted as a step towards the
solution state, else it accepts the move having a probability of less than 1. This search
technique was first used in 1980 to solve VLSI layout problems. It is also applied for factory
scheduling and other large optimization tasks.
Local Beam Search
Local beam search is quite different from random-restart search. It keeps track of k
states instead of just one. It selects k randomly generated states and expands them at
each step. If any state is a goal state, the search stops with success. Else it selects the
best k successors from the complete list and repeats the same process. In random-restart
search where each search process runs independently, but in local beam search, the
necessary information is shared between the parallel search processes.

Disadvantages of Local Beam search


• This search can suffer from a lack of diversity among the k states.
• It is an expensive version of a hill-climbing search.

Note: A variant of Local Beam Search is Stochastic Beam Search which selects k
successors at random rather than choosing the best k successors.

Genetic Algorithm:
Genetic Algorithm (GA) is a search-based optimization technique based on the
principles of Genetics and Natural Selection. It is frequently used to find optimal or near-
optimal solutions to difficult problems which otherwise would take a lifetime to solve. It is
frequently used to solve optimization problems, in research, and in machine learning.

Introduction to Optimization
Optimization is the process of making something better. In any process, we have a
set of inputs and a set of outputs as shown in the following figure.

Optimization refers to finding the values of inputs in such a way that we get the
“best” output values. The definition of “best” varies from problem to problem, but in
mathematical terms, it refers to maximizing or minimizing one or more objective functions,
by varying the input parameters.
The set of all possible solutions or values which the inputs can take make up the
search space. In this search space, lies a point or a set of points which gives the optimal
solution. The aim of optimization is to find that point or set of points in the search space.

What are Genetic Algorithms?


Nature has always been a great source of inspiration to all mankind. Genetic
Algorithms (GAs) are search based algorithms based on the concepts of natural selection
and genetics. GAs are a subset of a much larger branch of computation known
as Evolutionary Computation.
GAs were developed by John Holland and his students and colleagues at the
University of Michigan, most notably David E. Goldberg and has since been tried on
various optimization problems with a high degree of success.
In GAs, we have a pool or a population of possible solutions to the given problem.
These solutions then undergo recombination and mutation (like in natural genetics),
producing new children, and the process is repeated over various generations. Each
individual (or candidate solution) is assigned a fitness value (based on its objective
function value) and the fitter individuals are given a higher chance to mate and yield more
“fitter” individuals. This is in line with the Darwinian Theory of “Survival of the Fittest”.
In this way we keep “evolving” better individuals or solutions over generations, till
we reach a stopping criterion.
Genetic Algorithms are sufficiently randomized in nature, but they perform much
better than random local search (in which we just try various random solutions, keeping
track of the best so far), as they exploit historical information as well.

Advantages of GAs
GAs have various advantages which have made them immensely popular. These
include −
• Does not require any derivative information (which may not be available for
many real-world problems).
• Is faster and more efficient as compared to the traditional methods.
• Has very good parallel capabilities.
• Optimizes both continuous and discrete functions and also multi-objective
problems.
• Provides a list of “good” solutions and not just a single solution.
• Always gets an answer to the problem, which gets better over the time.
• Useful when the search space is very large and there are a large number of
parameters involved.

Limitations of GAs
Like any technique, GAs also suffer from a few limitations. These include −
• GAs are not suited for all problems, especially problems which are simple and
for which derivative information is available.
• Fitness value is calculated repeatedly which might be computationally expensive
for some problems.
• Being stochastic, there are no guarantees on the optimality or the quality of the
solution.
• If not implemented properly, the GA may not converge to the optimal solution.
GA – Motivation
Genetic Algorithms have the ability to deliver a “good-enough” solution “fast-
enough”. This makes genetic algorithms attractive for use in solving optimization
problems. The reasons why GAs are needed are as follows −

Solving Difficult Problems


In computer science, there is a large set of problems, which are NP-Hard. What this
essentially means is that, even the most powerful computing systems take a very long time
(even years!) to solve that problem. In such a scenario, GAs prove to be an efficient tool to
provide usable near-optimal solutions in a short amount of time.

Failure of Gradient Based Methods


Traditional calculus-based methods work by starting at a random point and by
moving in the direction of the gradient, till we reach the top of the hill. This technique is
efficient and works very well for single-peaked objective functions like the cost function in
linear regression. But, in most real-world situations, we have a very complex problem
called as landscapes, which are made of many peaks and many valleys, which causes
such methods to fail, as they suffer from an inherent tendency of getting stuck at the local
optima as shown in the following figure.

Getting a Good Solution Fast


Some difficult problems like the Travelling Salesperson Problem (TSP), have real-
world applications like path finding and VLSI Design. Now imagine that you are using your
GPS Navigation system, and it takes a few minutes (or even a few hours) to compute the
“optimal” path from the source to destination. Delay in such real world applications is not
acceptable and therefore a “good-enough” solution, which is delivered “fast” is what is
required.
This section introduces the basic terminology required to understand GAs. Also, a
generic structure of GAs is presented in both pseudo-code and graphical forms. The reader
is advised to properly understand all the concepts introduced in this section and keep
them in mind when reading other sections of this tutorial as well.

Basic Structure
The basic structure of a GA is as follows −
We start with an initial population (which may be generated at random or seeded
by other heuristics), select parents from this population for mating. Apply crossover and
mutation operators on the parents to generate new off-springs. And finally these off-
springs replace the existing individuals in the population and the process repeats. In this
way genetic algorithms actually try to mimic the human evolution to some extent.

Each of the following steps are covered as a separate chapter later in this tutorial.

Notion of Natural Selection


The process of natural selection starts with the selection of fittest individuals from a
population. They produce offspring which inherit the characteristics of the parents and
will be added to the next generation. If parents have better fitness, their offspring will be
better than parents and have a better chance at surviving. This process keeps on iterating
and at the end, a generation with the fittest individuals will be found.
This notion can be applied for a search problem. We consider a set of solutions for a
problem and select the set of best ones out of them.
Five phases are considered in a genetic algorithm.
• Initial population
• Fitness function
• Selection
• Crossover
• Mutation

Initial Population
The process begins with a set of individuals which is called a Population. Each
individual is a solution to the problem you want to solve.
An individual is characterized by a set of parameters (variables) known as Genes.
Genes are joined into a string to form a Chromosome (solution).
In a genetic algorithm, the set of genes of an individual is represented using a string,
in terms of an alphabet. Usually, binary values are used (string of 1s and 0s). We say that
we encode the genes in a chromosome.
Population is a subset of solutions in the current generation. It can also be defined
as a set of chromosomes. There are several things to be kept in mind when dealing with

GA population −
The diversity of the population should be maintained otherwise it might lead to
premature convergence.
The population size should not be kept very large as it can cause a GA to slow down,
while a smaller population might not be enough for a good mating pool. Therefore, an
optimal population size needs to be decided by trial and error.
The population is usually defined as a two dimensional array of – size population,
size x, chromosome size.

Population Initialization
There are two primary methods to initialize a population in a GA. They are −
Random Initialization − Populate the initial population with completely random
solutions.
Heuristic initialization − Populate the initial population using a known heuristic
for the problem.
It has been observed that the entire population should not be initialized using a
heuristic, as it can result in the population having similar solutions and very little
diversity. It has been experimentally observed that the random solutions are the ones to
drive the population to optimality. Therefore, with heuristic initialization, we just seed the
population with a couple of good solutions, filling up the rest with random solutions rather
than filling the entire population with heuristic based solutions.
It has also been observed that heuristic initialization in some cases, only effects the
initial fitness of the population, but in the end, it is the diversity of the solutions which
lead to optimality.

Population Models
There are two population models widely in use −
Steady State
In steady state GA, we generate one or two off-springs in each iteration and they
replace one or two individuals from the population. A steady state GA is also known as
Incremental GA.

Generational
In a generational model, we generate ‘n’ off-springs, where n is the population size,
and the entire population is replaced by the new one at the end of the iteration.

Selection
The idea of selection phase is to select the fittest individuals and let them pass their
genes to the next generation.
Two pairs of individuals (parents) are selected based on their fitness scores.
Individuals with high fitness have more chance to be selected for reproduction.

Crossover
Crossover is the most significant phase in a genetic algorithm. For each pair of
parents to be mated, a crossover point is chosen at random from within the genes.
For example, consider the crossover point to be 3 as shown below.
Offspring:
Offspring are created by exchanging the genes of parents among themselves until the
crossover point is reached.

Exchanging genes among parents


The new offspring are added to the population.

Mutation
In certain new offspring formed, some of their genes can be subjected to
a mutation with a low random probability. This implies that some of the bits in the bit
string can be flipped.

Mutation: Before and After


Mutation occurs to maintain diversity within the population and prevent premature
convergence.

Termination
The algorithm terminates if the population has converged (does not produce offspring
which are significantly different from the previous generation). Then it is said that the
genetic algorithm has provided a set of solutions to our problem.

You might also like