AIML - Unit 1 Notes
AIML - Unit 1 Notes
INTEGRATED)
Unit 1
Introduction to Artificial Intelligence
Definition: Artificial Intelligence is the study of how to make computers do things, which, at the moment,
people do better.
According to the father of Artificial Intelligence, John McCarthy, it is “The science and engineering of
making intelligent machines, especially intelligent computer programs”.
Problems of AI: Intelligence does not imply perfect understanding; every intelligent being has limited
perception, memory and computation. Many points on the spectrum of intelligence versus cost are viable,
from insects to humans. AI seeks to understand the computations required from intelligent behaviour and
to produce computer systems that exhibit intelligence. Aspects of intelligence studied by AI include
perception, communication using human languages, reasoning, planning, learning and memory.
Applications of AI
AI has applications in all fields of human study, such as finance and economics, environmental
engineering, chemistry, computer science, and so on. Some of the applications of AI are listed below:
Perception
■ Machine vision
■ Speech understanding
■ Touch ( tactile or haptic) sensation
Robotics
Natural Language Processing
■ Natural Language Understanding
■ Speech Understanding
1
■ Language Generation
■ Machine Translation
Planning
Expert Systems
Machine Learning
Theorem Proving
Symbolic Mathematics
Game Playing
AI Technique:
Artificial Intelligence research during the last three decades has concluded that Intelligence requires
knowledge. To compensate for overwhelming quality, knowledge possesses less desirable properties.
A. It is huge.
B. It is difficult to characterize correctly.
C. It is constantly varying.
D. It differs from data by being organized in a way that corresponds to its application.
E. It is complicated.
2
1. The knowledge captures generalizations that share properties, are grouped together, rather
than being allowed separate representation.
2. It can be understood by people who must provide it—even though for many programs bulk
of the data comes automatically from readings.
3. In many AI domains, how the people understand the same people must supply the
knowledge to a program.
4. It can be easily modified to correct errors and reflect changes in real conditions.
5. It can be widely used even if it is incomplete or inaccurate.
6. It can be used to help overcome its own sheer bulk by helping to narrow the range of
possibilities that must be usually considered.
This game is played by two players – One is a human and the other is a computer.The objective is to write a
computer program in such a way that the computer wins most of the time.
Assume ,
Player 1 – XPlayer 2 – O
So, a player who gets 3 consecutive marks first, they will win the game.Let’s have a discussion about how a
board’s data structure looks and how the Tic Tac Toe algorithm works.
Board Data Structure
3
Each square is numbered, which starts from 1 to 9 like following
image
Data Structures: Board and Move Table
Consider a Board having nine elements vector. Each element will contain
0 for blank
1 indicating X player move
2 indicating O player move
below:
4
When the computer receives the current board position the computer will go to the new board position as per the
entry.
ADVANTAGE
Program is very efficient in time
DISADVANTAGE
A lot of space and store the move table
A lot of no out to specify all the entries in the move table
ANOTHER APPROACH FOR Tic-Tac-Toe
Data Structures:
Board position is a structure containing,
5
6
PROBLEMS, PROBLEM SPACES AND SEARCH
To solve the problem of building a system you should take the following steps:
1. Define the problem accurately including detailed specifications of what is the initial
situation and what constitutes a suitable solution.
2. Scrutinize the problem carefully, for some features may have a central affect on the
chosen method of solution.
3. Segregate and represent the background knowledge needed in the solution of the
problem.
4. Choose the best solving techniques for the problem to solve to a solution.
Problem definitions
A problem is defined by its ‘elements’ and their ‘relations’. To provide a formal description of a
problem, we need to do the following:
a. Define a state space that contains all the possible configurations of the relevant objects,
including some impossible ones.
b. Specify one or more states that describe possible situations, from which the problem-
solving process may start. These states are called initial states.
c. Specify one or more states that would be acceptable solution to the problem.
To solve the problem of playing a game, we require the rules of the game and targets for winning
as well as representing positions in the game. The opening position can be defined as the initial
state and a winning position as a goal state. Moves from initial state to other states leading to the
goal state follow legally. However, the rules are far too abundant in most games— especially in
chess, where they exceed the number of particles in the universe. Thus, the rules cannot be
supplied accurately and computer programs cannot handle easily. The storage also presents
7
another problem but searching can be achieved by hashing.
The number of rules that are used must be minimized and the set can be created by expressing
each rule in a form as possible. The representation of games leads to a state space representation
and it is common for well-organized games with some structure. This representation allows for
the formal definition of a problem that needs the movement from a set of initial positions to one
of a set of target positions. It means that the solution involves using known techniques and a
systematic search. This is quite a common method in Artificial Intelligence.
In this problem, we use two jugs called four litre and three litre; four holds a maximum of four
litres of water and three a maximum of three litres of water. How can we get two litres of water
in the four litre jug?
That is, x = 0, 1, 2, 3, or 4 y = 0, 1, 2, 3
Here need to start from the current state and end up in a goal state.
8
Production Rules for Water Jug Problem in Artificial Intelligence
3 (x, y) if x>0 -> (x-d, y) Pour some water out of the 4-liter jug.
4 (x, y) if Y>0 -> (x, y-d) Pour some water out of the 3-liter jug.
5 (x, y) if x>0 -> (0, y) Empty the 4-liter jug on the ground
6 (x, y) if y>0 -> (x,0) Empty the 3-liter jug on the ground
(x, y) if X+Y >= 4 and Pour water from the 3-liter jug into the 4-liter jug until the 4-liter
7 y>0 -> (4, y-(4-x)) jug is full
(x, y) if X+Y>=3 and Pour water from the 4-liter jug into the 3-liter jug until the 3-liter
8 x>0 -> (x-(3-y), 3)) jug is full.
11 (0, 2) -> (2, 0) Pour the 2-liter water from the 3-liter jug into the 4-liter jug.
12 (2, Y) -> (0, y) Empty the 2-liter in the 4-liter jug on the ground.
9
Another solution to Water Jug Problem in Artificial Intelligence
(0, 0) – Start State
(4, 0) – Rule 1, Fill the 4-liter jug
(1, 3) – Rule 8, Pour water from the 4-liter jug into the 3-liter jug until the 3-liter jug is full.
(1, 0) – Rule 6, Empty the 3-liter jug on the ground
(0, 1) – Rule 10, Pour all the water from the 4-liter jug into the 3-liter jug.
(4, 1) – Rule 1, Fill the 4-liter jug
(2, 3) – Rule 8, Pour water from the 4-liter jug into the 3-liter jug until the 3-liter jug is full.
Goal State reached
The problem solved by using the production rules in combination with an appropriate control
strategy, moving through the problem space until a path from an initial state to a goal state is
found. In this problem solving process, search is the fundamental concept. For simple problems
it is easier to achieve this goal by hand but there will be cases where this is far too difficult.
PRODUCTION SYSTEMS
Production systems provide appropriate structures for performing and describing search
processes. A production system has four basic components as enumerated below.
A set of rules each consisting of a left side that determines the applicability of the
rule and a right side that describes the operation to be performed if the rule is
applied.
A database of current facts established during the process of inference.
A control strategy that specifies the order in which the rules will be compared
with facts in the database and also specifies how to resolve conflicts in selection
of several rules or selection of more facts.
A rule firing module.
The production rules operate on the knowledge database. Each rule has a precondition—that is,
either satisfied or not by the knowledge database. If the precondition is satisfied, the rule can be
applied. Application of the rule changes the knowledge database. The control system chooses
which applicable rule should be applied and ceases computation when a termination condition on
the knowledge database is satisfied.
Control Strategies
The word ‘search’ refers to the search for a solution in a problem space.
• Search proceeds with different types of ‘search control strategies’.
• A strategy is defined by picking the order in which the nodes expand.
The Search strategies are evaluated along the following dimensions: Completeness, Time
complexity, Space complexity, Optimality (the search- related terms are first explained, and then
the search algorithms and control strategies are illustrated next).
The control system checks the applicability of a rule. It helps decide which rule should be
applied and terminates the process when the system gives the correct output. It also resolves the
conflict of multiple conditions arriving at the same time. The strategy of the control system
specifies the sequence of rules that compares the condition from the global database to reach the
10
correct result.
Search-related terms
• Algorithm’s performance and complexity
Ideally we want a common measure so that we can compare approaches in order to select
the most appropriate algorithm for a given situation.
Performance of an algorithm depends on internal and external factors.
Is there any relationship between classes of production systems and classes of problems?
For any solvable problems, there exist an infinite number of production systems that show how
to find solutions. Any problem that can be solved by any production system can be solved by a
commutative one, but the commutative one is practically useless. It may use individual states to
represent entire sequences of applications of rules of a simpler, non-commutative system. In the
formal sense, there is no relationship between kinds of problems and kinds of production systems
Since all problems can be solved by all kinds of systems. But in the practical sense, there is
definitely such a relationship between the kinds of problems and the kinds of systems that lend
themselves to describing those problems.
Partially commutative, monotonic productions systems are useful for solving ignorable
problems. These are important from an implementation point of view without the ability to
backtrack to previous states when it is discovered that an incorrect path has been followed. Both
types of partially commutative production systems are significant from an implementation point;
they tend to lead to many duplications of individual states during the search process. Production
systems that are not partially commutative are useful for many problems in which permanent
changes occur.
Search Algorithms
Many traditional search algorithms are used in AI applications. For complex problems, the
traditional algorithms are unable to find the solutions within some practical time and space
limits. Consequently, many special techniques are developed, using heuristic functions.
The algorithms that use heuristic functions are called heuristic algorithms.
• Heuristic algorithms are not really intelligent; they appear to be intelligent because they
12
achieve better performance.
• Heuristic algorithms are more efficient because they take advantage of feedback from the data
to direct the search path.
• Informed search algorithms use heuristic functions that are specific to the problem, apply
them to guide the search through the search space to try to reduce the amount of time spent in
searching. Also called heuristic or intelligent search, this uses information about the problem
to guide the search—usually guesses the distance to a goal state and is therefore efficient, but
the search may not be always possible.
A good heuristic will make an informed search dramatically outperform any uninformed search:
for example, the Traveling Salesman Problem (TSP), where the goal is to find is a good solution
instead of finding the best solution.
Uninformed search algorithms
13
1. Visiting a node: Just like the name suggests, visiting a node means to visit or select a node.
2. Exploring a node: Exploring the adjacent nodes (child nodes) of a selected node.
Take a look at the below graph, we will use the Breadth-First Search algorithm to traverse through the graph. Take
a look at the below graph, we will use the Breadth-First Search algorithm to traverse through the graph.
14
The following are the steps that explains the end-to-end process of Breadth-First Search
1. Assign ‘a’ as the root node and insert it into the Queue.
2. Extract node ‘a’ from the queue and insert the child nodes of ‘a’, i.e., ‘b’ and ‘c’.
3. Print node ‘a’.
4. The queue is not empty and has node ‘b’ and ‘c’. Since ‘b’ is the first node in the queue, let’s extract it and
insert the child nodes of ‘b’, i.e., node ‘d’ and ‘e’.
5. Repeat these steps until the queue gets empty. Note that the nodes that are already visited should not be
added to the queue again.
15
Example : 8 puzzle problem
The breadth first search uses a queue to hold all expanded nodes
Advantages
Disadvantages
BFS cannot be effectively used unless the search space is quiet small
Take a look at the below graph, we will use the Breadth-First Search algorithm to traverse through the graph
16
17
The DFS algorithm is a recursive algorithm that uses the idea of backtracking. It involves exhaustive
searches of all the nodes by going ahead, if possible, else by backtracking.
Here, the word backtrack means that when you are moving forward and there are no more nodes along the
current path, you move backwards on the same path to find nodes to traverse. All the nodes will be visited
on the current path till all the unvisited nodes have been traversed after which the next path will be
selected.
This recursive nature of DFS can be implemented using stacks. The basic idea is as follows:
Pick a starting node and push all its adjacent nodes into a stack.
Pop a node from stack to select the next node to visit and push all its adjacent nodes into a stack.
Repeat this process until the stack is empty. However, ensure that the nodes that are visited are marked.
This will prevent you from visiting the same node more than once. If you do not mark the nodes that are
visited and you visit the same node more than once, you may end up in an infinite loop.
Some Applications of DFS include: Topological sorting, Finding connected components, Finding
articulation points (cut vertices) of the graph, Solving puzzles such as maze and Finding strongly
connected components.
Advantages - DFS
Disadvantages - DFS
May find a sub-optimal solution (one that is deeper or more costly than the best solution)
Incomplete - without a depth bound, that is one may not find a best solution even if it exists
18
visited vertices are pushed removed from the queue and
onto the stack and later on then displayed at once.
when there is no vertex
further to visit those that are
popped-off.
Search Its search can be done with Its search can be done with
the help of stack i.e LIFO the help of queue i.e FIFO
implementations. implementation.
Usefulness It is not useful in finding It is useful in finding shortest
shortest path path.
Speed It is comparatively faster It is comparatively faster
when compared to BFS. when compared to BFS.
Application Topological sorting Finding connected
Finding connected components in a graph
components Finding all nodes within one
Finding articulation points connected component.
(cut vertices) of the graph. Finding the shortest path
Solving puzzles such as between two nodes
maze.
Finding strongly connected
components.
19
The trees are divided into many levels, and during the iteration only one particular level is considered. The search
is performed level wise.
Iteration 0: A
Iteration : A -->B-->C
Iteration : A -->B-->D-->E-->C-->F-->G
Example: The below figure shows how iterative depending works on a simple tree. We note from the figure that
the number of nodes expanded by iterative depending search is not much more than that would be using breadth
first search.
20
Advantages
If the solution exists in the tree, IDDFS provides the hope of finding it.
When solutions are found at lesser depths, say n, the approach shows to be efficient and time-efficient.
IDDFS has a significant benefit in in-game tree searching, where the search operation strives to enhance
the depth definition in order to increase the search algorithm's efficiency.
Despite the fact that there is more work to be done here, IDDFS outperforms single BFS and DFS
operations.
Disadvantages
The time it takes to reach the goal node is exponential.
The main issue with IDDFS is the amount of time and calculations wasted at each depth.
When the branching factor is determined to be large, the situation is not as awful as we may imagine.
21
4.Bidirectional Search Algorithm:
Bidirectional search algorithm runs two simultaneous searches, one form initial state called as forward-search and
other from goal node called as backward-search, to find the goal node. Bidirectional search replaces one single
search graph with two small subgraphs in which one starts the search from an initial vertex and other starts from
goal vertex. The search stops when these two graphs intersect each other.
Bidirectional search can use search techniques such as BFS, DFS, DLS, etc.
Advantages:
Disadvantages:
Example:
In the below search tree, bidirectional search algorithm is applied. This algorithm divides one graph/tree into two
sub-graphs. It starts traversing from node 1 in the forward direction and starts from goal node 16 in the backward
direction. The algorithm terminates at node 9 where two searches meet.
22
Informed Search(Heuristics)
A heuristic is a method that improves the efficiency of the search process. These are like tour
guides. There are good to the level that they may neglect the points in general interesting
directions; they are bad to the level that they may neglect points of interest to particular
individuals. Some heuristics help in the search process without sacrificing any claims to entirety
that the process might previously had. Others may occasionally cause an excellent path to be
overlooked. By sacrificing entirety it increases efficiency. Heuristics may not find the best
solution every time but guarantee that they find a good solution in a reasonable time. These are
particularly useful in solving tough and complex problems, solutions of which would require
infinite time, i.e. far longer than a lifetime for the problems which are not solved in any other
way.
Heuristic search
To find a solution in proper time rather than a complete solution in unlimited time we use
heuristics. ‘A heuristic function is a function that maps from problem state descriptions to
measures of desirability, usually represented as numbers’. Heuristic search methods use
knowledge about the problem domain and choose promising operators first. These heuristic
search methods use heuristic functions to evaluate the next state towards the goal state. For
finding a solution, by using the heuristic technique, one should carry out the following steps:
1. Add domain—specific information to select what is the best path to continue searching along.
2. Define a heuristic function h(n) that estimates the ‘goodness’ of a node n.
Specifically, h(n) = estimated cost(or distance) of minimal cost path from n to a goal state.
3. The term, heuristic means ‘serving to aid discovery’ and is an estimate, based on domain
specific information that is computable from the current state description of how close we are to
a goal.
Finding a route from one city to another city is an example of a search problem in which
different search orders and the use of heuristic knowledge are easily understood.
1. State: The current city in which the traveller is located.
2. Operators: Roads linking the current city to other cities.
3. Cost Metric: The cost of taking a given road between cities.
4. Heuristic information: The search could be guided by the direction of the goal city from the
current city, or we could use airline distance as an estimate of the distance to the goal.
Heuristic search techniques
For complex problems, the traditional algorithms, presented above, are unable to find the
solution within some practical time and space limits. Consequently, many special techniques are
developed, using heuristic functions.
• Blind search is not always possible, because it requires too much time or Space (memory).
23
respect to goal.
Heuristics that underestimate are desirable and called admissible.
• Heuristic evaluation function estimates likelihood of given state leading to goal state.
• Heuristic search function estimates cost from current state to goal, presuming function is
efficient.
1.Hill Climbing
Hill Climbing is heuristic search used for mathematical optimization problems in the field of
Artificial Intelligence .
Given a large set of inputs and a good heuristic function, it tries to find a sufficiently good
solution to the problem. This solution may not be the global optimal maximum.
In the above definition, mathematical optimization problems implies that hill climbing
solves the problems where we need to maximize or minimize a given real function by
choosing values from the given inputs. Example-Travelling salesman problem where we
need to minimize the distance traveled by salesman.
‘Heuristic search’ means that this search algorithm may not find the optimal solution to
the problem. However, it will give a good solution in reasonable time.
A heuristic function is a function that will rank all the possible alternatives at any
branching step in search algorithm based on the available information. It helps the
algorithm to select the best route out of possible routes.
24
Hence we call Hill climbing as a variant of generate and test algorithm as it takes the feedback
from test procedure. Then this feedback is utilized by the generator in deciding the next move in
search space.
2. Uses the Greedy approach : At any point in state space, the search moves in that
direction only which optimizes the cost of function with the hope of finding the optimal
solution at the end.
Comparison of Algorithms
No knowledge about how far a node Guides search process toward goal
node from goal state
Prefers states (nodes) that lead
close to and not away from goal
state
Suppose there are N cities, then a solution would be to take N! possible combinations to find the
shortest distance to decide the required route. This is not efficient as with N=10 there are
36,28,800 possible routes. This is an example of combinatorial explosion. There are better
methods for the solution of such problems using heuristics.
In hill climbing the basic idea is to always head towards a state which is better than the current one.
So, if you are in town A and you can get to town B and town C (and your target is town D) then you should make
a move IF town B or C appear nearer to town D than town A does.
Search Algorithm
If it is also goal state then return it, otherwise continue with the initial state as the current state.
2. Loop until the solution is found or until there are no new operators to be applied in the current state
a) Select an operator that has not yet been applied to the current state and apply it to produce new state
ii) If it is not a goal state but it is better than the current state, then make it as current state
iii) If it is not better than the current state, then continue in loop.
26
Disadvantages
The key point while solving any hill-climbing problem is to choose an appropriate heuristic function. Let’s define
such function h: h(x) = +1 for all the blocks in the support structure if the block is correctly positioned otherwise -
1 for all the blocks in the support structure.
Since h(2)=-3 which is going towards the goal state and is better than the current state, this next
state is taken as the current state and is proceeded further. other possible neighboring states
are not considered.
27
Basic hill-climbing first applies one operator n gets a new state. If it is better that becomes the current state
whereas the steepest climbing tests all possible solutions n chooses the best.
Algorithm
1. Evaluate the initial state.
If it is also a goal state then return it and quit. Otherwise continue with the initial state as the current state.
2. Loop until a solution is found or until a complete iteration produces no change to current state:
a) Let SUCC be a state such that any possible successor of the current state will be better than SUCC.
successor
28
Hill climbing search algorithm-Drawbacks
Local Maximum is a state that is
better than all its neighbours but it is
not better than some other states
farther away. At a local maximum
all moves appear to make things
worse. They almost occur within the
sight of solution and are called as
foothills
29
Different regions in the State Space Diagram
1. Local maximum : It is a state which is better than its neighboring state however there
exists a state which is better than it(global maximum). This state is better because here
value of objective function is higher than its neighbors.
2. Global maximum : It is the best possible state in the state space diagram. This because
at this state, objective function has highest value.
3. Plateua/flat local maximum : It is a flat region of state space where neighboring
states have the same value.
4. Ridge : It is region which is higher than its neighbours but itself has a slope. It is a
special kind of local maximum.
5. Current state : The region of state space diagram where we are currently present
during the search.
6. Shoulder : It is a plateau that has an uphill
edge. Problems in different regions in Hill climbing
Hill climbing cannot reach the optimal/best state(global maximum) if it enters any of the
following regions :
1. Local maximum : At a local maximum all neighboring states have a values which is
worse than than the current state. Since hill climbing uses greedy approach, it will not
move to the worse state and terminate itself. The process will end even though a better
solution may exist.
To overcome local maximum problem : Utilize backtracking technique. Maintain a list of
visited states. If the search reaches an undesirable state, it can backtrack to the previous
configuration and explore a new path.
2. Plateau : On plateau all neighbors have same value . Hence, it is not possible to select the
best direction.
To overcome plateaus : Make a big jump. Randomly select a state far away from current state.
Chances are that we will land at a non-plateau region
3. Ridge : Any point on a ridge can look like peak because movement in all possible
directions is downward. Hence the algorithm stops when it reaches this state.
To overcome Ridge : In this kind of obstacle, use two or more rules before testing. It
implies moving in several directions at once.
30
The key point while solving any hill-climbing problem is to choose an appropriate heuristic function. Let’s
define such function h: h(x) = +1 for all the blocks in the support structure if the block is correctly positioned
otherwise -1 for all the blocks in the support structure.
Hill climbing using local information
31
3. Stochastic hill climbing
Simulated Annealing (SA) is an effective and general form of optimization. It is useful in finding
global optima in the presence of large numbers of local optima. “Annealing” refers to an analogy with
thermodynamics, specifically with the way that metals cool and anneal. Simulated annealing uses the
objective function of an optimization problem instead of the energy of a material.
p=∆E/T
32
A Star ( A*) Search Algorithm
It is a searching algorithm that is used to find the shortest path between an initial and a final point.It is used for
minimizing the total estimated solution cost.It is the most widely known form of best-first search.
It is a handy algorithm that is often used for map traversal to find the shortest path to be taken. A* was initially
designed as a graph traversal problem, to help build a robot that can find its own course. It still remains a widely
popular algorithm for graph traversal.
It searches for shorter paths first, thus making it an optimal and complete algorithm. An optimal algorithm will
find the least cost outcome for a problem, while a complete algorithm finds all the possible outcomes of a problem.
Heuristics are basically educated guesses. It is crucial to understand that we do not know the distance to the finish
point until we find the route since there are so many things that might get in the way
Another aspect that makes A* so powerful is the use of weighted graphs in its implementation. A weighted graph
uses numbers to represent the cost of taking each path or course of action. This means that the algorithms can take
the path with the least cost, and find the best route in terms of distance and time.
33
A* requires the heuristic function to evaluate the cost of the path that passes through the particular state. This
algorithm is complete if the branching factor is finite and every action has a fixed cost. A* requires the heuristic
function to evaluate the cost of the path that passes through the particular state. It can be defined by the following
formula.
Where
g(n): The actual cost path from the start state to the current state(n state).
h(n): The estimated cost path from the current state(n state) to the goal state.
f(n): The actual cost path from the start state to the goal state.
For the implementation of the A* algorithm, we will use two arrays namely OPEN and CLOSE.
For the implementation of the A* algorithm, we will use two arrays namely OPEN and CLOSE.
OPEN: An array that contains the nodes that have been generated but have not been yet examined.
CLOSE: An array that contains the nodes that have been examined
Algorithm
Step 1: Place the starting node into OPEN and find its f (n) value.
Step 2: Remove the node from OPEN, having the smallest f (n) value. If it is a goal node then stop and return
success.
Step 3: Else remove the node from OPEN, find all its successors.
Step 4: Find the f (n) value of all successors; place them into OPEN and place the removed node into CLOSE.
Step 5: Go to Step-2.
Step 6: Exit.
34
Advantages of A* Star
Disadvantages of A* Star
This algorithm is complete if the branching factor is finite and every action has a fixed cost.
The speed execution of the A* search is highly dependent on the accuracy of the heuristic algorithm that is
used to compute h (n).
It has complexity problems.
Example 2:
35
5
5
4
A major drawback of the algorithm is its space and time complexity. It takes a large amount of space to store all
possible paths and a lot of time to find them
36
Genetic algorithms use an iterative process to arrive at the best solution. Finding the best solution out
of multiple best solutions (best of best). Compared with Natural selection, it is natural for the fittest to
survive in comparison with others.
Genetic algorithms simulate the process of natural selection which means those species who can
adapt to changes in their environment are able to survive and reproduce and go to next generation. In
simple words, they simulate “survival of the fittest” among individual of consecutive generation for
solving a problem. Each generation consist of a population of individuals and each individual
represents a point in search space and possible solution. Each individual is represented as a string of
character/integer/float/bits. This string is analogous to the Chromosome.
Fitness Score
A Fitness Score is given to each individual who shows the ability of an individual to “compete”.
The individual having optimal fitness score (or near optimal) are sought.
The GAs maintains the population of n individuals (chromosome/solutions) along with their fitness
scores. The individuals having better fitness scores are given more chance to reproduce than others. The
individuals with better fitness scores are selected who mate and produce better offspring by combining
chromosomes of parents. The population size is static so the room has to be created for new arrivals. So,
some individuals die and get replaced by new arrivals eventually creating new generation when all the
mating opportunity of the old population is exhausted. It is hoped that over successive generations better
solutions will arrive while least fit die.
Each new generation has on average more “better genes” than the individual (solution) of
previous generations. Thus each new generation have better “partial solutions” than previous
generations. Once the offspring produced having no significant difference from offspring produced by
previous populations, the population is converged. The algorithm is said to be converged to a set of
solutions for the problem.
37
There are several phases of a genetic algorithm. They include:
Initial population
It is where the whole process begins. Each group has different sets of individuals from the population.
Each can help you solve the problem you want to solve using various traits. Each individual has genes
that form a chromosome when joined to the string. When using the algorithm, the strings get represented
by 0 or 1s or both.
Fitness function
It is a function they use to determine whether one is fit to compete with other individuals. Each
individual gets a fitness score. Most of those selected for reproduction rely on their score. There is some
situation where the object size exceeds the knapsack forcing the representation to become invalid. In
cases where the definition of fitness function becomes hard, we can use methods like simulation to find
the fitness score. We can use phenotypes like computational fluid dynamics and interactive genetic
algorithms.
Selection
We have to select the best individuals who will create the next generation. Each pair of individuals get s
selected depending on different fitness scores. Having a higher fitness score improves the chances of
getting selected.
Crossover
When performing mating, we must choose the point randomly from the genes. It is a critical phase when
using genetic algorithms. We create offspring by exchanging the parent's genes with themselves until
you achieve the cross-over point. After that, we will get new offspring added to the population later.
Mutation
It is when we flip some of the strings. We select the genes at this stage by considering those with low
probability. It mainly avoids premature convergence and maintains the existing diversity within the
population.
Heuristics
We use different approaches to solve all the problems. The techniques may be practical, but there is no
guarantee that they are perfect. They make the calculation processes faster and more robust. There are
some types of approaches that we can use to penalize the crossovers that happen between the candidate
solutions if there is some similarity.
Termination
It is a repetitive process that occurs until we have the termination symptoms or conditions showing up.
This includes:
3) Mutation Operator: The key idea is to insert random genes in offspring to maintain the diversity in the
population to avoid premature convergence. For example –
40
Advantages
They are Robust
Provide optimization over large space state.
Unlike traditional AI, they do not break on slight change in input or presence of noise
A typical application of this technology would be to try to determine the best route for delivering product
from point A to point B, when the conditions may periodically change. These conditions could be related
to weather, road conditions, traffic flow, rush hour, etc. Many times the best route could be the fastest,
shortest, most scenic, most cost effective, or a combination thereof.
41