AML Unit 4
AML Unit 4
Genetics is derived from the Greek word, “genesis” that means to grow. The genetics decides the
heredity factors, resemblances, and differences between the offsprings in the process of
evolution. Genetic Algorithms are also derived from natural evolution
Some Terminologies In A Biological Chromosome
By simulating the process of natural selection, reproduction and mutation, the genetic algorithms
can produce high-quality solutions for various problems including search and optimization.
The GAs perform a random search to solve optimization problems. The GA uses techniques that
use the previous historical information to direct their search towards optimization in the new
search space.
Following are some of the basic terminologies that can help us to understand genetic algorithms:
1. Chromosome/Individual
A chromosome is one of the solutions in the population. Genes are joined into a string to form
a Chromosome (solution).
2. Gene:
This is an element in a chromosome. Genes may determine the solution to the problem. They are
represented by a bit (0 or 1) string of random length. An individual is characterized by a set of
parameters (variables) known as Genes.
In a genetic algorithm, the set of genes of an individual is represented using a string, in terms of
an alphabet. Usually, binary values are used (string of 1s and 0s). We say that we encode the
genes in a chromosome.
3. Population:
chromosomes.
This is a subset of all the probable solutions that can solve the given problem. The process begins
with a set of individuals which is called a Population. Each individual is a solution to the
The fitness function helps in establishing the fitness of all individuals in the population. It
assigns a fitness score to every individual, which further determines the probability of being
chosen for reproduction. The higher the fitness score, the higher the chances of being chosen for
reproduction.
The fitness function determines how fit an individual is (the ability of an individual to compete
with other individuals). It gives a fitness score to each individual. The probability that an
individual will be selected for reproduction is based on its fitness score.
This is a function that uses a specific input to produce an improved output. The solution is used
as the input while the output is in the form of solution suitability.
The fitness function tells how close the solution is to the optimal solution. Fitness function is
determined in many ways such as the sum of all parameters related to the problem – Euclidean
distance, etc. There is no rule to evaluate fitness function.
In every iteration, the individuals are evaluated based on their fitness scores which are computed
by the fitness function.
Individuals who achieve a better fitness score represent better solutions and are more likely to be
chosen to crossover and passed on to the next generation.
For example, if genetic algorithms are used for feature selection, then the accuracy of the model
with those selected features would be the fitness function if it is a classification problem.
GENETIC OPERATORS
6. Selection
After calculating the fitness of every individual in the population, a selection process is used to
determine which of the individuals in the population will get to reproduce and create the
offspring that will form the next generation.
The idea of selection phase is to select the fittest individuals and let them pass their genes to the
next generation.
Two pairs of individuals (parents) are selected based on their fitness scores. Individuals with high
7. Crossover
This operator swaps the genetic information of two parents to reproduce an offspring. It is
performed on parent pairs that are selected randomly to generate a child population of equal size
as the parent population.
Generally, two individuals are chosen from the current generation and their genes are
interchanged between two individuals to create a new individual representing the offspring. This
process is also called mating or crossover.
For each pair of parents to be mated, a crossover point is chosen at random from within the
genes.
This represents mating between individuals. Two individuals are selected using selection
operator and crossover sites are chosen randomly. Then the genes at these crossover sites are
exchanged thus creating a completely new individual (offspring). For example –
It is a process of taking 2 individuals and producing a child from them. The reproduction process
after selection makes clones of the good stings. The crossover operator is applied over the strings
to produce a better offspring.
1. Two individuals are selected randomly from the population to produce offsprings.
2. A cross-site is selected at random along the length of the string.
3. The values at the site are swapped.
The crossover performed can be a single-point crossover, two-point crossover, multipoint
crossover, etc. The single point crossover has one crossover site while a two-point crossover site
has 2 sites where the values are swapped.
8. Mutation:
This operator adds new genetic information to the new child population. This is achieved by
flipping some bits in the chromosome. Mutation solves the problem of local minimum and
enhances diversification. The following image shows how mutation is done.
The mutation is a random change in a chromosome to introduce new patterns to a chromosome.
For example, flipping a bit in a binary string
The key idea is to insert random genes in offspring to maintain the diversity in the population
Termination
The algorithm terminates if the population has converged (does not produce offspring which are
significantly different from the previous generation). Then it is said that the genetic algorithm has
By simulating the process of natural selection, reproduction and mutation, the genetic algorithms
can produce high-quality solutions for various problems including search and optimization.
Genetic programming
Video:
Genetic programming is a subset of genetic algorithms, the only difference between them
being the representation of the chromosome. Genetic algorithms deal with optimization
problems where the phenotype is based on point or vector, while in genetic programming
the phenotype is based on tree. In addition to this, they also can increase or decrease their
genotype by adding more terminals and instructions.
As we already know genetics programming is based on trees, it is hugely applied to
evolve decision and behavioral tress for game playing.
Function elements from grammar are randomly selected to create the tree. They then
branch out by randomly choosing the terminals, and similar to the genetic algorithm, we
discard individuals with lower values than min or higher values than max and keep on
sampling until we get an individual within the min and max range.
The fitness function for genetic programming can have different ways to determine the
fitness values, such as it can create test cases and evaluating how well an agent did on the
test. It can also use AI for evaluation. Though, we need to reject overly complex models
as our requirement is to decrease the model complexity so that it can generalize to new
inputs. To avoid the complex models we can assess the bloating of the model. Bloating
refers to the addition of more terminal nodes and depth to a tree, while the fitness value is
decreased only slightly.
Types of GP include:
Genetic programming is iterative, and at each new stage of the algorithm, it chooses only the
fittest of the “offspring” to cross and reproduce in the next generation, which is sometimes
referred to as a fitness function. Just like in biological evolution, evolutionary algorithms can
sometimes have randomly mutating offspring, but since only the offspring that have the highest
fitness measure are reproduced, the fitness will almost always improve over generations.
Genetic programming will generally terminate once it reaches a predefined fitness measure.
Additionally, architecture-altering operations can be introduced to an already running program
in order to allow for new sources of information to be analyzed for a given fitness function.
Although originally proposed in 1950 by Alan Turing, it wasn’t until the 1980s that successful
genetic algorithms were first implemented. The first patented algorithm for genetic operations
was in 1988 by John Koza, who remains a leader in the field.
Genetic programming systems utilize a type of machine learning technique that can include
automatic programming without the need for manual interaction. This means that genetic
algorithms can utilize automatic program inductions to run as new information is ingested, so
that the programs can be optimized automatically. Genetic or evolutionary algorithms have a
variety of uses, particularly around domains where an exact solution is not known in advance,
or when finding an approximate solution is deemed appropriate. Genetic programming is often
used in conjunction with other forms of machine learning, as it is useful for performing
symbolic regressions and feature classifications.
• Saving time: Genetic algorithms are able to process large amounts of data much more
quickly than humans can. Additionally, these algorithms run free of human biases, and
are thereby able to come up with ideas that might otherwise not have been considered.
• Data and text classification: Genetic programming can quickly identify and classify
various forms of data without the need for human oversight. Genetic programming can
use data tree construction in order to optimize these classifications, especially when
dealing with big data.
• Ensuring network security: Rule evolution approaches have been successfully applied
to identify new attacks on networks. By quickly identifying intrusions, businesses and
organizations can ensure that they can respond to such attacks before they are able to
access confidential information.
It is a type of machine learning technique where a computer agent learns to perform a task
through repeated trial and error interactions with a dynamic environment. This learning approach
enables the agent to make a series of decisions that maximize a reward metric for the task
without human intervention and without being explicitly programmed to achieve the task –
Mathworks
Environment – a physical world where an agent learns and decides the actions to be performed
Reward – For each selected action by agent, the environment gives a reward. It’s usually a
Value Function – The value of state shows up the reward achieved starting from the state until
One of the best examples is of AlphaGo, a subsidiary of Alphabet, that developed a computer
called DeepMind, which went on to beat the best human player in the world in the board game
Go in 2016. This makes the world sit up and recognize RL’s significance as it was practically
impossible to code the extremely complex game Go. Similarly, for large and complex tasks,
computation becomes unworkable. From self-improving cars which tend to perform RL with
safety and precision, this technology can also be used for robots (without using manual
programming) and can figure out the configuration required for the apparatus in a data center.
Other players in RL are Mobileye, OpenAI, Google, and Uber. Google and DeepMind also
worked together to make its center’s energy efficient. This was made possible through an RL
algorithm which can study from assembled data, experiment through stimulation and finally
suggest when and how the cooling systems must be operated.
Steps of ’cause and effect’ for an RL Agent
• The artificial agent detects the input status (RL first identifies and formulates the
problem).
• The next step is determined by the strategy to be taken.
• The action is then performed and a reward/punishment and accordingly reinforcement are
provided.
• The informed status is recorded.
• Finally, the best action can further be adjusted to enhance results.