0% found this document useful (0 votes)
13 views22 pages

AML Unit 4

The document provides an overview of Genetic Algorithms (GAs), which are search-based algorithms inspired by natural selection and genetics, used for solving optimization problems in various applications. It explains key terminologies, the foundational concepts of GAs, and their operations such as selection, crossover, and mutation, as well as the relationship between GAs and Genetic Programming (GP). Additionally, it introduces Reinforcement Learning (RL) as a machine learning approach that enables agents to learn optimal behaviors through trial and error in an environment.

Uploaded by

traveller.syp
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views22 pages

AML Unit 4

The document provides an overview of Genetic Algorithms (GAs), which are search-based algorithms inspired by natural selection and genetics, used for solving optimization problems in various applications. It explains key terminologies, the foundational concepts of GAs, and their operations such as selection, crossover, and mutation, as well as the relationship between GAs and Genetic Programming (GP). Additionally, it introduces Reinforcement Learning (RL) as a machine learning approach that enables agents to learn optimal behaviors through trial and error in an environment.

Uploaded by

traveller.syp
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Unit - 4 : Introduction to Evolutionary Learning

A genetic algorithm is a search-based algorithm used for solving optimization problems in


machine learning. This algorithm is important because it solves difficult problems that would
take a long time to solve. It has been used in various real-life applications such as data centers,
electronic circuit design, code-breaking, image processing, and artificial creativity.
Genetic Algorithms are algorithms that are based on the evolutionary idea of natural selection
and genetics. GAs are adaptive heuristic search algorithms i.e. the algorithms follow an iterative
pattern that changes with time. It is a type of reinforcement learning where the feedback is
necessary without telling the correct path to follow. The feedback can either be positive or
negative.

Biological Background Of Genetic Algorithms

Genetics is derived from the Greek word, “genesis” that means to grow. The genetics decides the
heredity factors, resemblances, and differences between the offsprings in the process of
evolution. Genetic Algorithms are also derived from natural evolution
Some Terminologies In A Biological Chromosome

• Chromosomes: All genetic information of a species is stored chromosomes.


• Genes: The Chromosomes are divided into several parts called Genes.
• Alleles: The genes identify the characteristic of an individual. The possibility of a
combination of genes to form property is called Alleles. A gene can have different alleles.
• Gene Pool: All possible combinations of genes that are all alleles in a population pool is
called gene pool.
• Genome: The set of genes of a species is called a genome.
• Locus: Each gene has a position in a genome that is called locus.
• Genotype: A full combination of genes in an individual is called the genotype.
• Phenotype: A set of genotypes in a decoded form is called the phenotype.

Correlation Of A Chromosome With GA


The human body has chromosomes that are made of genes. A set of all genes of a specific
species is called the genome. In living beings, the genomes are stored in various chromosomes
while in GA all genes are stored in the same chromosome.

Foundation of Genetic Algorithms


Genetic algorithms are based on an analogy with genetic structure and behaviour of
chromosomes of the population. Following is the foundation of GAs based on this analogy –
1. Individual in population compete for resources and mate
2. Those individuals who are successful (fittest) then mate to create more offspring than
others
3. Genes from “fittest” parent propagate throughout the generation, that is sometimes parents
create offspring which is better than either parent.
4. Thus each successive generation is more suited for their environment.
Comparison between Natural Evolution and Genetic Algorithm Terminology
Natural Evolution Genetic Algorithm
Chromosome String
Gene Feature
Allele Feature Value
Genotype Coded String
Phenotype Decoded Structure

What Are Genetic Algorithms


The Genetic Algorithms stimulate the process as in natural systems for evolution. Charles
Darwin stated the theory of evolution that in natural evolution, biological beings evolve
according to the principle of “survival of the fittest”. The GA search is designed to encourage the
theory of “survival of the fittest”.

By simulating the process of natural selection, reproduction and mutation, the genetic algorithms
can produce high-quality solutions for various problems including search and optimization.

The GAs perform a random search to solve optimization problems. The GA uses techniques that
use the previous historical information to direct their search towards optimization in the new
search space.

Following are some of the basic terminologies that can help us to understand genetic algorithms:

1. Chromosome/Individual

A chromosome is a collection of genes. For example, a chromosome can be represented as a


binary string where each bit is a gene.

A chromosome is one of the solutions in the population. Genes are joined into a string to form
a Chromosome (solution).
2. Gene:

This is an element in a chromosome. Genes may determine the solution to the problem. They are
represented by a bit (0 or 1) string of random length. An individual is characterized by a set of
parameters (variables) known as Genes.

In a genetic algorithm, the set of genes of an individual is represented using a string, in terms of
an alphabet. Usually, binary values are used (string of 1s and 0s). We say that we encode the
genes in a chromosome.

3. Population:

Since an individual is represented as a chromosome, a population is a collection of such

chromosomes.

This is a subset of all the probable solutions that can solve the given problem. The process begins

with a set of individuals which is called a Population. Each individual is a solution to the

problem you want to solve.


4. Allele: This is the value given to a gene in a specific chromosome.
5. Fitness function

The fitness function helps in establishing the fitness of all individuals in the population. It
assigns a fitness score to every individual, which further determines the probability of being
chosen for reproduction. The higher the fitness score, the higher the chances of being chosen for
reproduction.

The fitness function determines how fit an individual is (the ability of an individual to compete
with other individuals). It gives a fitness score to each individual. The probability that an
individual will be selected for reproduction is based on its fitness score.

This is a function that uses a specific input to produce an improved output. The solution is used
as the input while the output is in the form of solution suitability.

The fitness function tells how close the solution is to the optimal solution. Fitness function is
determined in many ways such as the sum of all parameters related to the problem – Euclidean
distance, etc. There is no rule to evaluate fitness function.

In every iteration, the individuals are evaluated based on their fitness scores which are computed
by the fitness function.

Individuals who achieve a better fitness score represent better solutions and are more likely to be
chosen to crossover and passed on to the next generation.
For example, if genetic algorithms are used for feature selection, then the accuracy of the model
with those selected features would be the fitness function if it is a classification problem.

GENETIC OPERATORS

6. Selection

After calculating the fitness of every individual in the population, a selection process is used to
determine which of the individuals in the population will get to reproduce and create the
offspring that will form the next generation.

The idea of selection phase is to select the fittest individuals and let them pass their genes to the

next generation.

Two pairs of individuals (parents) are selected based on their fitness scores. Individuals with high

fitness have more chance to be selected for reproduction.

7. Crossover

This operator swaps the genetic information of two parents to reproduce an offspring. It is
performed on parent pairs that are selected randomly to generate a child population of equal size
as the parent population.

Generally, two individuals are chosen from the current generation and their genes are
interchanged between two individuals to create a new individual representing the offspring. This
process is also called mating or crossover.

For each pair of parents to be mated, a crossover point is chosen at random from within the
genes.

For example, consider the crossover point to be 3 as shown below.


Offspring are created by exchanging the genes of parents among themselves until the crossover
point is reached.

The new offspring are added to the population.

This represents mating between individuals. Two individuals are selected using selection
operator and crossover sites are chosen randomly. Then the genes at these crossover sites are
exchanged thus creating a completely new individual (offspring). For example –
It is a process of taking 2 individuals and producing a child from them. The reproduction process
after selection makes clones of the good stings. The crossover operator is applied over the strings
to produce a better offspring.

The implementation of the crossover operator is as follows:

1. Two individuals are selected randomly from the population to produce offsprings.
2. A cross-site is selected at random along the length of the string.
3. The values at the site are swapped.
The crossover performed can be a single-point crossover, two-point crossover, multipoint
crossover, etc. The single point crossover has one crossover site while a two-point crossover site
has 2 sites where the values are swapped.

8. Mutation:

This operator adds new genetic information to the new child population. This is achieved by
flipping some bits in the chromosome. Mutation solves the problem of local minimum and
enhances diversification. The following image shows how mutation is done.
The mutation is a random change in a chromosome to introduce new patterns to a chromosome.
For example, flipping a bit in a binary string

The key idea is to insert random genes in offspring to maintain the diversity in the population

to avoid premature convergence.

Termination

The algorithm terminates if the population has converged (does not produce offspring which are
significantly different from the previous generation). Then it is said that the genetic algorithm has

provided a set of solutions to our problem.

Simple Genetic Algorithm


In genetic algorithms, the best individuals mate to reproduce an offspring that is better than the
parents. Genetic operators are used for changing the genetic composition of this next generation.

By simulating the process of natural selection, reproduction and mutation, the genetic algorithms

can produce high-quality solutions for various problems including search and optimization.

Genetic programming

Video:
Genetic programming is a subset of genetic algorithms, the only difference between them
being the representation of the chromosome. Genetic algorithms deal with optimization
problems where the phenotype is based on point or vector, while in genetic programming
the phenotype is based on tree. In addition to this, they also can increase or decrease their
genotype by adding more terminals and instructions.
As we already know genetics programming is based on trees, it is hugely applied to
evolve decision and behavioral tress for game playing.

Sante Fe Ant trail


The Sante Fe Ant trail, is one of the most used examples of genetic programming. It
basically is a situation where a trail of food is placed around the map, and we need to
maximize the amount of food collected with a limited number of steps taken. We use
genetic programming to prepare a set of instructions for the ant to follow to solve this
problem.

Setting up the problem.


Irrespective of the algorithm there are two different preliminaries that need to be taken
care of in evolutionary computation. They are as follows:
1. Creation of initial population
2. Fitness function
From our prior knowledge, we know genetic algorithms can create a uniform population
randomly from the domain, but it is different for genetic programming. Genetic
programming needs to follow a problem-dependent grammar structure. It is done by first
defining the BNF grammar for the problem. After that, the depth of the tree is set. The
depth is decided according to the number of layers in the tree. We define a min and max
depth, that affects the initial population. Therefore, the initial population becomes
generation individuals that will be between the min and max generations while other
individuals can shrink and grow.

Function elements from grammar are randomly selected to create the tree. They then
branch out by randomly choosing the terminals, and similar to the genetic algorithm, we
discard individuals with lower values than min or higher values than max and keep on
sampling until we get an individual within the min and max range.

The fitness function for genetic programming can have different ways to determine the
fitness values, such as it can create test cases and evaluating how well an agent did on the
test. It can also use AI for evaluation. Though, we need to reject overly complex models
as our requirement is to decrease the model complexity so that it can generalize to new
inputs. To avoid the complex models we can assess the bloating of the model. Bloating
refers to the addition of more terminal nodes and depth to a tree, while the fitness value is
decreased only slightly.

Inspired by biological evolution and its fundamental mechanisms, GP software systems


implement an algorithm that uses random mutation, crossover, a fitness function, and multiple
generations of evolution to resolve a user-defined task. GP can be used to discover a functional
relationship between features in data (symbolic regression), to group data into categories
(classification), and to assist in the design of electrical circuits, antennae, and quantum
algorithms. GP is applied to software engineering through code synthesis, genetic improvement,
automatic bug-fixing, and in developing game-playing strategies, … and more.

Types of GP include:

• Tree-based Genetic Programming


• Stack-based Genetic Programming
• Linear Genetic Programming (LGP)
• Grammatical Evolution
• Extended Compact Genetic Programming (ECGP)

Genetic programming is iterative, and at each new stage of the algorithm, it chooses only the
fittest of the “offspring” to cross and reproduce in the next generation, which is sometimes
referred to as a fitness function. Just like in biological evolution, evolutionary algorithms can
sometimes have randomly mutating offspring, but since only the offspring that have the highest
fitness measure are reproduced, the fitness will almost always improve over generations.
Genetic programming will generally terminate once it reaches a predefined fitness measure.
Additionally, architecture-altering operations can be introduced to an already running program
in order to allow for new sources of information to be analyzed for a given fitness function.

Although originally proposed in 1950 by Alan Turing, it wasn’t until the 1980s that successful
genetic algorithms were first implemented. The first patented algorithm for genetic operations
was in 1988 by John Koza, who remains a leader in the field.

Genetic programming systems utilize a type of machine learning technique that can include
automatic programming without the need for manual interaction. This means that genetic
algorithms can utilize automatic program inductions to run as new information is ingested, so
that the programs can be optimized automatically. Genetic or evolutionary algorithms have a
variety of uses, particularly around domains where an exact solution is not known in advance,
or when finding an approximate solution is deemed appropriate. Genetic programming is often
used in conjunction with other forms of machine learning, as it is useful for performing
symbolic regressions and feature classifications.

Genetic programming can help organizations and businesses by:

• Saving time: Genetic algorithms are able to process large amounts of data much more
quickly than humans can. Additionally, these algorithms run free of human biases, and
are thereby able to come up with ideas that might otherwise not have been considered.
• Data and text classification: Genetic programming can quickly identify and classify
various forms of data without the need for human oversight. Genetic programming can
use data tree construction in order to optimize these classifications, especially when
dealing with big data.

• Ensuring network security: Rule evolution approaches have been successfully applied
to identify new attacks on networks. By quickly identifying intrusions, businesses and
organizations can ensure that they can respond to such attacks before they are able to
access confidential information.

• Supporting other machine learning methods: Genetic programming can be included


in larger systems of machine learning, such as with neural networks. By having genetic
programming focus on only specific subsets of data, organizations can ensure that this
data is quickly processed for ingestion into larger or different learning methods. This
allows organizations to gain as much useful and actionable information as possible.

Introduction to reinforcement Learning : the process of encouraging or establishing a belief


or pattern of behaviour./ the action or process of reinforcing or strengthening.
‘Reinforcement Learning’ is a field in artificial intelligence(s) machine learning. Inspired by
behaviorist psychology, this field renders software agents and machines to ascertain behavior,
take actions accordingly, ultimately maximizing their performance. Put simply, computers can
reckon/learn themselves by experimenting along with responses from the environment on how
things must be done and keep adapting while getting better each time leading to maximization.
For eg., controlling computers are trained to play games, schedule jobs such as elevator
scheduling (elevator secontrol limbs.
Reinforcement Learning (RL)
RL was documented more than 100 years ago by psychologist Edward Thorndike. This
technology, rather than letting the programmer telling it what to do, lets the computer/software
agent performs tasks on its own by slowly figuring out the best way. The interaction lies
between two elements – environment and the learning agent. On the way, the agent is rewarded
by the environment, known as the reinforcement signal. On the reward basis, the agent uses the
knowledge and makes choices for the next action. In essence, computers learn to like people
without the need of explicit training. Punishments to happen along the way for the artificial
agent, but with constant trial and error methods, agents learn and arrive at the best method
(based on raw inputs).
reinforcement learning, there is no answer but the reinforcement agent decides what to do to
perform the given task. In the absence of a training dataset, it is bound to learn from its
experience.
Reinforcement Learning (RL) is the science of decision making. It is about learning the
optimal behavior in an environment to obtain maximum reward. In RL, the data is accumulated
from machine learning systems that use a trial-and-error method. Data is not part of the input
that we would find in supervised or unsupervised machine learning.
Reinforcement learning uses algorithms that learn from outcomes and decide which action to
take next. After each action, the algorithm receives feedback that helps it determine whether
the choice it made was correct, neutral or incorrect. It is a good technique to use for automated
systems that have to make a lot of small decisions without human guidance.
Reinforcement learning is an autonomous, self- teaching system that essentially learns by trial
and error. It performs actions with the aim of maximizing rewards, or in other words, it is
learning by doing in order to achieve the best outcomes.

What is Reinforcement Learning?


Reinforcement Learning is a part of machine learning. Here, agents are self-trained on reward
and punishment mechanisms. It’s about taking the best possible action or path to gain maximum
rewards and minimum punishment through observations in a specific situation. It acts as a signal
to positive and negative behaviors. Essentially an agent (or several) is built that can perceive and
interpret the environment in which is placed, furthermore, it can take actions and interact with it.
Reinforcement learning, a type of machine learning, in which agents take actions in an
environment aimed at maximizing their cumulative rewards – NVIDIA

Reinforcement learning (RL) is based on rewarding desired behaviors or punishing undesired


ones. Instead of one input producing one output, the algorithm produces a variety of outputs and
is trained to select the right one based on certain variables – Gartner

It is a type of machine learning technique where a computer agent learns to perform a task
through repeated trial and error interactions with a dynamic environment. This learning approach
enables the agent to make a series of decisions that maximize a reward metric for the task
without human intervention and without being explicitly programmed to achieve the task –
Mathworks

Simplified Definition of Reinforcement Learning


Through a series of Trial and Error methods, an agent keeps learning continuously in an
interactive environment from its own actions and experiences. The only goal of it is to find a
suitable action model which would increase the total cumulative reward of the agent. It learns via
interaction and feedback.

Explanation to Reinforcement Learning

Here what do you see?


You can see a dog and a master. Let’s imagine you are training your dog to get the stick. Each
time the dog gets a stick successfully, you offered him a feast (a bone let’s say). Eventually, the
dog understands the pattern, that whenever the master throws a stick, it should get it as early as it
can to gain a reward (a bone) from a master in a lesser time.

Terminologies used in Reinforcement Learning

Agent – is the sole decision-maker and learner

Environment – a physical world where an agent learns and decides the actions to be performed

Action – a list of action which an agent can perform

State – the current situation of the agent in the environment

Reward – For each selected action by agent, the environment gives a reward. It’s usually a

scalar value and nothing but feedback from the environment

Policy – the agent prepares strategy(decision-making) to map situations to actions.

Value Function – The value of state shows up the reward achieved starting from the state until

the policy is executed


Model – Every RL agent doesn’t use a model of its environment. The agent’s view maps state-

action pairs probability distributions over the states

Reinforcement Learning Workflow

-Create the Environment

– Define the reward

– Create the agent

– Train and validate the agent

– Deploy the policy

One of the best examples is of AlphaGo, a subsidiary of Alphabet, that developed a computer
called DeepMind, which went on to beat the best human player in the world in the board game
Go in 2016. This makes the world sit up and recognize RL’s significance as it was practically
impossible to code the extremely complex game Go. Similarly, for large and complex tasks,
computation becomes unworkable. From self-improving cars which tend to perform RL with
safety and precision, this technology can also be used for robots (without using manual
programming) and can figure out the configuration required for the apparatus in a data center.
Other players in RL are Mobileye, OpenAI, Google, and Uber. Google and DeepMind also
worked together to make its center’s energy efficient. This was made possible through an RL
algorithm which can study from assembled data, experiment through stimulation and finally
suggest when and how the cooling systems must be operated.
Steps of ’cause and effect’ for an RL Agent
• The artificial agent detects the input status (RL first identifies and formulates the
problem).
• The next step is determined by the strategy to be taken.
• The action is then performed and a reward/punishment and accordingly reinforcement are
provided.
• The informed status is recorded.
• Finally, the best action can further be adjusted to enhance results.

How is reinforcement learning different from supervised learning?


In supervised learning, the model is trained with a training dataset that has a correct answer key.
The decision is done on the initial input given as it has all the data that’s required to train the
machine. The decisions are independent of each other so each decision is represented through a
label. Example: Object Recognition

Unsupervised, Exploitation and Exploration of RL Systems


RL is a form of unsupervised learning where the agent is left to learn in the environment
provided and learns by gradually adjusting. Further to this, the RL agent tries to learn through
the process of exploitation and exploration. Exploitation implies that once the agent has
achieved a satisfactory result and rewarded, it can exploit the same technique again to achieve
results. Exploration implies that an RL agent might try different strategies which could result in
better rewards and recognition, hence exploring the situations. The two strategies must work
collectively.
Limitations
There are limitations to RL too. The expense of memory being able to store values could be
complex as the problem in itself is complex. Moreover, similar behaviors occur too often, while
modularity has to be introduced to prevent repetition. There is also the limiting factor of
perception (Perceptual Aliasing) ultimately affecting the functioning of the algorithm.
Types of Reinforcement Learning

You might also like