0% found this document useful (0 votes)
151 views7 pages

Unit 5 Machine Learning Aktu

The document provides an overview of Genetic Algorithms, their components, and processes, including initialization, fitness assignment, selection, reproduction, and termination. It also discusses Genetic Programming, its applications, and differences from traditional Machine Learning. Additionally, it covers Q-learning as a model-free reinforcement learning algorithm, detailing its key components and working mechanism.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
151 views7 pages

Unit 5 Machine Learning Aktu

The document provides an overview of Genetic Algorithms, their components, and processes, including initialization, fitness assignment, selection, reproduction, and termination. It also discusses Genetic Programming, its applications, and differences from traditional Machine Learning. Additionally, it covers Q-learning as a model-free reinforcement learning algorithm, detailing its key components and working mechanism.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Unit 5

Genetic Algorithms: An illustrative example, Hypothesis space search, Genetic Programming, Models
of Evolution and Learning; Learning first order rules- sequential covering algorithms-General to
specific beam search-FOIL; REINFORCEMENT LEARNING - The Learning Task, Q learning.

Genetic Algorithms
A genetic algorithm is an adaptive heuristic search algorithm inspired by "Darwin's theory of
evolution in Nature." It is used to solve optimization problems in machine learning. It is one of the
important algorithms as it helps solve complex problems that would take a long time to solve. Genetic
Algorithms are being widely used in different real-world applications, for example, Designing
electronic circuits, code-breaking, image processing, and artificial creativity.

Before understanding the Genetic algorithm, let's first understand basic terminologies to better
understand this algorithm:
o Population: Population is the subset of all possible or probable solutions, which can solve the
given problem.
o Chromosomes: A chromosome is one of the solutions in the population for the given problem,
and the collection of gene generate a chromosome.
o Gene: A chromosome is divided into a different gene, or it is an element of the chromosome.
o Allele: Allele is the value provided to the gene within a particular chromosome.
o Fitness Function: The fitness function is used to determine the individual's fitness level in the
population. It means the ability of an individual to compete with other individuals. In every
iteration, individuals are evaluated based on their fitness function.
o Genetic Operators: In a genetic algorithm, the best individual mate to regenerate offspring
better than parents. Here genetic operators play a role in changing the genetic composition of
the next generation.
o Selection
After calculating the fitness of every existent in the population, a selection process is used to determine
which of the individualities in the population will get to reproduce and produce the seed that will form
the coming generation.

Types of selection styles available


o Roulette wheel selection
o Event selection
o Rank- grounded selection
So, now we can define a genetic algorithm as a heuristic search algorithm to solve optimization
problems. It is a subset of evolutionary algorithms, which is used in computing. A genetic algorithm
uses genetic and natural selection concepts to solve optimization problems.
Genetic Algorithm Work
The genetic algorithm works on the evolutionary generational cycle to generate high-quality solutions.
These algorithms use different operations that either enhance or replace the population to give an
improved fit solution.
It basically involves five phases to solve the complex optimization problems, which are given as
below:
o Initialization
o Fitness Assignment
o Selection
o Reproduction
o Termination
1. Initialization
The process of a genetic algorithm starts by generating the set of individuals, which is called
population. Here each individual is the solution for the given problem. An individual contains or is
characterized by a set of parameters called Genes. Genes are combined into a string and generate
chromosomes, which is the solution to the problem. One of the most popular techniques for
initialization is the use of random binary strings.

2. Fitness Assignment
Fitness function is used to determine how fit an individual is? It means the ability of an individual to
compete with other individuals. In every iteration, individuals are evaluated based on their fitness
function. The fitness function provides a fitness score to each individual. This score further determines
the probability of being selected for reproduction. The high the fitness score, the more chances of
getting selected for reproduction.
3. Selection
The selection phase involves the selection of individuals for the reproduction of offspring. All the
selected individuals are then arranged in a pair of two to increase reproduction. Then these individuals
transfer their genes to the next generation.
There are three types of Selection methods available, which are:
o Roulette wheel selection
o Tournament selection
o Rank-based selection
4. Reproduction
After the selection process, the creation of a child occurs in the reproduction step. In this step, the
genetic algorithm uses two variation operators that are applied to the parent population. The two
operators involved in the reproduction phase are given below:
o Crossover: The crossover plays a most significant role in the reproduction phase of the genetic
algorithm. In this process, a crossover point is selected at random within the genes. Then the
crossover operator swaps genetic information of two parents from the current generation to
produce a new individual representing the offspring.
The genes of parents are exchanged among themselves until the crossover point is met. These
newly generated offspring are added to the population. This process is also called or crossover.
5. Termination
After the reproduction phase, a stopping criterion is applied as a base for termination. The algorithm
terminates after the threshold fitness solution is reached. It will identify the final solution as the best
solution in the population.

Hypothesis space search


In the hypothesis space search method, we can see that the gradient descent search in backpropagation
moves smoothly from one hypothesis to another. On the other hand, the genetic algorithm search can
move much more abruptly. It replaces the parent hypotheses with an offspring that can be very
different from the parent. Due to this reason, genetic algorithm search has lower chances of it falling
into the same kind of local minima that plaques the gradient descent methods.
There is one practical difficulty that is often encountered in genetic algorithms, it is crowding.
Crowding can be defined as the phenomenon in which some individuals that are more fit in
comparison to others, reproduce quickly, therefore the copies of this individual take over a larger
fraction of the population. Most of the strategies used in the genetic algorithms are inspired by
biological evolution. One such other strategy used is fitness sharing, in which the measured fitness of
an individual is decreased by the presence of another individual of a similar kind.

Genetic Programming:
Genetic programming is a form of artificial intelligence that mimics natural selection in order to find
an optimal result. Genetic programming is iterative, and at each new stage of the algorithm, it chooses
only the fittest of the “offspring” to cross and reproduce in the next generation, which is sometimes
referred to as a fitness function. Just like in biological evolution, evolutionary algorithms can
sometimes have randomly mutating offspring, but since only the offspring that have the highest fitness
measure are reproduced, the fitness will almost always improve over generations. Genetic
programming will generally terminate once it reaches a predefined fitness measure. Additionally,
architecture-altering operations can be introduced to an already running program in order to allow for
new sources of information to be analyzed for a given fitness function.
Genetic programming systems utilize a type of machine learning technique that can include automatic
programming without the need for manual interaction. This means that genetic algorithms can utilize
automatic program inductions to run as new information is ingested, so that the programs can be
optimized automatically. Genetic or evolutionary algorithms have a variety of uses, particularly
around domains where an exact solution is not known in advance, or when finding an approximate
solution is deemed appropriate. Genetic programming is often used in conjunction with other forms
of machine learning, as it is useful for performing symbolic regressions and feature classifications.
Genetic programming can help organizations and businesses by:
• Saving time: Genetic algorithms are able to process large amounts of data much more quickly
than humans can. Additionally, these algorithms run free of human biases, and are thereby able
to come up with ideas that might otherwise not have been considered.

• Data and text classification: Genetic programming can quickly identify and classify various
forms of data without the need for human oversight. Genetic programming can use data tree
construction in order to optimize these classifications, especially when dealing with big data.

• Ensuring network security: Rule evolution approaches have been successfully applied to
identify new attacks on networks. By quickly identifying intrusions, businesses and
organizations can ensure that they can respond to such attacks before they are able to access
confidential information.

• Supporting other machine learning methods: Genetic programming can be included in


larger systems of machine learning, such as with neural networks. By having genetic
programming focus on only specific subsets of data, organizations can ensure that this data is
quickly processed for ingestion into larger or different learning methods. This allows
organizations to gain as much useful and actionable information as possible.

For a better understanding, here is a table that summarizes the main differences between Machine
Learning and Genetic Programming:

Machine Learning Genetic Programming


Definition A subfield of computer science that A type of evolutionary computation that
involves developing algorithms and involves the use of computer programs to
models that enable computers to learn solve complex problems.
from data and make predictions or
decisions without being explicitly
programmed.
Learning Learning from data through optimization Evolutionary optimization through genetic
Method of a specific objective function. operators like selection, crossover, and
mutation.
Problem Data-driven, where input features are Programmatic, where a population of
Representation used to train a model to make predictions. computer programs evolved over time to find
a solution.
Search Space Limited to the range of input features that Can search a much larger space of potential
are provided. solutions through programmatic
representation.
Application Widely used in image recognition, Used in optimization, control, prediction, and
speech recognition, natural language classification tasks where traditional machine
processing, recommendation systems, learning approaches may be difficult or
and autonomous vehicles. infeasible.
Computational Typically requires a large amount of It can be computationally expensive, as it
Requirements labelled data and significant requires the evaluation of many candidate
solutions over many generations.
computational resources for training
models.

Evaluation It can be evaluated using metrics like Evaluated based on the fitness of each
accuracy, precision, and recall. program's ability to solve the problem at hand.
Interpretability Some models may be difficult to Programmatic representation makes it easier to
interpret, making it challenging to understand how solutions evolved.
understand how decisions are being
made.
Limitations It can be prone to overfitting, bias, and Limited by the expressiveness of the
ethical concerns. programmatic representation and the
complexity of the problem at hand.

GENETIC ALGORITHM: MODELS OF EVOLUTION


A genetic algorithm is a search type that was totally inspired by the theory of Charles Darwin, the
theory of natural evolution-it covers the process of selection naturally where the individuals fit get
selected for reproduction for the production of offspring to the next generation.
Genetic algorithm covers the Darwinian model of evolution which states that natural selection can be
possible as well as genetic mutation variation through reallocating and process of mutation.
However, there are other models of evolution and lifetime adaption, they are as follows:
1. Lamarckian model and
2. Baldwinian model.

LAMARCKIAN MODEL:
The Lamarckian model defines that the traits of an individual get his or her lifetime can also be passed
to its further offspring. It was then named after the French biologist – Jean Baptiste Lamarck. Though
natural biology has excluded or disregarded the Lamarckism theory we know that only the information
in a particular genotype can be transmitted.
But from a computation point of view, it has also been stated that the Lamarckian model also gives
better results for some kinds of problems.
In the Lamarckian model, there is a local search operator which examines the neighbors getting new
traits, and if a good chromosome is found then it itself becomes the offspring.

BALDWINIAN MODEL:
As per the Baldwininan model, a chromosome can have the superficiality of encoding learning
beneficial behavior types. Unlike the Lamarckian model, here it does not transmit the traits acquired
to the next generation, and neither it completely ignores the traits acquired by the Darwinian model.

The Baldwinian model out of these extremes of two has the tendency of such kind of an individual
for acquiring traits that are encoded except the traits of themselves.
In the Baldwinian model, there is a local search operator that exists that examines the neighbor on
acquiring new traits, and if a good chromosome is found at the step, it then assigns only the improved
fitness factor to the particular chromosomes and it does not modify the chromosome itself. The ability
on acquiring the trait though it does not get passed directly on to the future generations.

Q-Learning
Q-learning is a model-free reinforcement learning algorithm that helps an agent learn the optimal
action-selection policy by iteratively updating Q-values, which represent the expected rewards of
actions in specific states.

Reinforcement Learning is a paradigm of the Learning Process in which a learning agent learns, over
time, to behave optimally in a certain environment by interacting continuously in the environment.
The agent during its course of learning experiences various situations in the environment it is in. These
are called states. The agent while being in that state may choose from a set of allowable actions which
may fetch different rewards (or penalties). Over time, The learning agent learns to maximize these
rewards to behave optimally at any given state it is in. Q-learning is a basic form of Reinforcement
Learning that uses Q-values (also called action values) to iteratively improve the behavior of the
learning agent.
This example helps us to better understand reinforcement learning.

Q-Learning
Q-learning in Reinforcement Learning
Q-learning is a popular model-free reinforcement learning algorithm used in machine learning and
artificial intelligence applications. It falls under the category of temporal difference learning
techniques, in which an agent picks up new information by observing results, interacting with the
environment, and getting feedback in the form of rewards.

Key Components of Q-learning


1. Q-Values or Action-Values
Q-values are defined for states and actions. Q (S, A) is an estimation of how good is it to take the
action A at the state S. This estimation of Q (S, A) will be iteratively computed using the TD- Update
rule which we will see in the upcoming sections.
2. Rewards and Episodes
An agent throughout its lifetime starts from a start state, and makes several transitions from its current
state to a next state based on its choice of action and also the environment the agent is interacting in.
At every step of transition, the agent from a state takes an action, observes a reward from the
environment, and then transits to another state. If at any point in time, the agent ends up in one of the
terminating states that means there are no further transitions possible. This is said to be the completion
of an episode.
3. Temporal Difference or TD-Update
The Temporal Difference or TD-Update rule can be represented as follows:
Q(S,A)←Q(S,A)+α(R+γQ(S’,A’)–Q(S,A))
This update rule to estimate the value of Q is applied at every time step of the agent’s interaction with
the environment. The terms used are explained below:
• S: Current State of the agent.
• A: Current Action Picked according to some policy.
• S’: Next State where the agent ends up.
• A’: Next best action to be picked using current Q-value estimation, i.e. pick the action with
the maximum Q-value in the next state.
• R: Current Reward observed from the environment in Response of current action.
• γγ(>0 and <=1) : Discounting Factor for Future Rewards. Future rewards are less valuable than
current rewards so they must be discounted. Since Q-value is an estimation of expected
rewards from a state, discounting rule applies here as well.
• αα: Step length taken to update the estimation of Q(S, A).

4. Selecting the Course of Action with ϵ-greedy policy


A simple method for selecting an action to take based on the current estimates of the Q-value is the ϵ-
greedy policy. This is how it operates:
Superior Q-Value Action (Exploitation):
• With a probability of 1−ϵ, representing the majority of cases,
• Select the action with the highest Q-value at the moment.
• In this instance of exploitation, the agent chooses the course of action that, given its current
understanding, it feels is optimal.

Q-Learning Working Mechanism:


Q-learning models engage in an iterative process where various components collaborate to train the
model. This iterative procedure encompasses the agent exploring the environment and continuously
updating the model based on this exploration.
The key components of Q-learning include:
1. Agents: Entities that operate within an environment, making decisions and taking actions.
2. States: Variables that identify an agent’s current position in the environment.
3. Actions: Operations undertaken by the agent in specific states.
4. Rewards: Positive or negative responses provided to the agent based on its actions.
5. Episodes: Instances where an agent concludes its actions, marking the end of an episode.
6. Q-values: Metrics used to evaluate actions at specific states.

You might also like