0% found this document useful (0 votes)

130 views79 pages

Unit 4

Uploaded by

21311a1962

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

130 views79 pages

Unit 4

Uploaded by

21311a1962

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 79

MACHINE LEARNING

UNIT-IV

Mr. Kasi Bandla

Asst.Professor
Department of ECM
SNIST
E-mail: [email protected]

1
CONTENTS

 Dimensionality Reduction
 Linear Discriminant Analysis
 Principal Component Analysis
 Factor Analysis
 Independent Component Analysis
 Locally Linear Embedding
 Isomap Least Squares Optimization
 Evolutionary Learning
 Genetic algorithms
 Genetic Offspring
 Genetic Operators
 Using Genetic Algorithms
 Reinforcement Learning
 Overview
 Getting Lost Example
2
DIMENSIONALITY REDUCTION
 Reduction is divided into two components, feature
Dimensionality reduction refers to techniques
for reducing the number of input variables in training data.

 Dimensionality reduction is a statistical technique

of reducing the amount of random variables in
a problem by obtaining a set of principal variables. The
process of dimensionality selection and feature extraction.

 It reduces the time and storage space required. It becomes

easier to visualize the data when reduced to very low
dimensions such as 2D or 3D.
3
COMMON DIMENSIONALITY REDUCTION TECHNIQUES

 Missing Value Ratio. Suppose you're given a

dataset. ...
 Low Variance Filter. ...
 High Correlation filter. ...
 Random Forest. ...
 Backward Feature Elimination. ...
 Forward Feature Selection. ...
 Factor Analysis. ...
 Principal Component Analysis (PCA)
 Linear Discernment Analysis(LDA)
4
ADVANTAGES OF DIMENSIONALITY REDUCTION

 It helps in data compression, and hence reduced storage

space. It reduces computation time. It also helps remove
redundant features.

Disadvantages
 Non-linear data is mapped and transformed onto a
higher-dimensional space. Then PCA is used
to reduce the dimensions. However, one downside of this
approach is that it is computationally very expensive.

5
LINEAR DISCRIMINANT ANALYSIS (LDA)

 Linear discriminant analysis is a supervised classification

method that is used to create machine learning models.
These models based on dimensionality reduction are used in
the application

 LDA is primarily used here to reduce the number of features

to a more manageable number before classification.

 Discriminate analysis is a statistical method that is used by

researchers to help them understand the relationship
between a "dependent variable" and "independent
variables.”
6
LDA PROCEDURE
1. Compute the d-dimensional mean vectors.
2. Compute the scatter matrices(Covariance)
3. Compute the eigenvectors and corresponding eigen values for
the scatter matrices.
4. Sort the eigen values and choose those with the largest eigen
values to form a d×k dimensional matrix
5. Transform the samples onto the new subspace.
Note: A scatter matrix is a pair-wise scatter plot of several
variables presented in a matrix format. It can be used to
determine whether the variables are correlated and whether the
correlation is positive or negative.
7
 THE PRINCIPAL INSIGHT OF LDA IS THAT THE COVARIANCE
MATRIX CAN TELL US ABOUT THE SCATTER WITHIN A DATASET,
WHICH IS THE AMOUNT OF SPREAD THAT THERE IS WITHIN THE
DATA.
THE WAY TO FIND THIS SCATTER IS TO MULTIPLY THE
COVARIANCE BY THE PC, THE PROBABILITY OF THE CLASS
(THAT IS, THE NUMBER OF DATA POINTS
THERE ARE IN THAT CLASS DIVIDED BY THE TOTAL NUMBER).
ADDING THE VALUES OF THIS FOR ALL OF THE CLASSES GIVES
US A MEASURE OF THE WITHIN-CLASS SCATTER OF THE
DATASET:

8
9
10
11
PRINCIPAL COMPONENTS ANALYSIS (PCA)

 Principal component analysis (PCA) is a standard tool in

modern data analysis - in diverse fields from neuroscience
to computer graphics. It is very useful method for
extracting relevant information from confusing data sets.

 Principal component analysis (PCA) is a statistical

procedure that uses an orthogonal transformation to convert
a set of observations of possibly correlated variables into a
set of values of linearly uncorrelated variables
called principal components. The number of principal
components is less than or equal to the number of original
variables.
12
THE PRINCIPAL COMPONENTS ANALYSIS ALGORITHM

13
PCA APPROACH

 Standardize the data.

 Perform Singular Vector Decomposition to get the
Eigenvectors and Eigen values.
 Sort eigen values in descending order and choose
the k- eigenvectors
 Construct the projection matrix from the
selected k- eigenvectors.
 Transform the original dataset via projection matrix to obtain
a k-dimensional feature subspace.

14
15
16
Goals
 The main goal of a PCA analysis is to identify
patterns in data.
 PCA aims to detect the correlation between
variables.
 It attempts to reduce the dimensionality.

Transformation
This transformation is defined in such a way that the
first principal component has the largest
possible variance and each succeeding component
in turn has the next highest possible variance.
17
LIMITATION OF PCA

The results of PCA depend on the scaling of the

variables.
A scale-invariant form of PCA has been developed.

Applications
 Spike-triggered covariance analysis in

Neuroscience.
 Quantitative Finance.

 Image Compression.

 Facial Recognition.

 Other applications like Medical Data

correlation. 18
PCA RELATION WITH THE MULTI-LAYER PERCEPTRON

 The auto-associative MLP actually computes something

very similar to the principal components of the data in the
hidden nodes, and this is one of the ways that we can
understand what the network is doing.

 Computing the principal components with a neural

network isn’t necessarily a good idea.

 PCA is linear (it just rotates and translates the axes). This is
clear, the hidden nodes that are computing PCA, and they
are effectively a bit like a Perceptron—they can only
perform linear tasks. 19
 The predictor variables are multi collinear in
nature which is overcome by using Principal
Component Analysis (PCA) which resulted in a
new set of independent variables that are taken
for predicting the results using Multilayer Layer
Perceptron (MLP) model.

 To evaluate the prediction ability of the model,

we compare the performance of models using a
common error measure. The empirical results
reveal that the proposed approach is a promising
alternate to different fields.

20
KERNEL PCA
 PCA is a linear method. Kernel PCA uses a kernel function to project
dataset into a higher dimensional feature space, where it is linearly
separable. It is similar to the idea of Support Vector Machines. There
are various kernel methods like linear, polynomial, Sigmoid and
gaussian.
 In the field of multivariate statistics, kernel principal component
analysis (KPCA) is an extension of principal component
analysis (PCA) using techniques of kernel methods. Using a kernel,
the originally linear operations of PCA are performed in a
reproducing kernel Hilbert space.
 The largest difference of the projections of the points onto the
eigenvector (new coordinates), KPCA is a circle and PCA is a straight
line, so KPCA gets higher variance than PCA.

21
22
THE KERNEL PCA ALGORITHM

23
FACTOR ANALYSIS
 Factor analysis is a technique that is used to reduce a large
number of variables into fewer numbers of factors. This
technique extracts maximum common variance from all
variables and puts them into a common score.
 Factor analysis is a statistical data reduction and
analysis technique that strives to explain correlations among
multiple outcomes as the result of one or more underlying
explanations, or factors. The technique involves data
reduction, as it attempts to represent a set of variables
by a smaller number.
 The difference between factor analysis and principal
component analysis. Factor analysis explicitly assumes the
existence of latent factors underlying the observed
data. PCA instead seeks to identify variables that are
composites of the observed variables. 24
INDEPENDENT COMPONENTS ANALYSIS (ICA)
 Independent Component Analysis (ICA) is a machine
learning technique to separate independent sources from a mixed
signal/Data. Unlike principal component analysis which focuses on
maximizing the variance of the data points, the independent
component analysis focuses on independence.

 Independent component analysis (ICA) is a statistical and

computational technique for revealing hidden factors that underlie sets
of random variables, measurements, or signals. ICA defines a
generative model for the observed multivariate data, which is
typically given as a large database of samples.

 Both are statistical transformations.

 PCA: information from second order statistics.
 ICA: information that goes up to high order statistics. 25
LOCALLY LINEAR EMBEDDING
 Locally linear embedding (LLE) seeks a lower-dimensional
projection of the data which preserves distances
within local neighborhoods. It can be thought of as a series
of local Principal Component Analyses which are globally
compared to find the best non-linear embedding.

 The Locally Linear Embedding algorithm is a type of typical

manifold learning algorithm. The main idea of LLE is to solve
globally nonlinear problems using locally linear fitting, which is
based on the assumption that data lying on a nonlinear manifold
can be viewed as linear in local areas.

26
THE LOCALLY LINEAR EMBEDDING ALGORITHM

27
THE LLE ALGORITHM PRODUCES A VERY INTERESTING
RESULT ON THE IRIS DATASET: IT SEPARATES THE THREE
GROUPS INTO THREE POINTS (FIGURE 6.12). THIS SHOWS
THAT THE ALGORITHM WORKS VERY WELL ON THIS TYPE OF
DATA, BUT DOESN’T GIVE US ANY HINTS AS TO WHAT ELSE IT
CAN DO.

28
FIGURE 6.13 SHOWS A COMMON DEMONSTRATION DATASET
FOR THESE ALGORITHMS. KNOWN AS THE SWISSROLL FOR
OBVIOUS REASONS, IT IS TRICKY TO FIND A 2D
REPRESENTATION OF THE 3D DATA BECAUSE IT IS ROLLED UP.
THE RIGHT OF FIGURE 6.13 SHOWS THAT LLE CAN
SUCCESSFULLY UNROLL IT.

29
MULTI-DIMENSIONAL SCALING (MDS)

 Like PCA, MDS tries to find a linear approximation to the full data
space that embeds the data into a lower dimensionality.
 In the case of MDS the embedding tries to preserve the distances
between all pairs of points. It turns out that if the space is
Euclidean, then the two methods are identical.
 We use the same notational setup as previously, starting with data
points x1, x2, . . . , xN 2 RM. We choose a new dimensionality L <
M and compute the embedding so that the data points are z1, z2, . .
. zN 2 RL. As usual, we need a cost function to minimize. There are
lots of choices for MDS cost functions, but the more common ones
are:

30
31
32
 THIS CLASSICAL MDS ALGORITHM WORKS FINE ON FLAT
MANIFOLDS(VARIOUS AND MANY) DATA SPACES.

 we are interested in manifolds that are not flat, and this is

handled by Isomap. This algorithm has to construct the
distance matrix for all pairs of data points on the manifold,
and so the distances can’t be computed exactly.

 Isomap approximates them by assuming that the distances

between pairs of points that are close together are good, It
builds up the distances between points that are far away by
finding paths that run through points that are close together,
i.e., that are neighbours, and then uses normal MDS on this
distance matrix:

33
34
EVOLUTIONARY LEARNING

 An EL uses mechanisms inspired by biological evolution, such as

reproduction, mutation, recombination, and selection. Evolutionary
algorithms often perform well approximating solutions to all types
of problems because they ideally do not make any assumption
about the underlying fitness landscape.
 The genetic algorithm models the genetic process that gives rise to
evolution. In particular, it models sexual reproduction, where both
parents give some genetic information to their offspring.
 The genetic algorithm shows many of the things that are best and
worst about machine learning: it is often, but not always, very
effective, it has an array of parameters that are crucial, but hard to
set, and it is impossible to guarantee that it will find a result that is
any good at all.

35
EACH ADULT IN THE MATING PAIR PASSES ONE OF
THEIR TWO CHROMOSOMES TO THEIR OFFSPRING

36
THE GENETIC ALGORITHM (GA)

 The Genetic Algorithm is a computational approximation to

how evolution performs search, which is by producing
modifications of the parent genomes in their offspring and
thus producing new individuals with different fitness. The
computational procedure is

 a method for representing problems as chromosomes

 a way to calculate the fitness of a solution
 a selection method to choose parents
 a way to generate offspring by breeding the parents

37
STRING REPRESENTATION

 The first thing that we need is some way to

represent the individual solutions, in analogy to
the chromosome.
 GAs use a string, with each element of the string
(equivalent to the gene) being chosen from
some alphabet. The different values in the
alphabet, which is often just binary, are
analogous to the alleles.
 For the problem, we are trying to solve to work
out a way of encoding the description of a
solution as a string. We then create a set of
random strings to be our initial population.
38
EVALUATING FITNESS
 The fitness function can be seen as an oracle that takes a
string as an argument and returns a value for that string.
Together with the string encoding the fitness function forms
the problem-specific part of the GA.

 For the knapsack problem, we decided that we wanted to

make the bag as full as possible. So we would need to know
the volume of each item that we want to put into the
knapsack, and then for a given string that says which things
should be taken, and which should not, we can compute the
total volume. This is then a possible fitness function.

39
POPULATION

 We can now measure the fitness of any string. The GA

works on a population of strings, with the first generation
usually being created randomly. The fitness of each string
is then evaluated, and that first generation is bred together
to make a second generation, which is then used to
generate a third, and so on.

 After the initial population is chosen randomly, the

algorithm evolves to produce each successive generation,
with the hope being that there will be progressively fitter
individuals in the populations as the number of
generations increases.
40
GENERATING OFFSPRING: PARENT SELECTION

 For the current generation we need to select those strings that will
be used to generate new offspring. The idea here is that average
fitness will improve if we select strings that are already relatively
fit compared to the other members of the population (following
natural selection), which is exploitation of our current population.

 However, it is also good to allow some exploration in there,

which means that we have to allow some possibility of weak
strings being considered.

 If strings are chosen proportionally to their fitness, so that fitter

strings are more likely to be chosen to enter the ‘mating pool’,
then this allows for both options.
41
42
GENERATING OFFSPRING: GENETIC OPERATORS

 Crossover
 Crossover is the operator that performs global exploration, since
the strings that are produced are radically different to both parents
in at least some places. The hope is that sometimes we will take
good parts of both solutions and put them together to make an
even better solution. The different forms of the crossover
operator.

 (a) Single point crossover. A position in the string is chosen at

random, and the offspring is made up of the first part of parent 1
and the second part of parent 2.
 (b) Multi-point crossover. Multiple points are chosen, with the
offspring being made in the same way.
 (c) Uniform crossover. Random numbers are used to select which43
 Mutation
 The other genetic operator is mutation, which effectively
performs local random search. The value of any element
of the string can be changed, governed by some (usually
low) probability p. For our binary alphabet in the
knapsack problem, mutation causes a bit-flip, as is
shown in Figure.
44
 Elitism, Tournaments, and Niching
 Elitism, which takes some number of the fittest strings from
one generation and puts them directly into the next
population, replacing strings that are already there either at
random, or by choosing the least fit to replace.
 Elitism and tournaments both ensure that good solutions
aren’t lost, they both have the problem that they can
encourage premature convergence, where the algorithm
settles down to a constant population that never changes even
though it hasn’t found an optimum.
45
THE RANDOMNESS IN THE GA IS A VERY LARGE PART
OF WHY IT WORKS, AND SCHEMES TO REDUCE THAT
RANDOMNESS OFTEN HARM THE OVERALL RESULTS.
 One way to solve the problem of premature convergence is
through niching (also known as using island populations),
where the population is separated into several subpopulations,
which all evolve independently for some period of time, so
that they are likely to have converged to different local
maxima, and a few members of one subpopulation are
occasionally injected as ‘immigrants’ into another
subpopulation.

 Another approach is known as fitness sharing, where the

fitness of a particular string is averaged across the number of
times that string appears in the population. This biases the
fitness function towards uncommon strings, but can also mean
46
47
USING GENETIC ALGORITHMS

Map Colouring
Graph colouring is a typical discrete optimisation problem. We want to
colour a graph using only k colours, and choose them in such a way that
adjacent regions have different colours. It has been mathematically proven
that any two-dimensional planar graph can be coloured with four colours,
which was the first ever proof that used a computer program to check the
cases.
 Encode possible solutions as strings For this problem, we’ll
choose our alphabet to consist of the three possible shades
(black (b), dark (d), and light (l)).
 Choose a suitable fitness function The thing that we want
to minimise (a cost function) is the number of times that
two adjacent regions have the same colour.
 Choose suitable genetic operators We’ll use the standard
genetic operators for this, since this example makes the
operations of crossover and mutation clear. 48
49
50
PUNCTUATED EQUILIBRIUM
The argument runs that if humans evolved from apes, then there should be
some evidence of a whole set of intermediary species that existed during the
transition phase, and there aren’t. Interestingly, GAs demonstrate one of the
explanations why this is not correct, which is that the way that evolution
actually seems to work is known as punctuated equilibrium.

51
EXAMPLES
 The Knapsack Problem
Knapsack problem states that: Given a set of items, each with a
mass and a value, determine the number of each item to include in
a collection so that the total weight is less than or equal to a given
limit and the total value is as large as possible.
The Genetic Algorithm provides a way to solve the knapsack
problem in linear time complexity . The attribute reduction
technique which incorporates Rough Set Theory finds the
important genes, hence reducing the search space and ensures that
the effective information will not be lost.

Genetic Algorithms definitely rule them all and prove to be the best
approach in obtaining solutions to problems traditionally thought
of as computationally infeasible such as the Knapsack problem.

52
53
EXAMPLE 2: THE FOUR PEAKS PROBLEM
The four peaks is a toy problem that is quite often used to test out
GAs and various developments of them. It is an invented fitness
function that rewards strings with lots of consecutive 0s at the
start of the string, and lots of consecutive 1s at the end. The fitness
consists of counting the number of 0s at the start, and the number
of 1s at the end and returning the maximum of them as the fitness.

54
LIMITATIONS OF GA

 A significant one of which is they can be very slow.

 The main problem is that once a local maximum has been reached,
it can often be a long time before a string is produced that escapes
from the local maximum and finds another, higher, maximum.
 A more basic criticism of genetic algorithms is that it
is very hard (read basically impossible) to analyse
the behaviour of the GA.
 we cannot guarantee that the algorithm will
converge at all, and certainly not to the optimal
solution.

55
TRAINING NEURAL NETWORKS WITH GENETIC ALGORITHMS
 We trained our neural networks, most notably the MLP, using
gradient descent. we could encode the problem of finding the
correct weights as a set of strings, with the fitness function
measuring the sum-of-squa res error. This has been done,
and with good reported results. However, there are
some problems with this approach.

Problem:
 The first is that we turn all the local information from the targets

about the error at each output node of the network into just one
number, the fitness, which is throwing away useful information,
and the second is that we are ignoring the gradient information,
which is also throwing away useful information.

56
SOLUTION

 GAs with neural networks is to use the GA to choose the

topology of the network. Previously, we chose the structure in a
completely ad hoc way by trying out different structures and
choosing the one that worked best.

 We can use a GA for this problem, although the crossover

operator doesn’t make a great deal of sense, so we just consider
mutation. However, we allow for four different types of
mutation: delete a neuron, delete a weight connection, add a
neuron, add a connection.

57
REINFORCEMENT LEARNING
 Reinforcement learning fills the gap between supervised
learning, where the algorithm is trained on the correct answers
given in the target data, and unsupervised learning, where the
algorithm can only exploit similarities in the data to cluster it.
 Reinforcement learning is usually described in terms of the
interaction between some agent and its environment. The agent is
the thing that is learning, and the environment is where it is
learning, and what it is learning about. The environment has
another task, which is to provide information about how good a
strategy is, through some reward function.
 The importance of reinforcement learning for psychological
learning theory comes from the concept of trial-and-error learning,
which has been around for a long time, and is also known as the
Law of Effect.
58
A ROBOT PERCEIVES THE CURRENT STATE OF ITS
ENVIRONMENT THROUGH ITS SENSORS, AND
PERFORMS ACTIONS BY MOVING ITS MOTORS. THE
REINFORCEMENT LEARNER (AGENT) WITHIN THE
ROBOT TRIES TO PREDICT THE NEXT STATE AND
REWARD.

59
 Reinforcement learning maps states or situations to actions in order
to maximise some numerical reward. That is, the algorithm knows
about the current input (the state), and the possible things it can do
(the actions), and its aim is to maximise the reward. There is a clear
distinction drawn between the agent that is doing the learning and
the environment, which is where the agent acts, and which produces
the state and the rewards.

 The most common way to think about reinforcement learning is on a

robot. The current sensor readings of the robot, or processed versions
of them, could define the state. They are a representation of the
environment around the robot in some way. Note that the state
doesn’t necessarily tell us everything that it would be useful to know,
and there can be noise and inaccuracies in the state data.

 The possible ways that the robot can drive its motors are the actions,
which move the robot in the environment, and the reward could be
60
how well it does its task without crashing into things.
THE REINFORCEMENT LEARNING CYCLE: THE LEARNING
AGENT PERFORMS ACTION AT IN STATE ST AND RECEIVES
REWARD RT+1 FROM THE ENVIRONMENT, ENDING UP IN
STATE ST+1.

61
EXAMPLE: GETTING LOST
You arrive in a foreign city exhausted after many
hours of flying, catch the train into town and stagger
into a backpacker’s hostel without noticing much of
your surroundings. When you wake up it is dark and
you are starving, walked through the old part of the city, so
you don’t need to worry about any street that takes you out of the
old part. So at the next bus stop you come to, you have a proper
look at the map, and note down the map of the old town squares,
which turns out to look like Figure.

62
YOU DECIDE THAT THE BACKPACKER’S IS ALMOST DEFINITELY IN THE SQUARE
LABELLED F ON THE MAP, BECAUSE ITS NAME SEEMS VAGUELY FAMILIAR.
YOU DECIDE TO WORK OUT A REWARD STRUCTURE SO THAT YOU CAN
FOLLOW A REINFORCEMENT LEARNING ALGORITHM TO GET TO THE
BACKPACKER’S. THE FIRST THING YOU WORK OUT IS THAT STAYING STILL
MEANS THAT YOU ARE SLEEPING ON YOUR FEET, WHICH IS BAD. SO YOU
ASSIGN A REWARD OF −5 FOR THAT (WHILE NEGATIVE REINFORCEMENT CAN
BE VIEWED AS PUNISHMENT, IT DOESN’T NECESSARILY CORRESPOND
CLEARLY, BUT YOU MIGHT WANT TO IMAGINE IT AS PINCHING YOURSELF SO
THAT YOU STAY AWAKE).

63
THE STATE DIAGRAM IF YOU ARE CORRECT AND THE
BACKPACKER’S IS IN SQUARE (STATE) F. THE
CONNECTIONS FROM EACH STATE BACK INTO ITSELF
(MEANING THAT YOU DON’T MOVE) ARE NOT SHOWN, TO
AVOID THE FIGURE GETTING TOO COMPLICATED. THEY
ARE EACH WORTH −5 (EXCEPT FOR STAYING IN STATE F,
WHICH MEANS THAT YOU ARE IN THE BACKPACKER’S).

64
THE FOLLOWING THINGS DISCUSS IN REINFORCEMENT LEARNING

 State and Action Spaces

Our reinforcement learner is basically a search algorithm, and obviously the
larger the number of states that the algorithm has to search through, the longer
it will take to find a good solution. The set of all states that are possible for the
learner to experience is known as the state space. There is a corresponding
action space that contains all of the possible actions.
 Carrots and Sticks: The Reward Function
The basic idea of the learner is that it will choose the action that gets
the maximum expected reward. The reward function takes the current
state and the chosen action and produces a numerical reward based on
them.
 Discounting
The solution to this problem is known as discounting, and means that
we take into account how certain we can be about things that happen in
the future: there is lots of uncertainty in the learning anyway, so we
should discount our predictions of rewards in the future according to
how much chance there is that they are wrong. 65
Action Selection
At each stage of the reinforcement learning process, the algorithm looks
at the actions that can be performed in the current state and computes the
value of each action. Based on the current average reward predictions,
there are three methods of choosing action a that are worth thinking about
for reinforcement learning. We’ve seen the first and third of them before:

66
POLICY

 We have just considered different action selection

methods, such as e-greedy and soft-max. The aim of the
action selection is to trade off exploration and
exploitation in such a way as to maximize the expected
reward into the future.

 Instead, we can make an explicit decision that we are

going to always take the optimal choice at each stage, and
not do exploration any more. This choice of which action
to take in each state in order to get optimal results is
known as the policy, π.

67
MARKOV DECISION PROCESSES
The Markov Property
A simple example of a Markov decision process to
decide on the state of your mind tomorrow given
your state of mind today.

A reinforcement learning problem that follows that is, the Markov property is
known as a Markov Decision Process (MDP). It means that we can compute the
likely next reward, and what the next state will be, from only the current state and
action, based on previous experience.
68
PROBABILITIES IN MARKOV DECISION PROCESSES

 We have now reduced our reinforcement learning problem to

learning about Markov Decision Processes.
 We will only talk about the case where the number of possible
states and actions is finite, because reasoning about the infinite
case makes your head hurt. There is a very simple example of an
MDP, showing predictions for your state-of-mind while preparing
for an exam, together with the (transition probabilities) for moving
between each pair of states shown. This is known as a Markov
chain.

69
PROBABILITIES IN MARKOV DECISION PROCESSES

 There are three actions that can be taken in state E (shown by the
black circles), with associated probabilities and expected rewards.
Learning and using this transition diagram can be seen as the aim
of any reinforcement learner.

 The Markov Decision Process formalism is a powerful one that

can deal with additional uncertainties. For example, it can be
extended to deal with the case where the true states are not known,
only an observation of the state can be made, which is
probabilistically related to the state, and possibly the action. These
are known as partially observable Markov Decision Processes
(POMDPs), and they are related to the Hidden Markov Models

70
VALUES
 The reinforcement learner is trying to decide on what action to take
in order to maximize the expected reward into the future. This
expected reward is known as the value. There are two ways that we
can compute a value.
 We can consider the current state, and average across all of the
actions that can be taken, leaving the policy to sort this out for itself
(the state-value function, V (s)), or we can consider the current
state and each possible action that can be taken separately, the
action-value function, Q(s, a). In either case we are thinking about
what the expected reward would be if we started in state s (where
E(·) is the statistical expectation):

71
72
73
BACK ON HOLIDAY: USING REINFORCEMENT LEARNING

Figure, and can be written out as (where 1 means that there is a link, and 0 means that there is
not):

74
THE DIFFERENCE BETWEEN SARSA AND Q-LEARNING

The most important difference between the two is how Q is updated after each
action. SARSA uses the Q' following a ε-greedy policy exactly, as A' is drawn
from it. In contrast, Q-learning uses the maximum Q' over all possible actions for
the next step.
 Both algorithms will start out with no information about the environment, and
will therefore explore randomly, using the -greedy policy. However, over time,
the strategies that the two algorithms produce are quite different.
 The main reason for the difference is that Q-learning always attempts to follow
the optimal path, which is the shortest one. This takes it close to the cliff, and the
-greedy part means that inevitably it will sometimes fall over. By way of
contrast, the sarsa algorithm will converge to a much safer route that keeps it
well away from the cliff, even though it takes longer.

75
USES OF REINFORCEMENT LEARNING

 Reinforcement learning has been used successfully for many problems, and the
results of computer modeling of reinforcement learning have been of great
interest to psychologists, as well as computer scientists, because of the close
links to biological learning.
 Reinforcement learning has been used in other robotic applications, including
robots learning to follow each other, travel towards bright lights, and even
navigate.
 In general, reinforcement learning is fairly slow, because it has to build up all of
the information through exploration and exploitation in order to find the better
solutions.
 It is also very dependent upon a carefully chosen reward function: get that
wrong and the algorithm will do something completely unexpected.
 A famous example of reinforcement learning was TD-Gammon, which was
produced by Gerald Tesauro. His idea was that reinforcement learning should be
very good at learning to play games, because games were clearly episodic—you
played until somebody won—and there was a clear reward structure, with a
positive reward for winning.
76
ASSIGNMENT QUESTIONS

1. What is Reinforcement Learning?

2. Explain the Q function and Q Learning Algorithm.
3. Describe K-nearest Neighbor learning Algorithm for
continues valued target function.
4. Discuss the major drawbacks of K-nearest Neighbor
learning Algorithm and how it can be corrected
5. Define the following terms with respect to K - Nearest
Neighbor Learning:
i) Regression ii) Residual iii) Kernel Function.
6. Explain Q learning algorithm assuming deterministic
rewards and actions?
8. Explain Locally Weighted Linear Regression.
9. Explain High Dimensional Spaces in machine learning.

77
10. What is The Curse of Dimensionality?
11. How the Dimensionality Reduction in Latent Variable.
12. Write algorithm for Principal Component Analysis.
13. Explain about Probabilistic PCA.
14. Differentiate Probabilistic PCA - Independent Components
Analysis.
15. What is Factor analysis?
16. (i)Describe in detail about Linear Discriminants. (ii)Discuss:
Generalizing the Linear Model and Geometry of the Linear
Discriminant.
17. Point out why dimensionality reduction is useful?
18. Define Factor Analysis or latent variables.
19. Distinguish between within-class scatter and between-
classes scatter. 78
20. Define PCA.
21. Describe what Isomap is.
22. Discover Locally Linear Embedding algorithm with k=12.
23. Explain the three different ways to do dimensionality reduction.
24. Explain what Least Squares Optimization is.
25. Difference action and state space.
26. Write what is Punctuated Equilibrium? Remember BTL1
27. How reinforcement learner experience and the corresponding action.
28. Express the basic tasks that need to be performed for GA.
29. Identify how reinforcement learning maps states to action
30. Examine Genetic Programming.
31. Differentiate Sarsa and Q-learning.
32. Explain Least Squares Optimization.
33. (i)Describe in detail about Generating Offspring Genetic Operators.
(ii)Discuss the Basic Genetic Algorithm.
79

Mother Terisa by Kushantsingh
67% (3)
Mother Terisa by Kushantsingh
40 pages
DSSSB Assistant Superintendent Question Paper 2019 PDF
No ratings yet
DSSSB Assistant Superintendent Question Paper 2019 PDF
55 pages
Mc4301 APR May 24 (Machine Learning)
No ratings yet
Mc4301 APR May 24 (Machine Learning)
3 pages
Pattern Recognition - Unit - 1&2
100% (1)
Pattern Recognition - Unit - 1&2
41 pages
ML-3-Decision Tree
No ratings yet
ML-3-Decision Tree
17 pages
TFN PPT Ida Jean Orlando
100% (1)
TFN PPT Ida Jean Orlando
19 pages
ML - CSA 301 - ML Perspective and Issues
No ratings yet
ML - CSA 301 - ML Perspective and Issues
34 pages
Igcse Art and Design Coursework Examples
100% (2)
Igcse Art and Design Coursework Examples
6 pages
ML LAB Mannual-1
No ratings yet
ML LAB Mannual-1
79 pages
Unit 5
No ratings yet
Unit 5
23 pages
ML Unit-Iv
No ratings yet
ML Unit-Iv
18 pages
Day 5 Supervised Technique-Decision Tree For Classification PDF
100% (1)
Day 5 Supervised Technique-Decision Tree For Classification PDF
58 pages
Chapter
100% (1)
Chapter
101 pages
Remedial Teaching For Slow Learners
100% (4)
Remedial Teaching For Slow Learners
2 pages
Answers For End-Sem Exam Part - 2 (Deep Learning)
No ratings yet
Answers For End-Sem Exam Part - 2 (Deep Learning)
20 pages
Department of Education: Detailed Lesson Plan in MAPEH 6
No ratings yet
Department of Education: Detailed Lesson Plan in MAPEH 6
11 pages
Education Should Be Free
100% (1)
Education Should Be Free
6 pages
Tle January
67% (3)
Tle January
10 pages
Deep Learning With Tensorflow
No ratings yet
Deep Learning With Tensorflow
15 pages
2.building Blocks of Neural Networks
100% (1)
2.building Blocks of Neural Networks
2 pages
Science 1º Primaria
50% (2)
Science 1º Primaria
14 pages
Question Bank Module-1: Department of Computer Applications 18mca53 - Machine Learning
No ratings yet
Question Bank Module-1: Department of Computer Applications 18mca53 - Machine Learning
7 pages
Data Science Intervieew Questions
100% (1)
Data Science Intervieew Questions
16 pages
Machine Learning Full Question Bank
No ratings yet
Machine Learning Full Question Bank
14 pages
Autoencoders - Presentation
No ratings yet
Autoencoders - Presentation
18 pages
ML Lab
No ratings yet
ML Lab
21 pages
Support Vector Machine - Explanation
No ratings yet
Support Vector Machine - Explanation
12 pages
UNIT 2 - Notes
No ratings yet
UNIT 2 - Notes
31 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
18 pages
Unit - IV - DIMENSIONALITY REDUCTION AND GRAPHICAL MODELS
No ratings yet
Unit - IV - DIMENSIONALITY REDUCTION AND GRAPHICAL MODELS
59 pages
Regression Notes
100% (1)
Regression Notes
20 pages
Unit 3
No ratings yet
Unit 3
99 pages
ML Unit-3
No ratings yet
ML Unit-3
24 pages
Optimization Techniques in Deep Learning
No ratings yet
Optimization Techniques in Deep Learning
14 pages
SCSA3015 Deep Learning Unit 2 PDF
No ratings yet
SCSA3015 Deep Learning Unit 2 PDF
32 pages
Scrum Guide Summary
100% (1)
Scrum Guide Summary
12 pages
ML Unit-3 Notes
No ratings yet
ML Unit-3 Notes
26 pages
ML Unit-1
No ratings yet
ML Unit-1
26 pages
ML Unit 1
No ratings yet
ML Unit 1
44 pages
Deep Learning Exp
No ratings yet
Deep Learning Exp
25 pages
Assignment # 01 Bscs - 7 Semester: Machine Learning
100% (1)
Assignment # 01 Bscs - 7 Semester: Machine Learning
5 pages
Math Lab and Math Corner
No ratings yet
Math Lab and Math Corner
1 page
Chandigarh Group of Colleges College of Engineering Landran, Mohali
No ratings yet
Chandigarh Group of Colleges College of Engineering Landran, Mohali
47 pages
ML UNIT-2 Notes
No ratings yet
ML UNIT-2 Notes
15 pages
Lab I TENSOR FLOW AND KERAS
No ratings yet
Lab I TENSOR FLOW AND KERAS
3 pages
ML Question Bank
No ratings yet
ML Question Bank
29 pages
Rajesh (DL Unit1) 04dec2024
No ratings yet
Rajesh (DL Unit1) 04dec2024
125 pages
Seminar Report Machine Learning
No ratings yet
Seminar Report Machine Learning
20 pages
Students Perceptions of Mathematics Writing
No ratings yet
Students Perceptions of Mathematics Writing
21 pages
ML Mod 4 Part 2
No ratings yet
ML Mod 4 Part 2
32 pages
ML Lab Final R22
No ratings yet
ML Lab Final R22
67 pages
Pages From 2024 P6 Science Prelim Exam SCGS
No ratings yet
Pages From 2024 P6 Science Prelim Exam SCGS
40 pages
Document From Bathini Sai Sujith
No ratings yet
Document From Bathini Sai Sujith
12 pages
CP4252 Machine Learning Lab Manual
No ratings yet
CP4252 Machine Learning Lab Manual
37 pages
OB Fundamentals
No ratings yet
OB Fundamentals
32 pages
ML Unit-2
No ratings yet
ML Unit-2
26 pages
Synopsis SEM4
No ratings yet
Synopsis SEM4
24 pages
Hibernation Lesson Plan
No ratings yet
Hibernation Lesson Plan
3 pages
Unit - 3
No ratings yet
Unit - 3
42 pages
Unit 4
No ratings yet
Unit 4
24 pages
Unit 05 Event Evaluation
No ratings yet
Unit 05 Event Evaluation
25 pages
Chem Final
No ratings yet
Chem Final
8 pages
SWP Yonsei Summer
No ratings yet
SWP Yonsei Summer
6 pages
Nueral Network Mcqs
No ratings yet
Nueral Network Mcqs
6 pages
Question Bank
No ratings yet
Question Bank
4 pages
02 ML Supervised Learning
No ratings yet
02 ML Supervised Learning
32 pages
ML Unit-Iv
No ratings yet
ML Unit-Iv
19 pages
Module-3 Association Analysis: Data Mining Association Analysis: Basic Concepts and Algorithms
No ratings yet
Module-3 Association Analysis: Data Mining Association Analysis: Basic Concepts and Algorithms
34 pages
Applicationform 30291
No ratings yet
Applicationform 30291
2 pages
A Study On The Effectiveness of Training Program of OIL
No ratings yet
A Study On The Effectiveness of Training Program of OIL
7 pages
18AI61
No ratings yet
18AI61
3 pages
Cot - DLP - Science 6 - Earth's Rotation by Master Teacher Eva M. Corvera
No ratings yet
Cot - DLP - Science 6 - Earth's Rotation by Master Teacher Eva M. Corvera
16 pages
Gradient Descent
No ratings yet
Gradient Descent
15 pages
ML Question Bank 2024
No ratings yet
ML Question Bank 2024
2 pages
Evaluation Metrics For Regression: Dr. Jasmeet Singh Assistant Professor, Csed Tiet, Patiala
No ratings yet
Evaluation Metrics For Regression: Dr. Jasmeet Singh Assistant Professor, Csed Tiet, Patiala
13 pages
Revison Worksheet Beehive Answer Key
No ratings yet
Revison Worksheet Beehive Answer Key
4 pages
Soft Max
No ratings yet
Soft Max
6 pages
SRM Valliammai Engineering College (An Autonomous Institution)
No ratings yet
SRM Valliammai Engineering College (An Autonomous Institution)
9 pages
CP5191 Machine Learning Techniques L T P C3 0 0 3
No ratings yet
CP5191 Machine Learning Techniques L T P C3 0 0 3
7 pages
Gujarat Technological University: Computer Engineering Machine Learning SUBJECT CODE: 3710216
No ratings yet
Gujarat Technological University: Computer Engineering Machine Learning SUBJECT CODE: 3710216
2 pages
EE320A Principles of Communication: Acadly
No ratings yet
EE320A Principles of Communication: Acadly
3 pages
Team Dynamics Research Paper
No ratings yet
Team Dynamics Research Paper
7 pages
BPEd LPE FormatFINAL
No ratings yet
BPEd LPE FormatFINAL
1 page
The Importance of Mastering English: Bismillahirrahmaanirrahiem, Assalamu'alaikum Warahmatullahi Wabarokatuh
No ratings yet
The Importance of Mastering English: Bismillahirrahmaanirrahiem, Assalamu'alaikum Warahmatullahi Wabarokatuh
2 pages
Unit - V
No ratings yet
Unit - V
10 pages
Decision Trees
No ratings yet
Decision Trees
5 pages
Parent/Guardian'S Signature: Attendance Record Department of Education
No ratings yet
Parent/Guardian'S Signature: Attendance Record Department of Education
2 pages
Ee693 PWM Converters and Applications
No ratings yet
Ee693 PWM Converters and Applications
6 pages
Curse of Dimensionality
No ratings yet
Curse of Dimensionality
9 pages
KNN Algorithm
No ratings yet
KNN Algorithm
3 pages

Unit 4

Uploaded by

Unit 4

Uploaded by

MACHINE LEARNING

Mr. Kasi Bandla

 Dimensionality reduction is a statistical technique

 It reduces the time and storage space required. It becomes

 Missing Value Ratio. Suppose you're given a

 It helps in data compression, and hence reduced storage

 Linear discriminant analysis is a supervised classification

 LDA is primarily used here to reduce the number of features

 Discriminate analysis is a statistical method that is used by

 Principal component analysis (PCA) is a standard tool in

 Principal component analysis (PCA) is a statistical

 Standardize the data.

The results of PCA depend on the scaling of the

 Other applications like Medical Data

 The auto-associative MLP actually computes something

 Computing the principal components with a neural

 To evaluate the prediction ability of the model,

 Independent component analysis (ICA) is a statistical and

 Both are statistical transformations.

 The Locally Linear Embedding algorithm is a type of typical

 we are interested in manifolds that are not flat, and this is

 Isomap approximates them by assuming that the distances

 An EL uses mechanisms inspired by biological evolution, such as

 The Genetic Algorithm is a computational approximation to

 a method for representing problems as chromosomes

 The first thing that we need is some way to

 For the knapsack problem, we decided that we wanted to

 We can now measure the fitness of any string. The GA

 After the initial population is chosen randomly, the

 However, it is also good to allow some exploration in there,

 If strings are chosen proportionally to their fitness, so that fitter

 (a) Single point crossover. A position in the string is chosen at

 Another approach is known as fitness sharing, where the

 A significant one of which is they can be very slow.

 GAs with neural networks is to use the GA to choose the

 We can use a GA for this problem, although the crossover

 The most common way to think about reinforcement learning is on a

 State and Action Spaces

 We have just considered different action selection

 Instead, we can make an explicit decision that we are

 We have now reduced our reinforcement learning problem to

 The Markov Decision Process formalism is a powerful one that

1. What is Reinforcement Learning?

You might also like