Deep Learning
Deep Learning
Architectures :
First, we need to identify the actual problem in order to get the right solution
and it should be understood, the feasibility of the Deep Learning should
also be checked (whether it should fit Deep Learning or not). Second, we
need to identify the relevant data which should correspond to the actual
problem and should be prepared accordingly. Third, Choose the Deep
Learning Algorithm appropriately. Fourth, Algorithm should be used while
training the dataset. Fifth, Final testing should be done on the dataset.
Limitations :
Advantages :
Disadvantages :
Applications :
2) Bayesian Learning:-
A learning technique that determines model parameters (such as the
network weights) by maximizing the posterior probability of the parameters
given the training data.The idea is that some parameter values are more
consistent with the observed data than others. By Bayes’ rule, maximizing
the posterior probability amounts to maximizing the so-called model
evidence, defined as the conditional probability of the training data given
the model parameters. The evidence often can be approximated by some
closed formula or by an update rule. Bayesian techniques render it possible
to use all data for training instead of reserving patterns for cross-validation
of parameters. Bayesian learning methods is able to provide useful
practical solutions and forecasting features toward solving complicated
problems
Features of Bayesian learning methods:
• Each observed training example can incrementally decrease or increase
the estimated probability that a hypothesis is correct.
– This provides a more flexible approach to learning than algorithms that
completely eliminate a hypothesis if it is found to be inconsistent with any
single example.
• Prior knowledge can be combined with observed data to determine the
final probability of a hypothesis. In Bayesian learning, prior knowledge is
provided by asserting
– a prior probability for each candidate hypothesis, and
– a probability distribution over observed data for each possible hypothesis.
• Bayesian methods can accommodate hypotheses that make probabilistic
predictions
• New instances can be classified by combining the predictions of multiple
hypotheses, weighted by their probabilities.
• Even in cases where Bayesian methods prove computationally
intractable, they can provide a standard of optimal decision making against
which other practical methods can be measured.
Bayes Theorem
• In machine learning, we try to determine the best hypothesis from some
hypothesis space H, given the observed training data D.
• In Bayesian learning, the best hypothesis means the most probable
hypothesis, given the data D plus any initial knowledge about the prior
probabilities of the various hypotheses in H.
• Bayes theorem provides a way to calculate the probability of a
hypothesis based on its prior probability, the probabilities of observing
various data given the hypothesis, and the observed data itself.
Bayes Theorem
P(h) is prior probability of hypothesis h
– P(h) to denote the initial probability that hypothesis h holds, before
observing training data.
– P(h) may reflect any background knowledge we have about the chance
that h is correct. If we have no such prior knowledge, then each candidate
hypothesis might simply get the same prior probability.
3) Decision Surfaces:-
Decision surface A (hyper) surface in a multidimensional state
space that partitions the space into different regions. Data lying on one side
of a decision surface are defined as belonging to a different class from
those lying on the other. Decision surfaces may be created or modified as a
result of a learning process and they are frequently used in machine
learning, pattern recognition, and classification systems.
Classification in machine learning means training our data to assign labels
in our input dataset.
Each input feature defines an axis on the feature space. The minimum
number of features required to form a plane is two, with dots representing
input coordinates in the feature space. If there were three input variables,
the feature space would be a three-dimensional volume.
The data points lying to one side of the decision surface belong to one
class label to those lying on the other side of the surface. Due to the model
learning process, we can create or modify decision boundaries.
Although the word ‘surface’ suggests a 2-D feature space, we can still use
these methods for more than two features by creating a decision surface
for each pair of input features.
4) Linear Classifiers:-
Linear classifiers classify data into labels based on a linear combination
of input features. Therefore, these classifiers separate data using a line or
plane or a hyperplane (a plane in more than 2 dimensions). They can
only be used to classify data that is linearly separable. They can be
modified to classify non-linearly separable data.
5)