Machine Learning Unit 5 Part 2
Machine Learning Unit 5 Part 2
1.Bayesian learning
01
Bayesian Learning for Machine Learning
02
Bayesian Learning for Machine Learning
03
Bayesian Learning for Machine Learning
Bayes’ Theorem
Bayes’ theorem describes how the conditional probability of an event
or a hypothesis can be computed using evidence and prior
knowledge.
It is similar to concluding that our code has no bugs given the
evidence that it has passed all the test cases, including our prior
belief that we have rarely observed any bugs in our code.
04
Bayesian Learning for Machine Learning
The Bayes’ theorem is given by:
P(θ|X)=[P(X|θ)P(θ)] ÷ P(X)
Where,
P(θ) - Prior Probability is the probability of the hypothesis θ being true before
applying the Bayes’ theorem. Prior represents the beliefs that we have gained
through past experience, which refers to either common sense or an outcome of
Bayes’ theorem for some past observations.
P(X|θ) - Likelihood is the conditional probability of the evidence given a
hypothesis. The likelihood is mainly related to our observations or the data we
have.
05
Bayesian Learning for Machine Learning
The Bayes’ theorem is given by:
P(θ|X)=[P(X|θ)P(θ)] ÷ P(X)
Where,
P(X) - Evidence term denotes the probability of evidence or data. This can be
expressed as a summation (or integral) of the probabilities of all possible
hypotheses weighted by the likelihood of the same.
P(θ|X) - Posteriori probability denotes the conditional probability of the
hypothesis θ after observing the evidence X.
05
Bayesian Learning for Machine Learning
Maximum a Posteriori (MAP)
We can use MAP to determine the valid hypothesis from a set of hypotheses.
According to MAP, the hypothesis that has the maximum posterior probability is
considered as the valid hypothesis. Therefore, we can express the hypothesis θ
MAP that is concluded using MAP as follows:
The arg θ operator estimates the event or hypothesis θi that maximizes the posterior
probability P(θi|X).
06
Bayesian Learning for Machine Learning
Bayesian Learning
With Bayesian learning, we are dealing with random variables that have
probability distributions.
Consider the prior probability of not observing a bug in our code in the above
example. We defined that the event of not observing bug is θ and the
probability of producing a bug free code P(θ) was taken as p. However, the
event θ can actually: take two values - either true or false - corresponding to
not observing bug or observing a bug respectively.
07
Bayesian Learning for Machine Learning
Therefore, observing a bug or not observing a bug are not two separate
events, they are two possible outcomes for the same event θ.
Since all possible values of θ are a result of a random event, we can
consider θ as a random variable. Therefore, P(θ) is not a single
probability value, rather it is a discrete probability distribution that can
be described using a probability mass function.
:
07
Bayesian Learning for Machine Learning
• Incorporate the prior belief and incrementally updating the prior
probabilities whenever more evidence is available.
• We can use concepts such as confidence interval to measure the
confidence of the posterior probability.
• We can end the experiment when we have obtained results with
sufficient confidence for the task.
09
Bayesian Learning for Machine Learning
• Bayesian learning and the frequentist method can also be considered as
two ways of looking at the tasks of estimating values of unknown
parameters given some observations caused by those parameters.
• For certain tasks, either the concept of uncertainty is meaningless or
interpreting prior beliefs is too complex. In such cases, frequentist
methods are more convenient and we do not require Bayesian learning
with all the extra effort.
09
Bayesian Learning for Machine Learning
09
Conclusion
01 Bayesian learning
10
Thank you so much!
For more info please contact us
0755 - 3501700
[email protected]