##7 Rev ML Module-2 Bayesian Learning

Uploaded by

monacmicsia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

0% found this document useful (0 votes)

19 views

##7 Rev ML Module-2 Bayesian Learning

Uploaded by

monacmicsia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

You are on page 1/ 7

Maximum Likelihood (ML) Hypothesis, hyyy, If we assume that every hypothesis in H is equally probable ie. P(h,) = P(h)) for all h, and h, in H We can only consider P(D|h) to find the most probable hypothesis. P(DJh) is often called the likelihood of the data D given h Any hypothesis that maximizes P(D|h) is called a maximum likelihood (ML) hypothesis, yyy. hv = argmax P(D/h) hellMaximum Likelihood and Least-Squared Error Hypotheses Many learning approaches such as neural network learning, linear regression, and polynomial curve fitting try to learn a continuous-valued target function. Under certain assumptions any learning algorithm that minimizes the squared error between the output hypothesis predictions and the training data will output a Maximum Likelihood Hypothesis. — Any hypothesis that maximizes P(DJh) is called a maximum likelihood (ML) hypothesis, hy, - hy, = argmax P(D|h) fet The significance of this result is that it provides a Bayesian justification (under certain assumptions) for many neural network and other curve fitting methods that attempt to minimize the sum of squared errors over the training data.Maximum Likelihood and Least-Squared Error Hypotheses — Deriving hy, In order to find the maximum likelihood hypothesis, we start with our earlier definition but using lower case p to refer to the probability density function. hu = argmax p(D[h) heH We assume a fixed set of training instances (x,. . . x,,) and therefore consider the data D to be the corresponding sequence of target values D = (d,. . . d,,). Here d, = f (x,) + e;. Assuming the training examples are mutually independent given h, we can write p(Djh) as the product of the various p(dj{h) m nya, = argmax | | p(dith) heH i=1Maximum Likelihood and Least-Squared Error Hypotheses ‘The maximum likelihood hypothesis hyy, is the one that minimizes the sum of the squared errors between observed training values d; and hypothesis predictions h(x,). This holds under the assumption that the observed training values d; are generated by adding random noise to the true target value, where this random noise is drawn independently for each example from a Normal distribution with zero mean. Similar derivations can be performed starting with other assumed noise distributions, producing different results. Why is it reasonable to choose the Normal distribution to characterize noise? — One reason, is that it allows for a mathematically straightforward analysis. — Asecond reason is that the smooth, bell-shaped distribution is a good approximation to many types of noise in physical systems. Minimizing the sum of squared errors is a common approach in many neural network, curve fitting, and other approaches to approximating real-valued functions.Bayes Optimal Classifier Normally we consider: — What is the most probable hypothesis given the training data? We can also consider: — What is the most probable classification of the new instance given the training data? Consider a hypothesis space containing three hypotheses, hl, h2, and h3. — Suppose that the posterior probabilities of these hypotheses given the training data are 4, 3, and .3 respectively. — Thus, hl is the MAP hypothesis. — Suppose a new instance x is encountered, which is classified positive by Al, but negative by A2 and h3. — Taking all hypotheses into account, the probability that x is positive is .4 (the probability associated with 1), and the probability that it is negative is .6. — The most probable classification (negative) in this case is different from the classification generated by the MAP hypothesis.Bayes Optimal Classifier The most probable classification of the new instance is obtained by combining the predictions of all hypotheses, weighted by their posterior probabilities. If the possible classification of the new example can take on any value v, from some set V, then the probability P(v; | D) that the correct classification for the new instance is vj: j VID) = Yaen P(y|hi) PChy|D) Bayes optimal classification: argmax Yp,cu P(¥j{hy) P(hy|D) vyevBayes Optimal Classifier + Although the Bayes optimal classifier obtains the best performance that can be achieved from the given training data, it can be quite costly to apply. — The expense is due to the fact that it computes the posterior probability for every hypothesis in H and then combines the predictions of each hypothesis to classify each new instance. + An alternative, less optimal method is the Gibbs algorithm: 1. Choose a hypothesis # from H at random, according to the posterior probability distribution over H. 2. Use h to predict the classification of the next instance x.

Bayesian Learning Unit 3 PDF
No ratings yet
Bayesian Learning Unit 3 PDF
18 pages
Bayesian Learning
No ratings yet
Bayesian Learning
81 pages
ML - Unit-3 Chapter - 6 (Bayes Theorem) - Notes
No ratings yet
ML - Unit-3 Chapter - 6 (Bayes Theorem) - Notes
31 pages
Wa0002.
No ratings yet
Wa0002.
24 pages
ML - Unit-3 Chapter - 6 (Bayes Theorem) - Notes
No ratings yet
ML - Unit-3 Chapter - 6 (Bayes Theorem) - Notes
123 pages
6.1 Bayesian Learning
No ratings yet
6.1 Bayesian Learning
33 pages
3maximum-likelyhood
No ratings yet
3maximum-likelyhood
15 pages
Bayesian Learning: Based On "Machine Learning", T. Mitchell, Mcgraw Hill, 1997, Ch. 6
No ratings yet
Bayesian Learning: Based On "Machine Learning", T. Mitchell, Mcgraw Hill, 1997, Ch. 6
54 pages
Unit 4
No ratings yet
Unit 4
18 pages
ML Unit 3
No ratings yet
ML Unit 3
14 pages
L13 Bayesian Methods
No ratings yet
L13 Bayesian Methods
30 pages
Bayes Algorithm
No ratings yet
Bayes Algorithm
26 pages
SL09. Bayesian Learning
No ratings yet
SL09. Bayesian Learning
4 pages
Module 4 - Bayesian Learning
No ratings yet
Module 4 - Bayesian Learning
36 pages
Unit III
No ratings yet
Unit III
19 pages
Bayesian Learning Methods
No ratings yet
Bayesian Learning Methods
57 pages
ml and ls
No ratings yet
ml and ls
2 pages
Lecture 9: Bayesian Learning: Cognitive Systems II - Machine Learning SS 2005
No ratings yet
Lecture 9: Bayesian Learning: Cognitive Systems II - Machine Learning SS 2005
39 pages
ML Unit 3 Part 1
No ratings yet
ML Unit 3 Part 1
36 pages
The Bayes Optimal Classifier: Machine Learning
No ratings yet
The Bayes Optimal Classifier: Machine Learning
12 pages
Ba Yes Naive
No ratings yet
Ba Yes Naive
15 pages
2BAYESIAN LEARNING (1)
No ratings yet
2BAYESIAN LEARNING (1)
22 pages
Unit 3 Bayesian Learning
No ratings yet
Unit 3 Bayesian Learning
49 pages
Unit 2 Bayesian Learning
No ratings yet
Unit 2 Bayesian Learning
50 pages
L23 Bayesian Naive
No ratings yet
L23 Bayesian Naive
18 pages
Bayesian Decision Theory and Learning: Jayanta Mukhopadhyay Dept. of Computer Science and Engg
No ratings yet
Bayesian Decision Theory and Learning: Jayanta Mukhopadhyay Dept. of Computer Science and Engg
56 pages
AI&ML-Q With Answer
No ratings yet
AI&ML-Q With Answer
18 pages
ML Unit-Iii
No ratings yet
ML Unit-Iii
178 pages
Machine_learning(unit 3)
No ratings yet
Machine_learning(unit 3)
9 pages
Features of Bayesian Learning Methods
No ratings yet
Features of Bayesian Learning Methods
39 pages
Naive Bayes
No ratings yet
Naive Bayes
60 pages
ML - Unit 1 - Part Ii
No ratings yet
ML - Unit 1 - Part Ii
18 pages
15CS73 Module 4
No ratings yet
15CS73 Module 4
60 pages
MODULE - 4 QB SOLVED-1
No ratings yet
MODULE - 4 QB SOLVED-1
31 pages
Machine Learning: Lecture 6: Bayesian Learning (Based On Chapter 6 of Mitchell T.., Machine Learning, 1997)
No ratings yet
Machine Learning: Lecture 6: Bayesian Learning (Based On Chapter 6 of Mitchell T.., Machine Learning, 1997)
15 pages
Slide 1
No ratings yet
Slide 1
37 pages
Aiml Module 04
No ratings yet
Aiml Module 04
62 pages
Concept Learning
No ratings yet
Concept Learning
33 pages
UNIT-3
No ratings yet
UNIT-3
99 pages
Bayesian Learning
No ratings yet
Bayesian Learning
49 pages
8-Probability Theory Cont''d and BAYESIAN LEARNING-01!08!2024
No ratings yet
8-Probability Theory Cont''d and BAYESIAN LEARNING-01!08!2024
22 pages
Mathematics - Iii: Institute of Science&Technology
No ratings yet
Mathematics - Iii: Institute of Science&Technology
16 pages
Bayesian Learning
No ratings yet
Bayesian Learning
44 pages
Naïve Bayes Classifier: April 25, 2006
No ratings yet
Naïve Bayes Classifier: April 25, 2006
19 pages
ML 3
No ratings yet
ML 3
45 pages
Log-Linear Models and Conditional Random Fieldsels
No ratings yet
Log-Linear Models and Conditional Random Fieldsels
27 pages
Bayesian Learning: Salma Itagi, Svit
No ratings yet
Bayesian Learning: Salma Itagi, Svit
14 pages
Bayers Optimal Classifier
No ratings yet
Bayers Optimal Classifier
9 pages
ML Unit III
No ratings yet
ML Unit III
40 pages
18CS71 Module 4
No ratings yet
18CS71 Module 4
30 pages
E-Note 14654 Content Document 20231228101425AM
No ratings yet
E-Note 14654 Content Document 20231228101425AM
10 pages
Probability Theory For Machine Learning: Chris Cremer September 2015
No ratings yet
Probability Theory For Machine Learning: Chris Cremer September 2015
40 pages
AI Mod4@AzDOCUMENTS - in
No ratings yet
AI Mod4@AzDOCUMENTS - in
41 pages
Module05 - Bayesian Reasoning
No ratings yet
Module05 - Bayesian Reasoning
37 pages
8 ML
No ratings yet
8 ML
22 pages
ML - Unit4pdf
No ratings yet
ML - Unit4pdf
65 pages
ML Unit 3 Part 1
No ratings yet
ML Unit 3 Part 1
36 pages
Chapter 6 Bayesianlearning
No ratings yet
Chapter 6 Bayesianlearning
32 pages
FML Unit3
No ratings yet
FML Unit3
18 pages

##7 Rev ML Module-2 Bayesian Learning

Uploaded by

##7 Rev ML Module-2 Bayesian Learning

Uploaded by

You might also like