0% found this document useful (0 votes)

2 views95 pages

IML Module 3

The document discusses various Bayesian, ensemble, and probabilistic learning techniques, focusing on concepts like Bayes theorem, maximum likelihood, and the Bayes optimal classifier. It also covers practical applications such as spam classification, medical diagnosis, and weather prediction, along with methods like Naïve Bayes and Bayesian belief networks. Additionally, it introduces ensemble learning methods including voting, bagging, and boosting, highlighting their significance in machine learning.

Uploaded by

thejasgangadkar007

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views95 pages

IML Module 3

Uploaded by

thejasgangadkar007

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 95

B M S College of Engineering

Department of Machine Learning

UNIT 03
Bayesian, Ensemble and
Probabilistic Learning
Techniques/Models
Brute-Force Bayes Concept Learning
We can design a straightforward concept learning algorithm to output the maximum
a posteriori hypothesis, based on Bayes theorem, as follows:
Assumptions
First consider the case where h is inconsistent with the training data D.
Now consider the case where h is consistent with D.
Maximum Likelihood and Least Squared
Error Hypothesis
Maximum Likelihood and Least Squared
Error Hypothesis
Maximum Likelihood and Least Squared
Error Hypothesis
Maximum Likelihood and Least Squared
Error Hypothesis
Minimum Description Length Principle
Minimum Description Length Principle

▶ Let us considerer a probability of designing a code to transmit messages drawn at

random from a set D
▶ Where probability of drawing an ith message =pi
▶ While transmitting we want a code that minimizes the expected no. of bits.
▶ To do this we should assign shorter codes to the most probable one
▶ We represent the length of message I wrt c as Lc(i)
Things We’d Like to Do

▶ Spam Classification
▶ Given an email, predict whether it is spam or not

▶ Medical Diagnosis
▶ Given a list of symptoms, predict whether a patient has disease X or not

▶ Weather
▶ Based on temperature, humidity, etc… predict if it will rain tomorrow
Bayesian Classification

▶ Problem statement:
▶ Given features
X1,X2,…,Xn
▶ Predict a label Y
Another Application

▶ Digit
Recognition

Classifier 5

▶ X1,…,Xn ∈ {0,1} (Black vs. White pixels)

▶ Y ∈ {5,6} (predict whether a digit is a 5 or a
6)
Provide practical learning algorithms:
• Naïve Bayes learning
• Bayesian belief network learning
• Combine prior knowledge (prior probabilities) with observed data
Requires prior probabilities:
• Provides useful conceptual framework:
• Provides “gold standard” for evaluating other learning algorithms
• Additional insight into Occam’s razor
Bayes Theorem

P(D | h)P(h)
P(h | D) =
P(D)
▶ P(h) = prior probability of hypothesis h
▶ P(D) = prior probability of training data
D
▶ P(h|D) = probability of h given D
▶ P(D|h) = probability of D given h
Bayes Optimal Classifier

▶ The Bayes optimal classifier is a probabilistic model that makes the most probable prediction for
a new example, given the training dataset.

▶ This model is also referred to as the Bayes optimal learner, the Bayes classifier, Bayes optimal
decision boundary, or the Bayes optimal discriminant function.

▶ Bayes Classifier: Probabilistic model that makes the most probable prediction for new
examples.

What is the most probable classification of the new instance given the training data?
Choosing Hypotheses

P(D | h)P(h)
P(h | D) =
P(D)
Generally want the most probable hypothesis given the training data
Maximum a posteriori hypothesis hMAP
: hMAP = arg max P(h | D)
h∈H

P(D | h)P(h)
= arg max
h∈H P(D)
= arg max P(D | h)P(h)
h∈H

If we assume P(hi)=P(hj) then can further simplify, and choose the Maximum
likelihood (ML) hypothesis
hML = arg max
hi ∈H P(D | hi )
Bayes Optimal Classifier
Bayes optimal classification

∑
hi ∈H P(vj | hi )P(hi | D)
arg max
vj
∈V
Example:
P(h1|D)=.4, P(-|h1)=0, P(+|h1)=1
P(h2|D)=.3, P(-|h2)=1, P(+|h2)=0
P(h3|D)=.3, P(-|h3)=1, P(+|h3)=0
therefore
∑
P(+ | hi )P(hi | D) = .4
hi ∈H

∑
P(− | hi )P(hi | D) = .6
hi ∈H

∑
hi ∈H P(vj | hi )P(hi | D) =
arg max
vj
∈V
-
The Bayes Classifier

▶ A good strategy is to
predict:

▶ (for example: what is the probability that the image represents a 5

given its pixels?)

▶ So … How do we compute
that?
The Bayes Classifier

▶ Use Bayes
Rule! Likelihood Prior

Normalization Constant

▶ Why did this help? Well, we think that we might be able to specify how features are “generated”
by the class label
ESTIMATING PROBABILITIES

Use the m-estimate defined as follows

Here, n, and n are defined as before, p is our prior estimate of the probability we wish to determine.
m is a constant called the equivalent sample size, which determines how heavily to weight p relative to
the observed data.
method for choosing p is to assume uniform priors. that is, if an attribute has k possible values we set p
= 1/k
New test data set

▶ Assume today = (Sunny, Hot, Normal,

False)

and probability to not play golf is given by:

Since, P(today) is common in both probabilities, we can ignore

P(today) and find proportional probabilities as:
▶ Likelihood of yes

▶ Likelihood of no

▶ Therefore, the prediction is No

The Naive Bayes Classifier for Data Sets with
Numerical Attribute Values

▶ One common practice to handle numerical attribute values is to assume normal distributions for
numerical attributes.
Naïve Bayes Assumption

▶ Recall the Naïve Bayes assumption:

▶ that all features are independent given the class label Y

▶ Does this hold for the digit recognition

problem?
Exclusive-OR Example

▶ For an example where conditional independence

fails:
▶ Y=XOR(X1,X2)

X1 X2 P(Y=0|X1,X2) P(Y=1|X1,X2)
0 0 1 0
0 1 0 1
1 0 0 1
1 1 1 0
▶ Actually, the Naïve Bayes assumption is almost never
true

▶ Still… Naïve Bayes often performs surprisingly well even when its assumptions do not
hold
Bayesian Belief Networks

34
BBN – Conditional Independence

35
Bayesian Belief Networks

Dr. Monika Puttaramaiah, Dept. of MEL, BMSCE 36

Bayesian Belief Networks

37
Bayesian Belief Networks

38
Inferences in Bayesian Belief Networks

39
Learning of Bayesian Networks

40
Learning Bayes Nets

41
Gradient Ascent for Bayes Nets

42
Gradient Ascent for Bayes Nets

43
Gradient Ascent for Bayes Nets

44
Gradient Ascent for Bayes Nets

45
Gradient Ascent for Bayes Nets

46
Gradient Ascent for Bayes Nets

47
More on Learning Bayes Nets

48
Bayesian Belief Networks

49
Bayesian Belief Networks

50
Bayesian Belief Networks - Joint Probability

51
Bayesian Belief Networks – only single condition given – Marginal Probability

52
Bayesian Belief Networks – Event not sure but given Evidences

53
Bayesian Belief Networks – Event not sure but given Evidences

54
Bayesian Belief Networks – Event not sure but given Evidences

55
Bayesian Belief Networks – Event not sure but given Evidences

56
BBN -
Advantages
• Intuitive, graphical, and efficient
• Accounts for sources of uncertainty
• Allows for information updating
• Models multiple interdependencies
• Models distributed & interacting systems
• Identifies critical components & cut sets
• Includes utility and decision nodes

57
BBN -
Disadvantages
• Not ideally suited for computing small probabilities
• Practical limitations on the type of distributions and the form of statistical
dependence
• Computationally demanding for systems with a large number of random variables
• Exponential growth of computational effort with increased number of states

58
BBN -
Applications

59
BBN -
Summary

60
Expectation Maximization [EM] Algorithm

61
Expectation Maximization [EM] Algorithm – Finite Mixture

62
Generating Mixture of k Guassians

63
EM Estimating k-means

64
EM Estimating k-means

65
EM Estimating k-means

66
EM Estimating k-means

67
EM Estimating k-means

68
EM Estimating k-means

69
General EM Problem

70
Extending Mixture Model

71
Extending Mixture Model

72
General EM Method

73
General EM Method

74
Expectation Maximization [EM] Algorithm

75
Expectation Maximization [EM] Algorithm

76
Expectation Maximization [EM] Algorithm

77
Expectation Maximization [EM] Algorithm - Uses

78
Expectation Maximization [EM] Algorithm - Advantages

79
Expectation Maximization [EM] Algorithm - Disadvantages

80
Expectation Maximization [EM] Algorithm - Example

81
Expectation Maximization [EM] Algorithm - Example

84
Expectation Maximization [EM] Algorithm - Example

85
Ensemble Learning
Types of Ensemble Methods

1. Voting (Averaging)
2. Bootstrap aggregation (Bagging)
3. Random Forest
4. Boosting
5. Stacked Generalization (Blending)
Voting (Averaging)
Bootstrap Aggregation (Bagging)
Random Forest
Boosting
Boosting
Stacked Generalization (Blending)
Stacked Generalization (Blending)
Note: Explore on Pasting and Random Patches

Unit 5 - Machine Learning - WWW - Rgpvnotes.in
No ratings yet
Unit 5 - Machine Learning - WWW - Rgpvnotes.in
12 pages
(Machine Learning) BAYES' THEOREM AND CONCEPT LEARNING
No ratings yet
(Machine Learning) BAYES' THEOREM AND CONCEPT LEARNING
22 pages
Anova
100% (2)
Anova
49 pages
Bayesian Learning
No ratings yet
Bayesian Learning
44 pages
Unit 3 Bayesian Learning
No ratings yet
Unit 3 Bayesian Learning
49 pages
Module - 4 - ECE3047 - Machine Learning
No ratings yet
Module - 4 - ECE3047 - Machine Learning
81 pages
Naive Bayes Classifier in Machine Learning - Javatpoint
No ratings yet
Naive Bayes Classifier in Machine Learning - Javatpoint
19 pages
Bayesian Learning: Based On "Machine Learning", T. Mitchell, Mcgraw Hill, 1997, Ch. 6
No ratings yet
Bayesian Learning: Based On "Machine Learning", T. Mitchell, Mcgraw Hill, 1997, Ch. 6
54 pages
L23 Bayesian Naive
No ratings yet
L23 Bayesian Naive
18 pages
Naive Bayes
No ratings yet
Naive Bayes
62 pages
8 ML
No ratings yet
8 ML
22 pages
Slide07 Bayes
No ratings yet
Slide07 Bayes
51 pages
2 Naive Bayes
No ratings yet
2 Naive Bayes
49 pages
NB Slides
No ratings yet
NB Slides
29 pages
ML Unit3
No ratings yet
ML Unit3
21 pages
Bayes Classification
No ratings yet
Bayes Classification
8 pages
Lecture 6 - Generative Models
No ratings yet
Lecture 6 - Generative Models
33 pages
Class Adv Classification IV
No ratings yet
Class Adv Classification IV
49 pages
Bayesian Decision Theory and Learning: Jayanta Mukhopadhyay Dept. of Computer Science and Engg
No ratings yet
Bayesian Decision Theory and Learning: Jayanta Mukhopadhyay Dept. of Computer Science and Engg
56 pages
Nayes Bayes Classifier
No ratings yet
Nayes Bayes Classifier
46 pages
Forecasting
No ratings yet
Forecasting
91 pages
Lec04 Classifiers NBC
No ratings yet
Lec04 Classifiers NBC
24 pages
Naive Bayes
No ratings yet
Naive Bayes
36 pages
Bayes' Theorem Explained
No ratings yet
Bayes' Theorem Explained
18 pages
Module - 3 - Last Part
No ratings yet
Module - 3 - Last Part
16 pages
Lecture 2 - Principle of Machine Learning
No ratings yet
Lecture 2 - Principle of Machine Learning
39 pages
Ba Yes Naive
No ratings yet
Ba Yes Naive
15 pages
Module05 - Bayesian Reasoning
No ratings yet
Module05 - Bayesian Reasoning
37 pages
Classification With NaiveBayes
No ratings yet
Classification With NaiveBayes
19 pages
29-Naive Bayes-03-10-2024
No ratings yet
29-Naive Bayes-03-10-2024
48 pages
05 Classification II 2024
No ratings yet
05 Classification II 2024
54 pages
Machine - Learning (Unit 3)
No ratings yet
Machine - Learning (Unit 3)
9 pages
ML 09 Naive Bayes Classifier
No ratings yet
ML 09 Naive Bayes Classifier
24 pages
Bayes Algorithm
No ratings yet
Bayes Algorithm
26 pages
Lecture 06 Bayesian Networks 07112022 011127pm
No ratings yet
Lecture 06 Bayesian Networks 07112022 011127pm
33 pages
Machine Learning: Lecture 6: Bayesian Learning (Based On Chapter 6 of Mitchell T.., Machine Learning, 1997)
No ratings yet
Machine Learning: Lecture 6: Bayesian Learning (Based On Chapter 6 of Mitchell T.., Machine Learning, 1997)
15 pages
Bayesian
No ratings yet
Bayesian
23 pages
Lecture Slide 03 - Bayesian Classifier - Summer 2023
No ratings yet
Lecture Slide 03 - Bayesian Classifier - Summer 2023
23 pages
Data Mining - Module 7
No ratings yet
Data Mining - Module 7
8 pages
Bayes Classifier
No ratings yet
Bayes Classifier
35 pages
Lecture 5 Bayesian
No ratings yet
Lecture 5 Bayesian
37 pages
Bayesian Classification: Cse 634 Data Mining - Prof. Anita Wasilewska
No ratings yet
Bayesian Classification: Cse 634 Data Mining - Prof. Anita Wasilewska
66 pages
Machine Learning and Data Mining: Prof. Alexander Ihler
No ratings yet
Machine Learning and Data Mining: Prof. Alexander Ihler
51 pages
Probabilistic Models in Machine Learning: Unit - III Chapter - 1
No ratings yet
Probabilistic Models in Machine Learning: Unit - III Chapter - 1
18 pages
Lecture10 - Bayesian Classifier
No ratings yet
Lecture10 - Bayesian Classifier
40 pages
Naive Bayes Classifier
No ratings yet
Naive Bayes Classifier
14 pages
Unit6 - 3 Classification-Bayesian
No ratings yet
Unit6 - 3 Classification-Bayesian
15 pages
Naïve Bayes Classifier: April 25, 2006
No ratings yet
Naïve Bayes Classifier: April 25, 2006
19 pages
Naive Bayes Classifier in Machine Learning
No ratings yet
Naive Bayes Classifier in Machine Learning
16 pages
Bayes Classifier
No ratings yet
Bayes Classifier
20 pages
Bayesian Classification: Dr. Navneet Goyal BITS, Pilani
No ratings yet
Bayesian Classification: Dr. Navneet Goyal BITS, Pilani
35 pages
Bayesian Classification: Dr. Navneet Goyal BITS, Pilani
No ratings yet
Bayesian Classification: Dr. Navneet Goyal BITS, Pilani
35 pages
Classification-Alternative Techniques: Bayesian Classifiers
No ratings yet
Classification-Alternative Techniques: Bayesian Classifiers
7 pages
Machine Learning and Data Mining: Prof. Alexander Ihler
No ratings yet
Machine Learning and Data Mining: Prof. Alexander Ihler
51 pages
Unit-4 Naïve Bayes & Support Vector Machine
No ratings yet
Unit-4 Naïve Bayes & Support Vector Machine
79 pages
Lecture - 4.1 - Bayes Classifier
No ratings yet
Lecture - 4.1 - Bayes Classifier
31 pages
8 - Classification NaiveBayes PDF
No ratings yet
8 - Classification NaiveBayes PDF
13 pages
Bayes Rule PR-2
No ratings yet
Bayes Rule PR-2
5 pages
Bayesian Learning
No ratings yet
Bayesian Learning
49 pages
Econometrics Notes
No ratings yet
Econometrics Notes
95 pages
Lecture 5 Bayesian Classification
No ratings yet
Lecture 5 Bayesian Classification
16 pages
Game Theory
No ratings yet
Game Theory
60 pages
Answer Key CHP 18 Derivatives Market
No ratings yet
Answer Key CHP 18 Derivatives Market
5 pages
The Work of John Nash in Game Theory - Nobel Seminar 1994
No ratings yet
The Work of John Nash in Game Theory - Nobel Seminar 1994
33 pages
Econometrics WS11-12 Course Manual
100% (1)
Econometrics WS11-12 Course Manual
72 pages
Heteroscedasticity Issue
100% (2)
Heteroscedasticity Issue
3 pages
Valvetables PDF
No ratings yet
Valvetables PDF
10 pages
Credibility Theory
No ratings yet
Credibility Theory
6 pages
Basic Econometrics 2023 Question Paper With Solution Delhi University BBE Business Economics
No ratings yet
Basic Econometrics 2023 Question Paper With Solution Delhi University BBE Business Economics
7 pages
Risk Assessment and Pooling - Book 2
No ratings yet
Risk Assessment and Pooling - Book 2
21 pages
Ch3 ComputerSolutionSensitivityAnalysis
No ratings yet
Ch3 ComputerSolutionSensitivityAnalysis
38 pages
Take Home Exam
No ratings yet
Take Home Exam
10 pages
International Statistical Institute (ISI)
No ratings yet
International Statistical Institute (ISI)
15 pages
Pivot Table
No ratings yet
Pivot Table
52 pages
3 V Regression Mod 2
No ratings yet
3 V Regression Mod 2
35 pages
Module 5
No ratings yet
Module 5
28 pages
BPCC 108 Unit 01 Introduction To Inferential Statistics @aanchal
No ratings yet
BPCC 108 Unit 01 Introduction To Inferential Statistics @aanchal
58 pages
DAV Short Notes
No ratings yet
DAV Short Notes
5 pages
Control Function Approach Slides
No ratings yet
Control Function Approach Slides
9 pages
ISE 5424 Syllabus
No ratings yet
ISE 5424 Syllabus
4 pages
Regresi 1: Variables Entered/Removed
No ratings yet
Regresi 1: Variables Entered/Removed
5 pages
Sample Questions
No ratings yet
Sample Questions
4 pages
Inference For Normal Mean Variance - postXX
No ratings yet
Inference For Normal Mean Variance - postXX
35 pages
Answer Key: Make Sure To Write Your Answers Here. They Will Not Be Graded If They Are Anywhere Else
No ratings yet
Answer Key: Make Sure To Write Your Answers Here. They Will Not Be Graded If They Are Anywhere Else
1 page
Chapter 7: Heteroscedasticity
No ratings yet
Chapter 7: Heteroscedasticity
20 pages
2 PB
No ratings yet
2 PB
15 pages
DAR Descriptive
No ratings yet
DAR Descriptive
2 pages
Discount Rate: Concept 9: Present Value
No ratings yet
Discount Rate: Concept 9: Present Value
5 pages
Naive Bayes Classifier: Fundamentals and Applications
From Everand
Naive Bayes Classifier: Fundamentals and Applications
Fouad Sabry
No ratings yet
Bayesian Inference: Fundamentals and Applications
From Everand
Bayesian Inference: Fundamentals and Applications
Fouad Sabry
No ratings yet

IML Module 3

Uploaded by

IML Module 3

Uploaded by

B M S College of Engineering

Department of Machine Learning

▶ Let us considerer a probability of designing a code to transmit messages drawn at

▶ X1,…,Xn ∈ {0,1} (Black vs. White pixels)

▶ (for example: what is the probability that the image represents a 5

Use the m-estimate defined as follows

▶ Assume today = (Sunny, Hot, Normal,

and probability to not play golf is given by:

Since, P(today) is common in both probabilities, we can ignore

▶ Therefore, the prediction is No

▶ Recall the Naïve Bayes assumption:

▶ that all features are independent given the class label Y

▶ Does this hold for the digit recognition

▶ For an example where conditional independence

Dr. Monika Puttaramaiah, Dept. of MEL, BMSCE 36

You might also like