0% found this document useful (0 votes)

66 views29 pages

Lecture 10 Naïve Bayes Classification

This document discusses Naive Bayes classification. It begins by providing examples of classification problems like spam detection, medical diagnosis, and weather prediction. It then introduces the Bayesian classification approach and derives the Naive Bayes classifier by making the naive assumption that features are conditionally independent given the class label. The document discusses training a Naive Bayes model, classifying new examples, and some of the limitations of the naive independence assumption. It concludes by noting that Naive Bayes is easy to implement and often effective in practice.

Uploaded by

Abdul Majid

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

66 views29 pages

Lecture 10 Naïve Bayes Classification

Uploaded by

Abdul Majid

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

Lecture 8:

Decision Tree Learning

Lecture 10
Naïve Bayes Classification
Things We’d Like to Do

 Spam Classification
 Given an email, predict whether it is spam or not

 Medical Diagnosis
 Given a list of symptoms, predict whether a
patient has cancer or not

 Weather
 Based on temperature, humidity, etc… predict if it
will rain tomorrow
Bayesian Classification

 Problem statement:
 Given features X1,X2,…,Xn
 Predict a label Y
Another Application

 Digit Recognition

Classifier 5

 X1,…,Xn  {0,1} (Black vs. White pixels)

 Y  {5,6} (predict whether a digit is a 5 or a 6)
The Bayes Classifier

 In class, we saw that a good strategy is to predict:

 (for example: what is the probability that the

image represents a 5 given its pixels?)

 So … how do we compute that?

The Bayes Classifier

 Use Bayes Rule!

Likelihood Prior

Normalization Constant

 Why did this help? Well, we think that we might be

able to specify how features are “generated” by the
class label
The Bayes Classifier

 Let’s expand this for our digit recognition task:

 To classify, we’ll simply compute these two probabilities and

predict based on which one is greater
Model Parameters

 For the Bayes classifier, we need to “learn” two

functions, the likelihood and the prior

 How many parameters are required to specify the

prior for our digit recognition example?
Model Parameters

 How many parameters are required to specify the

likelihood?
 (Supposing that each image is 30x30 pixels)
Model Parameters

 The problem with explicitly modeling P(X1,…,Xn|Y) is

that there are usually way too many parameters:
 We’ll run out of space

 We’ll run out of time

 And we’ll need tons of training data (which is

usually not available)

The Naïve Bayes Model

 The Naïve Bayes Assumption: Assume that all

features are independent given the class label Y
 Equationally speaking:

 (We will discuss the validity of this assumption later)

Why is this useful?

 # of parameters for modeling P(X1,…,Xn|Y):

 2(2n-1)

 # of parameters for modeling P(X1|Y),…,P(Xn|Y)

 2n
Naïve Bayes Training

 Now that we’ve decided to use a Naïve Bayes classifier, we need

to train it with some data:

MNIST Training
Naïve Bayes Training

 Training in Naïve Bayes is easy:

 Estimate P(Y=v) as the fraction of records with Y=v

 Estimate P(Xi=u|Y=v) as the fraction of records with Y=v for

which Xi=u

 (This corresponds to Maximum Likelihood estimation of model

parameters)
Naïve Bayes Training

 In practice, some of these counts can be zero

 Fix this by adding “virtual” counts:

 (This is like putting a prior on parameters and

doing MAP estimation instead of MLE)
 This is called Smoothing
Naïve Bayes Training

 For binary digits, training amounts to averaging all of the

training fives together and all of the training sixes together.
Naïve Bayes Classification
Outputting Probabilities

 What’s nice about Naïve Bayes (and generative

models in general) is that it returns probabilities
 These probabilities can tell us how confident the

algorithm is
 So… don’t throw away those probabilities!
Performance on a Test Set

 Naïve Bayes is often a good choice if you don’t have much

training data!
Naïve Bayes Assumption

 Recall the Naïve Bayes assumption:

 that all features are independent given the class label Y

 Does this hold for the digit recognition problem?

Exclusive-OR Example
 For an example where conditional independence
fails:
 Y=XOR(X1,X2)

X1 X2 P(Y=0|X1,X2) P(Y=1|X1,X2)
0 0 1 0
0 1 0 1
1 0 0 1
1 1 1 0
 Actually, the Naïve Bayes assumption is almost never
true

 Still… Naïve Bayes often performs surprisingly well

even when its assumptions do not hold
Numerical Stability

 It is often the case that machine learning algorithms

need to work with very small numbers
 Imagine computing the probability of 2000

independent coin flips

 MATLAB thinks that (.5)
2000=0
Numerical Stability

 Instead of comparing P(Y=5|X1,…,Xn) with

P(Y=6|X1,…,Xn),
 Compare their logarithms
Recovering the Probabilities

 Suppose that for some constant K, we have:

 And

 How would we recover the original probabilities?

Recovering the Probabilities

 Given:
 Then for any constant C:

 One suggestion: set C such that the greatest i is

shifted to zero:
Recap

 We defined a Bayes classifier but saw that it’s

intractable to compute P(X1,…,Xn|Y)
 We then used the Naïve Bayes assumption – that
everything is independent given the class label Y

 A natural question: is there some happy compromise

where we only assume that some features are
conditionally independent?
 Stay Tuned…
Conclusions

 Naïve Bayes is:

 Really easy to implement and often works well

 Often a good first thing to try

 Commonly used as a “punching bag” for smarter

algorithms

MODI Method Examples, Transportation Problem
100% (2)
MODI Method Examples, Transportation Problem
4 pages
Navies Bayes
No ratings yet
Navies Bayes
18 pages
Naive Bayes Classifier in Machine Learning - Javatpoint
No ratings yet
Naive Bayes Classifier in Machine Learning - Javatpoint
19 pages
Mastering Advanced Scala Sample
No ratings yet
Mastering Advanced Scala Sample
21 pages
DAA 2marks With Answers
No ratings yet
DAA 2marks With Answers
11 pages
Bayesian Classification
No ratings yet
Bayesian Classification
25 pages
NB Slides
No ratings yet
NB Slides
29 pages
Compiler Lab
No ratings yet
Compiler Lab
5 pages
Naive Bayes
No ratings yet
Naive Bayes
62 pages
Naive Bayes
No ratings yet
Naive Bayes
62 pages
14 Supervised Machine Learning
No ratings yet
14 Supervised Machine Learning
94 pages
Class 3 Navie Bayes
No ratings yet
Class 3 Navie Bayes
21 pages
Lecture3 Linear Classifiers
No ratings yet
Lecture3 Linear Classifiers
36 pages
Naive Ba Yes
No ratings yet
Naive Ba Yes
65 pages
CSE546: Naïve Bayes: Winter 2012
No ratings yet
CSE546: Naïve Bayes: Winter 2012
35 pages
2 Naive Bayes
No ratings yet
2 Naive Bayes
49 pages
Lecture 7
No ratings yet
Lecture 7
15 pages
Naive Bayes Classifiers - Parta
No ratings yet
Naive Bayes Classifiers - Parta
17 pages
NaiveBayersClassification BA
No ratings yet
NaiveBayersClassification BA
36 pages
Naive Bayes Ons
No ratings yet
Naive Bayes Ons
29 pages
Chapter 8
No ratings yet
Chapter 8
24 pages
Data Structures (DS) : Detailed Content Hours Introduction To Data Structure
No ratings yet
Data Structures (DS) : Detailed Content Hours Introduction To Data Structure
4 pages
Unit 5-6
No ratings yet
Unit 5-6
18 pages
Naive Bayes
No ratings yet
Naive Bayes
9 pages
Machine Ass
No ratings yet
Machine Ass
33 pages
Lec 09
No ratings yet
Lec 09
50 pages
Unit-3 AML (Bayesian Concept Learning)
No ratings yet
Unit-3 AML (Bayesian Concept Learning)
40 pages
Notes On Module 3 - Pattern Recognition
No ratings yet
Notes On Module 3 - Pattern Recognition
17 pages
lec20-ML I
No ratings yet
lec20-ML I
48 pages
Practical-3 Ritesh
No ratings yet
Practical-3 Ritesh
5 pages
cs221 Lecture10
No ratings yet
cs221 Lecture10
43 pages
Naïve Bayes Classifier
No ratings yet
Naïve Bayes Classifier
21 pages
Naive456 Bayes297Classification
No ratings yet
Naive456 Bayes297Classification
21 pages
29-Naive Bayes-03-10-2024
No ratings yet
29-Naive Bayes-03-10-2024
48 pages
Linear Programming Class 12 Notes CBSE Maths Chapter 12
No ratings yet
Linear Programming Class 12 Notes CBSE Maths Chapter 12
10 pages
CSL0777 L24
No ratings yet
CSL0777 L24
38 pages
Lecture - 4.1 - Bayes Classifier
No ratings yet
Lecture - 4.1 - Bayes Classifier
31 pages
Pgm5 With Output
No ratings yet
Pgm5 With Output
13 pages
Back Savers Production Problem
No ratings yet
Back Savers Production Problem
4 pages
LM3 - Naive Bayes Model
No ratings yet
LM3 - Naive Bayes Model
21 pages
NBayes 1 20 2011 Ann
No ratings yet
NBayes 1 20 2011 Ann
21 pages
L25 - Naïve Bayes
No ratings yet
L25 - Naïve Bayes
18 pages
Naive Bates Classifier
No ratings yet
Naive Bates Classifier
18 pages
Naive Bayes
No ratings yet
Naive Bayes
26 pages
SP14 CS188 Lecture 21 - Naive Bayes - Print
No ratings yet
SP14 CS188 Lecture 21 - Naive Bayes - Print
41 pages
Class Adv Classification IV
No ratings yet
Class Adv Classification IV
49 pages
Lec 09
No ratings yet
Lec 09
50 pages
Ba Yes Naive
No ratings yet
Ba Yes Naive
15 pages
Ame: Waqar Ali
No ratings yet
Ame: Waqar Ali
22 pages
NOTES
No ratings yet
NOTES
15 pages
Naive Bayes Classifier in Machine Learning Javatpoint
No ratings yet
Naive Bayes Classifier in Machine Learning Javatpoint
23 pages
07 Naive Bayes
No ratings yet
07 Naive Bayes
6 pages
What Is Naive Bayes Algorithm
No ratings yet
What Is Naive Bayes Algorithm
10 pages
UNIT 2 AAM Notes
No ratings yet
UNIT 2 AAM Notes
38 pages
Bayes Classifier
No ratings yet
Bayes Classifier
20 pages
Naïve Bayes Classifier Algorithm
No ratings yet
Naïve Bayes Classifier Algorithm
11 pages
Naive Bayes Classifier in Machine Learning
No ratings yet
Naive Bayes Classifier in Machine Learning
16 pages
WK 08
No ratings yet
WK 08
10 pages
Naïve Bayes Classifier: April 25, 2006
No ratings yet
Naïve Bayes Classifier: April 25, 2006
19 pages
16 - Naïve Bayes Classifier
No ratings yet
16 - Naïve Bayes Classifier
21 pages
IML Module 3
No ratings yet
IML Module 3
95 pages
Bayes Classifier
No ratings yet
Bayes Classifier
35 pages
Naive Bayes
No ratings yet
Naive Bayes
4 pages
6 Easy Steps To Learn Naive Bayes Algorithm (With Code in Python)
No ratings yet
6 Easy Steps To Learn Naive Bayes Algorithm (With Code in Python)
3 pages
EEE2046 Week 1 Board+Notes 1
No ratings yet
EEE2046 Week 1 Board+Notes 1
12 pages
(A K Ray S K Gupta) Mathematical Methods in Chemi
No ratings yet
(A K Ray S K Gupta) Mathematical Methods in Chemi
53 pages
Parsing - 1
No ratings yet
Parsing - 1
59 pages
SCL-fuzzy Toolbox
No ratings yet
SCL-fuzzy Toolbox
21 pages
Fast Convergence Particle Swarm Optimization For Functions Optimization
No ratings yet
Fast Convergence Particle Swarm Optimization For Functions Optimization
6 pages
Lec-2 ALU Multiplication CompArch Wali
No ratings yet
Lec-2 ALU Multiplication CompArch Wali
14 pages
Query Languages
No ratings yet
Query Languages
10 pages
P13CS63 - Ii
No ratings yet
P13CS63 - Ii
3 pages
Sorting Algorithm Implemented in Python
No ratings yet
Sorting Algorithm Implemented in Python
6 pages
Tries DS: Dr. Zahid Halim
No ratings yet
Tries DS: Dr. Zahid Halim
17 pages
Ec301 Digital Signal Processing, January 2022
No ratings yet
Ec301 Digital Signal Processing, January 2022
2 pages
Chapter 5
No ratings yet
Chapter 5
33 pages
063 - Vedaant Budakoti - B1
No ratings yet
063 - Vedaant Budakoti - B1
19 pages
Summative Test 3-3
No ratings yet
Summative Test 3-3
3 pages
Python Oop Part1
No ratings yet
Python Oop Part1
9 pages
Untitled
No ratings yet
Untitled
11 pages
Computer Science Igcse Pre Release Task 1
No ratings yet
Computer Science Igcse Pre Release Task 1
5 pages
Inheritance 57
No ratings yet
Inheritance 57
25 pages
Or - Lecture 3 - LP Graphical Solution
No ratings yet
Or - Lecture 3 - LP Graphical Solution
24 pages
DSA 1month Revision Plan v2
No ratings yet
DSA 1month Revision Plan v2
5 pages
A Parallelizable Variant of HCA : Sreenivasan Ganti Visnu Srinivasan Pallavi Ramicetty
No ratings yet
A Parallelizable Variant of HCA : Sreenivasan Ganti Visnu Srinivasan Pallavi Ramicetty
7 pages
Probab 10
No ratings yet
Probab 10
3 pages
Software Testing Methodologies Syllabus
No ratings yet
Software Testing Methodologies Syllabus
2 pages
Attacking Problems in Logarithms and Exponential Functions
From Everand
Attacking Problems in Logarithms and Exponential Functions
David S. Kahn
5/5 (1)
Top Numerical Methods With Matlab For Beginners!
From Everand
Top Numerical Methods With Matlab For Beginners!
Andrei Besedin
No ratings yet
Naive Bayes Classifier: Fundamentals and Applications
From Everand
Naive Bayes Classifier: Fundamentals and Applications
Fouad Sabry
No ratings yet

Lecture 10 Naïve Bayes Classification

Uploaded by

Lecture 10 Naïve Bayes Classification

Uploaded by

Lecture 8:

Decision Tree Learning

 X1,…,Xn  {0,1} (Black vs. White pixels)

 In class, we saw that a good strategy is to predict:

 (for example: what is the probability that the

 So … how do we compute that?

 Use Bayes Rule!

 Why did this help? Well, we think that we might be

 Let’s expand this for our digit recognition task:

 To classify, we’ll simply compute these two probabilities and

 For the Bayes classifier, we need to “learn” two

 How many parameters are required to specify the

 How many parameters are required to specify the

 The problem with explicitly modeling P(X1,…,Xn|Y) is

 We’ll run out of time

 And we’ll need tons of training data (which is

usually not available)

 The Naïve Bayes Assumption: Assume that all

 (We will discuss the validity of this assumption later)

 # of parameters for modeling P(X1,…,Xn|Y):

 # of parameters for modeling P(X1|Y),…,P(Xn|Y)

 Now that we’ve decided to use a Naïve Bayes classifier, we need

 Training in Naïve Bayes is easy:

 Estimate P(Xi=u|Y=v) as the fraction of records with Y=v for

 (This corresponds to Maximum Likelihood estimation of model

 In practice, some of these counts can be zero

 (This is like putting a prior on parameters and

 For binary digits, training amounts to averaging all of the

 What’s nice about Naïve Bayes (and generative

 Naïve Bayes is often a good choice if you don’t have much

 Recall the Naïve Bayes assumption:

 that all features are independent given the class label Y

 Does this hold for the digit recognition problem?

 Still… Naïve Bayes often performs surprisingly well

 It is often the case that machine learning algorithms

independent coin flips

 Instead of comparing P(Y=5|X1,…,Xn) with

 Suppose that for some constant K, we have:

 How would we recover the original probabilities?

 One suggestion: set C such that the greatest i is

 We defined a Bayes classifier but saw that it’s

 A natural question: is there some happy compromise

 Naïve Bayes is:

 Often a good first thing to try

 Commonly used as a “punching bag” for smarter

You might also like