Naïve Bayes Classification
Dr.S.Domnic
Things We’d Like to Do
• Spam Classification
– Given an email, predict whether it is spam or not
• Medical Diagnosis
– Given a list of symptoms, predict whether a patient has
disease X or not
• Weather
– Based on temperature, humidity, etc… predict if it will rain
tomorrow
Bayesian Classification
• Problem statement:
– Given features X1,X2,…,Xn
– Predict a label Y
Application
• Digit Recognition
Classifier 5
• X1,…,Xn {0,1} (Black vs. White pixels)
• Y {5,6} (predict whether a digit is a 5 or a 6)
The Bayes Classifier
• A good strategy is to predict:
– (for example: what is the probability that the image
represents a 5 given its pixels?)
• So … How do we compute that?
The Bayes Classifier
• Use Bayes Rule!
Likelihood Prior
Normalization Constant
The Bayes Classifier
• P(y|X) is the posterior probability of class (y, target)
given predictor (x, attributes).
• P(y) is the prior probability of class.
• P(X|y) is the likelihood which is the probability
of predictor given class.
• P(X) is the prior probability of predictor.
The Bayes Classifier
The Naïve Bayes Model
• The Naïve Bayes Assumption: Assume that all features are
independent given the class label Y
• By applying chain rule and conditional independence :
The Bayes Classifier
By applying chain rule and conditional independence
(independent features)
Therefore, we have to find the class variable(y) with
maximum probability.
Naïve Bayesian Classifier
Algorithm: Naïve Bayesian Classification
CS 40003: Data Analytics
Naïve Bayesian Classifier
CS 40003: Data Analytics 12
Example
• Example: Play Tennis
In this dataset, there are four attributes
A = [ Outlook, Temperature, Humidity, Wind]
with 14 Instances.
The categories of classes are:
13
C= [Yes, No]
Example
• Learning Phase
Outlook Play=Yes Play=No Temperature Play=Yes Play=No
Sunny 2/9 3/5 Hot 2/9 2/5
Overcast 4/9 0/5 Mild 4/9 2/5
Rain 3/9 2/5 Cool 3/9 1/5
Humidity Play=Yes Play=No Wind Play=Yes Play=No
High 3/9 4/5 Strong 3/9 3/5
Normal 6/9 1/5 Weak 6/9 2/5
P(Play=Yes) = 9/14 P(Play=No) = 5/14
14
Example
• Test Phase
– Given a new instance,
x’=(Outlook=Sunny, Temperature=Cool, Humidity=High, Wind=Strong)
Find P(Yes|x’), P(No|x’)
– Look up tables
P(Outlook=Sunny|Play=No) = 3/5
P(Outlook=Sunny|Play=Yes) = 2/9 P(Temperature=Cool|Play==No) = 1/5
P(Temperature=Cool|Play=Yes) = 3/9 P(Huminity=High|Play=No) = 4/5
P(Huminity=High|Play=Yes) = 3/9 P(Wind=Strong|Play=No) = 3/5
P(Wind=Strong|Play=Yes) = 3/9 P(Play=No) = 5/14
P(Play=Yes) = 9/14
– MAP rule
P(Yes|x’): [P(Sunny|Yes)P(Cool|Yes)P(High|Yes)P(Strong|Yes)]P(Play=Yes) = 0.0053
P(No|x’): [P(Sunny|No) P(Cool|No)P(High|No)P(Strong|No)]P(Play=No) = 0.0206
Given the fact P(Yes|x’) < P(No|x’), we label x’ to be “No”.
15
The Bayes Classifier
• Let’s expand this for digit recognition task:
• To classify, we’ll simply compute these two probabilities and predict
based on which one is greater
Naïve Bayes Training
• Now that we’ve decided to use a Naïve Bayes classifier, we need to train it
with some data:
MNIST Training Data
Naïve Bayes Training
X1, X2 {<25%, 25%><50%, >50%<75%, 75%>} (Black vs. White pixels)
S.No Black White Class
1 25% 75% 5
2 30% 70% 6
3 20% 75&above 6
4 35% 65% 5
5 75% above 10% 5
6 20% 80% 6
7 55% 45% 5
8 34% 66% 5
9 25% 75% 6
10 75% above 20% 6
Example
• Learning Phase
Black Digit=5 Digit=6
White Digit=5 Digit=6
<=25% 1/5 3/5 <25% 1/5 1/5
>25% <50% 2/5 1/5 >25% <50% 1/5 0/5
>50%<75% 1/5 0/5 >50%<75% 3/5 2/5
>75% 1/5 1/5
>75% 0/5 2/5
P(Digit=5) = 5/10 P(Digit=6) = 5/10
19
Example
• Test Phase
– Given a new instance,
x’=(Black=30%, White=70%)
Find P(5|x’), P(6|x’)
– Look up tables
P(Black=>25%<50%|Digit=6) = 1/5
P(Black=>25%<50%|Digit=5) = 2/5 P(White=>50<75%| Digit=6) = 2/5
P(White=>50<75%| Digit=5) = 3/5 P(Digit=6) = 5/10
P(Digit=5) = 5/10
– MAP rule
P(5|x’): P(Black=>25%<50%| digit=5)P(White=>50%<75%|Digit=5)]P(Digit=5) = 0.12
P(6|x’): P(Black=>25%<50%| digit=6)P(White=>50%<75%|Digit=6)]P(Digit=6) = 0.04
Given the fact P(5|x’) > P(6|x’), we label x’ to be “5”.
20
Questions?