0% found this document useful (0 votes)

318 views16 pages

Naive Bayes Classifier: Coin Toss and Fair Dice Example

The document discusses the Naive Bayes classifier algorithm. It begins by explaining what conditional probability and Bayes' rule are. It then describes how Naive Bayes makes the assumption that features are independent, and uses Bayes' rule to calculate the probability that a new data point belongs to each class. An example is provided to demonstrate how to classify a fruit as banana, orange or other using Naive Bayes. Finally, the advantages and disadvantages of Naive Bayes are summarized.

Uploaded by

Rupali Patil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

318 views16 pages

Naive Bayes Classifier: Coin Toss and Fair Dice Example

Uploaded by

Rupali Patil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 16

Naive Bayes Classifier

1. Introduction
Naive Bayes is a probabilistic machine learning algorithm that can be
used in a wide variety of classification tasks. Typical applications include
filtering spam, classifying documents, sentiment prediction etc. It is based
on the works of Rev. Thomas Bayes (1702-61) and hence the name.

But why is it called ‘Naive’?

The name naive is used because it assumes the features that go into the
model is independent of each other. That is changing the value of one
feature, does not directly influence or change the value of any of the other
features used in the algorithm.

Naive Bayes does seem to be a simple yet powerful algorithm.

It is a probabilistic model, the algorithm can be coded up easily and the
predictions made real quick.
Real-time quick. Because of this, it is easily scalable and is traditionally
the algorithm of choice for real-world applications (apps) that are required
to respond to user’s requests instantaneously.

But before you go into Naive Bayes, you need to understand what
‘Conditional Probability’ is and what is the ‘Bayes Rule’.

2. What is Conditional Probability?

Coin Toss and Fair Dice Example

When you flip a fair coin, there is an equal chance of getting either heads
or tails. So you can say the probability of getting heads is 50%.

Similarly what would be the probability of getting a 1 when you roll a dice
with 6 faces? Assuming the dice is fair, the probability of 1/6 = 0.166.

Playing Cards Example

If you pick a card from the deck, can you guess the probability of getting a
queen given the card is a spade?
Well, I have already set a condition that the card is a spade. So, the
denominator (eligible population) is 13 and not 52. And since there is only
one queen in spades, the probability it is a queen given the card is a spade
is 1/13 = 0.077

This is a classic example of conditional probability. So, when you say the
conditional probability of A given B, it denotes the probability of A
occurring given that B has already occurred.

Mathematically, Conditional probability of A given B can be computed as:

P(A|B) = P(A AND B) / P(B)

School Example
Consider a school with a total population of 100 persons. These 100
persons can be seen either as ‘Students’ and ‘Teachers’ or as a population
of ‘Males’ and ‘Females’.
With below tabulation of the 100 people, what is the conditional
probability that a certain member of the school is a ‘Teacher’ given that he
is a ‘Man’?

To calculate this, you may intuitively filter the sub-population of 60 males

and focus on the 12 (male) teachers.
So the required conditional probability P(Teacher | Male) = 12 / 60 = 0.2.

This can be represented as the intersection of Teacher (A) and Male (B)
divided by Male (B). Likewise, the conditional probability of B given A can
be computed. The Bayes Rule that we use for Naive Bayes, can be derived
from these two notations.
3. The Bayes Rule
The Bayes Rule is a way of going from P(X|Y), known from the training
dataset, to find P(Y|X).
To do this, we replace A and B in the above formula, with the feature X and
response Y.
For observations in test or scoring data, the X would be known while Y is
unknown. And for each row of the test dataset, you want to compute the
probability of Y given the X has already happened.
What happens if Y has more than 2 categories? we compute the
probability of each class of Y and let the highest win.

4. The Naive Bayes

The Bayes Rule provides the formula for the probability of Y given X. But,
in real-world problems, you typically have multiple X variables.
When the features are independent, we can extend the Bayes Rule to what
is called Naive Bayes.
It is called ‘Naive’ because of the naive assumption that the X’s are
independent of each other. Regardless of its name, it’s a powerful
formula.

In technical jargon, the left-hand-side (LHS) of the equation is understood

as the posterior probability or simply the posterior
The RHS has 2 terms in the numerator.
The first term is called the ‘Likelihood of Evidence’. It is nothing but the
conditional probability of each X’s given Y is of particular class ‘c’.
Since all the X’s are assumed to be independent of each other, you can
just multiply the ‘likelihoods’ of all the X’s and called it the ‘Probability of
likelihood of evidence’. This is known from the training dataset by filtering
records where Y=c.
The second term is called the prior which is the overall probability of Y=c,
where c is a class of Y. In simpler terms, Prior = count(Y=c) / n_Records .
An example is better than an hour of theory. So let’s see one.

5. Naive Bayes Example

Say you have 1000 fruits which could be either ‘banana’, ‘orange’ or ‘other’.
These are the 3 possible classes of the Y variable.
We have data for the following X variables, all of which are binary (1 or 0).
 Long
 Sweet
 Yellow
The first few rows of the training dataset look like this:
Fruit Long (x1) Sweet (x2) Yellow (x3)
Orange 0 1 0
Banana 1 0 1
Banana 1 1 1
Other 1 1 0
.. .. .. ..
For the sake of computing the probabilities, let’s aggregate the training
data to form a counts table like this.

So the objective of the classifier is to predict if a given fruit is a ‘Banana’

or ‘Orange’ or ‘Other’ when only the 3 features (long, sweet and yellow) are
known.
Let’s say you are given a fruit that is: Long, Sweet and Yellow, can you
predict what fruit it is?
This is the same of predicting the Y when only the X variables in testing
data are known. Let’s solve it by hand using Naive Bayes.
The idea is to compute the 3 probabilities, that is the probability of the
fruit being a banana, orange or other. Whichever fruit type gets the highest
probability wins.
All the information to calculate these probabilities is present in the above
tabulation.
Step 1: Compute the ‘Prior’ probabilities for each of the class of fruits.
That is, the proportion of each fruit class out of all the fruits from the
population. You can provide the ‘Priors’ from prior information about the
population. Otherwise, it can be computed from the training data.
For this case, let’s compute from the training data. Out of 1000 records in
training data, you have 500 Bananas, 300 Oranges and 200 Others. So the
respective priors are 0.5, 0.3 and 0.2.
P(Y=Banana) = 500 / 1000 = 0.50
P(Y=Orange) = 300 / 1000 = 0.30
P(Y=Other) = 200 / 1000 = 0.20
Step 2: Compute the probability of evidence that goes in the
denominator.
This is nothing but the product of P of Xs for all X. This is an optional step
because the denominator is the same for all the classes and so will not
affect the probabilities.
P(x1=Long) = 500 / 1000 = 0.50
P(x2=Sweet) = 650 / 1000 = 0.65
P(x3=Yellow) = 800 / 1000 = 0.80
Step 3: Compute the probability of likelihood of evidences that goes in
the numerator.
It is the product of conditional probabilities of the 3 features. If you refer
back to the formula, it says P(X1 |Y=k). Here X1 is ‘Long’ and k is ‘Banana’.
That means the probability the fruit is ‘Long’ given that it is a Banana. In
the above table, you have 500 Bananas. Out of that 400 is long. So, P(Long
| Banana) = 400/500 = 0.8.
Here, I have done it for Banana alone.
Probability of Likelihood for Banana
P(x1=Long | Y=Banana) = 400 / 500 = 0.80
P(x2=Sweet | Y=Banana) = 350 / 500 = 0.70
P(x3=Yellow | Y=Banana) = 450 / 500 = 0.90
So, the overall probability of Likelihood of evidence for Banana = 0.8 * 0.7
* 0.9 = 0.504
Step 4: Substitute all the 3 equations into the Naive Bayes formula, to get
the probability that it is a banana.
Similarly, you can compute the probabilities for ‘Orange’ and ‘Other fruit’.
The denominator is the same for all 3 cases, so it’s optional to compute.
Clearly, Banana gets the highest probability, so that will be our predicted
class.

6. Building Naive Bayes Classifier in Python

# Import packages

fromsklearn.naive_bayesimportGaussianNB

fromsklearn.model_selectionimporttrain_test_split

fromsklearn.metricsimportconfusion_matrix

importnumpyas np

import pandas as pd

importmatplotlib.pyplotasplt

import seaborn assns;sns.set()

# Import data

training =
pd.read_csv('https://raw.githubusercontent.com/selva86/datasets/master
/iris_train.csv ')
test =
pd.read_csv('https://raw.githubusercontent.com/selva86/datasets/master
/iris_test.csv ')

# Create the X, Y, Training and Test

xtrain=training.drop('Species', axis=1)
ytrain=training.loc[:,'Species']
xtest=test.drop('Species', axis=1)
ytest=test.loc[:,'Species']

# Init the Gaussian Classifier

model =GaussianNB()

# Train the model

model.fit(xtrain,ytrain)

# Predict Output
pred=model.predict(xtest)

# Plot Confusion Matrix

mat =confusion_matrix(pred,ytest)
names =np.unique(pred)
sns.heatmap(mat, square=True,annot=True,fmt='d', cbar=False,
xticklabels=names,yticklabels=names)
plt.xlabel('Truth')
plt.ylabel('Predicted')

Advantages
 It is easy and fast to predict the class of the test data set. It
also performs well in multi-class prediction.
 When assumption of independence holds, a Naive Bayes
classifier performs better compare to other models
like logistic regression and you need less training data.

 It perform well in case of categorical input variables

compared to numerical variable(s). For numerical variable,
normal distribution is assumed (bell curve, which is a strong
assumption).

Disadvantages
 If categorical variable has a category (in test data set),
which was not observed in training data set, then model will
assign a 0 (zero) probability and will be unable to make a
prediction. This is often known as Zero Frequency. To solve
this, we can use the smoothing technique. One of the
simplest smoothing techniques is called Laplace estimation.

 On the other side naive Bayes is also known as a bad

estimator, so the probability outputs are not to be taken too
seriously.

 Another limitation of Naive Bayes is the assumption of

independent predictors. In real life, it is almost impossible
that we get a set of predictors which are completely
independent.

Applications
 Real time Prediction: Naive Bayes is an eager learning
classifier and it is sure fast. Thus, it could be used for making
predictions in real time.
 Multi class Prediction: This algorithm is also well
known for multi class prediction feature. Here we can predict
the probability of multiple classes of target variable.

 Text classification/ Spam Filtering/ Sentiment

Analysis: Naive Bayes classifiers mostly used in text
classification (due to better result in multi class problems
and independence rule) have higher success rate as
compared to other algorithms. As a result, it is widely used
in Spam filtering (identify spam e-mail) and Sentiment
Analysis (in social media analysis, to identify positive and
negative customer sentiments)

 Recommendation System: Naive Bayes Classifier

and Collaborative Filtering together builds a
Recommendation System that uses machine learning and
data mining techniques to filter unseen information and
predict whether a user would like a given resource or not.

When to use
 Text Classification

 when dataset is huge

 When you have small training set

Dataset

 Iris dataset

 Wine dataset

 Adult dataset
7. Practice Exercise: Predict Human Activity Recognition (HAR)
The objective of this practice exercise is to predict current human activity
based on phisiological activity measurements from 53 different features
based in the HAR dataset . The training and test datasets are provided.
Build a Naive Bayes model, predict on the test dataset and compute the
confusion matrix.

Now suppose you want to calculate the probability of playing when the weather is overcast, and the temperature
is mild.

P(Overcast|Yes)= 4/9=0.44

Play Golf Play Golf

Frequency Table Likelihood Table
Yes No Yes No
Sunny 2 3 Sunny 2/9 3/9 5/14
Weather Overcast 4 0 Weather Overcast 4/9 0/9 4/14 P(X=Overcast)=4/14=0.2857
Rainy 3 2 Rainy 3/9 2/9 5/14
9/14 5/14

P(Yes)=9/14=0.6428

P(Mild|Yes)= 4/9=0.44

Play Golf Play Golf

Frequency Table Likelihood Table
Yes No Yes No
Hot 2 2 Hot 2/9 2/9 4/14
Temp Mild 4 2 Temp Mild 4/9 2/9 6/14 P(X=Mild)=6/14=0.4285
Cool 3 1 Cool 3/9 1/9 4/14
9/14 5/14

P(Yes)=9/14=0.6428

Probability of playing:
P(Play= Yes | Weather=Overcast, Temp=Mild)

= P(Weather=Overcast, Temp=Mild | Play= Yes)P(Play=Yes) /P(Overcast)*P(Mild)..........(1)

P(Weather=Overcast, Temp=Mild | Play= Yes)= P(Overcast |Yes) P(Mild |Yes) ………..(2)

1. Calculate Prior Probabilities: P(Yes)= 9/14 = 0.64

2. Calculate likelihood Probabilities: P(Overcast |Yes) = 4/9 = 0.44 P(Mild |Yes) = 4/9 = 0.44

3. Put likelihood probabilities in equation (2) P(Weather=Overcast, Temp=Mild | Play= Yes) = 0.44* 0.44=0.1936

4. Calculate evidence probabilities P(Overcast)=4/14=0.2857, P(Mild)=6/14=0.4285

5. P(Overcast)*P(Mild)= 0.2857*0.4285=0.122

6. P(Play=Yes|Weather=Overcast, Temp=Mild)=(0.1936*0.64)/0.122=1

Similarly, you can calculate the probability of not playing:

Probability of not playing:

P(Play= No | Weather=Overcast, Temp=Mild) = P(Weather=Overcast, Temp=Mild | Play= No)P(Play=No) ..........(3)

P(Weather=Overcast, Temp=Mild | Play= No)= P(Weather=Overcast |Play=No) P(Temp=Mild | Play=No) ………..(4)

1. Calculate Prior Probabilities: P(No)= 5/14 = 0.36

2. Calculate likelihood Probabilities: P(Overcast |No) = 0/5 = 0 P(Mild |No) = 2/5 = 0.4

3. Put likelihood probabilities in equation (2) P(Weather=Overcast, Temp=Mild | Play= No) = 0* 0.4=0

4. Calculate evidence probabilities P(Overcast)=4/14=0.2857, P(Mild)=6/14=0.4285

5. P(Overcast)*P(Mild)= 0.2857*0.4285=0.122

6. P(Play=Yes|Weather=Overcast, Temp=Mild)=0*0.36/0.122= 0

The probability of a 'Yes' class is higher. So you can say here that if the weather is overcast than players will play the sport.

# Assigning features and label variables
wheather=['Sunny','Sunny','Overcast','Rainy','Rainy','Rainy','Overcast','Sunny','Sunny',
'Rainy','Sunny','Overcast','Overcast','Rainy']
temp=['Hot','Hot','Hot','Mild','Cool','Cool','Cool','Mild','Cool','Mild','Mild','Mild','Hot','Mild']

play=['No','No','Yes','Yes','Yes','No','Yes','No','Yes','Yes','Yes','Yes','Yes','No']
# Import LabelEncoder
from sklearn import preprocessing
#creating labelEncoder
le = preprocessing.LabelEncoder()
# Converting string labels into numbers.
wheather_encoded=le.fit_transform(wheather)
print("Wheather:",wheather_encoded)
# Converting string labels into numbers
temp_encoded=le.fit_transform(temp)
label=le.fit_transform(play)
print("Temp:",temp_encoded)
print("Play:",label)
#Combinig weather and temp into single listof tuples
features=list(zip(wheather_encoded,temp_encoded))
print(features)
#Import Gaussian Naive Bayes model
from sklearn.naive_bayes import GaussianNB

#Create a Gaussian Classifier
model = GaussianNB()

# Train the model using the training sets
model.fit(features,label)

#Predict Output
predicted= model.predict([[0,2]]) # 0:Overcast, 2:Mild
print("Predicted Value:", predicted)

Output
[2 2 0 1 1 1 0 2 2 1 2 0 0 1]
Temp: [1 1 1 2 0 0 0 2 0 2 2 2 1 2]
Play: [0 0 1 1 1 0 1 0 1 1 1 1 1 0]
[(2, 1), (2, 1), (0, 1), (1, 2), (1, 0), (1, 0), (0, 0), (2, 2), (2, 0), (1, 2), (2, 2), (0, 2), (0,
1), (1, 2)]
Predicted Value: [1]
1. Naive Bayes classifier assumes that the features are
independent of each other.
2. Naive Bayes classifier can be trained faster as compared
to other classification algorithms.
3. Naive Bayes classifier model can predict faster as
compared to other classification algorithms.
4. Naive Bayes classifier model can be modified with new
training data without having to re build the model.
5. Naive Bayes classifier model does not involve
optimization of a cost function.
6. Naive Bayes classifier training does not involve epoch.
7. Naive Bayes classifier model does not involve solving
a matrix equation.
8. When assumptions of independence of features holds,
Naive Bayes classifier model performs better than other
classifiers.
9. When assumptions of independence of features holds,
Naive Bayes classifier model needs less training data.
10. Naive Bayes classifier model performs well in case
of categorical input variables compared to numerical
input variable.

Class 9 (Moral Science)
No ratings yet
Class 9 (Moral Science)
27 pages
3.5 Session 14 - Naive Bayes Classifier
67% (3)
3.5 Session 14 - Naive Bayes Classifier
47 pages
Logistic Regression
100% (1)
Logistic Regression
12 pages
Lecture 1
100% (1)
Lecture 1
81 pages
Cover Sheet: For Audited Financial Statements
80% (10)
Cover Sheet: For Audited Financial Statements
2 pages
Overhead Lines Chapter 4 PDF
No ratings yet
Overhead Lines Chapter 4 PDF
102 pages
U02Lecture07 Classification
100% (1)
U02Lecture07 Classification
56 pages
Bayesian Network
No ratings yet
Bayesian Network
15 pages
5 2 Multilayer Perceptron
No ratings yet
5 2 Multilayer Perceptron
17 pages
ML3 - Evaluation
100% (1)
ML3 - Evaluation
65 pages
Naive Bayes
No ratings yet
Naive Bayes
38 pages
Bayesian Network - Problem
100% (1)
Bayesian Network - Problem
4 pages
EDA Lecture Module 2
100% (1)
EDA Lecture Module 2
42 pages
CS229 Lecture 3 PDF
100% (1)
CS229 Lecture 3 PDF
35 pages
Module 08 - Basic Network Configuration
100% (1)
Module 08 - Basic Network Configuration
12 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
60 pages
Unit 4 MCQ PDF
No ratings yet
Unit 4 MCQ PDF
34 pages
Naive Bayes Classifier
No ratings yet
Naive Bayes Classifier
46 pages
To Pattern Recognition: CSE555, Fall 2021 Chapter 1, DHS
100% (1)
To Pattern Recognition: CSE555, Fall 2021 Chapter 1, DHS
39 pages
Instructions Reference Manual (W474) CPU CJ2M
100% (1)
Instructions Reference Manual (W474) CPU CJ2M
1,314 pages
Forecast Time Series With R Language
No ratings yet
Forecast Time Series With R Language
98 pages
Bayes Classifier PDF
100% (1)
Bayes Classifier PDF
18 pages
Bayes Network
100% (1)
Bayes Network
80 pages
Logistic Regression
100% (1)
Logistic Regression
21 pages
Euclidean Distance Matrix Trick
No ratings yet
Euclidean Distance Matrix Trick
3 pages
Decision Tree Classification
100% (1)
Decision Tree Classification
11 pages
Lecture 5 Bayesian Classification 3
No ratings yet
Lecture 5 Bayesian Classification 3
103 pages
Chapter 14 - Cluster Analysis: Data Mining For Business Intelligence
No ratings yet
Chapter 14 - Cluster Analysis: Data Mining For Business Intelligence
31 pages
Vinee
100% (1)
Vinee
28 pages
Agglomerative Hierarchical Clustering
No ratings yet
Agglomerative Hierarchical Clustering
21 pages
Research in Daily Life 1 Research in Daily Life 2: Flexible Instruction Delivery Plan (Fidp)
No ratings yet
Research in Daily Life 1 Research in Daily Life 2: Flexible Instruction Delivery Plan (Fidp)
9 pages
Euclidean Distance
No ratings yet
Euclidean Distance
10 pages
9 Distance Measures in Data Science
No ratings yet
9 Distance Measures in Data Science
9 pages
Support Vector Machine
No ratings yet
Support Vector Machine
40 pages
Erlang PDF Generation
No ratings yet
Erlang PDF Generation
2 pages
DBSCAN
No ratings yet
DBSCAN
42 pages
Bayesian Network
No ratings yet
Bayesian Network
32 pages
DBSCAN
No ratings yet
DBSCAN
18 pages
Principal Component Analysis 4 Dummies
100% (1)
Principal Component Analysis 4 Dummies
8 pages
Naive Bayes Classifier
No ratings yet
Naive Bayes Classifier
9 pages
Decision-Tree Learning .
No ratings yet
Decision-Tree Learning .
29 pages
Captiva Sevies
No ratings yet
Captiva Sevies
5 pages
Concepts and Techniques: Data Mining
No ratings yet
Concepts and Techniques: Data Mining
52 pages
Logistic Regression
No ratings yet
Logistic Regression
16 pages
Naive Bayes and Sentiment
No ratings yet
Naive Bayes and Sentiment
19 pages
12 Cross Validation
No ratings yet
12 Cross Validation
82 pages
Minitab Statguide Multivariate
No ratings yet
Minitab Statguide Multivariate
25 pages
Naïve Bayes Classifier (Week 8)
No ratings yet
Naïve Bayes Classifier (Week 8)
18 pages
Invisible Cities Art of Sophronia'
No ratings yet
Invisible Cities Art of Sophronia'
13 pages
Spectral Clustering: Eyal David Image Processing Seminar May 2008
No ratings yet
Spectral Clustering: Eyal David Image Processing Seminar May 2008
52 pages
Logistic Regression
No ratings yet
Logistic Regression
10 pages
An Introduction To Naive Bayes Algorithm For Beginners
No ratings yet
An Introduction To Naive Bayes Algorithm For Beginners
11 pages
A Tutorial On Bayesian Belief Networks: Mark L. Krieg DSTO-TN-0403
No ratings yet
A Tutorial On Bayesian Belief Networks: Mark L. Krieg DSTO-TN-0403
66 pages
Expectation Maximization
No ratings yet
Expectation Maximization
23 pages
Preliminary Water Utility Report
No ratings yet
Preliminary Water Utility Report
24 pages
Probability Theory - Towards Data Science
No ratings yet
Probability Theory - Towards Data Science
19 pages
Big Data Presentation
No ratings yet
Big Data Presentation
45 pages
Clustering Lecture
No ratings yet
Clustering Lecture
46 pages
Customer Choice Tutorial
No ratings yet
Customer Choice Tutorial
15 pages
El Acoso Alejo Carpentier's War On Time
No ratings yet
El Acoso Alejo Carpentier's War On Time
10 pages
Bayesian Belief Network
No ratings yet
Bayesian Belief Network
30 pages
Article of Minig 2016
No ratings yet
Article of Minig 2016
7 pages
A Tutorial in Logistic Regression
No ratings yet
A Tutorial in Logistic Regression
14 pages
Danfoss Refrigeration Basics - ESSENTIAL
100% (1)
Danfoss Refrigeration Basics - ESSENTIAL
24 pages
Mean Absolute Error
No ratings yet
Mean Absolute Error
2 pages
Gamma Distribution
No ratings yet
Gamma Distribution
12 pages
Autopsy of Membranes..... 12
No ratings yet
Autopsy of Membranes..... 12
27 pages
1.5 S Shift in The Six Sigma Process
No ratings yet
1.5 S Shift in The Six Sigma Process
3 pages
Greek Architecture:: Golden Ratio in Use
No ratings yet
Greek Architecture:: Golden Ratio in Use
4 pages
1.1 Introduction To Service Management
No ratings yet
1.1 Introduction To Service Management
22 pages
Embr 1 PDF
No ratings yet
Embr 1 PDF
32 pages
Overcurrent Protection Device Basis
No ratings yet
Overcurrent Protection Device Basis
10 pages
Secrets of Mind Power Harry Lorayne
No ratings yet
Secrets of Mind Power Harry Lorayne
45 pages
DB en Trio Ups 2g 1ac 1ac 120v 750va 107057 en 01
No ratings yet
DB en Trio Ups 2g 1ac 1ac 120v 750va 107057 en 01
24 pages
ME 111 Thermodynamics 1
No ratings yet
ME 111 Thermodynamics 1
8 pages
RLSC Inventory of Laboratory Glasswares As of July20211 1
No ratings yet
RLSC Inventory of Laboratory Glasswares As of July20211 1
2 pages
Research Methods Synopsis
No ratings yet
Research Methods Synopsis
22 pages
LG HG6 Datasheet
No ratings yet
LG HG6 Datasheet
9 pages
Chemical Signalling.
No ratings yet
Chemical Signalling.
73 pages
Section 23 21 14 - Underground Pre-Insulated Hydronic Piping
No ratings yet
Section 23 21 14 - Underground Pre-Insulated Hydronic Piping
7 pages
Ch05 Student (Prob. Tuts)
No ratings yet
Ch05 Student (Prob. Tuts)
154 pages
Assignmentdetails Physics 12
No ratings yet
Assignmentdetails Physics 12
3 pages
Field Record of Concrete: Commercial & Office Building On Plot No. 373-1343 at Al Barsha First Dubai
No ratings yet
Field Record of Concrete: Commercial & Office Building On Plot No. 373-1343 at Al Barsha First Dubai
38 pages
The Leadership Chellenge
100% (1)
The Leadership Chellenge
9 pages
18CSP83 - Project Phase 2 - Body
No ratings yet
18CSP83 - Project Phase 2 - Body
11 pages
A Data-Driven Online Prediction Model For Battery
No ratings yet
A Data-Driven Online Prediction Model For Battery
17 pages
Pci Assigment
No ratings yet
Pci Assigment
7 pages
Sensors: Implementation of Parameter Observer For Capacitors
No ratings yet
Sensors: Implementation of Parameter Observer For Capacitors
19 pages
McIntyre - Quantum Mechanics - 83
No ratings yet
McIntyre - Quantum Mechanics - 83
3 pages
BAYES Theorem
From Everand
BAYES Theorem
Jeffery Short
2/5 (5)

Naive Bayes Classifier: Coin Toss and Fair Dice Example

Uploaded by

Naive Bayes Classifier: Coin Toss and Fair Dice Example

Uploaded by

Naive Bayes Classifier

But why is it called ‘Naive’?

Naive Bayes does seem to be a simple yet powerful algorithm.

2. What is Conditional Probability?

Playing Cards Example

Mathematically, Conditional probability of A given B can be computed as:

To calculate this, you may intuitively filter the sub-population of 60 males

4. The Naive Bayes

In technical jargon, the left-hand-side (LHS) of the equation is understood

5. Naive Bayes Example

So the objective of the classifier is to predict if a given fruit is a ‘Banana’

6. Building Naive Bayes Classifier in Python

import seaborn assns;sns.set()

# Create the X, Y, Training and Test

# Init the Gaussian Classifier

# Train the model

# Plot Confusion Matrix

 It perform well in case of categorical input variables

 On the other side naive Bayes is also known as a bad

 Another limitation of Naive Bayes is the assumption of

 Text classification/ Spam Filtering/ Sentiment

 Recommendation System: Naive Bayes Classifier

 when dataset is huge

 When you have small training set

Play Golf Play Golf

Play Golf Play Golf

= P(Weather=Overcast, Temp=Mild | Play= Yes)P(Play=Yes) /P(Overcast)*P(Mild)..........(1)

P(Weather=Overcast, Temp=Mild | Play= Yes)= P(Overcast |Yes) P(Mild |Yes) ………..(2)

1. Calculate Prior Probabilities: P(Yes)= 9/14 = 0.64

4. Calculate evidence probabilities P(Overcast)=4/14=0.2857, P(Mild)=6/14=0.4285

Similarly, you can calculate the probability of not playing:

Probability of not playing:

P(Play= No | Weather=Overcast, Temp=Mild) = P(Weather=Overcast, Temp=Mild | Play= No)P(Play=No) ..........(3)

P(Weather=Overcast, Temp=Mild | Play= No)= P(Weather=Overcast |Play=No) P(Temp=Mild | Play=No) ………..(4)

1. Calculate Prior Probabilities: P(No)= 5/14 = 0.36

4. Calculate evidence probabilities P(Overcast)=4/14=0.2857, P(Mild)=6/14=0.4285

You might also like