Naive Bayes

Naive Bayes
Naive Bayes
• Naive Bayes is a probabilistic Supervised
machine learning algorithm that can be used
in a wide variety of classification tasks.
• Typical applications include filtering spam,
classifying documents, sentiment prediction
etc. It is based on the works of Thomas Bayes
(1702–61) and hence the name.
why is it called ‘Naive’?
• The name naive is used because it assumes
the features that go into the model is
independent of each other.
• That is changing the value of one feature, does
not directly influence or change the value of
any of the other features used in the
algorithm.
Pros
• It is easy and fast to predict class of test data set.
• It also perform well in multi class prediction
• When assumption of independence holds, a Naive
Bayes classifier performs better compare to
other models like logistic regression and you need
less training data.
• It perform well in case of categorical input
variables compared to numerical variable(s). For
numerical variable, normal distribution is assumed
(bell curve, which is a strong assumption).
When to use Naive Bayes
• When the data set is Labelled
• When the data set is large
• When the attributes are independent
Cons
• If categorical variable has a category (in test
data set), which was not observed in training
data set, then model will assign a 0 (zero)
probability and will be unable to make a
prediction. This is often known as “Zero
Frequency”. To solve this, we can use the
smoothing technique. One of the simplest
smoothing techniques is called Laplace
estimation.
Cons
• On the other side naive Bayes is also known as
a bad estimator, so the probability outputs
from predict_proba are not to be taken too
seriously.
• Another limitation of Naive Bayes is the
assumption of independent predictors. In real
life, it is almost impossible that we get a set of
predictors which are completely independent.
Applications of Naive Bayes Algorithms
• Real time Prediction: Naive Bayes is an eager
learning classifier and it is sure fast. Thus, it could
be used for making predictions in real time.
• Multi class Prediction: This algorithm is also well
known for multi class prediction feature. Here we
can predict the probability of multiple classes of
target variable.
• Text classification/ Spam Filtering/ Sentiment
Analysis
• Recommendation System:
What is Conditional Probability?
• Coin Toss
When you flip a fair coin, there is an equal
chance of getting either heads or tails. So you
can say the probability of getting heads is 50%.
• Fair Dice
Similarly what would be the probability of getting
a 1 when you roll a dice with 6 faces? Assuming
the dice is fair, the probability of 1/6 = 0.166.
• Playing Cards
• If you pick a card from the deck, can you guess
the probability of getting a queen given the card
is a spade?
• Well, I have already set a condition that the card
is a spade. So, the denominator (eligible
population) is 13 and not 52. And since there is
only one queen in spades, the probability it is a
queen given the card is a spade is 1/13 = 0.077
• This is a classic example of conditional
probability. So, when you say the conditional
probability of A given B, it denotes the
probability of A occurring given that B has
already occurred.
• Mathematically, Conditional probability of A
given B can be computed as:
P(A|B) = P(A AND B) / P(B)
• School Example
• Let’s see a slightly complicated example. Consider a
school with a total population of 100 persons. These
100 persons can be seen either as ‘Students’ and
‘Teachers’ or as a population of ‘Males’ and
‘Females’.
• With given below tabulation of the 100 people, what
is the conditional probability that a certain member
of the school is a ‘Teacher’ given that he is a ‘Man’?
• calculate this, you may intuitively filter the
sub-population of 60 males and focus on the
12 (male) teachers.
• So the required conditional probability
P(Teacher | Male) = 12 / 60 = 0.2.
Bayes Rule
• The Bayes Rule that we use for Naive Bayes,
can be derived from these two notations.
Bayes’ theorem
• The probability of an event, based on prior
knowledge of conditions that might be
related to the event.
as marginal probability
Navie Baye’s theorem
Example
• Let us say P(Fire) - how often there is fire, and
P(Smoke) means how often we see smoke,
then:
• P(Fire|Smoke) - how often there is fire when
we can see smoke
• P(Smoke|Fire) means how often we can see
smoke when there is fire
STRONG
today = (Sunny,
Cool, High,
True) Play or Not
Example
Example
Naive Bayes
Naïve Bayes
P(today)yes:
1- 0-
P(today)no:
P ( X ( today | Play yes )∗P ( Play= yes)

P ( (Play yes |X ( today ))=
P ( X ( today))
P (X ( today | Playno )∗P ( Play= no )

P (( Play no |X ( today ))=
P (X ( today ))
Example
• today = (Sunny, Hot, Normal, False ) Play or Not
Example
Probability Basics
• Probability function P(), returns the probability of
an event
• Probability functions for categorical features are
referred to as probability mass functions
• The probability functions for continuous features
are known as probability density functions.
• The joint probability refers to the probability of
an assignment of specific values to multiple
different features p(a=T, b=T, c=F)
Probability Basics
• Conditional probability refers to the probability of
one feature taking a specific value given that we
already know the value of a different feature p(a|b)
• Probability distribution is a data structure that
describes the probability of each possible value a
feature can take (p(toss) =p(H)+p(T)=1
• A joint probability distribution is a probability
distribution over more than one feature assignment
Probability Basics
• Product Rule:
• Chain Rule:
• Bayesian Theorem: Dependent Features
• Naive Bayesian Theorem: Independent Features
Bayesian Prediction: Example
Bayesian Prediction
Bayesian Prediction
Bayesian Prediction
P(t)=0.3333 <p(f)=0.6667 … So FALSE
Predict when :
Bayesian Prediction

Naive Bayes

Uploaded by

Copyright:

Available Formats

Naive Bayes

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Naive Bayes

Uploaded by

Copyright:

Available Formats

Naive Bayes

P ( X ( today | Play yes )∗P ( Play= yes)

P (X ( today | Playno )∗P ( Play= no )

P(t)=0.3333 <p(f)=0.6667 … So FALSE

You might also like