0% found this document useful (0 votes)
32 views7 pages

Machine Learning Session: Naïve Bayes Classifier

The document discusses the Naive Bayes classifier machine learning algorithm. It introduces Bayes' Theorem and explains how Naive Bayes makes the naive assumption that features are independent given the class label. This allows the joint conditional probability to be simplified. It also discusses techniques like Laplace smoothing to handle zero probabilities and ways to handle numerical predictors.

Uploaded by

Satish Khullar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views7 pages

Machine Learning Session: Naïve Bayes Classifier

The document discusses the Naive Bayes classifier machine learning algorithm. It introduces Bayes' Theorem and explains how Naive Bayes makes the naive assumption that features are independent given the class label. This allows the joint conditional probability to be simplified. It also discusses techniques like Laplace smoothing to handle zero probabilities and ways to handle numerical predictors.

Uploaded by

Satish Khullar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 7

MACHINE LEARNING SESSION

NAÏVE BAYES CLASSIFIER


Bayes Theorem

P(A|B) = P(B|A) P(A)


P(B)

• P(A) – Prior Probability


• P(B/A) – Conditional Probability, Probability of B given A
Bayes Theorem Example

• 5% of all mails received is Spam. 2 % of all mails contain the


word lottery
• 20% of Spam mails contains the word “lottery”
• Given that a mail contains the word “lottery”, what is the
probability that the mail is spam
Naïve Bayes Classifier

• Using Bayes rule to find the probability that an observation


belongs to a class Y, given the set of Input features
Naïve Assumption in Naïve Bayes

• Computing the Joint conditional probability P(X 1,…,Xn|Y) is


not feasible
• Assume features are independent given class label Y
• Under this assumption, the equation simplifies to:
Laplacian Smoothing

• If any of the probabilities, P(Xi/Y) is zero, the overall


predicted probability would be zero
• Use Laplace smoothing:
Dealing with Numerical Predictors

• Either:
– Discretize Numerical Predictor (however, leads to loss of
information. Also, what should be the cut-offs for discretization?)
– Assume the feature is normally distributed and use Gaussian
Probability density function to estimate probability

You might also like