0% found this document useful (0 votes)
23 views17 pages

Module 3 NLP

The document provides an overview of Naïve Bayes classifiers, emphasizing its probabilistic nature and the assumption of feature independence. It explains the Bag-of-Words model used in text classification, detailing the process of training the classifier with labeled sentiment data and calculating prior probabilities and likelihoods with Laplace smoothing. The document outlines the steps involved in training the classifier, including vocabulary extraction and probability computation.

Uploaded by

prajwal4560
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views17 pages

Module 3 NLP

The document provides an overview of Naïve Bayes classifiers, emphasizing its probabilistic nature and the assumption of feature independence. It explains the Bag-of-Words model used in text classification, detailing the process of training the classifier with labeled sentiment data and calculating prior probabilities and likelihoods with Laplace smoothing. The document outlines the steps involved in training the classifier, including vocabulary extraction and probability computation.

Uploaded by

prajwal4560
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

NATURAL LANGUAGE PROCESSING

Course Code: BAI601


Credits: 04
Ms. Y.Nikhila
Introduction to Naïve Bayes Classifiers
• Naïve Bayes is a probabilistic classifier based
on Bayes’ Theorem.
• It is called "naïve" because it assumes that
features (words in this case) are independent
given the class.
• The Multinomial Naïve Bayes Classifier is
commonly used for text classification, such as
spam filtering or sentiment analysis.
Bag-of-Words Model
• In text classification, we use the bag-of-words
model:
– Ignore the order of words.
– Only count how many times each word appears in the
document.
• Example:
– Sentence: "I love this movie! It’s amazing!"
– Bag of words: {I: 1, love: 1, this: 1, movie: 1, it: 1,
amazing: 1}
Training the Naive Bayes Classifier
Step 1: Training Data
We have the following training documents, labeled with
sentiment:
Extracting Vocabulary
• The vocabulary (set of unique words) from the training
data is:
V = {just, plain, boring, entirely, predictable, and, lacks,
energy, no, surprises, very, few, laughs, powerful, the,
most, fun, film, of, summer}
• There are 20 unique words (|V| = 20).
• The model will learn the probability of each word
appearing in positive and negative reviews.
Step 2: Compute Prior Probabilities P(c)
• The prior probability of each class is computed using:

The model assumes that 60% of all reviews are negative and 40% are
positive. These prior probabilities will be multiplied with likelihood values
when making predictions.
Step 3: Compute Likelihoods P(w|c) using Add-One Smoothing
• Word Count in Training Data
• For each word in our vocabulary, we count how often it appears in
Negative (-) and Positive (+) reviews.
Applying Laplace Smoothing

You might also like