0% found this document useful (0 votes)
63 views9 pages

NLP Mini Project

The document presents a system to analyze sentiment from Bengali book reviews. It discusses preprocessing Bengali reviews by removing unnecessary symbols and stopwords. The workflow involves converting text to vectors using n-grams and classifying reviews with a machine learning algorithm. Multinomial Naive Bayes achieved the highest accuracy of 89.62% when using bigrams on a dataset of 1444 reviews labeled as positive or negative. The system then makes predictions of sentiment on new reviews.

Uploaded by

Saikat Mondal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views9 pages

NLP Mini Project

The document presents a system to analyze sentiment from Bengali book reviews. It discusses preprocessing Bengali reviews by removing unnecessary symbols and stopwords. The workflow involves converting text to vectors using n-grams and classifying reviews with a machine learning algorithm. Multinomial Naive Bayes achieved the highest accuracy of 89.62% when using bigrams on a dataset of 1444 reviews labeled as positive or negative. The system then makes predictions of sentiment on new reviews.

Uploaded by

Saikat Mondal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Opinion Mining

From Bengali
Book Review Presented by - SAIKAT MONDAL
(22/10/MI/003)
PROBLEM STATEMENT

To build a system that can


understand the opinion
(positive or negative) of
of the bengali book review
Dataset:
Total Reviews: 1444

Total Positive Reviews: 972

Total Negative Reviews: 472


Workflow :

For developing a machine learning based


sentiment analyzer we need to go through
Different stages.
We can not use any text data directly
as a input to any ml algorithm.
Preprocessing :

Remove the unncessary symbols from a review


such as punctuation mark, numbers ,emoji.
STOPWORDS=['এই', 'কের', 'জন্য', 'একটি', 'আমার', 'একটা', 'এর', ' য', 'তার', 'এবং', ' থেক',
'ও', 'এ', 'িক', ' কান', 'আিম', 'আর']
TEXT TO VECTOR

…..
……

UNIGRAM BIGRAM TRIGRAM


CLASSIFICATION

Accuracy achieved by(UNIGRAM) MNB at


= 87.89%

Accuracy achieved by (BIGRAM)MNB at =


89.62%

Accuracy achieved by (TRIGRAM) MNB at


= 88.58%
Prediction

1(negative)
0(positive)
0(positive)
1(negative)
0(negative)
0(negative)
0(negative)
THANK YOU

You might also like