AAIML
AAIML
SENTIMENT CLASSIFICATION
Team Members:
Chaitra ( 4mw21ai010 )
Dhanya ( 4mw21ai014 )
Dhanya Kini B ( 4mw21ai015 )
Shreya S ( 4mw21ai052 )
Vidisha Nelli ( 4mw21ai057 )
Text Classification: Assigns predefined categories to text data based on its content.
● Examples: Classifying customer reviews as “positive” or “negative,” or
categorizing emails as “spam” or “not spam.”
Sentiment Analysis:
● Definition: Sentiment analysis determines the emotion (positive, negative,
neutral) expressed in text.
● Example: "The movie was fantastic!" (positive sentiment); "I wasted my time on
this movie" (negative sentiment).
● Importance: Companies use sentiment analysis to gauge public opinion on
products, events, and services
BAG OF WORDS ( BOW ) MODEL
Definition: Treats each word as an independent feature, without concern for order.
How It Works:
● Example: For the sentence "The cat sat on the mat," the BoW model would count
each word independently: {the: 2, cat: 1, sat: 1, on: 1, mat: 1}.
Usage: Common in applications where the presence of specific words correlates
strongly with certain categories (e.g., spam keywords like “free,” “win”).
Limitations:
● No Context: Ignores the sequence of words, which can lead to errors. For
example, “not good” and “good” are treated similarly despite differing meanings.
N-GRAM MODELS
Unigram, Bigram, and Trigram Models:
Unigram (1-word): Treats each word independently, like the BoW model.
Bigram (2-word): Captures pairs of words, adding some context (e.g., "not good").
Trigram (3-word): Captures sequences of three words, useful for short phrases (e.g., "the
quick brown").
Example:
Applications: n-grams help in spam detection, sentiment analysis, and predictive text
suggestions, where word order matters
SENTIMENT ANALYSIS IN ACTION
Steps in Sentiment Analysis:
Algorithms Used:
● Naive Bayes: Simple, probabilistic approach often used in spam detection.
● Support Vector Machines (SVM): Effective for high-dimensional data like
text.
● Decision Trees: Uses rules to classify text, which can be useful in some
sentiment analysis contexts
CHALLENGES IN TEXT ANALYTICS
Ambiguity in Language:
Words can have multiple meanings based on context.
Example: “The bank” could mean a financial institution or the side of a river.
Feature Selection:
Choosing the right features (e.g., n-grams, emoticons) improves accuracy but is complex.
Data Sparsity:
Rare word combinations may not have enough examples in training data, reducing model
reliability.
Handling Negations:
Example: In “I don’t love this product,” the sentiment is negative, though the word
“love” typically indicates a positive sentiment
ADVANCES AND FUTURE OF TEXT ANALYTICS
Trends:
● Deep Learning: Advanced neural networks (e.g., CNN, RNN) that capture complex
language patterns.
● Contextual Embeddings: Models like BERT and GPT use word context to improve
accuracy.
Opportunities:
● Enhanced Sentiment Models: More nuanced understanding, like detecting sarcasm.
● Text Summarization: Automatically summarizing news or research papers.
● Question-Answering Systems: Supporting customer service through automated
responses.
Future Directions:
● Greater integration of NLP with conversational AI to improve dialogue systems.
● Advanced sentiment models that detect complex emotions (e.g., frustration, excitement)
THANK YOU !