0% found this document useful (0 votes)
20 views10 pages

AAIML

Uploaded by

acharyaramya412
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views10 pages

AAIML

Uploaded by

acharyaramya412
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 10

TEXT ANALYTICS: OVERVIEW AND

SENTIMENT CLASSIFICATION
Team Members:
Chaitra ( 4mw21ai010 )
Dhanya ( 4mw21ai014 )
Dhanya Kini B ( 4mw21ai015 )
Shreya S ( 4mw21ai052 )
Vidisha Nelli ( 4mw21ai057 )

Dept. of Artificial Intelligence and Machine Learning


TEXT ANALYTICS OVERVIEW
Definition: Text analytics is the process of transforming WHEN YOUR DATA IS
UNSTRUCTURED BUT
unstructured text into structured data for analysis. YOU WANT STRUCTURED
INSIGHTS ASAP!
Applications:
Sentiment Analysis: Used by businesses to understand customer
sentiment in reviews.
Spam Detection: Identifies spam emails by recognizing patterns
in words.
Topic Extraction: Clusters news articles into topics like politics,
sports, and technology.
Language Identification: Recognizes the language of a text (e.g.,
English, Spanish) for translation services​.
Tools: Natural Language Processing (NLP), machine learning,
statistical analysis.
TEXT CLASSIFICATION AND SENTIMENT ANALYSIS

Text Classification: Assigns predefined categories to text data based on its content.
● Examples: Classifying customer reviews as “positive” or “negative,” or
categorizing emails as “spam” or “not spam.”
Sentiment Analysis:
● Definition: Sentiment analysis determines the emotion (positive, negative,
neutral) expressed in text.
● Example: "The movie was fantastic!" (positive sentiment); "I wasted my time on
this movie" (negative sentiment).
● Importance: Companies use sentiment analysis to gauge public opinion on
products, events, and services​
BAG OF WORDS ( BOW ) MODEL
Definition: Treats each word as an independent feature, without concern for order.
How It Works:
● Example: For the sentence "The cat sat on the mat," the BoW model would count
each word independently: {the: 2, cat: 1, sat: 1, on: 1, mat: 1}.
Usage: Common in applications where the presence of specific words correlates
strongly with certain categories (e.g., spam keywords like “free,” “win”).

Limitations:
● No Context: Ignores the sequence of words, which can lead to errors. For
example, “not good” and “good” are treated similarly despite differing meanings​.
N-GRAM MODELS
Unigram, Bigram, and Trigram Models:

Unigram (1-word): Treats each word independently, like the BoW model.
Bigram (2-word): Captures pairs of words, adding some context (e.g., "not good").
Trigram (3-word): Captures sequences of three words, useful for short phrases (e.g., "the
quick brown").

Example:

Sentence: "The cat sat on the mat."


Unigrams: ["The," "cat," "sat," "on," "the," "mat"]
Bigrams: ["The cat," "cat sat," "sat on," "on the," "the mat"]
Trigrams: ["The cat sat," "cat sat on," "sat on the," "on the mat"]

Applications: n-grams help in spam detection, sentiment analysis, and predictive text
suggestions, where word order matters
SENTIMENT ANALYSIS IN ACTION
Steps in Sentiment Analysis:

1. Preprocessing: Clean data by removing punctuation,


lowercasing, removing stop words (e.g., "the", "and").
○ Example: “I love this product!” becomes “love
product.”
2. Feature Extraction: Identify relevant features, such as
words or phrases.
3. Model Training: Train a classifier on labeled data, like
reviews marked as positive or negative.
4. Classification: Use the trained model to predict
sentiment on new data.
Example Application:
● Input: “This restaurant has amazing food and great service!”
● Output: Positive sentiment (based on words like “amazing” and “great”).

Algorithms Used:
● Naive Bayes: Simple, probabilistic approach often used in spam detection.
● Support Vector Machines (SVM): Effective for high-dimensional data like
text.
● Decision Trees: Uses rules to classify text, which can be useful in some
sentiment analysis contexts
CHALLENGES IN TEXT ANALYTICS
Ambiguity in Language:
Words can have multiple meanings based on context.
Example: “The bank” could mean a financial institution or the side of a river.
Feature Selection:
Choosing the right features (e.g., n-grams, emoticons) improves accuracy but is complex.
Data Sparsity:
Rare word combinations may not have enough examples in training data, reducing model
reliability.
Handling Negations:
Example: In “I don’t love this product,” the sentiment is negative, though the word
“love” typically indicates a positive sentiment​
ADVANCES AND FUTURE OF TEXT ANALYTICS
Trends:
● Deep Learning: Advanced neural networks (e.g., CNN, RNN) that capture complex
language patterns.
● Contextual Embeddings: Models like BERT and GPT use word context to improve
accuracy.
Opportunities:
● Enhanced Sentiment Models: More nuanced understanding, like detecting sarcasm.
● Text Summarization: Automatically summarizing news or research papers.
● Question-Answering Systems: Supporting customer service through automated
responses.
Future Directions:
● Greater integration of NLP with conversational AI to improve dialogue systems.
● Advanced sentiment models that detect complex emotions (e.g., frustration, excitement)​
THANK YOU !

You might also like