SA Notes
SA Notes
ANALYSIS
UNIT-I
Introduction to Sentiment Analysis Introduction: Sentiment Analysis
Applications - Sentiment Analysis Research - Sentiment Analysis as Mini NLP.
The Problem of Sentiment Analysis: Definition of Opinion - Definition of
Opinion Summary - Affect, Emotion, and Mood - Different Types of Opinions -
Author and Reader Standpoint. Document Sentiment Classification: Supervised
Sentiment Classification - Unsupervised Sentiment Classification - Sentiment
Rating Prediction - Cross-Domain Sentiment Classification - Cross-Language
Sentiment Classification - Emotion Classification of Documents.
Introduction to Sentiment Analysis
Sentiment Analysis Definition: Sentiment analysis, also called opinion mining,
studies people's opinions, sentiments, evaluations, attitudes, and emotions
towards entities like products, services, and topics.
Terminology: It includes various subfields like sentiment mining, subjectivity
analysis, affect analysis, and review mining, all under the broader term "sentiment
analysis."
Industry vs. Academia: The term "sentiment analysis" is more common in
industry, while both "sentiment analysis" and "opinion mining" are frequently
used in academia.
Historical Origins: The term "sentiment analysis" first appeared in 2003
(Nasukawa & Yi), and "opinion mining" also in 2003 (Dave et al.), though related
research began earlier.
Focus on Opinions: The book uses "opinion" to broadly represent sentiment,
evaluation, appraisal, attitude, and emotion but distinguishes them when
necessary.
Positive and Negative Sentiments: Sentiment analysis mainly focuses on opinions
that express positive or negative sentiments.
Early Research Gap: Despite NLP’s long history, little research was done on
opinions and sentiments before 2000.
Rapid Growth Since 2000: Sentiment analysis has become a major research area
due to its vast applications and increasing commercial interest.
Industry Impact: The rise of sentiment analysis is driven by its real-world
applications across multiple domains.
Challenging Research Problems: The field presents unique and complex
challenges that had not been explored before.
Role of social media: The explosion of opinionated data from social media has
fueled sentiment analysis research.
Interdisciplinary Influence: Sentiment analysis impacts NLP, management
sciences, political science, economics, and social sciences.
Earlier Related Work: Prior research focused on metaphor interpretation,
sentiment adjectives, subjectivity, viewpoints, and affects (Hatzivassiloglou &
McKeown, Wiebe, Hearst, etc.).
Purpose of the Book: The book provides an up-to-date, comprehensive
introduction and survey of sentiment analysis, covering key concepts and
techniques.
Before this book, important works such as Computing Attitude and Affect
in Text (2006) and a survey by Pang and Lee (2008) provided great
insights. However, sentiment analysis has grown significantly in the last
five years, leading to a better understanding of the problem, new models,
and refined methodologies.
Although sentiment analysis applies to various forms of text, product reviews are
commonly used as examples because they are highly focused and opinion-rich.
However, other opinion sources, such as news articles, tweets, blogs, and forum
discussions, pose different challenges. Tweets, despite being informal, are easier
to analyze due to their short length and direct expressions. Product reviews are
also relatively straightforward because they contain little irrelevant content.
Forum discussions, however, are more complex due to unstructured
conversations and user interactions.
Definition of Opinion
Sentiment analysis involves understanding and structuring opinions expressed in
text. A key aspect of this process is identifying the opinion target (what the
opinion is about) and the sentiment (positive, negative, or neutral). For example,
in a product review, opinions may be directed at the product as a whole or at
specific aspects like battery life or picture quality.
Opinions also have opinion holders (who expressed the opinion) and time stamps
(when the opinion was expressed). These elements are crucial for analyzing
sentiment trends over time and understanding different perspectives. A structured
opinion is represented as a quadruple (target, sentiment, holder, time), but in
practical applications, a more detailed quintuple model is used:
(entity, aspect, sentiment, holder, time).
Entities can be anything from products and services to people and events. Each
entity has multiple aspects (attributes or components), making sentiment analysis
more detailed. For example, a camera can have aspects like picture quality,
battery life, and weight. Sentiments can be associated with either the entity as a
whole or specific aspects.
Since natural language is complex, some challenges arise. Sentiments may not
always be explicit, and aspects may not be clearly mentioned in the same
sentence. Sarcasm, context-based opinions, and relationships between entity parts
add further difficulties. For instance, “The ink of this printer is expensive”
expresses an opinion on ink price, not the printer as a whole.
Despite limitations, structuring opinions using the quintuple model allows for
transforming unstructured text into structured data. This enables powerful
quantitative, qualitative, and trend analyses using databases and analytical tools.
Comparative opinions, which express preferences between entities (e.g., "Coke
tastes better than Pepsi"), require a separate framework, discussed in later
sections.
The opinion quintuple (entity, aspect, sentiment, opinion holder, time) provides a
strong foundation for both qualitative and quantitative summaries. A widely used
approach is aspect-based opinion summarization, which organizes opinions by
entity aspects rather than generating traditional text summaries.
For example, a structured summary of reviews for a digital camera might show
that 105 people gave a positive opinion on the camera overall, while 12 people
gave a negative opinion. The picture quality aspect received 95 positive and 10
negative reviews, while the battery life aspect had 50 positive and 9 negative
reviews.
Comparative Opinions
Comparative opinions compare two or more entities based on shared
aspects, showing similarities, differences, or preferences. For example:
- "Coke tastes better than Pepsi." (Comparison between two entities)
- "Coke tastes the best." (Superlative comparison)
Comparative opinions are usually expressed using comparative or
superlative adjectives (better, best) or preference words (prefer). They are
discussed in more detail in later chapters.
Implicit Opinions
Implicit opinions are objective statements that imply a sentiment without
directly stating it. For example:
- "I bought the mattress a week ago, and a valley has formed." (Implies
negative sentiment)
- "The battery life of Nokia phones is longer than Samsung phones."
(Implies a positive sentiment about Nokia's battery life)
Implicit opinions are more difficult to analyze because they require deeper
understanding of context. Some studies have explored how syntactic
choices in sentences, such as news headlines, influence the perception of
sentiment.
Problem Definition
The goal is to analyze an opinion document \( d \) and determine the overall
sentiment \( s \) expressed about an entity. This is represented in the quintuple
model as:
(_, GENERAL, s, _, _)
where entity (e), opinion holder (h), and time (t) are either known or considered
irrelevant.
This assumption is valid for product and service reviews, where a user evaluates
a single item. However, it does not hold for forum posts and blog articles, where
the author might discuss multiple entities or compare them.
Classification Techniques
Most document-level sentiment classification techniques use supervised learning,
though some unsupervised methods exist. Sentiment regression is mainly done
using supervised learning.
Recent Advancements
Recent research has introduced cross-domain sentiment classification (adapting
models to different domains) and cross-language sentiment classification
(analyzing sentiment in multiple languages). These topics have become essential
as sentiment analysis expands across different industries and languages.
Early Approaches
Pang and Lee (2005) experimented with SVM regression, SVM multiclass
classification (one-vs-all strategy), and a meta-learning method called metric
labeling. They found that one-vs-all classification performed poorly compared to
regression methods, as numerical ratings are not purely categorical.
Goldberg and Zhu (2006) improved rating prediction using a graph-based semi-
supervised learning approach, which used both labeled (with ratings) and
unlabeled (without ratings) reviews. In this approach, each review was
represented as a node in a graph, and links between nodes indicated review
similarity. The algorithm revised initial rating predictions by enforcing
smoothness across the graph, ensuring similar reviews had similar ratings.
Long, Zhang, and Zhu (2010) followed a similar approach but used a Bayesian
network classifier. To enhance accuracy, they focused on a selected subset of
reviews that comprehensively evaluated multiple aspects. These reviews
provided more reliable predictions, using Kolmogorov complexity to measure
information content.
Sentiment rating prediction extends beyond simple positive/negative
classification by estimating numerical ratings using machine learning, regression
models, and aspect-based approaches. It remains a challenging task due to
dependencies between aspects, negations, and variations in user opinion
expression.
Pan et al. (2010) developed Spectral Feature Alignment (SFA), which aligns
domain-specific words into unified clusters using domain-independent words as
a bridge. This method applies spectral clustering on a bipartite graph linking
domain-independent and domain-specific words.
Other researchers applied topic modeling (He et al., 2011; Gao & Li, 2011) to
identify shared opinion topics across domains, which were then used as additional
classification features. Bollegala et al. (2011) created a sentiment-sensitive
thesaurus to map words expressing similar sentiments across domains.
Brooke et al. (2009) also used machine translation (English to Spanish) and
applied either lexicon-based or machine learning approaches to classify sentiment
in Spanish documents.
Guo et al. (2010) developed a topic modeling method to group aspect expressions
from different languages into common aspect clusters. This approach allowed
businesses to compare sentiment across different countries.
Problem Definition
Given a sentence (x), the goal is to classify whether it expresses a positive,
negative, or neutral (no) opinion. Unlike document-level classification, the
quintuple model (e, a, s, h, t) is not directly used here because sentence-level
classification is often an intermediate step in identifying opinion targets.
Classification Approaches
Sentence sentiment classification can be approached in two ways:
1. Three-class classification (positive, negative, neutral).
2. Two-step classification:
• Step 1: Determine whether a sentence expresses an opinion (subjectivity
classification).
• Step 2: If it does, classify the opinion as positive or negative.
Subjectivity Classification
The first step is to distinguish between opinionated (subjective) sentences and
non-opinionated (objective) sentences. This is known as subjectivity
classification (Hatzivassiloglou & Wiebe, 2000; Riloff & Wiebe, 2003).
Objective sentences are typically treated as neutral, but this can be misleading.
Advanced Techniques
Wilson et al. (2004) classified clauses within sentences as neutral, low,
medium, or highly subjective instead of treating entire sentences as subjective
or objective.
Benamara et al. (2011) proposed a four-class system:
• S (Subjective Evaluative): Expresses positive/negative sentiment.
• OO (Objective with Implied Opinion): Factual but implies sentiment.
• (Purely Objective): No sentiment or implied opinion.
• SN (Subjective but Non-Evaluative): Expresses emotions or beliefs but
without clear sentiment.
Conclusion
Subjectivity classification is crucial for sentiment analysis, as filtering out
objective sentences improves sentiment classification accuracy. However, some
objective sentences still imply sentiment, requiring more advanced methods that
consider context and linguistic structure.
Early Approaches
• Yu & Hatzivassiloglou (2003) modified Turney’s (2002) PMI-based
approach by using a large set of seed adjectives and a log-likelihood ratio
to determine sentiment orientation.
• Hu & Liu (2004) proposed a lexicon-based approach, assigning +1 to
positive words and -1 to negative words while considering negation words
(not) and contrast words (but, however).
• Kim & Hovy (2004, 2007) applied multiplicative aggregation of sentiment
scores instead of summation.
Conclusion
Sentence-level sentiment classification is crucial for fine-grained sentiment
analysis but is challenging due to mixed sentiments in compound sentences.
Lexicon-based and supervised learning approaches remain widely used, while
context-aware and partially supervised models show promising results.
Dealing with Conditional Sentences
Most research in sentence-level sentiment classification assumes a one-size-fits-
all approach, but Narayanan, Liu, and Choudhary (2009) argued that different
types of sentences require different treatments. They proposed a divide-and-
conquer approach, focusing on conditional sentences, which pose unique
challenges for sentiment analysis.
Conclusion
Handling conditional and question sentences requires customized approaches, as
traditional sentiment classification techniques fail to capture their contextual
meaning. More focused research is needed to improve accuracy in different
sentence types.
Dealing with Sarcastic Sentences
Sarcasm is a complex linguistic phenomenon where the intended meaning is the
opposite of the literal meaning. In sentiment analysis, sarcasm presents a major
challenge because a positive statement may actually imply negative sentiment,
and vice versa. While sarcasm is not very common in product reviews, it is
frequently found in political discussions and online commentaries.
Conclusion
Cross-language sentiment classification is highly dependent on translation
quality. While translated training data and lexicons have shown promising results,
manually created corpora still perform better. Future research should focus on
improving machine translation and domain adaptation techniques for better
sentiment consistency across languages.
Segment-Based Classification
Zirn et al. (2011) classified discourse segments, where each segment expresses
one sentiment. They used Markov Logic Networks (MLN) to integrate sentiment
lexicons and neighboring discourse context for better classification.
Conclusion
Discourse relations provide valuable insights for sentiment classification,
particularly in complex sentences and multi-sentence texts. Future research
should further integrate discourse-level features to enhance sentiment analysis
accuracy.
UNIT-III
Sentiment Lexicon generation and Summarization: Sentiment Lexicon
Generation: Dictionary-Based Approach - Corpus-Based Approach - Desirable
and Undesirable Facts. Analysis of Comparative Opinions: Problem Definition -
Identify Comparative Sentences - Identifying the Preferred Entity Set - Special
Types of Comparison - Entity and Aspect Extraction. Opinion Summarization and
Search: Aspect-Based Opinion Summarization - Enhancements to Aspect-Based
Summary - Contrastive View Summarization - Traditional Summarization -
Summarization of Comparative Opinions - Opinion Search - Existing Opinion
Retrieval Techniques. Mining Intentions: Problem of Intention Mining - Intention
Classification - Fine-Grained Mining of Intentions.
Sentiment Lexicon Generation
A sentiment lexicon is a collection of words and phrases that convey positive or
negative sentiments. These words, also known as opinion words, polar words, or
opinion-bearing words, help in sentiment analysis. Positive words describe
desired qualities (e.g., beautiful, amazing), while negative words describe
undesired qualities (e.g., awful, poor). In addition to individual words, sentiment
phrases and idioms, like cost an arm and a leg, are also part of the sentiment
lexicon.
Sentiment words are classified into two types: base type and comparative type.
Base-type sentiment words express direct opinions, while comparative-type
words (e.g., better, worse, best, worst) indicate relative comparisons rather than
absolute opinions. For example, "Pepsi tastes better than Coke" does not state
whether either drink is good or bad, only that Pepsi is preferable in comparison.
There are three main methods for compiling a sentiment lexicon: manual
approach, dictionary-based approach, and corpus-based approach. The manual
approach is accurate but time-consuming, so it is often combined with automated
methods. The dictionary-based approach starts with a small set of words and
expands it using synonyms and antonyms from dictionaries. The corpus-based
approach identifies sentiment words based on patterns and statistical relationships
in large text datasets.
While automated methods are efficient, they can make errors, so manual
verification is often necessary. Additionally, factual statements that imply
opinions pose challenges in sentiment analysis, as they are not always explicitly
positive or negative.
Dictionary-Based Approach
The dictionary-based approach is a widely used method for compiling sentiment
lexicons by leveraging dictionaries like WordNet to expand a set of seed words
based on their synonyms and antonyms. This approach starts with a manually
selected list of positive and negative sentiment words and iteratively grows the
list by finding related words in the dictionary. The process continues until no new
words are discovered. After completion, manual inspection is usually performed
to correct errors and refine the lexicon.
A key improvement to this approach was proposed by Kamps et al. (2004), who
introduced a WordNet distance-based method. They calculated the sentiment
orientation of adjectives based on their relative distance from reference words
like good and bad. Similarly, Blair-Goldensohn et al. (2008) extended the
approach by incorporating a neutral seed set to prevent sentiment propagation
through neutral words. Their method used a directed semantic graph with
weighted edges to assign sentiment scores using a modified label propagation
algorithm.
Some methods also integrated fuzzy set theory to refine sentiment classification.
Andreevskaia and Bergler (2006) employed multiple bootstrapping runs to reduce
errors and normalized sentiment scores within a [0,1] range. Others, like Kaji and
Kitsuregawa (2006, 2007), used heuristics from web page structures (e.g., pros
and cons tables) to extract sentiment words from large corpora.
A significant limitation of the dictionary-based approach is that it primarily
provides general, domain-independent sentiment orientations. Words like quiet
can have different meanings depending on the domain (e.g., negative for a
speakerphone but positive for a car). To address this, corpus-based methods are
often used in combination with dictionary-based approaches to capture context-
dependent sentiment orientations.
Corpus-Based Approach
The corpus-based approach is used to expand a sentiment lexicon or adapt a
general-purpose sentiment lexicon to a specific domain. Unlike the dictionary-
based approach, which provides general sentiment orientations, corpus-based
methods help identify context-dependent sentiment by analyzing text data.
However, sentiment words can have different meanings within the same domain
based on context, making lexicon generation more complex.
One early method was introduced by Hazivassiloglou and McKeown (1997), who
used linguistic rules to determine sentiment words from a corpus. Their sentiment
consistency rule states that adjectives connected by AND usually share the same
sentiment, while conjunctions like BUT indicate a sentiment shift. A graph was
built with words linked by sentiment similarity, and clustering was applied to
classify words as positive or negative.
Additionally, Wiebe and Mihalcea (2006) explored subjectivity at the word sense
level, showing that word sense disambiguation could improve sentiment analysis.
Another unique approach by Brody and Diakopoulos (2011) analyzed word
lengthening in social media (e.g., slooooow), finding that longer words often
convey stronger emotions.
For example, words like valley and mountain are neutral in general but may carry
negative sentiment in the mattress review domain (e.g., “A valley has formed in
the middle of the mattress”). Identifying such implied sentiments is challenging
but essential for accurate sentiment analysis.
Approach for Identifying Implied Sentiments
Zhang and Liu (2011b) proposed a method to detect nouns and noun phrases that
imply sentiment in a specific domain. The approach is based on the observation
that certain aspects inherently carry only one sentiment—either positive or
negative. For example, the phrase “A bad valley has formed” is uncommon
because valley already has a negative connotation in mattress reviews.
1. Candidate Identification
• The algorithm analyzes sentiment contexts around each noun aspect.
• If an aspect appears significantly more in either positive or negative
sentiment contexts, it is inferred to have that sentiment orientation.
This method was an initial attempt to solve the problem of implied sentiment
detection, but its accuracy remains low. Further research is needed to improve the
approach, especially for distinguishing context-dependent sentiments more
effectively.
Analysis of Comparative Opinions
Comparative opinions express sentiment by comparing two or more entities
rather than directly stating a positive or negative opinion about one. These
opinions differ from regular opinions in both semantic meaning and syntactic
structure (Jindal and Liu, 2006a, 2006b).
For example, a regular opinion states an entity’s quality, such as "The voice
quality of this phone is amazing." In contrast, a comparative opinion compares
entities without explicitly stating good or bad quality, such as "The voice quality
of Nokia phones is better than that of iPhones."
Both types are often formed using comparative or superlative adjectives and
adverbs, though this is not always the case. Since their semantic meanings and
analysis techniques are similar, they are generally studied together under
comparative opinions.
Importance of Comparative Opinion Analysis
Comparative opinions play a crucial role in sentiment analysis as they help
understand consumer preferences and competitive positioning of products. Due
to their structural and contextual differences from regular opinions, they require
specialized analysis techniques to accurately extract and interpret sentiment.
Types of Comparisons
1. Gradable Comparison
Gradable comparisons rank entities based on shared aspects and have three
subtypes:
• Non-equal gradable comparison – Expresses a ranking relationship,
e.g., "Coke tastes better than Pepsi."
• Equative comparison – Indicates equality, e.g., "Coke and Pepsi taste
the same."
• Superlative comparison – Ranks one entity above all others, e.g.,
"Coke tastes the best among soft drinks."
2. Non-Gradable Comparison
Non-gradable comparisons describe relationships without ranking. These
include:
• Similarity or difference, e.g., "Coke tastes differently from Pepsi."
• Different aspects between entities, e.g., "Desktops use external
speakers, but laptops use internal speakers."
• Presence or absence of an aspect, e.g., "Nokia phones come with
earphones, but iPhones do not."
Comparative Keywords
Comparisons are often expressed using comparative and superlative words like
better, best, more, most, less, and least. These words can be regular (e.g., longer
→ longest) or irregular (e.g., better → best). Other words like prefer and superior
also indicate comparisons and are treated similarly.
Comparative keywords in non-equal gradable comparisons are further classified
into:
• Increasing comparative – Indicates a higher degree (e.g., more, longer).
• Decreasing comparative – Indicates a lower degree (e.g., less, fewer).
The goal of comparative opinion mining (Jindal and Liu, 2006b; Liu, 2010) is to
extract comparative opinion sextuples from a document:
For example, in "Canon’s picture quality is better than those of LG and Sony,"
written by Jim on 9-25-2011, the extracted sextuple is:
Jindal and Liu (2006a) found that almost every comparative sentence includes a
keyword that indicates comparison. Using a set of such keywords, 98% recall was
achieved, though precision was only 32%.
Since keywords alone provide high recall, they can filter out non-comparative
sentences, leaving only those requiring further precision improvement.
Jindal and Liu (2006a) found that SVM classifiers using keywords and
keyphrases performed best for classification.
Ding, Liu, and Zhang (2009) and Ganapathibhotla and Liu (2008) extended the
lexicon-based approach used for regular sentiment classification to comparative
opinions. Comparative opinion words are categorized into two types:
Since Pros and Cons rarely use comparative words, these words are first
converted to their base forms using WordNet and English comparative formation
rules. The assumption is that if a base-form adjective or adverb is positive (or
negative), its comparative and superlative forms also carry the same sentiment
(e.g., good → better → best).
Once sentiment words and their orientations are identified, the preferred entity
set is determined using comparison structures:
• If the comparative is positive, the entity before "than" is preferred.
• If the comparative is negative, the entity after "than" is preferred.
With opinion quintuples, businesses can track sentiment trends over time and
apply data mining techniques for deeper insights, improving decision-making
based on customer preferences.
The sentence selection method produced varied language and details, while
sentence generation provided a better sentiment overview.
Ontology-Based Summarization
Tata and Di Eugenio (2010) structured song reviews using ontology trees. They:
• Selected focused representative sentences mentioning fewer aspects.
• Ordered sentences logically using domain ontologies to maintain
coherence.
Sentiment-Aware Summarization
Lerman, Blair-Goldensohn, and McDonald (2009) introduced three models for
summarizing product reviews:
1. Sentiment Match (SM) – Extracts sentences to match the average
sentiment score.
2. Sentiment Match + Aspect Coverage (SMAC) – Balances sentiment and
aspect coverage.
3. Sentiment-Aspect Match (SAM) – Ensures both aspect and sentiment
alignment.
User evaluations showed SAM performed best, though not significantly better
than the others.
Optimization-Based Summarization
Nishikawa et al. (2010b) proposed a text summary technique considering
informativeness and readability by:
• Maximizing aspect-sentiment frequency for informativeness.
• Ensuring natural sentence flow for readability.
• Using integer linear programming to optimize summary structure.
Traditional Summarization
Traditional opinion summarization focuses on generating short text summaries
without explicitly considering aspects, entities, or sentiments. These summaries
extract important sentences from reviews, conversations, or articles without
structuring them based on opinion targets.
Approaches to Traditional Summarization
• Beineke et al. (2003) proposed a supervised learning method to identify
and extract key sentences from reviews.
• Seki et al. (2006) introduced a paragraph-clustering algorithm to group and
extract important sentences.
• Wang and Liu (2011) applied extractive summarization in conversations,
considering not only sentence ranking but also topic relevance, sentiment,
and dialogue structure.
Conclusion
While traditional summarization methods help extract key information, they are
less effective for opinion analysis, as they do not structure content based on
entities, aspects, or sentiment distribution. Aspect-based summarization provides
a more informative and quantitative alternative for analyzing opinions.
As Web search has become an essential service, opinion search is also gaining
importance. Opinion search helps users find public sentiment on various topics,
products, or individuals.
Types of Opinion Search Queries
1. Finding Public Opinions – Users search for customer reviews or public
sentiment about a specific entity or its aspects. Example: "Find opinions
on the picture quality of a digital camera."
Conclusion
Opinion search enables users to quickly access public sentiment and expert
opinions, making it a valuable tool for decision-making and analysis in various
fields.
Some advanced techniques combine topic and sentiment relevance into a single
ranking score for improved accuracy.
Opinion Search System Example
Zhang and Yu (2007) developed an opinion retrieval system, which won the 2007
TREC blog track. The system consists of two components:
1. Retrieval Component – Performs traditional information retrieval (IR)
by considering keywords and concepts (e.g., named entities, Wikipedia
entries). It expands queries using synonyms and pseudo-feedback to
improve search relevance.
2. Opinion Classification Component – Classifies retrieved documents as
opinionated or not using an SVM-based classifier trained on review and
factual data. Opinionated documents are further categorized as positive,
negative, or mixed.
Conclusion
Opinion retrieval techniques have evolved from basic document ranking to
advanced models integrating topic and sentiment relevance. These methods
enhance search accuracy, helping users find relevant, sentiment-rich content more
effectively.
UNIT-IV
Identifying intention, fake and quality of opinion: Detecting Fake or Deceptive
Opinions: Different Types of Spam - Supervised Fake Review. Detection -
Supervised Yelp Data Experiment - Automated Discovery of Abnormal Patterns
– Model Based Behavioral Analysis - Group Spam Detection - Identifying
Reviewers with Multiple User ids - Exploiting Business in Reviews - Some
Future Research Directions. Quality of Reviews: Quality Prediction as a
Regression Problem - Other Methods - Some New Frontiers.
Opinion Spam Detection
Opinions from social media influence purchase decisions, elections, marketing,
and product design. Since positive opinions can lead to profits and fame, some
individuals and organizations attempt to manipulate public perception by posting
fake reviews or opinions without revealing their true motives. This deceptive
practice is known as opinion spamming, and those who engage in it are called
opinion spammers. Opinion spam can be particularly dangerous when it
influences social and political issues, as it can mislead the public and create false
narratives.
Opinion spam detection is a major challenge because, unlike other types of spam
(such as email or web spam), fake reviews and opinions are difficult to identify
just by reading them. Unlike link spam, which manipulates hyperlinks, or content
spam, which stuffs irrelevant keywords, opinion spam involves fabricated or
misleading statements that appear legitimate. Additionally, since opinion spam
often blends with genuine reviews, detecting it requires analysis beyond just the
text itself.
One of the key challenges in opinion spam detection is the lack of a clear way to
distinguish between real and fake reviews by simply reading them. For example,
a positive review written for a good restaurant can be copied and posted as a fake
review for a bad restaurant, making it impossible to determine its authenticity
without external data. This means that effective opinion spam detection must rely
on additional information such as user behavior, review patterns, and metadata
rather than just textual analysis.
Type 1 (Fake Reviews): Fake reviews are deceptive opinions written with hidden
motives rather than genuine experiences. They are often used to promote certain
products or services by posting undeservedly positive reviews or to harm
competitors by posting false negative opinions. This type of spam is the most
harmful as it misleads consumers and manipulates public perception.
Ott et al. (2011) used Amazon Mechanical Turk to crowdsource fake hotel
reviews. Turkers were asked to write promotional reviews as if they worked for
the hotels, while truthful reviews were taken from TripAdvisor. They
experimented with several classification techniques and found that unigram and
bigram-based text classification performed best. However, their dataset had
limitations, as Turkers were not actual spammers, and the 50/50 class distribution
did not reflect real-world conditions.
Overall, supervised spam detection shows promise, but the lack of high-quality
labeled data remains a challenge. Many studies rely on imperfect datasets, which
can affect the accuracy of detection models.
2. Ranking Groups Based on Spam Indicators: Not all identified groups are
actual spammers. To refine the results, various behavioral indicators are
used, such as reviewing products within a short time frame, writing reviews
right after a product launch, having similar review content, and showing
rating deviations. A relational model called GSRank (Group Spam Rank)
was developed to rank candidate spammer groups based on these
indicators. An iterative algorithm was then applied to improve accuracy.
Different studies have used variations of these features, with some also
incorporating reviewer expertise, timeliness of reviews, and writing style to
improve accuracy.
This approach was tested on Ciao, a social review platform where users can trust
and follow others, making it an effective method for sites with social networks.
Other Approaches
Several alternative methods have been proposed:
• Classification-based methods: Some studies classify reviews as "helpful"
or "non-helpful" using content, social, and sentiment features.
• Addressing biases in helpfulness votes: Liu et al. (2007) argued that early
reviews and highly ranked reviews receive more votes, which may not
always reflect quality. They manually labeled reviews into categories and
trained an SVM classifier using features like informativeness, subjectivity,
and readability.
• Unsupervised ranking: Tsur and Rappoport (2009) introduced a method
where reviews are compared to a virtual "core review" and ranked based
on similarity.
• Personalized review quality prediction: Moghaddam et al. (2012) proposed
factorization models, arguing that review helpfulness varies by user
preference.
• Diversity in review selection: Some researchers suggested that top-ranked
reviews should not be redundant and should cover different aspects and
viewpoints of a product.
Conclusion
Determining review quality is a complex task that combines text analysis, user
feedback, social interactions, and ranking algorithms. While many approaches
have been explored, challenges remain, such as handling bias in user votes and
ensuring diverse, informative reviews are displayed first.