0% found this document useful (0 votes)
29 views6 pages

Opinion Mining and Social Networks: A Promising Match

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 6

Opinion Mining and Social Networks:

a Promising Match
Krzysztof Jdrzejewski, Mikoaj Morzy
Institute of Computing Science
Poznan University of Technology, Poznan, Poland
[email protected], [email protected]



AbstractIn this paper we discuss the role and importance of
social networks as preferred environments for opinion mining
and sentiment analysis especially. We begin by briefly describing
selected properties of social networks that are relevant with
respect to opinion mining and we outline the general
relationships between the two disciplines. We present the related
work and provide basic definitions used in opinion mining. Then,
we introduce our original method of opinion classification and we
test the presented algorithm on real world datasets acquired from
popular Polish social networks, reporting on the results. The
results are promising and soundly support the main thesis of the
paper, namely, that social networks exhibit properties that make
them very suitable for opinion mining activities.
Keywords: opinion mining, sentiment analysis, social
computing, social networks
I. INTRODUCTION
Graphs and networks certainly rank among one of
the most popular data representation models due to their
universal applicability to various application domains.
The need to analyze and mine interesting knowledge from
graph and network structures has been long recognized, but
only recently the advances in information systems have
enabled the analysis of graph structures at huge scales.
Analysis of graph and network structures gained new
momentum with the advent of social networks. While the
analysis of social networks has been a field of intensive
research, particularly in the domains of social sciences and
psychology, economy or chemistry, it is the emergence of
huge social networking services over the Web that
spawned the research into large-scale structural properties
of social networks.. Social networks exhibit a very clear
community structure. Such community structure partially
stems from objective limitations (e.g., internal
organizational structure of a company can be closely
represented by the ties within a particular social network)
or, to some extent, may result from subjective user actions
and activities (e.g., bonding with other people who share
ones interests and hobbies). Unveiling the true structure of
a social network and understanding of communities
forming within the network is the key factor in
understanding what the future structure of network will be.
The main goal of social network analysis is the study
of structural properties of networks. Structural analysis of
the social network investigates the properties of individual
vertices and the global properties of the network as
a whole. It answers two basic classes of questions about
the network: what is the structural position of any given
individual node and what can be said about groups
(communities) forming within the network. The main
measurement of a nodes social power (also called
members prestige) is centrality, which allows to determine
nodes relative and absolute importance in the network.
There are several methods to determine nodes centrality,
such as the degree centrality (the number of links that
connect to a given node), the betweenness centrality
(the number of shortest paths between any pair of nodes in
the network that traverse a given node) or the closeness
centrality (the mean of shortest paths lengths to other
nodes in the network).
From the point of view of opinion mining the ability to
assess the nodes prestige is essential as it allows to
differentiate between opinions of different individuals.
More specifically, nodes prestige allows to assign
different weights to opinions and associate more
importance to opinions expressed by prominent
individuals. Another factor that is often considered in
opinion mining is the identification of influential
individuals. An influential individual does not have to be
necessarily characterized with high degree centrality to
influence the average opinion within the network. Usually,
such individuals are characterized by high betweenness
centrality, impacting the dissemination of opinion rather
than forming the opinion. For instance, an individual with
high betweenness centrality can stop a negative opinion
from spreading through the network, or, on the other hand,
she can amplify the opinion.
Due to psychological reasons humans tend to form their
opinions in such way that the opinions conform with
the norm established within a given social group. Thus,
when mining opinions one has to take into consideration
the influence of the context in which the opinion is
forming, i.e. the social milieu of an individual. Social
networks are highly effective in bolstering group formation
2011 International Conference on Advances in Social Networks Analysis and Mining
978-0-7695-4375-8/11 $26.00 2011 IEEE
DOI 10.1109/ASONAM.2011.123
599
of similar individuals. Groups of nodes that share common
properties tend to get connected in the social network.
Communities are densely inter-connected and have fewer
connections to nodes from outside of a group. When
provided with the information on group membership of
an individual, the opinion mining algorithm can utilize this
knowledge to improve on the accuracy of opinion
prediction. This improvement stems from the fact that
the inclusion of community information in the opinion
mining algorithm allows for group-specific behaviour and
norms to be accounted for when trying to assess e.g.
the semantic orientation of an opinion.
Opinion mining is the domain of natural language
processing and text analytics that aims at the discovery and
extraction of subjective qualities from textual sources.
Opinion mining tasks can be generally classified into three
types. The first task is referred to as sentiment analysis and
aims at the establishment of the polarity of the given
source text (e.g., distinguishing between negative, neutral
and positive opinions). The second task consists in
identifying the degree of objectivity and subjectivity of
a text (i.e., the identification of factual data as opposed to
opinions). This task is sometimes referred to as opinion
extraction. The third task is aims at the discovery and/or
summarisation of explicit opinions on selected features of
the assessed product. Some authors refer to the this task as
sentiment analysis. All three classes of opinion mining
tasks can greatly benefit from additional data that may be
provided from the social network. Added knowledge may
include: a nodes centrality indexes, a nodes group
membership, nomenclature utilized within the group,
average group opinion on selected products, groups
coherence and cohesion, etc. All these variables enrich
opinion mining algorithms and provide additional
explanatory capabilities to constructed models.
This paper is organized as follow. In Section II we
present some related work on opinion mining. Section III
introduces basic concepts used in opinion mining. In
Section IV we present an original algorithm for
discovering opinion polarity. Section V describes the
datasets gathered from popular Polish social networks, and
the results of conducted experiments are reported in
Section VI. The paper concludes in Section VII with a
brief summary and a future work agenda.
II. RELATED WORK
Literature related to social network analysis is
extremely abundant and rich. The first proposals to
perform social network analysis originated in the domains
of social sciences and psychology [12] or economy [13].
Interestingly, much of this research rephrased what has
been previously discussed in physics within the context of
complex systems [14]. The most thorough summary of
social network analysis topics, models and algorithms can
be found in [17].
Opinion mining is a relatively new domain spanning
between the fields of data mining, machine learning and
natural language processing. Sentiment Analysis methods
can be regarded both as a supervised [1][5] and an
unsupervised learning methods [6][15], and an information
retrieval methods [16][18]. Many works concerning
opinion mining present conceptions based on dealing with
text documents modelled as sets of words [1] or vectors,
where dimensions represents words and values are weights
of words in the document [2].
In the vast majority of sentiment analysis methods,
information about connotations of a word with a positive
or a negative class is used to calculate documents
semantic orientation

y(J) = _
C
P
, c:ol(J) > u
C
N
, c:ol(J) < u
(1)
where
c:ol(J) =
scoc(t
i
)
t
i
ed
|d|
(2)
or
c:ol(J) = scorc(t

)
t
i
ed
(3)
where t

is the i-th term of the document d, |J| is


the number of terms appearing in the document d, C
P
and
C
N
are positive and negative classes, respectively, and
score() is a function that assigns positive or negative
values to terms, depending on their relationship with
the respective class.
Semantic orientations of individual terms are
aggregated using a dictionary method [5]. This method
uses two small sets of manually identified positive and
negative adjectives, which serve as seed sets. New terms
are subsequently added to these sets if they are linked by
semantically loaded conjunctions such as and, but,
however, etc.
Some opinion mining algorithms use the pointwise
mutual information measure to determine semantic
orientation of a term [3][4][6]. In this case semantic
orientation of a term is inferred from the association
between the term and a word (or a set of words) assigned
unambiguously to only one class (positive or negative),
e.g. excellent and poor. The pointwise mutual information
of the term t and the word w is defined as
PHI(t, w) = log (
p(t,w)
p(t)p(w)
) (4)
where p(t, w) is the joint probability of the term t and
the word w occurring together, while p(t) and p(w) are
probabilities from individual distributions of t and
w assuming their independence.
Semantic orientation of the term t is defined as
600

S0PHI(t) = PHI(t, w
pos
) -PHI(t, w
ncg
) (5)

where PHI(t, w
pos
) and PHI(t, w
ncg
) are values
calculated in accordance to the formula (4) for positive and
negative classes, respectively.
As the individual probabilities of terms are difficult to
compute, sometimes a heuristic is employed. This heuristic
considers the number of documents in the database where
the term t is placed near semantically loaded words. Then,
the pointwise mutual information becomes

PHI(t) = log (
hIts(t,w
pcs
)hIts(w
ncg
)
hIts(t,w
ncg
)hIts(w
pcs
)
) (6)

where bits(t, w
pos
) and bits(t, w
ncg
) denote the number
of documents in which the term t occurs close to at least
one word representing positive and negative class,
respectively, while bits(w
pos
) and bits( w
ncg
) represent
the number of documents in which these words occurs.

Among methods based on the concept of supervised
learning, similar assumptions are presented by score
classification algorithm [1]. In this case, the term scoring
function has following form:

scorc(t) =
p(t|C
P
)-p(t|C
N
)
p(t|C
P
)+p(t|C
N
)
(7)

where p(t|C
P
) and p(t|C
N
) are conditional probabilities of
the occurrence of the term t in positive and negative class,
respectively. These probabilities may be approximated by
term occurrence frequencies in the training set.
Another popular concepts described in the literature
on sentiment analysis are families of methods based on
the use of Naive Bayes classifier, e.g. [20], and Support
Vector Machines [1][1]. For other popular models and
algorithms, and other opinion mining tasks, e.g. opinion
extraction, we refer the reader to [19].
III. BASIC CONCEPTS

Opinion mining algorithms utilize heavily the available
text mining techniques. In order to clarify the description
of our approach and the description of conducted
experiments we introduce some basic notions from
the domain of text processing and mining.
Lemmatisation is a process of identifying the lemma of
a word. Algorithms for performing this operation typically
use dictionaries, where they look up the primary form of
the word. Lemmatisation may find several different
lemmas for a given word, if the word is the inflected form
of many various lemmas. The use of lemmatisation reduces
the number of terms present in the corpus and allows
matching of words in documents, even if words tend to
occur in different grammatical forms. However, in the case
of text classification problems, the use of lemmatisation
may result in deterioration of the classification accuracy,
due to the possible occurrence of words in different forms
derived from one lemma, depending on the document
affiliation to one of the classes.
Stemming is a process similar to lemmatisation. It aims
to extract the core of the word, referred to as the stem,
from the inflectional word forms. Stemming typically
involves removal and replacement of prefixes and suffixes.
The result of stemming does not need to be and often is not
a proper lemma. The best known stemming algorithm is
the Porter stemmer [7].
A stop-list is a set of words that should be removed at
early stage of text processing. In most cases, these are
conjunctions and other words which do not contribute
additional information to the content of the sentence.
Often, stop-list words are present in the sentence solely
due to the requirements of languages grammar. In many
cases the use of stop-lists improves accuracy and
performance of text document processing.
A term is a token generated from the document. It may
be a word, a lemma, a stem or an n-gram. An n-gram is a
sequence of n letters appearing in a document content, e.g.
the character string opinion mining may be split into the
following 8-grams: opinion_, pinion_m, inion_mi,
nion_min, ion_mini, on_minin, on_minin,
n_mining. N-gram representation of documents is often
an alternative to term representation. N-grams are lossless,
because the text may be rebuild, e.g. with use of algorithms
for DNA sequencing [8]. N-gram representation allows
the same operations on documents collections as the term
representation, but in addition offers extended functionality
(e.g., spelling corrections).
IV. OUR APPROACH
The method proposed in this paper for determining
terms semantic orientation is a variant of the method
used in [1]. The drawback of the original method is that it
assigns maximum or minimum value to all terms if they
occur in only one class, regardless of the number of
occurrences. Therefore, we have proposed an alternative
way of calculating the semantic orientation of a term. Our
method is based on the ratio of term occurence frequency
in documents assigned to positive and negative classes.
According to our approach the scoring function for
assigning positive and negative scores to terms becomes
scorc(t) = _
p
t
-1 , iff p

1
-[
1
p
t
-1 , iff p

< 1
(8)
where

p
t
=
p(t|C
P
)+s
p(t|C
N
)+s
(9)

where, p
t
is the raw semantic orientation of the term t,
p(t|C
P
) and p(t|C
N
) are conditional probabilities of
occurences of the term t in documents from positive and
601
negative classes, respectively, and is a small positive
value controlling for terms that appear in only one class. In
our experiments we have used the reciprocal of |C
*
| as
value, where C
*
denotes the majority class. We refer to
the original method presented in [1] as the score method,
and we refer to the above described modification as
the proportional method.
Example: Let us compute token polarity evaluation in
the way presented above. Lets assume training set
contains 1000 positive and 200 negative examples, token T
occured 9 times in positive examples, and 3 times in
negative examples. Then e =
1
max(200;1000)
= u.uu1 ,
p(t|C
P
) =
9
1000
= u.uu9 and p(t|C
N
) =
3
200
= u.u1S , so
p
t
=
0.009+0.001
0.015+0.001
= u.62S , thus scorc(t) = -[
1
0.625
-
1 = -u.6.
The score value of a term determined as above
increases or decreases with changing frequency of term
occurrences in positive or negative class, even if the term
occurs in only one class. Similarly to the score method,
the disadvantage of the proportional method is the noise
resulting from an insufficient number of terms instances
in the training set. However, when proportional method is
used, the influence of the noise is limited in comparison to
the score method. This limitation results from the use of
the scaling value . The score value assigned to a term
which occurs only once in the training set is limited by
the ratio of cardinalities of classes, whereas the semantic
orientation of terms characteristic to positive or negative
documents is often orders of magnitude greater. To further
reduce the impact of the noise on the effectiveness of
the algorithm, we propose to add filtering by removing
from the dictionary terms that occur in fewer than
documents
[ = j
|C
-
|
|C
#
|
[ +2 (10)
where C
*
denotes the majority class and C
#
denotes
the minority class in the training set.
Setting the threshold of term occurrences in
the training set allows to eliminate terms that are not
characteristic for any of the document classes, i.e. these
terms for which conditional probabilities of term
occurrences are similar for both classes, but which
occurred too rarely in the training set, to have their
evaluation been determined to be equal or close to zero.
V. EXPERIMENTS
A. Test sets
The main objective of experiments was to test
the accuracy of the classification algorithm proposed in
Section IV. We used collections of opinions harvested
from the e-commerce site Merlin, and two social networks
Znany lekarz and Ceneo. The first dataset is the collection
of movie reviews from the Merlin website. The reviewers
were grading movies using the scale from 1 to 5, where
the reviews with grades 1 or 2 are considered negative, and
the reviews with grades 4 and 5 are considered positive.
We have discarded neutral reviews with grade equal to 3.
The dataset consists of 1055 negative reviews and 9068
positive reviews.
The second dataset contains opinions on consumer
products aggregated by the website Ceneo. Among the
reviews graded from 0 to 5, we have chosen 16 674
positive reviews graded 4 or 5, and 793 negative reviews
with grades 0 or 1. Again, we have discarded all neutral
reviews.
The third dataset comes from the website Znany lekarz
which gathers opinions about physicians. We have
assumed that opinions associated with grades 1 and 2 on
a scale 1-6 are negative, and opinions with grades 5 and 6
indicate a positive feedback. The dataset contains 2380
negative opinions and 11 764 positive opinions. In addition
we have performed tests using an aggregated dataset
created by merging the three datasets. The aggregated
dataset contains 4228 negative opinions and 37 506
positive opinions.
B. Performance measures
To evaluate the effectiveness of the classification we
have used two measures: classification accuracy (A) and
binary classification quality (Q). The latter measure is
similar to the F1 measure, but takes into account precision
and recall achieved in both classes. These measures are
expressed by equations:

A =
t
p
+t
n
t
p
+]
p
+t
n
+]
n
(11)


= _
4
1
rcc
P
+
1
rcc
N
+
1
prcc
P
+
1
prcc
N
,iff u e {rcc
P
, rcc
N
, prcc
P
, prcc
N
]
u ,iff u e {rcc
P
, rcc
N
, prcc
P
, prcc
N
]
(12)

where
prcc
P
=
t
p
t
p
+]
p
prcc
N
=
t
n
t
n
+]
n
(13, 14)
rcc
P
=
t
p
t
p
+]
n
rcc
N
=
t
n
t
n
+]
p
(15, 16)

where t
p
and f
p
are true positives and false positives (i.e.,
numbers of positive examples from the test set classified
correctly and incorrectly), and t
n
and f
n
are true negatives
and false negatives (i.e., numbers of negative examples
from the test set classified correctly and incorrectly).
The binary classification quality measure Q, in contrast
to the accuracy of classification A, is not vulnerable
to a considerable disproportion between the sizes of
the positive and negative class. This vulnerability is
noticeable when the classifier prefers the majority class for
which the number of examples in the test set is many times
greater than the number of examples of the minority class.
602
C. Experiment setup
We have performed the 10-fold
experiments with both document represent
terms and n-grams. As the baseline we hav
method. In case of representations based o
tested classification performance both usin
and stemming, and without text pre-
stemming we have used Stempel [10], fo
we have used morfologik-stemming [9]
tested the impact of dictionary filtering
stop-words. Stop-word lists were created
Wikipedia [11]. In case of n-gram represen
tested the dependency of classification pe
n-gram length and the impact of dictiona
have not considered the impact of charac
the original text, i.e., in all experiments
documents were converted to lowe
the assignment of a document to either of
not possible due to the lack of available
assuming that the document has be
assigned.
VI. RESULTS
In this Section we present the resu
running all combinations of test described
Figure 1: Maximal and average values of accuracy a
during the tests using a document representation b
proportional method, S - score method, PF, SF - filt
created using proportional and score method

The results show better performance of
method over the score method. The remov
tends to lower the quality and accuracy, w
caused by insufficient number of tokens in
cross-validation
tations: based on
ve used the score
on terms we have
ng lemmatisation
-processing. For
for lemmatisation
. We have also
and removal of
based on Polish
ntations we have
erformance from
ary filtering. We
cter case used in
all letters in all
er case. When
f the classes was
tokens, we were
een erroneously
ults obtained by
in Section V.

and quality, achieved
based on terms. P -
tering of dictionary
respectively.
f the proportional
val of rare tokens
which is possibly
n documents.
Table 1: Sums of ranks from desce
configurations calculated independentl
quality Q and accuracy A. Value in ea
positions assigned to algor
Conf. element value
No pre-processing
Stemming
Lemmatization
Proportional method
Prop. method + filtering
Score method
Score method + filtering
with stop-words
w/o stop-words
A. Experiments on term representa
Table 1 shows best performa
no stemming or lemmatization w
that the reasons for such behavi
of the Polish language, where
express positive and negative o
but the frequency of specific gra
For example, the word uywa
frequently in negative context
tego produktu I will never use
positive context (Uywam t
zadowolony - I am using this p
The influence of the removal of
ambiguous.
Figure 2: Average values of accuracy and
using a document representati
B. Experiments on n-gram represen
The length of the n-grams
and accuracy is attained is 7 or
quality is almost independent
average results strongly poin
Interestingly, among n-grams h
the semantic orientation ther
ending rankings of algorithm
ly for each set based on values of
ach row is the sum of all ranking
rithm configurations.
Q A
330 283
379 383
491 534
184 170
286 262
359 292
371 476
637 629
563 571
ation
ance of classification when
was performed. We believe
or stem from the properties
e the vocabulary used to
opinions strongly overlaps,
ammatical forms may vary.
(to use) may occur more
(Nie bd wicej uywa
this product again) than in
tego produktu i jestem
roduct and I am satisfied).
f stop words is minimal and

d quality, achieved during the tests
ion based on n-grams.
ntation
for which the best quality
8. Maximum accuracy and
of the parameter n, but
nt out to long n-grams.
having the highest value of
re are n-grams spanning
603
between words, e.g., czo odr (stanowc
strongly discourage) or dzo pol (bardzo p
recommend).
Figure 3: Maximal values of accuracy and quality, achi
using a document representation based on n
VII. CONCLUSIONS
In this paper we have discussed the
benefit of using the social network e
opinion mining. We believe that social ne
perfect solution to the problem of opinion
dissemination and may be perceived as nat
opinion mining applications.
We have presented, as a proof of conc
analysis that aim at gathering user opinion
application areas. Both experiments sugge
networks fuelling the websites in question
context for opinion mining. We are aware
we have not utilized the information
network directly in the opinion mining alg
we have tested the ability to attain hig
quality of sentiment prediction using the
from a social network site.
Our future work agenda includes
the user's reception of opinions contained
further improvements of the presented
expect to attain the improvement o
performance due to the utilization of
information derived from the social ne
the information on relationships and conn
users. We also intend to develop an active
for this type of classification task.
czo odradzam -
polecam - highly

ieved during the tests
n-grams.
e possibility and
environment for
etworks present a
n acquisition and
tural enablers for
cept, examples of
s in two different
est that the social
n provide relevant
e of the fact that
from the social
gorithm. Merely,
gh accuracy and
e data harvested
the analysis of
d in the text and
algorithm. We
of classification
more contextual
etworks, namely,
nections between
learning strategy
VIII. BIBLIOG
[1] K. Dave, S. Lawrence, and D. M. Pen
opinion extraction and semantic clas
Proceedings of the 12th international
WWW 03, pages 519528, New York
[2] R.F. Xu, , K.F. Wong, and Y.Q. Xia
WIA in NTCIR-7 MOAT Task, Proc. o
[3] G. Wang, K. Araki,. Modifying SO-P
Mining by Using a Balancing Factor an
In Proceedings of NAACL HLT 2007,
189 192, Rochester, NY, 2007
[4] G. Wang, K. Araki,. A Graphic Reput
Japanese Weblog Based on both
Information, In Advanced Information
Workshops, 2008. AINAW 2008. 22n
pages 1240 1245, Okinawa, 2008
[5] V. Hatzivassiloglou, , K.R. McKeown
orientation of adjectives. In Proceedin
the Association for Computational Ling
the European, pages 174-181. New
Computational Linguistics.
[6] P. D. Turney, M. L. Littman, Un
orientation from a Hundred-Billion
Available: http://arxiv.org/abs/cs/02120
[7] M. F. Porter, An algorithm for
vol. 14, no. 3, pp. 130137,
http://portal.acm.org/citation.cfm?id=2
[8] J. Blazewicz, M. Kasprzak, W. Kuroc
for DNA Sequencing with Error
vol. 8, no. 5, pp. 495502, 2002
http://www.springerlink.com/content/c
[9] D. Weiss, M. Mikowski, morfologik
http://morfologik.blogspot.com/, 2010
[10] A. Biaecki, Stempel - Algorithmic
[Online]. Available: http://www.getopt
[11] Stop listy - Wikipedia, wolna enc
http://pl.wikipedia.org/wiki/Stop_listy
[12] J. Moreno, H. Jennings, Statistics of s
pp. 342-374. 1938
[13] S. D. Berkovitz, Markets and m
formulations. W B. Wellman, & S. D
Network Approach (strony 261-303).
University Press. 1988
[14] S. Bocaletti, V. Latora, Y. Moreno,
structure and dynamics. Physics Report
[15] M. Hu, B. Liu, Mining opinion fe
AAAI'04: Proceedings of the 19th
intelligence (2004), pp. 755-760.
[16] W. Zhang, C. Yu, W. Meng, Opinion r
Proceedings of the sixteenth ACM
information and knowledge manageme
[17] S. Wassermann, K. Faust, Social N
Applications, Cambridge University Pr
[18] A.M. Popescu, O. Entzioni, Extractin
from Reviews, Natural Language Proc
2007
[19] B. Pang, L. Lee, Opinion Mining
Publishers inc. 2008.
[20] H. Yu, V. Hatzivassiloglou, Toward
Separating Facts from Opinions and Id
Sentences, In Proceedings of the 2003
in Natural Language Processing, pp. 12
[21] M. Gamon, Sentiment classification o
data, large feature vectors, and the
Proceedings of the International
Linguistics (COLING), 2004
GRAPHY
nnock. Mining the peanut gallery:
sification of product reviews. In
conference on World Wide Web,
k, NY, USA, 2003. ACM
a,. Coarse-Fine Opinion Mining
of NTCIR-7. Japan, 2008
PMI for Japanese Weblog Opinion
nd Detecting Neutral Expressions,
Companion Volume 2007, pages
tation Analysis System for Mining
h Unstructured and Structured
n Networking and Applications -
nd International Conference on.,
n (1997). Predicting the semantic
ngs of the 35th Annual Meeting of
guistics and the 8th Conference of
Brunswick, NJ: Association for
nsupervised learning of semantic
n-word corpus, 2002. [Online].
012
suffix stripping, In Program,
1980. [Online]. Available:
75705
czycki, Hybrid Genetic Algorithm
rs, In Journal of Heuristics,
2. [Online]. Available: http://
7y26bnn1mrvkx1d
k-stemming, [Online]. Available:
Stemmer for Polish Language,
t.org/stempel/, 2004
cyklopedia, [Online]. Available:
, 2010
ocial configurations. Sociometry ,
market-areas: Some preliminary
D. Berkovitz, Social Structures: A
Cambridge, England. Cambridge
M. Chavez, Complex networks:
ts (424), pp 175-308. 2006
eatures in customer reviews, In
national conference on Artifical
retrieval from blogs, In CIKM '07:
M conference on Conference on
ent (2007), pp. 831-840.
Network Analysis: Methods and
ress, 1994.
ng Product Features and Opinions
essing and Text Mining, pp. 9-28.
and Sentiment Analysis, Now
s Answering Opinion Questions:
dentifying the Polarity of Opinion
Conference on Emprical Method
29-136, 2003
on customer feedback data: Noisy
e role of linguistic analysis, in
Conference on Computational
604

You might also like