Fake Reviews Detection Based On LDA: Shaohua Jia Xianguo Zhang, Xinyue Wang, Yang Liu
Fake Reviews Detection Based On LDA: Shaohua Jia Xianguo Zhang, Xinyue Wang, Yang Liu
Abstract—It is necessary for potential consume to make demonstrate the validity of our method through experiments.
decision based on online reviews. However, its usefulness The rest of the paper is organized as follows: in Section 2
brings forth a curse ‒ deceptive opinion spam. The deceptive we summarize related work; in Section 3 we discuss our
opinion spam mislead potential customers and organizations dataset, features and classifiers; we show the results and
reshaping their businesses and prevent opinion-mining discussion in Section 4; finally, conclusion and future work are
techniques from reaching accurate conclusions. Thus, the given in Section 5.
detection of fake reviews has become more and more fervent.
In this work, we attempt to find out how to distinguish II. RELATED WORKS
between fake reviews and non-fake reviews by using
linguistic features in terms of Yelp Filter Dataset. To our There are many significant study on how to classify
surprise, the linguistic features performed well. Further, we authentic and fictitious reviews. Ott Collected 800 deceptive
proposed a method to extract features based on Latent opinions via Mechanical Turk and 800 truthful opinions from
Dirichlet Allocation. The result of experiment proved that TripAdvisor, then Integrating work from psychology and
the method is effective. computational linguistics, they develop and compare three
approaches to detecting deceptive opinion spam, and
Keywords-Review detection; Linguistic features; Latent ultimately develop an admirable classifier that is nearly 90%
Dirichlet Allocation accurate on their gold-standard opinion spam dataset [2, 3]. It
is admirable for Ott to open their gold-standard opinion spam
I. INTRODUCTION dataset, which have make a great impact on the field of fake
With the dramatically increasing of online reviews, reviews detection. Nitin Jindal and Bing Liu deals with a
the review spam come along due to the fact that there is restricted problem, identifying unusual review patterns which
no control, anyone can write anything on the web [1]. The can represent suspicious behaviors of reviewers [4]. Snehasish
review that describe authentic post-purchase experience Banerjee and Alton YK Chua extract linguistic features to
can help potential consume get a satisfactory commodity, distinguish fake reviews by word n-gram, psycholinguistic
business have its own accurate positioning. Instead, deception words, part-of-speech distributions, readability of
review spam misleads consume and business. Thus, reviews and review writing style [5]. Heydari focuses on
detection of review spam has become increasingly urgent systematically analyzing and categorizing models that detect
and important. review spam [6]. Michael Crawford mainly provide a strong
There are generally three types of spam reviews: and comprehensive comparative study of current research on
Type 1: untruthful opinions (also known as fake reviews). detecting review spam using various machine learning
Type 2: reviews on brands onlyˈType 3: Non-reviews [1]. techniques [7]. Interestingly, Lim, P and Liu, B proposed
In this paper, we aim to detect deceptive fake reviews by ranking and supervised methods to discover spammers and
looking into deep-level semantics of reviews. Our goal is outperform other baseline method based on helpfulness votes
then to cast the deceptive fake review detection problem along [8].
into binary classification task and build classification In terms of Yelp Filter Dataset, Mengqi Yu found sentiment
model. Using term frequency, LDA, word2vec to extract features are very useful for rating prediction [9]. Dao Runa
features, then we fed those kinds of features extracted treated the fake review detection problem as binary
from each review of our dataset into several Machine classification task and built classification models by extracting
Learning models for classification and finally compare semantic based features and relational based features with
the performances of features in those Machine Learning several data mining techniques [10]. Boya Yu use a Support
models. Vector Machine model to decipher the sentiment tendency of
We perform our experiment in Yelp dataset. As each review from word frequency. Word scores generated
Yelp.com is a well-known large-scale online review site from the SVM models are further processed into a polarity
that filters fake or suspicious reviews which can be used index indicating the significance of each word for special
as fake reviews in our experiment. In the end, we types of restaurant [11].
P R F1 A P R F1 A
Features
hotel restaurant
Word unigrams (WU) 62.9 76.6 68.9 65.6 64.3 76.3 69.7 66.9
WU + IG (top 2%) 62.4 76.7 68.8 64.9 64.1 76.1 69.5 66.5
Word-Bigrams (WB) 61.1 79.9 69.2 64.4 64.5 79.3 71.1 67.8
WB + POS Bigrams 63.2 73.4 67.9 64.6 65.1 72.4 68.6 68.1
WB + Deep Syntax 62.3 74.1 67.7 64.1 65.8 73.8 69.6 67.6
WB + POS Seq. Pat 63.4 74.5 68.5 64.5 66.2 74.2 69.9 67.7
281
B. Discussion
TABLE II. DATASET STATISTIC
LDA can extract topic-words from one document, and
to some extent, topic-words can represent whole document.
Domain fake Non-fake %fake total
Thus, we use LDA to respectively extract topic-words from
fake reviews and non-fake reviews, it is more reflected the
Hotel and
802 4876 14.1% 5678 features of fake or non-fake reviews. Then when we counts
Restaurant the term frequency of each word, the import words to
8368 50114 14.3% 58517 reflect the features of fake or non-fake reviews will have a
higher term frequency, and then increase the accuracy of
TABLE III. NEW DATASET STATISTIC classification models. But due to the quantity of data is
enormous, the quantity of topic-words is far less than that.
Domain fake Non-fake %fake total Therefore, the accuracy with LDA slightly higher than the
Hotel and accuracy without LDA.
4017 8034 1/3 12051
restaurant
V. CONCLUSION AND FUTURE WORK
C. Classification and Evaluation This paper performed a linguistic investigation of the
Features from the two approaches just introduced are nature fake reviews in the commercial setting of Yelp.com.
used to train Support Vector Machine and Logistic Our study shows that linguistic features yielded a
Regression and Multi-layer Perceptron classifiersDŽ respectable 81.3% accuracy, which obviously higher than
The classification results of above mentioned techniques the 68.1% accuracy reported by Arjun Mukherjee on the
are evaluated by accuracy, precision, recall and F-measure. Yelp Filter Dataset as far as linguistic features linguistic
features [12]. Meanwhile, the study proved the
IV. RESULTS AND DISCUSSION effectiveness of features extracted based on LDA.
A. Results Possible directions for future work is to explore why
Logistic Regression and Multi-layer Perceptron have a high
In our experiment, we train SVM, Logistic Regression, accuracy, SVM in not. There is a hypothesis that sigmoid
and Multi-layer Perceptron models in Python 3.6. make a decisive influence, which will be testified in future
We choose 80% of the dataset as training data and 20% work.
as testing data. As we can see from the experimental results
in Table VII. TABLE IV. TOPIC-WORDS OF FAKE REVIEWS
Then compared the experimental results with the result Topic1 Topic2 Topic3 Topic4 Topic5
in [12]. The result of compare will be showed in Fig. 1. promise park stones writing cube
The difference between with LDA processing data and
quality comments discarded reserve parings
without LDA will be showed in Fig. 2.
pushy ramps split injure shined
Table VII shows that the results by using SVM yielded
accuracy of 65.7%, LDA+ SVM yielded a maximum rationalize edge eavesdrop damn pomp
accuracy of 67.9%, which slightly lower than the 68.1% podium cliff strict autographed bamboo
accuracy reported by Arjun Mukherjee on the Yelp Filter decorated spray breadth hate heroin
Dataset [12]. But LDA+ Logistic Regression yielded a peeled shots settle zealand absurd
maximum accuracy of 81.3%, which obviously higher than gulped care swirling olfactory unsalted
the 68.1% accuracy, and LDA+ Multi-layer Perceptron also
yielded a maximum accuracy of 81.3%. The accuracy of TABLE V. TOPIC-WORDS OF NON-FAKE REVIEWS
LDA+ Logistic Regression keep up to LDA + Multi-layer
Perceptron’s, but the F1-score of LDA+ Logistic Topic1 Topic2 Topic3 Topic4 Topic5
Regression slightly higher than the 71.1% F1-score of extremely decadent confirmed prospect collective
LDA+ Multi-layer Perceptron in fake reviews. burnt entertain duke eaten smiled
In terms of Fig. 1, the green line represent LDA+ vaguely hiccup warm previous cultural
Logistic Regression results in this paper, red and gray line arrives successor pour night mystery
respective represent the hotel and restaurant’s best results in content troubles laugh dish smothering
[12]. We can obviously notice the green line far higher than unstuck mustards transmogrify completely observing
red and gray line, which indicates that method of this paper
twists brighter care recognizable kindle
has a good effectiveness in binary classification task.
redefining responds school notable tire
In terms of Fig. 2, red line represent the accuracy by
using LDA, green line represent the accuracy without LDA.
TABLE VI. EXPERIMENT DATA
We can notice that the accuracy with LDA slightly higher
than the accuracy without LDA, which indicate the Domain fake Non-fake %fake total
effectiveness of LDA in this experiment. Hotel and
4167 8234 33.6% 12401
restaurant
282
TABLE VII. EXPERIMENTAL RESULTS
Logistic LDA+ Logistic Regression 81.3 85.2 87.2 86.2 73 69.5 71.2
Regression Word2Vec+Logistic Regression 65.1 65.1 1 78.9 \ 0 \
LDA+Word2Vec+Logistic Regression 65.8 66.8 1 80.1 \ 0 \
Multi-layer Perceptron 80.3 84 86.2 85.1 72.9 69.4 71.1
Multi-layer LDA+ Multi-layer Perceptron 81.3 85 87.4 86.2 73.2 69.1 71.1
Perceptron Word2Vec+Multi-layer Perceptron 65.5 65.5 1 79.2 \ 0 \
LDA+Word2Vec+Multi-layer Perceptron 65.7 65.7 1 79.3 \ 0 \
283