A Machine Learning Approach To Building A Tourism
A Machine Learning Approach To Building A Tourism
net/publication/333857452
CITATIONS READS
3 2,606
4 authors, including:
All content following this page was uploaded by Aarushi Phade on 17 March 2021.
48
International Journal of Computer Applications (0975 – 8887)
Volume 178 – No. 19, June 2019
text classification, for example Multivariate Bernoulli model generally want, a full score is given. Following formula has been
and Multinomial model are portrayed. In result it states, devised for the same,
Multivariate Bernoulli algorithm performs good on little
vocabulary sizes and Multinomial performs better at large Score(place=X) = 10*ambience + 10*cleanliness + 10*must-
vocabulary sizes. Execution of Multinomial Naïve Bayes can visit + nightlife*NightlifeUserValue +
be upgraded by utilizing locally weighted learning.[7] parking*ParkingUserValue +
peacefulness*PeacefulnessUserValue +
In [6] various ways to deal with make a recommendation list childSafety*ChildSafetyUserValue + 10*ratings
for the travel industry are examined. It expresses that by
utilizing content-based scoring, framework can utilize typical Here, the maximum value of each feature is 5, thus the maximum
tourist media information to include scores-based contents and total score of any place will be 400. The features ambience,
its semantics to the general inference process. cleanliness, must-visit status and ratings are preferred by most of
the users, so it is assumed that its value will be maximum. For the
In [8] Ensemble classification is used to analyze sentiment other features, user’s inputs will determine the value. Thus, the
analysis on twitter dataset. Ensemble classification includes summation of these values will result in a holistic score of each
joining the impact of different autonomous classifiers on a tourist place and arranging these scores in descending order will
specific issue which beats traditional Machine Learning generate the recommendation list.
classifiers by 3-5%
3.3 Getting Data
[9] This paper actualizes Sentiment Classification task on Deep learning algorithms used for Sentiment analysis require a
Amazon Fine Food reviews dataset and Yelp challenge vast dataset. For this reason, Amazon Product Reviews dataset
dataset. James Berry thought about two methodologies, first - from Kaggle having 3.6 million reviews has been used. From this
conventional Bag of Words approach using Multinomial Naïve dataset, 1 million reviews are taken. These 1 million reviews
Bayes and Support Vector Machine Classifiers and second – contain 600,131 positive reviews and remaining 399,869 negative
Long Short-Term Memory (LSTM) Recurrent Neural Network reviews. Along with these surveys have been gathered of better
with GloVe Embeddings and self-learned Word2Vec places utilizing Google API. When one looks through a spot-on
embeddings. This paper concludes LSTM is best performing Google, Google API returns data about that place alongside 5
algorithm. most recent surveys for each spot. Additionally, reviews on
destinations like TripAdvisor, Google which are openly
3. PROPOSED METHODOLOGY accessible and are taken to assemble the dataset. Likewise, to get
3.1 Overview of the System the reviews a survey was conducted getting reviews about various
The proposed system aims to reduce the effort on the user’s places. Utilizing these, a sum of 30,000 surveys of better places
part. The system will create a recommendation list which is were accumulated. These accumulated surveys are given
curated using the results of analysis of numerous user-reviews classification as positive or negative manually.From above
and the inputs given by the user. The deep learning algorithm dataset, 1 million surveys of Amazon Product Reviews dataset
will determine the extent to which the review of a place is and 20,000 reviews of places are used as training set for
positive and negative. Based on the result, the rank of the place algorithm, while remaining 10,000 reviews are used for
in the recommendation list will be decided. A more positive testing.As a model for the recommendation system, the state of
result will rank the place higher and increase the chances of it Goa from India is considered. 26 places from Goa are chosen.
being recommended while a more negative result will rank the Values for features like ambiance, cleanliness, peacefulness of
place lower, thereby decreasing the likelihood of it being each spot are given physically by perusing surveys.
recommended. Each of these places is categorized based on
what it offers, for instance the Taj Mahal being a 3.4 Model Building
To find out the best performing models, the following machine
historical site offers a historical and heritage value. A user will learning and deep learning algorithms were considered and
enter their preference. This includes the type of location they implemented on the Amazon Reviews dataset:
want to visit (adventurous, historical, architectural, etc.), the
number of people traveling and children (if any), and the I) Bernoulli Naïve-Bayes
number of days they plan to take the trip for. Based on these In the multivariate Bernoulli event model, features are
parameters and the user reviews for each place, the independent Booleans (binary variables) describing inputs. Like
recommendation list will be generated uniquely for that user. It the multinomial model, this model is popular for document
will be mapped to the individual user’s requirements and a classification tasks, where binary term occurrence features are
tailored trip will be generated. Thus, the user won’t have to used rather than term frequencies. If xi is a Boolean expressing
settle for generalized plans that tour businesses generally offer. the occurrence or absence of the ith term from the vocabulary,
This system works in two phases. In the first phase the reviews then the likelihood of a document given a class Ck is given by
and other relevant data is gathered, and an average rating is
assigned to each place. In the second phase, the ratings
assigned in the previous phase and other parameters taken
from the user are utilized to generate a unique
recommendation list. Thus, every user gets a tailored tour plan where pki is the probability of class Ck generating the term xi. This
that actually considers their opinions. event model is especially popular for classifying short texts. It
has the benefit of explicitly modelling the absence of terms.
3.2 Design of the Recommendation List When implemented on the Amazon dataset, it had an accuracy of
The crux of the system is the recommendation list which maps 82.75% and an f1-score of 0.83.
user’s interests to ratings analyzed from reviews. Ratings,
ambience, cleanliness, must-visit, nightlife, parking and
II) Multinomial Naïve-Bayes
With a multinomial event model, samples (feature vectors)
peacefulness are the factors considered while generating the
recommendation list. For features which a user would represent the frequencies with which certain events have been
49
International Journal of Computer Applications (0975 – 8887)
Volume 178 – No. 19, June 2019
generated by a multinomial (p1...,pn) is the probability that IV) Convolutional Neural Network
event i occurs (or K such multinomials in the multiclass case). A convolutional neural network consists of an input and an
A feature vector x = (x1,.....,xn) is then a histogram, with xi output layer, as well as multiple hidden layers. The hidden layers
counting the number of times event i was observed in a of a CNN typically consist of convolutional layers, RELU layer
particular instance. This is the event model typically used for i.e. activation function, pooling layers, fully connected layers and
document classification, with events representing the normalization layers.
occurrence of a word in a single document. The likelihood of
observing a histogram x is given by Description of the process as a convolution in neural networks is
by convention. Mathematically it is a cross-correlation rather than
a convolution. This only has significance for the indices in the
matrix, and thus which weights are placed at which index.
Convolutional networks were inspired by biological processes in
that the connectivity pattern between neurons resembles the
If a given class and feature value never occur together in the organization of the animal visual cortex. Individual cortical
training data, then the frequency-based probability estimate neurons respond to stimuli only in a restricted region of the visual
will be zero. This is problematic because it will wipe out all field known as the receptive field. The receptive fields of
information in the other probabilities when they are multiplied. different neurons partially overlap such that they cover the entire
Therefore, it is often desirable to incorporate a small-sample visual field.
correction, called pseudo count, in all probability estimates
such that no probability is ever set to be exactly zero. This way As expected, the CNN model yielded an accuracy of 94.40%.
of regularizing Naïve Bayes is called Laplace smoothing.
Implementing this algorithm on the V) Long Short-Term Memory RNN
For the neural network approach, LSTM RNNs have been used
Amazon dataset yields an accuracy of 84.48% and an f1-score because they generally have a superior performance than
of 0.85. traditional RNNs. A problem arises when using traditional RNNs
for NLP tasks because the gradients from the objective function
III) Random Forest Classifier can vanish or explode after a few iterations of multiplying the
Random Forest learning is the construction of a decision tree weights of the network. For such reasons, simple RNNs have
from class-labelled training tuples. A random forest is a flow- rarely been used for NLP tasks such as text classification In such
chart-like structure, where each internal (non-leaf) node a scenario, one can turn to another model in the RNN family such
denotes a test on an attribute, each branch represents the as the LSTM model. LSTMs are better suited to this task due to
outcome of a test, and each leaf (or terminal) node holds a the presence of input gates, forget gates, and output gates, which
class label. The topmost node in a tree is the root node. control the flow of information through the network.
Classification and Regression Tree (CART), Iterative
Dichotomiser 3(ID3) and Chi-squared Automatic Interaction An accuracy of 94.56% was obtained using this algorithm
Detector (CHAID) are few types of decision tree learning on the Amazon Reviews dataset.
algorithms. 4. RESULTS
The Amazon Reviews dataset when used to train this For the neural network approach, LSTM RNNs is used because
algorithm outputs an accuracy of 84.60% and an f1-score of they generally have a superior performance than traditional
0.85. RNNs for learning relationships.
A problem arises when using traditional RNNs for NLP tasks Network performs the best.
because the gradients from the objective function can vanish or
explode after a few iterations of multiplying the weights of the 5. CONCLUSION
network. For such reasons, simple RNNs have rarely been Thus, to develop the recommendation list, various machine
used for NLP tasks such as text classification [7]. In such a learning and deep learning algorithms have been discussed to
scenario one can turn to another model in the RNN family analyze the reviews of the Amazon Reviews dataset. As can be
such as the LSTM model. LSTMs are better suited to this task seen from the evidence above, the Recurrent Neural Network
due to the presence of input gates, forget gates, and output proves to be the model which yields the highest accuracy of
gates, which control the flow of information through the 94.56%. Thus, in this experiment a deep learning algorithm
network. outperforms the machine learning algorithms and is consequently
chosen to classify the user reviews. The proposed system will
Table 1. Results thus take the output of this analysis and map it with the user’s
Algorithm Used Accuracy interests.
In the proposed system, the reviews are looked at holistically.
Bernoulli Naïve-Bayes 82.75% Breaking this review down based on multiple core properties may
result in a more in-depth and accurate classification. For instance,
Multinomial Naïve-Bayes 84.48% in a review about a tourist spot, extracting features like parking
availability, cleanliness, child-safety may prove to be helpful and
needs further exploration in the future.
Random Forest 84.60%
6. ACKNOWLEDGEMENTS
Convolutional Neural Network 94.40% We greatly acknowledge Amazon Co. and Kaggle for making the
dataset of the Amazon Product reviews openly available.
Recurrent Neural Network 94.56%
50
International Journal of Computer Applications (0975 – 8887)
Volume 178 – No. 19, June 2019
IJCATM : www.ijcaonline.org 51