Restaurant Recommendation System Using Machine Learning
Restaurant Recommendation System Using Machine Learning
1. INTRODUCTION
ABSTRACT
Going out to a new restaurant is a big challenge faced by the
Nowadays a big challenge when going out to a new restaurant people as nowadays there are a lot of restaurants and choosing
or cafe, people usually use websites or applications to look up the one which has good taste according to the needs of the
nearby places and then choose one based on an average rating. person can be a nightmare. The opinions and feelings of the
But most of the time the average rating isn't enough to predict public about a restaurant's taste and hygiene greatly
the quality or hygiene of the restaurant. Different people have influences the user's opinion about the restaurant. Suppose a
different perspectives and priorities when evaluating a product has some copious negative reviews, it affects the
restaurant. Many online businesses now have implemented user's opinion and trust regarding that product in a negative
personalized recommendation systems which basically try to way. While exploring the available offers for a certain
identify user preferences and then provide relevant products product, it is appreciated by the user having the possibility to
to enhance the users experience . In turn, users will be able to access items that are generated by the Recommended System
enjoy exploring what they might like with convenience and
as it saves time as well as the money but the recommendation
ease because of the recommendation results. Finding an ideal
system should only consist of the products which suits the
restaurant can be a struggle because the mainstream
user's preference the most. In this paper we have implemented
recommender apps have not yet adopted the personalized
recommender approach. So we took up this challenge and we a recommendation system using hybrid filtering which is the
aim to build the prototype of a personalized recommender combination of Content-based filtering (CBF) and
system that incorporates metadata which is basically the collaborative filtering (CF). CBF models recommend based
information provided by interactions of customers and on the user's past behaviors and not from other users' data. If
restaurants online(reviews), which gives a pretty good idea of there is lack of enough information, the CBF will not be able
customers satisfaction and taste as well as features of the to discriminate the items properly. It will not perform upto the
restaurant. This type of approach enhances user experience of standards. On the other hand, CF looks upon the user's
finding a restaurant that suits their taste better. This paper has interactions and it tries to recommend items that were similar
used a package called lightfm(the library of python for to those items. In Cf, data sparsity problems occur because
implementing popular recommendation algorithms) and the interactions of many users are insufficient. However, there are
dataset from yelp. There are different methods of filtering the limitations to both methods. To avoid these, we have tried to
data, here we have used Hybrid filtering which is a use a hybrid approach that uses a combination of both
combination of Content-based filtering (CBF) and methods to give sufficient results and we have also compared
Collaborative Filtering (CF). Since the results from Hybrid the results running the CF model with our Hybrid Model to
filtering are far more closer to accuracy than CBF or CF
see which performs the best.
respectively. Then hybrid filtering gives results in the form of
personalized recommendations for users after training and
testing of the data 2. RELATED WORK
Key words : Restaurant recommendation system, The available recommendation system utilizes techniques
Content-based filtering, Collaborative Filtering and Hybrid across various fields, such as machine learning, data mining,
filtering database, statistics, similarity testing, etc. It generates
predictions of user satisfaction and/or recommends an item to
a user. Generally, the recommendation system is
1671
Ketan Mahajan et al., International Journal of Advanced Trends in Computer Science and Engineering, 10(3), May - June 2021, 1671 – 1675
implemented using three main conceptual approaches. For more details, please check out the dataset. [2]
Collaborative filtering,
Content-based filtering,
4. DATA PRE-PROCESSING
Hybrid approach.
Content-based filtering generates a prediction from attributes Now for building a recommendation system, we need an
of an item that the user prefers. Collaborative filtering is an interaction matrix between users and items, metadata
algorithm to generate a prediction using similarity of users’ associated with customers that indicates their taste preference
taste and preference. There are some critical problems in both and metadata of restaurants that summarizes their
concepts, such as limited content analysis, overspecialization, characteristics.
new user’s problem, and sparsity problem. Limited content
analysis is alimitation of content-based filtering that can only There is business of all categories from over 100 cities in the
draw a conclusion from features that are explicitly associated business dataset. We decided to filter out the dataset only for
with users. Over specialization is an overfitting problem one city because considering different cities means the items
where the recommendation system provides a high accuracy (restaurants) have less interactions with each other, So we
only on test data, but low accuracy on real data. New user selected Toronto which has 10,093 restaurants. Then we
problems arise from a lag in data to provide a new user with explored the restaurant attributes that would potentially be
an accurate prediction. Several recommendation systems use useful for recommendations. Then we explored the attributes
machine learning techniques to reduce the impact of those of restaurants that can be useful for recommendations. We
picked three attributes : 1. The rating of restaurants, 2.
problems. A hybrid approach combines the two concepts
Review count of the restaurant, 3.Restaurant categories as
together authorized licensed use limited to. The hybrid
item features since there are many features that have missing
recommendation system utilizes various approaches, such as
values. Usually Yelp has assigned an average of 10
combining separate recommenders, adding content-based categories/tags for each item(restaurant), and in total, across
characteristics to collaborative models, using multi-criteria, all restaurants, 436 tags exist such as breakfast, brunch,
etc. seafood, vegetarian, bars and so on. We selected upto 58 tags
with highest popularities since including some tags that only
3. DATASET appear a few times out of more than 10k restaurants would
add more noise in the recommender.[3]
The dataset used for this project is taken from Yelp. In that we
have three files that we converted to .csv viz users.csv,
reviews,csv, Business.csv. Given below are the column names
from each of the dataset.
Figure 4: TF-IDF
1672
Ketan Mahajan et al., International Journal of Advanced Trends in Computer Science and Engineering, 10(3), May - June 2021, 1671 – 1675
There were some cases where users give high ratings for each an example, users who like South Indian Restaurants would
restaurant , in such cases we subtract ratings from mean have similar embeddings with users who like North Indian
ratings and classify the result as positive for 1 value and Restaurants but won’t resemble the embeddings of an Italian
negative for -1 value and 0 for non-rated restaurants . In this food in the vector space. Since the embeddings are estimated
research paper instead of predicting user rating for each for every featur e. And the embeddings across all features
item(restaurant) we focus on ranking which restaurant user sums up the representations for items and users.
liked and disliked in proper order , as all this will lead to high III. Example of interaction matrix - user-movie ratings for
movie recommender(refer figure given below),
variance as time passes. So for the data cleansing step as the
final step, we selected users characteristic . We chose 4 not
sparse attribute, viz. The total written reviews , number of
useful reviews, if the user is elite/active in Yelp and the list of
liked restaurants of the user.[3]
6. MODEL ESTABLISHMENT
II. Demo :-
In the figure given below, restaurant recommendation for REFERENCES
user 1 and user 17. The number of ‘Known Positive’ means
1. Mara-Renata Petrusel, Sergiu-George Limboi, "A
the number of restaurant names that user is connected with,
Restaurants Recommendation System: Improving
on that basis, it recommends 5 restaurants to each user.
Rating Predictions using Sentiment Analysis", 21st
International Symposium on Symbolic and Numeric
Algorithms for Scientific Computing , SYNASC-2019.
2. https://www.kaggle.com/datafiniti/hotel-reviews
3. Asyush Singh, Solving business usecases by
recommender system using lightFM of
Towardsdatascience.com, 2018
4. Nanthaphat Koetphrom, Panachai Charusangvittaya,
Daricha Sutivong, "Comparing Filtering Techniques
in Restaurant Recommendation System", Department
1674
Ketan Mahajan et al., International Journal of Advanced Trends in Computer Science and Engineering, 10(3), May - June 2021, 1671 – 1675
1675