0% found this document useful (0 votes)
21 views11 pages

Location Based

Uploaded by

senthilg1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views11 pages

Location Based

Uploaded by

senthilg1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Journal of Xi'an University of Architecture & Technology Issn No : 1006-7930

LOCATION-BASED RECOMMENDER SYSTEM USING


COLLABORATIVE FILTERING AND CLUSTERING

G. Senthil Kumar1, S.Anuja1 ,Keerthana Ravikumar2 and Avani Bhatnagar2


1 &2
Assistant Professor ,3&4 Student
1,2,3,4
Department of Software Engineering, SRM Institute of Science and Technology,
Kattankulathur, Chennai-603203, India
Corresopnding Author: Keerthana Ravikumar [email protected]

ABSTRACT:
Recommendation systems supports the user to discover various products and contents by
foreseeing the users rating of the corresponding products or contents and showcase the items which
the users have rated highly. These items can be places, books, movies, restaurants and commodities
on which users can have different opinions. Since there is a increase of online services,Items and
products day by day,designing an efficient and effective recommendation has become an
significant task. Recent research shows that the Online service recommendation system focuses
on two prominent approaches: collaborative filtering and content-based recommendation. The
content-Based method involves characteristics of items and the Collaborative Filtering method
takes into consideration the user’s past behavior and ratings to form decisions. In existing web
services discovery approach the recommendation systems focuses on keyword-dominant web
service search engines.Thease searach engines may possess several limitations such as poor
recommendation performance and they are heavily dependenent on correct and complex queries
from users. In this paper,an Agglomerative Hierarchical Clustering-based Collaborative Filtering
approach is proposed for effective recommendation. The proposed recommender system
recommends similar or more accurate places to the customers present in the same clusters with
some similarity in preferences.

Keywords: User-Based Collaborative Filtering, Item-Based Collaborative Filtering,


Agglomerative Hierarchical Clustering.

1. INTRODUCTION:

The data available in the web can be structured, semi-structured, or unstructured. The
traveling history of the people can be very vast and varied [1], which results in big data sets. Big

Volume XII, Issue III, 2020 Page No: 2878


Journal of Xi'an University of Architecture & Technology Issn No : 1006-7930

Data can be referred to as a collection of substantial data sets that are growing exponentially with
time, which has to be stored and managed. The traditional methods are very less efficient in
managing this big data as these approaches are expensive, time-consuming, and less scalable[2][6].
Insight of this challenge, a Clustering-based Collaborative Filtering approach (Club CF) is
proposed in this paper that aims at recruiting similar services within the same clusters to suggest
services collaboratively. Recommender systems in location-based social networks take benefit of
social and geographical influence in creating customized Points-of-interest (POI)
recommendations[1][4]. The social influence is obtained from similar users supported matching
visit history[3]. In contrast, the geographical influence is obtained from the geographic preferences
of users[4] we can get after their arrival at different POIs[7]. However, this approach could come
short once a user moves to a brand new location wherever there is no activity history[[9].

We tend to propose a system that models user preferences based on user ratings and
categories of POIs. This cluster-based recommendation system uses both types of Collaborative
Filtering, namely, User-Based CF[3] and Item-Based CF[8]. This process entails a data pre-
processing part, during which the datasets will be retrieved and processed and then grouped into
clusters with similar entities. Recommendations are then created for every cluster. The top side of
this approach is providing a recommendation which process the data in quicker manner at runtime
and as a result of virtually everything is pre-computed.

2. LITERATURE SURVEY:

The field of recommender systems uses mainly three methods namely, content-based,
collaborative and hybrid recommendation approaches [2]. The user-based collaborative filtering
identifies similar users to an active user for whom the recommendations have to be provided and
then using the user-item matrix, ratings for items not rated by the active user are predicted and
accordingly, the recommendations are done[3].
The relationships between the items are identified in Item-based collaborative filtering,
unlike user-based in which the relationships of users are considered. The similarities between the
items are calculated and then the recommendations are provided. This method is sometimes more
efficient than user-based CF, as relationships between items are not based on fluctuating moods of
users[8].Many different approaches for different purposes are used such as State-of-the-art and
time-aware recommender systems.

Volume XII, Issue III, 2020 Page No: 2879


Journal of Xi'an University of Architecture & Technology Issn No : 1006-7930

The state-of-the-art technique uses the user-generated textual reviews, which studies two
branches that are review-based user profile building and review-based product profile building.
The limitation in this system is that the user-defined reviews may not be very clear and sometimes
the negative comments can be considered positive and vise versa. This can be overcome by
considering only the ratings of the users[2][3].
The users tend to visit a different type of places at different time slots. For example, people usually
visit restaurants during the afternoon and pubs at night. So, the time-aware point-of-interest is
introduced based on geographical[6] and temporal influences[4][10].Personalized ranking and
suggestions are very crucial based on the user's history and preferences[5]. Many real-world
constraints, like POI availability, traveling time constraints, diversity, can interfere in providing
accurate and e[fficient suggestions to the user[9].
There are many limitations in the existing recommendation systems like cold start
problems, data scalability, data sparsity [7].Cold start problem occurs when a user is new to the
system and has no history or data to suggest the items.Data sparsity problem occurs because the
users and items data are in abundance, and the ratings are comparatively less[11]. The users are
not so used to rate immediately, causing a lack of data. Data scalability is the rapidly changing
data that cannot be controlled[12]. So many new users and new items are added to the datasets and
there is a lack of data. The proposed clustering methodology can overcome these limitations and
increases the efficiency of the system[13].

3. LOCATION BASED RECOMMENDER SYSTEM:


A recommender system filters out the items based on the predictions on ratings provided by the
users. The proposed system uses Hierarchical Agglomerative Clustering to cluster all the similar items or
users into the same group and then recommend the predicted places generated by using user-based
collaborative filtering and item-based collaborative filtering which can be known as Club CF. Clustering
helps in reducing the size of the data by grouping items that are similar to the same clusters.

Initially, the data is preprocessed, converting it into some useful datasets using which different
clusters are created. According to the similarities, a user-item matrix is generated, which contributes to
predicting the missing ratings. This metric computes the Euclidean distance between two user points.
Collaborative filtering is taken into account for recommending the items based on users’ and items’
ratings.The architecture of the location based recommender system is shown in Figure 1.

Volume XII, Issue III, 2020 Page No: 2880


Journal of Xi'an University of Architecture & Technology Issn No : 1006-7930

Figure 1 Location based Recommender Sytstem

3.1. DATA PREPROCESSING:

In this module, the users’ ratings are collected as a dataset, and it has been used for the
analysis. The ratings given by the user are saved in CSV format of data. This data set is provided
as an input to the mapper class, which takes the userID, placeID and ratings given by the user in a
map. userID will be considered as a key, whereas placeID and ratings are taken as a value. Once
after loading this data, it forms an output. This output is given as an input to the reducer. The
reducer finds a particular cluster for each user. Places rated by each user are grouped by the reducer
preprocess step which is shown in Figure 2.

Volume XII, Issue III, 2020 Page No: 2881


Journal of Xi'an University of Architecture & Technology Issn No : 1006-7930

Figure 2 Data Pre-Processing

3.2. AGGLOMERATIVE CLUSTERING:

Clustering is an essential step in our approach. The clusters provide with the neighborhood
of the active user (the user for whom the recommendations are made). Agglomerative clustering
is a bottom-up approach that starts with several small-scale clusters and then merges them along
to create more massive clusters. Dendrograms are the visual representation of this algorithm. The
better explanation of this algorithm can be:

1. Treat every data as one individual cluster at the start of the algorithm.

2. Merge two clusters at a time into a replacement cluster. The dissimilarity calculation between
each merged group and the other groups are covered. Some ways to implement this:

 Complete linkage- similarities between the farthest pairs.


 Single linkage-similarities between the closest pairs.
 Group average- similarities between groups.
 Centroid similarity- every iteration merges the clusters with the first similar central point.
4. The grouping method continues till all the pairs are incorporated into one cluster which is
shown in figure 3.

Volume XII, Issue III, 2020 Page No: 2882


Journal of Xi'an University of Architecture & Technology Issn No : 1006-7930

Figure 3 – User Clustering

3.3 COLLABORATIVE FILTERING:

CF systems recommend products to target users based on the opinions of other users. These
systems employ statistical techniques to find a set of users known as neighbors, who have a history
of agreeing with the target user, of recommending products.For better recommendations, our
approach is to use both User-based CF and Item-based CF.

In user-based collaborative filtering, a user-item matrix is built according to the ratings of


the users on the item. Similarity scores between the users are calculated, and similar users to the
active user are identified. Then the active users are recommended with items with similar interests.
Item-based collaborative filtering is very similar to user-based collaborative filtering. Initially, the
pairs of items rated by the same user are identified. The similarity of their ratings across all users
who rated both is measured. Then the recommendations are generated and sorted by the items.Our
target is to generate recommendations using both of these techniques together and get more
effective results

Volume XII, Issue III, 2020 Page No: 2883


Journal of Xi'an University of Architecture & Technology Issn No : 1006-7930

4. Experimental setup and analysis :

J2EE is used to create a dynamic web project, using many libraries and datasets. The front-
end of the system gets user information, their visited places, and ratings. This information is
preprocessed and stored in datasets having format CSV using MapReduce Algorithm.MapReduce
is an algorithm or technique to handle the computations on a large amount of data basically written
in java, having to implement applications of mapping and reducing steps.
The mapping process breaks the data into key/value pairs also known as tuples. The output
of the mapping step is then taken as input in the reducing step. The reducer task is to reduce the
tuples into a smaller set of tuples. The reducer process involves the shuffling of data and then
sorting it and storing it in datasets.
Algorithm 1: The Map Function
1. for each element mi,j of M do
2. produce (key, value) pairs as (i,k), (M,j,m i,j)), for k=1,2,3,.. upto the number of columns of N.
3. for each element nj,k of N do
4. produce (key, value) pairs as (i,k),(N,j,n j,k)), for i=1,2,3,.. upto the number of rows of M
5. return Set of (key, value) pairs that each key, (i,k), has a list with values (M,j,mi,j) and (N,j,nj,k)
for all possible values of j.

Algorithm 2: The Reduce Function


1. for each key(i,k) do
2. sort values begin with M by j in listM
3. sort values begin with N by j in listN
4. multiply mi,j and nj,k for jth value of each list
5. sum up mi,j and nj,k
6. return (i,k), mij * njk

The correlation between users and places is depicted in the form of user-item matrix. CF
systems suggest places to active users supported by the opinions of alternative users. These
systems use applied mathematics techniques to seek out a collection of users referred to as
neighbors who have a history of considering the target user. Similarity measures like cosine,
Pearson correlation, etc calculates the similarities and the neighbors are identified.

Volume XII, Issue III, 2020 Page No: 2884


Journal of Xi'an University of Architecture & Technology Issn No : 1006-7930

4.1 Cosine similarity:

𝐴.𝐵 ∑𝑛
𝑖=1 𝐴𝑖 𝐵𝑖
Similarity = cos(𝜃) = =
||𝐴|| ||𝐵||
√∑𝑛 2 𝑛 2
𝑖=1 𝐴𝑖 ×√∑𝑖=1 𝐵𝑖

4.2 Pearson correlation or centered cosine similarity:

∑𝑛 ̅)
𝑖=1(𝑥𝑖 −𝑥̅ )×(𝑦𝑖 −𝑦
r=
√∑𝑛 2 𝑛 ̅)2
𝑖=1(𝑥𝑖 −𝑥̅ ) √∑𝑖=1(𝑦𝑖 −𝑦

4.3 In user-based CF, the predict function:

∑𝑢∈𝐾(𝑟𝑢,𝑖 −𝑟̅𝑢 )×𝑤𝑎,𝑢


Pa,i = 𝑟̅𝑎 + ∑𝑢∈𝐾 𝑤𝑎,𝑢

Pa,i refers to the prediction for the active user a for a place i, wa,u refers to the similarity between
users and K refers to the neighborhood of most similar users.

In item-based CF, computation of similarities between pairs of items using centered cosine
similarity is done. The prediction function for the rating for a place i for the active user a is:

∑𝑗∈𝐾 𝑟𝑎,𝑗 × 𝑤𝑖,𝑗


Pa,i = ∑𝑗∈𝐾 |𝑤𝑖,𝑗 |

∑𝑢∈𝑈(𝑟𝑢,𝑖 − 𝑟̅𝑖 ) × (𝑟𝑢,𝑗 − 𝑟̅𝑗 )


𝑤𝑖,𝑗 =
2 2
√∑𝑢∈𝑈(𝑟𝑢,𝑖 − 𝑟̅𝑖 ) √∑𝑢∈𝑈(𝑟𝑢,𝑗 − 𝑟̅𝑗 )

Where wi,j refers to the similarity between places and K refers to the neighborhood of most similar places
rated by user a. Using both user-based and item-based CF together helps in overcoming many issues such
as scalability problems and in achieving better performance.

The above mentioned algorithms are being done on a dataset of yelp reviews. Yelp is a
business directory service and crowd-sourced review forum, i.e, it is a useful platform for users to
post reviews and rate products on the services provided to them. The big dataset of the Yelp

Volume XII, Issue III, 2020 Page No: 2885


Journal of Xi'an University of Architecture & Technology Issn No : 1006-7930

platform can be downloaded from Kaggle. The data a lot of users is being retrieved. The size of
the dataset is about5,200,000 user reviews. After data cleaning and preprocessing, a user-item
matrix is created. During the preprocessing, the data is extracted or retrieved from the datasets
available for the project. Each data for the users are processed into a set of keys and values using
a java program.

The ratings of users are considered for the places they have visited and then the similarities
between the users are calculated, which can be then used to cluster the users and recommend them
places they would like to visit according to user-based and item-based Collaborative filtering. The
clusters are formed on the basis of similarities calculated using centered cosine. One matrix is
created from the datasets, row representing unique users and columns representing unique items.
Then this matrix is separated into two different matrices on the basis of their estimated similarities,
one for user- based CF and other for item-based CF.

The prediction is done on the matrices for the ratings of some items/places from point of
view of a particular user. The new places to visit are recommended by both the matrices and the
accuracy and efficiency of the recommendations are improved. Both the collaborative filtering
algorithm when implemented together on the clusters can recommend the places to the active user
efficiently. A large amount of data leads to better results and datasets which are less spread or with
less sparsity(which is handled by clustering) helps in improving the results.

5. CONCLUSION/ FUTURE WORK:

In this paper agglomerative hierarchical clustering-based collaborative filtering approach


is proposed for POI recommendation retrieved from large data sets. Before applying CF technique,
services are merged into some clusters via AHC algorithm. Then the rating similarities between
services in the same clusters are computed. Because the clusters reduce the size of the data to be
processed and the ratings of services within the same cluster are a lot relevant with one another
than with those in other clusters, Club CF costs less computational time. In future work,
the applicability of planned methodology will be examined by utilizing different types of dataset
to optimize the results. Besides, a lot of sorts of information in multi-media, such as video, is probably
going to be integrated into POI popularity prediction and recommendation in our following work.

Volume XII, Issue III, 2020 Page No: 2886


Journal of Xi'an University of Architecture & Technology Issn No : 1006-7930

6. REFERENCES:

1. H. Yin, W. Wang, H. Wang, L. Chen, and X. Zhou, “Spatial-aware hierarchical collaborative


deep learning for POI recommendation,” IEEE Trans. Knowl. Data Eng., vol. 29, no. 11, pp.
2537-2551, 2017.
2. G. Adomavicius and A. Tuzhilin, “Toward the next generation of recommender systems: A
survey of the state-of-the-art and possible extensions,” IEEE Trans. Knowl. Data Eng., vol. 17,
no. 6,pp. 734–749, 2005.
3. Praveen Sundar, P.V., Ranjith, D., Vinoth Kumar, V. et al. Low power area efficient adaptive
FIR filter for hearing aids using distributed arithmetic architecture. Int J Speech Technol
(2020). https://doi.org/10.1007/s10772-020-09686-y
4. Q. Yuan, G. Cong, and A. Sun, “Graph-based point-of-interest recommendation with
geographical and temporal influences,” in CIKM, 2014, pp. 659–668.
5. M. Aliannejadi, I. Mele, and F. Crestani, “Personalized ranking for context-aware venue
suggestion,” in SAC, 2017, pp. 960–962.
6. 6J. Bao, Y. Zheng, D. Wilkie, and M. F. Mokbel, “Recommendations in location-based social
networks: a survey,” GeoInformatica,vol. 19, no. 3, pp. 525–565, 2015.
7. Umamaheswaran, S., Lakshmanan, R., Vinothkumar, V. et al. New and robust composite
micro structure descriptor (CMSD) for CBIR. International Journal of Speech
Technology (2019), doi:10.1007/s10772-019-09663-0.
8. B. M. Sarwar, G. Karypis, J. A. Konstan, and J. Riedl, “Item-basedcollaborative filtering
recommendation algorithms,” in WWW,2001, pp. 285–295.
9. C. Zhang, H. Liang, and K. Wang, “Trip recommendation meets real-world constraints: POI
availability, diversity, and traveling time uncertainty,” ACM Trans. Inf. Syst., vol. 35, no. 1,
pp. 5:1–5:28,2016.
10. N. Du, H. Dai, R. Trivedi, U. Upadhyay, M. Gomez-Rodriguez, and L. Song, “Recurrent
marked temporal point processes: Embedding event history to vector,” in KDD, 2016, pp.
1555–1564.
11. Karthikeyan, T., Sekaran, K., Ranjith, D., Vinoth kumar, V., Balajee, J.M. (2019)
“Personalized Content Extraction and Text Classification Using Effective Web Scraping
Techniques”, International Journal of Web Portals (IJWP), 11(2), pp.41-52

Volume XII, Issue III, 2020 Page No: 2887


Journal of Xi'an University of Architecture & Technology Issn No : 1006-7930

12. Jayasuruthi L,Shalini A,Vinoth Kumar V.,(2018) ” Application of rough set theory in data
mining market analysis using rough sets data explorer” Journal of Computational and
Theoretical Nanoscience, 15(6-7), pp. 2126-2130
13. Maithili, K , Vinothkumar, V, Latha, P (2018). “Analyzing the security mechanisms to prevent
unauthorized access in cloud and network security” Journal of Computational and Theoretical
Nanoscience, Vol.15, pp.2059-2063

Volume XII, Issue III, 2020 Page No: 2888

You might also like