Deep and Broad Learning On Content-Aware POI Recommendation
Deep and Broad Learning On Content-Aware POI Recommendation
Fengjiao Wang∗ , Yongzhi Qu† , Lei Zheng∗ , Chun-Ta Lu∗ and Philip S. Yu∗
∗ Department of Computer Science
University of Illinois at Chicago
Chicago, Illinois, 60607, USA
Email: {fwang27,lzheng21,clu29,psyu}@uic.edu
† Wuhan University of Technology, China
Email: {quwong}@whut.edu.cn
Abstract—POI recommendation has attracted lots of research information. There are several drawbacks of these algorithms.
attentions recently. There are several key factors that need First of all, existing POI recommendation algorithms mainly
to be modeled towards effective POI recommendation – POI focus on information of users, such as user preference, users’
properties, user preference and sequential momentum of check- check-in sequence, while ignoring the characteristics of POIs.
ins. The challenge lies in how to synergistically learn multi-source Second, current algorithms typically model different sources
heterogeneous data. Previous work tries to model multi-source
information in a flat manner, using either embedding based
of information with the same metric, such as distances in
methods or sequential prediction models in a cross-related space, PRME and transition probabilities in FPMC. However, these
which cannot generate mutually reinforce results. In this paper, symbolized features may not be suitable to handle different
a deep and broad learning approach based on a Deep Context- form of dependencies. Third, they always model consecutive
aware POI Recommendation (DCPR) model was proposed to dependencies but ignores long term dependencies in check-
structurally learn POI and user characteristics. The proposed in sequences. Moreover, the above-mentioned models are all
DCPR model includes three collaborative layers, a CNN layer shallow models, which cannot capture the highly non-linearity
for POI feature mining, a RNN layer for sequential dependency of sequential patterns.
and user preference modeling, and an interactive layer based
on matrix factorization to jointly optimize the overall model. Recently, researchers take the content information of POIs
Experiments over three data sets demonstrate that DCPR model into consideration. Content information can be helpful in
achieves significant improvement over state-of-the-art POI recom- various ways. For instance, a user may search a POI’s reviews
mendation algorithms and other deep recommendation models.
or tips beforehand to decide whether she/he is interested in
Keywords—Spatial Temporal Modeling; Embedding; POI Rec- visiting the place. Therefore, in reality, POIs’ reviews or tips
ommendation; can actually be part of the inputs that affect a user’s check-
in decision. Besides, context information can help identify
I. I NTRODUCTION semantically similar POIs, e.g., ‘burgers’ often appear in the
reviews and descriptions of fast food shops. As shown in
As location-based applications rapidly gain popularity, a recent works [7], [8], [9], integrating context information
large volume of online contents with geo-tagged information can be beneficial to alleviate the sparsity problem in POI
(check-ins) is created daily. Check-ins, as a direct channel recommendation. However, most of these works are based on
connecting the online and offline worlds, aid the development traditional topic models that simply use bag-of-word features
of many personalized and locational information services, such and ignore the word orders. Sentences with similar N-grams
as personalized advertisement [1], local event promote [2], but total different semantic meanings are hard to differentiate
[3] and city management improvement [4]. One of the core for bag-of-words based technique [10]. Therefore, previous
tasks towards these services is Point Of Interest (POI) rec- methods may not fully uncover semantic information of POIs.
ommendation, since it not only helps users enriching their Moreover, topic models can be easily affected by the scalability
urban experiences but also facilitates the analysis of the crowd problem and also cannot handle new users and new POIs.
mobility and communication.
Due to the success of the deep neural networks, researchers
Most of the prominent approaches to POI recommenda- have also applied deep models on POI recommendation tasks.
tion can be divided into three categories: 1) collaborative Among which, Recurrent Neural Networks (RNN) is espe-
filtering, 2) sequential pattern modeling and 3) context-aware cially suitable for sequential prediction. Recently, [11] shows
recommendation. Basically, they are derived to learn three RNN’s superior performance on sequential click prediction. By
types of information - user preference, check-in sequences, and concurrently model spatial and temporal patterns in LBSNs
text information, respectively. Recently, some state-of-the-art through transition matrix of RNN, [12] achieves promising
models try to learn two types of information simultaneously, performance improvement over matrix factorization based and
such as PRME [5] and FPMC [6], which model user prefer- Markov chain-based algorithms.
ence and sequential patterns together. However, most of the
extended variants of the prominent approaches still relied on In order to broadly fuse different sources of information
the original architecture and integrate other information as side (user preference, check-in sequences, and text information), in
370
negative
Check-in Behaviorr Ranking Loss POI
positive
Learning
POI
h User Embeddings
n
User Representation
Learning R R R R
… POI Embeddings
…
…
…
on
POI Representation
Learning
CNN
.
.
. .. . .
.
.
.
. ... .
.
.
.
. ... .
.
… .
.
. .. . .
.
.
.
. .. . .
.
.
.
. .. . .
. Word Embeddings
. . . . . .
… POIs
… Reviews/tips
Fig. 1: Network Architecture. The architecture contains three components: 1) POI representation learning; 2) user representation
learning; 3) check-in behavior learning.
371
document dq , and ⊕ is the concatenation operator. Note that
n-th column of Φ corresponds to embedding of n-th word in Ct-1 × + Ct
document dq .
× tanh
Following the embedding function, three inner layers inside
CNN, including a convolution layer, a max-pooling layer and σ tanh ×
a fully connected layer, are built to learn feature vectors of
POIs. The structure of the POI representation learn component σ σ
is illustrated in Figure 2. Next, we will explain these three
layers in details. ht-1 ht
372
(a) Data distribution@4sq (b) Data distribution@yelp
model the conditional probability over POI j’s latent features TABLE I: Datasets
with Gaussian distribution as
Dataset Foursquare Yelp TIST
p(vj |xj , λ) = N (vj |xj , λv I) (14) Users 74,140 30,367 266,909
POIs 104,844 25,728 3,680,126
where I is a K × K identity matrix. Similarly, conditional Check-ins 418,081 146,456 33,263,631
373
0.2 0.5 0.16
FPMC FPMC FPMC
PRME PRME PRME
FM FM 0.14 FM
0.4
0.15 RNN RNN RNN
DCPR DCPR 0.12 DCPR
0.3
0.1 0.1
0.2
0.08
0.05
0.1
0.06
0 0 0.04
0 5 10 15 20 0 5 10 15 20 0 5 10 15 20
0.25 0.08
0.1 0.2
0.06
0.15
0.04
0.05 0.1
0.02
0.05
0 0 0
0 5 10 15 20 0 5 10 15 20 0 5 10 15 20
separate spaces. One embedding is based on sequen- algorithm, the dimension of POIs’ embeddings is set to 50,
tial transition probability, while the other embedding the number of neurons in the recurrent layer is set to 64, cross
is based on user preferences. Each user’s top-N entropy is employed as the loss function. For the proposed
recommendation is based on linear combination of DCPR algorithm, embedding dimension of POIs is set to 50.
the learned embeddings. For the convolution layer, the number of filters is set to 100,
3) FM [21] refers to Factorization Machine. It models filter length is set to 3. The number of neurons in the fully
pairwise interactions between all features. Note that, connected layer and the recurrent layer is set to 50. Note that,
for the proposed problem, there are three types of fea- we use different latent dimensions for different comparison
tures constructed for FM, including one hot encoding algorithms to optimize the performance for each case.
of users, combinations of one hot encoding of POIs
Three metrics are used to evaluate the performance of the
in check-in sequences, and one hot encoding of POI
compared methods. The output of the compared methods is
checked in after the sequences.
a ranked list of all POIs which indicate the likelihood of
4) RNN [11] is the state-of-the-art deep model for
the POI being checked in at the testing period from high
sequential prediction by adopting recurrent neural
to low. The first metric is Precision@N, which measures the
networks.
percentage of correct predictions in the top-N ranked list. The
5) CDL [22] jointly models text information with deep
second metric Recall@N measures the percentage of correct
representation learning and user feedback with col-
predictions in the top-N ground truth set. Note that, top-N
laborative filtering.
ground truth set is constructed based on the time difference
6) DCPR is the proposed method in this paper.
between training check-in sequence and testing check-ins. The
closer the time difference is, the higher position the POI’s takes
To evaluate the performance of the different approaches, in the top-N ground truth list. The third metric F1-score@N
for each user, we pick the first 80% of check-ins as training is the harmonic mean of above-mentioned two metrics, which
data, and the remained 20% of check-ins are considered as shows a comprehensive evaluation of the compared methods.
testing data. For the FPMC algorithm, the training data is
further divided into 80% and 20%, for training and validation,
B. Performance Comparison
respectively. Learning rate is set to 0.005, the parameter for
the regularization term is set to 0.03, and the factorization Fig. 5 shows the performance of POI recommendation
dimension is set to 20. For the PRME algorithm, parameter on Foursquare and Yelp datasets with metrics Precision@N,
α and latent dimension are set to 0.02 and 60 respectively, Recall@N, and F1-score@N. N varies from 1 to 20. Four
which follows the setting in the original paper. For the RNN observations are made as follows.
374
0.25 0.2 0.14
FPMC FPMC FPMC
DCPR DCPR 0.12 DCPR
0.2 0.15
0.1
0.06
0.1 0.05
0.04
0.05 0 0.02
0 5 10 15 20 0 5 10 15 20 0 5 10 15 20
0.16 0.30
FPMC FM DCPR FPMC FM DCPR 0.20 FPMC FM DCPR
0.14 PRME RNN 0.25
PRME RNN PRME RNN
0.12
0.15
0.20
0.10
0.08 0.15 0.10
0.06
0.10
0.04 0.05
0.05
0.02
0.00 50% 60% 70% 80% 0.00 50% 60% 70% 80% 0.00 50% 60% 70% 80%
0.14 0.18
FPMC FM CDL 0.25 FPMC FM CDL FPMC FM CDL
0.12 PRME RNN DCPR PRME RNN DCPR 0.16 PRME RNN DCPR
0.14
0.10 0.20
0.12
0.08 0.15 0.10
0.06 0.08
0.10
0.06
0.04
0.04
0.05
0.02 0.02
0.00 50% 60% 70% 80% 0.00 50% 60% 70% 80% 0.00 50% 60% 70% 80%
• DCPR consistently outperform other compared meth- dependencies, therefore, DCPR wins in almost all of
ods in two datasets, as shown in Fig. 5. Although in the varied N.
yelp dataset, PRME achieves slightly better results
when N = 1, the proposed DCPR algorithm per- • FM usually performs well in rate prediction tasks,
forms the best in most of general cases. The reason while it achieves inferior results compared to other
that PRME shows slightly higher results is because methods in POI recommendation task. Although FM
that PRME utilizes metric embedding technique to captures all pairwise interactions between all features,
model sequential transition probability. The metric the model is incapable of differentiate the importance
embedding technique is designed to learn transition of different feature interactions. Therefore, it is not
probability between consecutive check-ins, but it can able to focus on important feature interactions and
not model long term sequential influences. In con- ignore insignificant ones. In comparison, the proposed
trast, the proposed DCPR algorithm employ special DCPR have different parts to specialize on modeling
recurrent structure to particularly modeling long-term specific type of information and jointly learn the im-
375
portance of each part in one loss function. Therefore,
it achieves superior results compared to FM.
• The proposed DCPR outperforms other two deep
neural network based models. It can be seen from
Fig. 5 that DCPR achieves much higher accuracy
compared with typical RNN algorithms and CDL
as well. Even though RNN tries to model check-
in sequences, long term dependencies may not be
captured by deep recurrent neural networks. Also, (a) Precision@Top5-4sq (b) Recall@Top5-4sq
RNN ignores text information and thus loses another
source of information to tackle the problem. Although
CDL algorithm learns deep representation for content
information, it is not capable of modeling sequential
influence. The proposed DCPR algorithm models text
information and check-in sequences simultaneously,
so it outperforms RNN and CDL with big margin.
• Comparing the performance of the comparison meth-
ods on three different metrics, we observe that Pre-
cision@N and Recall@N always monotonically de- (c) Precision@Top5-Yelp (d) Recall@Top5-Yelp
crease or increase in all three datasets, while Re-
call@N shows non-monotonic trending. It increases
first and then decrease. It is worth noticing that DCPR Fig. 8: Gain over FPMC on Foursquare and Yelp datasets.
almost always achieves the biggest improvement when
the comparison algorithms are at their highest F1-
score. For example, in Foursquare dataset, Fig. (5c),
when N = 4, DCPR achieves 13% improvement over Recall@Top5 increases 14.92%. The evaluation results on the
FPMC, and DCPR obtains 128% improvement over size of training set indicate that DCPR is capable of producing
FM. Interestingly, for Foursquare and Yelp datasets, high-quality embedding vectors of users and POIs.
almost all algorithms perform best when N = 4. Besides evaluating the proposed approach on the whole
It is probably because Foursquare and Yelp datasets dataset with different metrics in macro level, we also show
contain more users with short sequences. a comprehensive study on the performance of the compared
approaches in micro level. Specifically, we study the perfor-
From Fig. 5, we can conclude that FPMC is the best mance of the proposed algorithm on different users groups
performing baseline method. Therefore, for the large-scale where users are clustered according to the length of their
TIST dataset, we only compare the performance of the pro- check-in sequences. As an illustration example, Fig. 8 shows
posed DCPR algorithm with FPMC in Fig. 6. Since the TIST the gain of DCPR over FPMC in Precision@5 and Recall@5.
dataset does not contain text information, we accommodate the We pick users with modestly long sequences in the overall
proposed DCPR algorithm to only generate embeddings for population for Foursquare and Yelp datasets. At the same time,
POIs by omitting the convolution feature generating process. population density of each group is shown to provide in depth
We can see that, for all three different metrics, DCPR always understanding of the performance of different algorithms. The
outperforms FPMC with a big margin. For instance, for F1- population density of each groups is indicated by the size of
score@N metric, when N equals to 12, DCPR achieves 15.2% orange marker in fig. 8. First of all, in all of the different
improvement over FPMC. For the F1-score@N metric, both group of users, DCPR achieves larger than 10% improvements.
algorithms perform the best when N = 15. It is probably Interestingly, for both datasets, highest improvement always
because TIST dataset include users with longer sequences achieved when users having 11 or 12 check-ins. For instance,
compared to that of Foursquare and Yelp datasets. It shows when sequence length equals to 11, the proposed algorithm
that the proposed DCPR is robust in terms of varying sequence achieves nearly 50% improvement over FPMC in Foursquare
length. Also, compared to the performance on foursquare dataset, while it also improves FPMC nearly 70% in Yelp
and yelp datasets, the proposed DCPR algorithm achieves the dataset. The possible reason for this observation is that when
largest improvement in TIST dataset. It is probably due to the feeding too long a sequence from the past may contain more
reason that the proposed DCPR is especially good at modeling noise, while too short a sequence does not capture enough
long term dependencies and average sequence length of TIST behavior information.
dataset is much longer than that of other two datasets.
The robustness of the proposed algorithm is also tested C. Sensitivity analysis
by varying the size of the training check-ins in Foursquare
We perform the sensitivity analysis in Fig. 9 on two
and Yelp datasets. Also, we pick N = 5 for illustration
parameters: one is the number of convolution kernels n1 , while
purpose. As can be seen in Fig. 7, the proposed DCPR always
the other is the number of latent recurrent features n2 . These
outperform other compared algorithms. For instance, in Yelp
results are all based on Yelp dataset due to space limitation.
dataset, when the size of the training data increase from 50% to
Upper two figures show results of n1 , while bottom two figures
80%, FPMC’s Recall@Top5 increases 7.54%, while DCPR’s
376
Embedding (PRME) [5] model learns embeddings in two
separate spaces which models sequential transition probability
and user preference. Bayesian personalized ranking loss is
introduced to combine learned embeddings to predict future
check-ins. Instead of learning POI representations only from
previous check-ins, [28] proposes to learn representations from
surrounding check-ins inspired by skip-gram. [29] incorporates
skip-gram model with bayesian personalized ranking loss.
(a) Precision@5, n1 (b) Recall@5, n1 Even though PRME also models sequential pattern and user
preference, simply linear combination of embeddings cannot
explain the complex relationship interacted between these two
factors.
B. Context-Aware Recommendation
Although spatial, temporal, and social information have
been investigated in POI recommendation, text information is
relatively less explored in POI recommendation[30], [8], [9].
(c) Precision@5, n2 (d) Recall@5, n2 Text information includes reviews, tags, tips, and categories,
etc. [8] proposes a topic and location aware probabilistic
Fig. 9: Sensitivity analysis. matrix factorization model using POI-associated tags. Firstly,
users’ interest with respect to semantic topics is learned from
text information of POIs through Latent Dirichlet Allocation
(LDA) model. Then, learned users’ topic interests is compared
with POIs’ topic distribution to find potential POIs utilizing
display results of n2 . First column’s figures display analysis on probabilistic matrix factorization. Meanwhile, word-of-month
the Precision@5, while the second column’s figures indicate opinions are considered in the above-mentioned factor-based
analysis on Recall@5. As can be seen, for parameter n1 , model. Yang et al. [7] employs sentiment analysis techniques
when it increases from 5 to 50, values increase, however, to extract users’ preference from text information (tips).
when it increases beyond 50, values almost stay same. For And then, preference inferred from contents is considered
the parameter n2 , when it increases from 5, the performance simultaneously with preference learned from users’ check-
increases drastically, when it reaches 50, the performance stays in behavior. Factor analysis framework is also extended to
evenly. Therefore, for the proposed DCPR algorithm, we pick model geographical influence [24]. Similar to LDA model,
the number of convolution kernels equals to 100 and the [9] proposes a spatial topic model by simultaneously mod-
number of recurrent features as 50. eling spatial and content information in Twitter networks.
[31] investigates personal and local preferences from POIs’
V. R ELATED W ORK contents. [30] exploits contents associated with POIs’ and
A. POI Recommendation comments written by users with weighted matrix factorization.
[32] models personal preferences and sequential influence with
Similar like the traditional recommender systems, matrix a latent probabilistic generative model.
factorization technique is introduced in POI recommendation
[23], [24]. Different from item recommender systems which Above-mentioned models learn text similarity only based
employ explicit user feedback such as ratings, POI recom- on lexical similarity. Two reviews can be semantically similar
mendation utilize implicit user behavior (check-ins) as user when they have low lexical overlaps, as English vocabulary
feedback. Other implicit information is introduced such as is very diverse. These works ignores semantic meaning which
location of check-in POIs, temporal information of check-ins, plays an important role in understanding POIs. In addition,
and social networks. Some recent works focus on leveraging topic modeling-based approaches can easily be affected by
geographical [23], [24], social influences [24] and temporal sparsity problem and also cannot cope with new users and
effects. [24] combines users’ preference, social influence, and POIs.
geographical influence based on matrix factorization frame-
work. [23] proposes a GeoMF model which jointly models C. Deep Learning for Recommendation
geographical information and user preference. [25] introduces
Lately, neural network based methods attract lots of at-
ranking based loss into the GeoMF model.
tentions not only because it generates useful representations
Sequential pattern mining gains lots of attentions lately for various learning tasks but also delivers state-of-the-art per-
in personalized recommendation [6], [26]. Rendle at al. [6] formance on natural language processing and other sequential
proposes a FPMC model which constructs a personalized modeling tasks [11], [12]. Among which, Recurrent Neural
probability transition tensor based on Markov chains. Then, Networks (RNN) is especially good at modeling sequence [33],
a factorization model is proposed to estimate the transition [34]. For example, [11] shows RNN’s superior performance
tensor. FPMC model is extended by incorporating geographical on sequential click prediction. By concurrently model spatial
constraints [27]. Embedding technique [5], [28] attracts lots of and temporal patterns in LBSNs through transition matrix
research attentions lately since it is capable of learning better of RNN, [12] achieves promising improvement over matrix
representations for various tasks. Personalized Ranking Metric factorization-based and markov chain-based algorithms.
377
Researchers start to focus on employing neural network [8] B. Liu and H. Xiong, “Point-of-interest recommendation in location
based models for traditional recommender systems [35], [22]. based social networks with topic and location awareness,” in SDM’13.
[35] proposes an item recommendation algorithm which jointly [9] B. Hu and M. Ester, “Spatial topic modeling in online social media for
location recommendation,” in RecSys’13.
models users and items from reviews utilizing deep neural
[10] H. M. Wallach, “Topic modeling: Beyond bag-of-words,” in ICML’06.
networks.
[11] Y. Zhang, H. Dai, C. Xu, J. Feng, T. Wang, J. Bian, B. Wang, and T.-Y.
As discussed above, while there are studies try to model se- Liu, “Sequential click prediction for sponsored search with recurrent
quential pattern in check-in sequences and review text in item neural networks,” in AAAI’14.
recommender system, they did not address both challenges [12] Q. Liu, S. Wu, L. Wang, and T. Tan, “Predicting the next location: A
recurrent model with spatial and temporal contexts,” in AAAI’16.
simultaneously. Instead of learning sequence from markov
[13] W. W. Cohen, R. E. Schapire, and Y. Singer, “Learning to order things,”
chain-based models, the proposed DCPR model learns per- J. Artif. Int. Res., 1999.
sonalized sequential behaviors with the aid of advanced deep [14] R. Salakhutdinov and A. Mnih, “Probabilistic matrix factorization,” in
model. Instead of only relying on topic modeling based models NIPS’08.
to handle review text, the proposed DCPR learns the semantic [15] V. Nair and G. E. Hinton, “Rectified linear units improve restricted
meaning and sentimental attitudes of reviews with deep CNN boltzmann machines,” in ICML’10.
model. [16] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification
with deep convolutional neural networks,” in NIPS’12.
[17] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural
VI. C ONCLUSION Comput., 1997.
This paper proposed a deep content-aware POI recom- [18] S. Rendle, C. Freudenthaler, Z. Gantner, and L. Schmidt-Thieme, “Bpr:
mendation (DCPR) algorithm to tackle the problem of POI Bayesian personalized ranking from implicit feedback,” in UAI’09.
recommendation. Broad learning from multiple sources of [19] Y. LeCun, L. Bottou, G. B. Orr, and K.-R. Müller, “Effiicient backprop,”
in Neural Networks: Tricks of the Trade, 1998.
information is utilized to solve this challenging problem.
[20] D. Yang, D. Zhang, L. Chen, and B. Qu, “Nationtelescope: Monitoring
Specifically, text information associated with POIs and users’ and visualizing large-scale collective behavior in lbsns,” J. Network and
check-in sequences are simultaneously modeled in this paper. Computer Applications, 2015.
Furthermore, two different types of deep neural networks are [21] S. Rendle, “Factorization machines with libFM,” ACM Trans. Intell.
combined in an architectural framework with each one learns Syst. Technol., 2012.
one information source, and finally a ranking-based loss is [22] H. Wang, N. Wang, and D.-Y. Yeung, “Collaborative deep learning for
introduced to learn the users’ overall check-in behaviors. The recommender systems,” in KDD’15.
proposed DCPR model learns different source information [23] D. Lian, C. Zhao, X. Xie, G. Sun, E. Chen, and Y. Rui, “Geomf: Joint
discriminatively. Therefore, it can synergistically learns multi- geographical modeling and matrix factorization for point-of-interest
recommendation,” in KDD’14.
source heterogeneous networks. To this end, it is a deep
and broad learning model. Evaluation on three different real- [24] B. Liu, Y. Fu, Z. Yao, and H. Xiong, “Learning geographical preferences
for point-of-interest recommendation,” in KDD’13.
world datasets demonstrated the effectiveness of the proposed
[25] X. Li, G. Cong, X.-L. Li, T.-A. N. Pham, and S. Krishnaswamy, “Rank-
approach. For future work, other side information such as geofm: A ranking based geographical factorization method for point of
temporal information and geographical information can be in- interest recommendation,” in SIGIR’15.
cluded in the proposed framework. Besides, the proposed deep [26] J.-D. Zhang, C.-Y. Chow, and Y. Li, “Lore: Exploiting sequential
framework can be further extended for event recommendation. influence for location recommendations,” in SIGSPATIAL’14.
[27] C. Cheng, H. Yang, M. R. Lyu, and I. King, “Where you like to go
next: Successive point-of-interest recommendation,” in IJCAI’13.
ACKNOWLEDGMENT
[28] X. Liu, Y. Liu, and X. Li, “Exploring the context of locations for
This work is supported in part by NSF through grants IIS- personalized location recommendations,” in IJCAI’16.
1526499, and CNS-1626432, and NSFC 61672313. [29] S. Zhao, T. Zhao, I. King, and M. R. Lyu, “Geo-teaser: Geo-temporal
sequential embedding rank for point-of-interest recommendation,” in
WWW Companion’17.
R EFERENCES [30] H. Gao, J. Tang, X. Hu, and H. Liu, “Content-aware point of interest
recommendation on location-based social networks,” in AAAI’15.
[1] A. Agarwal, K. Hosanagar, and M. D. Smith, “Location, location,
[31] H. Yin, Y. Sun, B. Cui, Z. Hu, and L. Chen, “Lcars: A location-content-
location: An analysis of profitability of position in online advertising
aware recommender system,” in KDD’13.
markets,” JMR’11.
[32] W. Wang, H. Yin, S. W. Sadiq, L. Chen, M. Xie, and X. Zhou,
[2] R. Lee and K. Sumiya, “Measuring geographical regularities of crowd “SPORE: A sequential personalized spatial item recommender system,”
behaviors for twitter-based geo-social event detection,” in SIGSPATIAL in ICDE’16.
Workshop’10.
[33] A. Graves, “Generating sequences with recurrent neural networks,”
[3] T. Sakaki, M. Okazaki, and Y. Matsuo, “Earthquake shakes twitter users: CoRR’13.
Real-time event detection by social sensors,” in WWW’10.
[34] S. Hochreiter, Y. Bengio, P. Frasconi, and J. Schmidhuber, “Gradient
[4] C. Xia, R. Schwartz, K. Xie, A. Krebs, A. Langdon, J. Ting, and flow in recurrent nets: the difficulty of learning long-term dependen-
M. Naaman, “Citybeat: Real-time social media visualization of hyper- cies,” in A Field Guide to Dynamical Recurrent Neural Networks, 2001.
local city data,” in WWW’14.
[35] L. Zheng, V. Noroozi, and P. S. Yu, “Joint deep modeling of users and
[5] S. Feng, X. Li, Y. Zeng, G. Cong, Y. M. Chee, and Q. Yuan, “Person- items using reviews for recommendation,” in WSDM’17.
alized ranking metric embedding for next new poi recommendation,” in
IJCAI’15.
[6] S. Rendle, C. Freudenthaler, and L. Schmidt-Thieme, “Factorizing per-
sonalized markov chains for next-basket recommendation,” in WWW’10.
[7] D. Yang, D. Zhang, Z. Yu, and Z. Wang, “A sentiment-enhanced
personalized location recommendation system,” in HT’13.
378