Graph Neural Networks For Social Recommendation: Wenqi Fan Yao Ma Qing Li
Graph Neural Networks For Social Recommendation: Wenqi Fan Yao Ma Qing Li
Dawei Yin
JD.com
[email protected]
User-to-item Interaction
2 THE PROPOSED FRAMEWORK
Social Relations In this section, we will first introduce the definitions and notations
used in this paper, next give an overview about the proposed
3
framework, then detail each model component and finally discuss
how to learn the model parameters.
5
… Table 1: Notation
1
r'
Concatenation
μ3
μ2 Attention Network
μ1
Concatenation
… …
Item-space Social-space
α1 β1
α2
Attention Network β2
Attention Network
α3
β3
… … … …
Item-space Item-space Item-space
User Aggregation
Item Embedding
User Embedding
Item Aggregation Social Aggregation
Opinion Embedding
Figure 2: The overall architecture of the proposed model. It contains three major components: user modeling, item modeling,
and rating prediction.
Algorithms
Training Metrics
PMF SoRec SoReg SocialMF TrustMF NeuMF DeepSoR GCMC+SN GraphRec
Ciao MAE 0.952 0.8489 0.8987 0.8353 0.7681 0.8251 0.7813 0.7697 0.7540
(60%) RMSE 1.1967 1.0738 1.0947 1.0592 1.0543 1.0824 1.0437 1.0221 1.0093
Ciao MAE 0.9021 0.8410 0.8611 0.8270 0.7690 0.8062 0.7739 0.7526 0.7387
(80%) RMSE 1.1238 1.0652 1.0848 1.0501 1.0479 1.0617 1.0316 0.9931 0.9794
Epinions MAE 1.0211 0.9086 0.9412 0.8965 0.8550 0.9097 0.8520 0.8602 0.8441
(60%) RMSE 1.2739 1.1563 1.1936 1.1410 1.1505 1.1645 1.1135 1.1004 1.0878
Epinions MAE 0.9952 0.8961 0.9119 0.8837 0.8410 0.9072 0.8383 0.8590 0.8168
(80%) RMSE 1.2128 1.1437 1.1703 1.1328 1.1395 1.1476 1.0972 1.0711 1.0631
RMSE
MAE
MAE
0.76 0.83
0.98
1.06 0.82
0.97 0.74 0.81
1.05
0.96 0.80
0.95 0.72 1.04 0.79
Rec-SN Rec-Opi
nion Graph
Rec Rec-SN Rec-Opi
nion Graph
Rec Rec-SN Rec-Opi
nion Graph
Rec Rec-SN Rec-Opi
nion Graph
Rec
Graph Graph Graph Graph Graph Graph Graph Graph
Figure 3: Effect of social network and user opinions on Ciao and Epinions datasets.
RMSE
RMSE
0.735
MAE
MAE
0.820
1.070
0.980
0.730 0.815
RMSE
0.830
MAE
MAE
1.070
0.985 0.75
1.065 0.825
0.980
1.060 0.820
0.74
0.975 0.815
1.055
0.970 8 16 32 64 128 256
0.73 8 16 32 64 128 256 8 16 32 64 128 256
0.810 8 16 32 64 128 256
compare GraphRec with its four variants: GraphRec-α, GraphRec-β, • GraphRec-α&β: This variant eliminates two attention mech-
GraphRec-α&β, and GraphRec-µ. These four variants are defined anisms (item attention α and social attention β) on item
in the following: aggregation and social aggregation for modeling user latent
factors.
• GraphRec-µ: The user attention µ of GraphRec is eliminated
• GraphRec-α: The item attention α of GraphRec is eliminated during aggregating opinion-aware interaction user represen-
during aggregating the opinion-aware interaction repre- tation. This variant employs the mean-based aggregation
sentation of items. This variant employs the mean-based function on user aggregation for modeling item latent
aggregation function on item aggregation for modeling item- factors.
space user latent factors.
• GraphRec-β: The social attention α is to model users’ tie The results of different attention mechanisms on GraphRec are
strengths. The social attention α of GraphRec in this variant shown in Figure 4. From the results, we have the following findings,
is eliminated during aggregating user’s neighbors. This • Not all interacted items (purchased history) of one user
variant employs the mean-based aggregation function on contribute equally to the item-space user latent factor, and
social aggregation for modeling social-space user latent not all interacted users (buyers) have the same importance
factors. to learning item latent factor. Based on these assumptions,
our model considers these difference among users and items such as speech recognition [12], Computer Vision (CV) [14] and
by using two different attention mechanisms (α and µ). From Natural Language Processing (NLP) [4]. Some recent efforts have
the results, we can observe that GraphRec-α and GraphRec- applied deep neural networks to recommendation tasks and shown
µ obtain worse performance than GraphRec. These results promising results [41], but most of them used deep neural networks
demonstrate the benefits of the attention mechanisms on to model audio features of music [32], textual description of
item aggregation and user aggregation. items [3, 33], and visual content of images [40]. Besides, NeuMF [11]
• As mentioned before, users are likely to share more sim- presented a Neural Collaborative Filtering framework to learn the
ilar tastes with strong ties than weak ties. The attention non-linear interactions between users and items.
mechanism β at social aggregation considers heterogeneous However, the application of deep neural network in social
strengths of social relations. When the attention mechanism recommender systems is rare until very recently. In particular,
β is removed, the performance of GraphRec-β is dropped NSCR [35] extended the NeuMF [11] model to cross-domain
significantly. It justifies our assumption that during social social recommendations, i.e., recommending items of information
aggregation, different social friends should have different domains to potential users of social networks, and presented a
influence for learning social-space user latent factor. It’s neural social collaborative ranking recommender system. However,
important to distinguish social relations with heterogeneous the limitation is NSCR requires users with one or more social
strengths. networks accounts (e.g., Facebook, Twitter, Instagram), which
To sum up, GraphRec can capture the heterogeneity in ag- limits the data collections and its applications in practice. SMR-
gregation operations of the proposed framework via attention MNRL [42] developed social-aware movie recommendation in social
mechanisms, which can boost the recommendation performance. media from the viewpoint of learning a multimodal heterogeneous
network representation for ranking. They exploited the recurrent
3.3.3 Effect of Embedding Size. In this subsection, to analyze the neural network and convolutional neural network to learn the
effect of embedding size of user embedding p , item embedding q, representation of movies’ textual description and poster image, and
and opinion embedding e, on the performance of our model. adopted a random-walk based learning method into multimodal
Figure 5 presents the performance comparison w.r .t . the length neural networks. In all these works [35] [42], they addressed the
of embedding of our proposed model on Ciao and Epinions datasets. task of cross-domain social recommendations for ranking metric,
In general, with the increase of the embedding size, the performance which is different from traditional social recommender systems.
first increases and then decreases. When increasing the embedding Most related to our task with neural networks includes DLMF [6]
size from 8 to 64 can improve the performance significantly. and DeepSoR [8]. DLMF [6] used auto-encoder on ratings to learn
However, with the embedding size of 256, GraphRec degrades the representation for initializing an existing matrix factorization. A
performance. It demonstrates that using a large number of the two-phase trust-aware recommendation process is proposed to
embedding size has powerful representation. Nevertheless, if the utilize deep neural networks in matrix factorization’s initialization
length of embedding is too large, the complexity of our model and to synthesize the user’s interests and their trust friends’
will significantly increase. Therefore, we need to find a proper interests together with the impact of community effect based on
length of embedding in order to balance the trade-off between the matrix factorization for recommendations. DeepSoR [8] integrated
performance and the complexity. neural networks for user’s social relations into probabilistic matrix
factorization. They first represented users using pre-trained node
4 RELATED WORK embedding technique, and further exploited k-nearest neighbors to
In this section, we briefly review some related work about social bridge user embedding features and neural network.
recommendation, deep neural network techniques employed for More recently, Graph Neural Networks (GNNs) have been proven
recommendation, and the advanced graph neural networks. to be capable of learning on graph structure data [2, 5, 7, 15, 25].
Exploiting social relations for recommendations has attracted In the task of recommender systems, the user-item interaction
significant attention in recent years [27, 28, 37]. One common contains the ratings on items by users, which is a typical graph data.
assumption about these models is that a user’s preference is similar Therefore, GNNs have been proposed to solve the recommendation
to or influenced by the people around him/her (nearest neighbours), problem [1, 22, 39]. sRMGCNN [22] adopted GNNs to extract
which can be proven by social correlation theories [20, 21]. Along graph embeddings for users and items, and then combined with
with this line, SoRec [17] proposed a co-factorization method, which recurrent neural network to perform a diffusion process. GCMC [1]
shares a common latent user-feature matrix factorized by ratings proposed a graph auto-encoder framework, which produced latent
and by social relations. TrustMF [37] modeled mutual influence features of users and items through a form of differentiable message
between users, and mapped users into two low-dimensional spaces: passing on the user-item graph. PinSage [39] proposed a random-
truster space and trustee space, by factorizing social trust networks. walk graph neural network to learn embedding for nodes in web-
SoDimRec [30] first adopted a community detection algorithm scale graphs. Despite the compelling success achieved by previous
to partition users into several clusters, and then exploited the work, little attention has been paid to social recommendation with
heterogeneity of social relations and weak dependency connec- GNNs. In this paper, we propose a graph neural network for social
tions for recommendation. Comprehensive overviews on social recommendation to fill this gap.
recommender systems can be found in surveys [29].
In recent years, deep neural network models had a great impact
on learning effective feature representations in various fields,
5 CONCLUSION AND FUTURE WORK [13] Mohsen Jamali and Martin Ester. 2010. A matrix factorization technique with
trust propagation for recommendation in social networks. In Proceedings of the
We have presented a Graph Network model (GraphRec) to model fourth ACM conference on Recommender systems. ACM, 135–142.
social recommendation for rating prediction. Particularly, we [14] Hamid Karimi, Jiliang Tang, and Yanen Li. 2018. Toward End-to-End Deception
Detection in Videos. In 2018 IEEE Big Data. IEEE, 1278–1283.
provide a principled approach to jointly capture interactions and [15] Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification
opinions in the user-item graph. Our experiments reveal that the with Graph Convolutional Networks. In International Conference on Learning
opinion information plays a crucial role in the improvement of our Representations (ICLR).
[16] Yehuda Koren. 2008. Factorization meets the neighborhood: a multifaceted
model performance. In addition, our GraphRec can differentiate collaborative filtering model. In Proceedings of the 14th ACM SIGKDD international
the ties strengths by considering heterogeneous strengths of social conference on Knowledge discovery and data mining. ACM, 426–434.
relations. Experimental results on two real-world datasets show [17] Hao Ma, Haixuan Yang, Michael R Lyu, and Irwin King. 2008. Sorec: social
recommendation using probabilistic matrix factorization. In Proceedings of the
that GraphRec can outperform state-of-the-art baselines. 17th ACM conference on Information and Knowledge Management. ACM, 931–940.
Currently we only incorporate the social graph into recommen- [18] Hao Ma, Dengyong Zhou, Chao Liu, Michael R Lyu, and Irwin King. 2011.
Recommender systems with social regularization. In Proceedings of the fourth
dation, while many real-world industries are associated rich other ACM international conference on Web Search and Data Mining. ACM, 287–296.
side information on users as well as items. For example, users [19] Yao Ma, Suhang Wang, Charu C. Aggarwal, Dawei Yin, and Jiliang Tang. 2019.
and items are associated with rich attributes. Therefore, exploring Multi-dimensional Graph Convolutional Networks. In Proceedings of the 2019
SIAM International Conference on Data Mining(SDM).
graph neural networks for recommendation with attributes would [20] Peter V Marsden and Noah E Friedkin. 1993. Network studies of social influence.
be an interesting future direction. Beyond that, now we consider Sociological Methods & Research 22, 1 (1993), 127–151.
both rating and social information static. However, rating and social [21] Miller McPherson, Lynn Smith-Lovin, and James M Cook. 2001. Birds of a feather:
Homophily in social networks. Annual review of sociology 27, 1 (2001), 415–444.
information are naturally dynamic. Hence, we will consider building [22] Federico Monti, Michael Bronstein, and Xavier Bresson. 2017. Geometric matrix
dynamic graph neural networks for social recommendations with completion with recurrent multi-graph neural networks. In Advances in Neural
Information Processing Systems. 3700–3710.
dynamic. [23] Paul Resnick and Hal R Varian. 1997. Recommender systems. Commun. ACM 40,
3 (1997), 56–58.
ACKNOWLEDGMENTS [24] Ruslan Salakhutdinov and Andriy Mnih. 2007. Probabilistic Matrix Factorization.
In 21th Conference on Neural Information Processing Systems, Vol. 1. 2–1.
The work described in this paper has been supported, in part, by a [25] David I Shuman, Sunil K Narang, Pascal Frossard, Antonio Ortega, and Pierre
general research fund from the Hong Kong Research Grants Council Vandergheynst. 2013. The emerging field of signal processing on graphs:
Extending high-dimensional data analysis to networks and other irregular
(project PolyU 1121417/17E), and an internal research grant from domains. IEEE Signal Processing Magazine 30, 3 (2013), 83–98.
the Hong Kong Polytechnic University (project 1.9B0V). Yao Ma and [26] Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan
Jiliang Tang are supported by the National Science Foundation (NSF) Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from
overfitting. The Journal of Machine Learning Research 15, 1 (2014), 1929–1958.
under grant numbers IIS-1714741, IIS-1715940 and CNS-1815636, [27] Jiliang Tang, Charu Aggarwal, and Huan Liu. 2016. Recommendations in signed
and a grant from Criteo Faculty Research Award. social networks. In Proceedings of the 25th International Conference on World Wide
Web. International World Wide Web Conferences Steering Committee, 31–40.
[28] Jiliang Tang, Xia Hu, Huiji Gao, and Huan Liu. 2013. Exploiting local and global
REFERENCES social context for recommendation.. In IJCAI, Vol. 13. 2712–2718.
[1] Rianne van den Berg, Thomas N Kipf, and Max Welling. 2017. Graph [29] Jiliang Tang, Xia Hu, and Huan Liu. 2013. Social recommendation: a review.
convolutional matrix completion. arXiv preprint arXiv:1706.02263 (2017). Social Network Analysis and Mining 3, 4 (2013), 1113–1133.
[2] Michael M Bronstein, Joan Bruna, Yann LeCun, Arthur Szlam, and Pierre [30] Jiliang Tang, Suhang Wang, Xia Hu, Dawei Yin, Yingzhou Bi, Yi Chang, and Huan
Vandergheynst. 2017. Geometric deep learning: going beyond euclidean data. Liu. 2016. Recommendation with Social Dimensions. In AAAI. 251–257.
IEEE Signal Processing Magazine 34, 4 (2017), 18–42. [31] Tijmen Tieleman and Geoffrey Hinton. 2012. Lecture 6.5-RMSProp, COURSERA:
[3] Chong Chen, Min Zhang, Yiqun Liu, and Shaoping Ma. 2018. Neural Attentional Neural networks for machine learning. University of Toronto, Technical Report
Rating Regression with Review-level Explanations. In Proceedings of the 27th (2012).
International Conference on World Wide Web. 1583–1592. [32] Aaron Van den Oord, Sander Dieleman, and Benjamin Schrauwen. 2013.
[4] Hongshen Chen, Xiaorui Liu, Dawei Yin, and Jiliang Tang. 2017. A survey on Deep content-based music recommendation. In Advances in neural Information
dialogue systems: Recent advances and new frontiers. ACM SIGKDD Explorations Processing Systems. 2643–2651.
Newsletter 19, 2 (2017), 25–35. [33] Hao Wang, Naiyan Wang, and Dit-Yan Yeung. 2015. Collaborative deep learning
[5] Michaël Defferrard, Xavier Bresson, and Pierre Vandergheynst. 2016. Convolu- for recommender systems. In Proceedings of the 21th ACM SIGKDD International
tional neural networks on graphs with fast localized spectral filtering. In Advances Conference on Knowledge Discovery and Data Mining. ACM, 1235–1244.
in Neural Information Processing Systems. 3844–3852. [34] Suhang Wang, Jiliang Tang, Yilin Wang, and Huan Liu. 2018. Exploring
[6] Shuiguang Deng, Longtao Huang, Guandong Xu, Xindong Wu, and Zhaohui Wu. Hierarchical Structures for Recommender Systems. IEEE Transactions on
2017. On deep learning for trust-aware recommendations in social networks. Knowledge and Data Engineering 30, 6 (2018), 1022–1035.
IEEE transactions on neural networks and learning systems 28, 5 (2017), 1164–1177. [35] Xiang Wang, Xiangnan He, Liqiang Nie, and Tat-Seng Chua. 2017. Item silk road:
[7] Tyler Derr, Yao Ma, and Jiliang Tang. 2018. Signed Graph Convolutional Networks. Recommending items from information domains to social users. In Proceedings
In 2018 IEEE International Conference on Data Mining (ICDM). IEEE, 929–934. of the 40th International ACM SIGIR conference on Research and Development in
[8] Wenqi Fan, Qing Li, and Min Cheng. 2018. Deep Modeling of Social Relations Information Retrieval. ACM, 185–194.
for Recommendation. In AAAI. [36] Rongjing Xiang, Jennifer Neville, and Monica Rogati. 2010. Modeling relationship
[9] Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for strength in online social networks. In Proceedings of the 19th international
networks. In Proceedings of the 22nd ACM SIGKDD International Conference on conference on World wide web. ACM, 981–990.
Knowledge Discovery and Data Mining. ACM, 855–864. [37] Bo Yang, Yu Lei, Jiming Liu, and Wenjie Li. 2017. Social collaborative filtering by
[10] Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation trust. IEEE transactions on pattern analysis and machine intelligence 39, 8 (2017),
learning on large graphs. In Advances in Neural Information Processing Systems. 1633–1647.
1024–1034. [38] Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and Eduard
[11] Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Hovy. 2016. Hierarchical attention networks for document classification. In
Chua. 2017. Neural Collaborative Filtering. In Proceedings of the 26th International Proceedings of the 2016 Conference of the North American Chapter of the Association
Conference on World Wide Web, WWW. 173–182. for Computational Linguistics: Human Language Technologies. 1480–1489.
[12] Geoffrey Hinton, Li Deng, Dong Yu, George E Dahl, Abdel-rahman Mohamed, [39] Rex Ying, Ruining He, Kaifeng Chen, Pong Eksombatchai, William L. Hamilton,
Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara N and Jure Leskovec. 2018. Graph Convolutional Neural Networks for Web-Scale
Sainath, et al. 2012. Deep neural networks for acoustic modeling in speech Recommender Systems. In KDD ’18. ACM, 974–983.
recognition: The shared views of four research groups. IEEE Signal Processing [40] Lili Zhao, Zhongqi Lu, Sinno Jialin Pan, and Qiang Yang. 2016. Matrix
Magazine 29, 6 (2012), 82–97. Factorization+ for Movie Recommendation. In IJCAI. 3945–3951.
[41] Xiangyu Zhao, Liang Zhang, Zhuoye Ding, Long Xia, Jiliang Tang, and Dawei [42] Zhou Zhao, Qifan Yang, Hanqing Lu, Tim Weninger, Deng Cai, Xiaofei He, and
Yin. 2018. Recommendations with Negative Feedback via Pairwise Deep Yueting Zhuang. 2018. Social-aware movie recommendation via multimodal
Reinforcement Learning. In KDD’18. ACM, 1040–1048. network learning. IEEE Transactions on Multimedia 20, 2 (2018), 430–440.