A Personalized Collaborative Filtering Recommendation System Based on Bi-Graph Embedding and Causal Reasoning
Abstract
:1. Introduction
- Existing graph embedding algorithms struggle with entity- and relation-space representation, resulting in poor graph-embedded features.
- The current research on knowledge graph-based recommender systems focuses on using knowledge graphs to improve the search capability of recommender systems, using knowledge graphs as a modality for capturing features to provide additional value information [12], and some of the studies consider the complementation of knowledge graphs, with few studies on dealing with the bias problem brought about by graph embedding.
- Current recommendation algorithms increasingly aim to uncover user interests and provide personalized recommendations. Effectively identifying the volatility of user interests remains a recent challenge in the field of recommendation research.
2. Related Work
2.1. Collaborative Filtering-Based Recommendation Systems
2.2. Recommendation System Based on Knowledge Graph
3. Personalized Collaborative Filtering Recommender System Based on Bipartite Graph Embedding and Causal Inference
- Improve the ability of the CoFM model to recognize 1-N, N-1, and N-N relationships. Specifically, we hope to improve the effectiveness of the CoFM model by solving the problem that TransE cannot recognize 1-N, N-1, and N-N relations well. Here, we use the TransR graph embedding model instead of the TransE model, which is able to jump out of the single space when embedding and to deal with many-to-many relations more effectively and accurately.
- Reconstructing the user’s eigenvalues using backdoor adjustment of causal inference to eliminate the biased effect of user–item-history interactions on the user’s eigenvalues.
- Using symmetric KL scatter to fuse the traditional FM recommendation with the final predictive score of RCKFM to achieve an adaptive personalized user–item recommendation.
- where the blue part represents the feature-learning section, and the green part denotes the recommendation algorithm’s portion. Notably, the functionally concentrated steps in the RCKFM model consist of TransR bipartite graph embedding, backdoor adjustment for bias elimination, FM collaborative filtering, and symmetric KL scatter score aggregation. The overall framework of the RCKFM model is shown in Figure 5:
- TransR dual graph embedding: we take the knowledge graph dataset and the user–item-history interaction dataset as the input of the whole system. The graph embedding uses the TransR model based on graph embedding to represent the structured information in the knowledge graph. At the same time, to obtain the movie embedding and user embedding, here we consider dual graph embedding for users and projects. The specific operation steps are to constitute the graph embedding of the attributes and types of the movie at the same time as constituting the graph embedding of the user’s attributes and history interactions, etc., in order to obtain the project feature representations and user characterization of the project features and user features, respectively. The TransR graph embedding model is utilized to jump out of the TransE graph embedding model single-space (entity and relationship shared space) representation problem.
- Causal inference backdoor adjustment: After obtaining the embedding results, we adjust the representation of user-feature vectors by the causal inference backdoor adjustment method to eliminate the bias influence of user-history interaction on user-features representation.
- FM collaborative filtering: collaborative filtering will be the user–item-history interaction relationship of data collection after matching the embedding results as the input of the FM model. Here, the collaborative filtering model FM should be trained both before the backdoor adjustment of the parameters and after the backdoor adjustment of the parameters to facilitate the personalized scoring fusion.
- KL Scatter Score Aggregation: Using symmetric KL scatter, we calculate whether the user’s interest is variable according to the timestamp, we fuse the predicted scores before and after the backdoor recommendation, and, ultimately, we obtain the top-10 recommendation list according to the order of the predicted scores.
3.1. TransR Dual Graph Embeddings
3.2. Backdoor Adjustment of Causal Inference
3.3. Factorization Machine (FM) and Collaborative Filtering
3.4. Aggregation of KL Divergence Ratings
4. Experiments and Results
4.1. Data Set
4.2. Baseline
4.3. Evaluation Metrics
- Precision@N: We defined precision as the ratio of the number of user-preferred items in the top-10 recommendations to the total number of recommendations (N). For the experimental results of multiple users in the test set, we calculated the average precision of all users as the final definition of precision. The range of its value was 0–1, and, usually, the higher the value, the better the performance. Precision@N is often abbreviated to P@N, and the calculation formula is as shown in Formula (28):
- Recall@N: We defined recall as the ratio of the number of user-preferred items in the top-10 recommendations to the total number of user-preferred items (that is, the percentage of successful recommendations of user-preferred items). Similarly, we calculated the average recall of all users as the final definition of recall. The range of its value was 0–1, and, usually, the higher the value, the better the performance. Recall@N is often abbreviated to R@N, and the calculation formula is as shown in Formula (29):
- NDCG@N: We defined normalized discounted cumulative gain as the cumulative benefit calculated for the first N positions, which had been normalized considering the position information. As before, we calculated the average normalized discounted cumulative gain of all users as the final definition for it. The range of its value was 0–1, and, usually, the higher the value, the better the performance. NDCG@N is often abbreviated to N@N, and the calculation formula is as shown in Formula (30):
- Hit Rate@N: We defined hit rate as the ratio of the number of users successfully recommended to the total number of users (if the recommendation list contained a user’s preferred item, then this user was defined as being successfully recommended). We also calculated the average hit rate of all users as the final definition of hit rate. The value range was 0–1, and, usually, the larger the value, the better the performance. Hit Rate@N is often abbreviated as HR@N, and its calculation formula is as shown in Formula (31):
5. Result
- Compared with the benchmark model we adopted, the RCKFM model performs better on the dataset, and the evaluation indicators surpass the benchmark model, showing stronger competitiveness.
- In terms of performance improvement, compared with the results of the basic model CoFM, the improvement in various indicators of RCKFM on both datasets ranges from 3.17% to 6.81%.
- To compare the TransR method based on graph embedding, we compared the CoFM model and the RFM model. Their essential difference lies in the Trans series graph embedding method. After using the TransR model to embed relationships and entities into different spaces, we found that the indices of the model improved.
- RCKFM is the main model in this paper. By comparing with the RFM model, the RCKFM model works better than the RFM model, indicating that the RCKFM model incorporating TransR has a significant improvement in the performance of the model by incorporating the computation of causal inference backdoor adjustment and KL scatter.
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- He, X.; Liao, L.; Zhang, H.; Nie, L.; Hu, X.; Chua, T.S. Neural collaborative filtering. In Proceedings of the 26th International Conference on World Wide Web, Perth, Australia, 3–7 April 2017; pp. 173–182. [Google Scholar]
- Jiang, W.; Chen, J.; Jiang, Y.; Xu, Y.; Wang, Y.; Tan, L.; Liang, G. A New Time-Aware Collaborative Filtering Intelligent Recommendation System. Comput. Mater. Contin. 2019, 61, 849–859. [Google Scholar] [CrossRef]
- Wulam, A.; Wang, Y.; Zhang, D.; Sang, J.; Yang, A. A recommendation system based on fusing boosting model and DNN model. Comput. Mater. Contin. 2019, 60, 1003–1013. [Google Scholar] [CrossRef]
- Bai, H.; Li, X.; He, L.; Jin, L.; Wang, C.; Jiang, Y. Recommendation algorithm based on probabilistic matrix factorization with adaboost. Comput. Mater. Contin. 2020, 65, 1591–1603. [Google Scholar] [CrossRef]
- Gu, X.; Zhang, G. Energy-efficient computation offloading for vehicular edge computing networks. Comput. Commun. 2021, 166, 244–253. [Google Scholar] [CrossRef]
- Lin, R.; Xie, T.; Luo, S.; Zhang, X.; Xiao, Y.; Moran, B.; Zukerman, M. Energy-efficient computation offloading in collaborative edge computing. IEEE Internet Things J. 2022, 9, 21305–21322. [Google Scholar] [CrossRef]
- Karabetian, A.; Kiourtis, A.; Voulgaris, K.; Karamolegkos, P.; Poulakis, Y.; Mavrogiorgou, A.; Kyriazis, D. An Environmentally-sustainable Dimensioning Workbench towards Dynamic Resource Allocation in Cloud-computing Environments. In Proceedings of the 2022 13th International Conference on Information, Intelligence, Systems & Applications (IISA), Corfu, Greece, 18–20 July 2022; pp. 1–4. [Google Scholar]
- Wang, H.; Zhang, F.; Xie, X.; Guo, M. DKN: Deep knowledge-aware network for news recommendation. In Proceedings of the 2018 World Wide Web Conference, Lyon, France, 23–27 April 2018; pp. 1835–1844. [Google Scholar]
- Piao, G.; Breslin, J.G. Transfer learning for item recommendations and knowledge graph completion in item related domains via a co-factorization model. In The Semantic Web, Proceedings of the 15th International Conference, ESWC 2018, Heraklion, Greece, 3–7 June 2018; Proceedings 15; Springer: Berlin/Heidelberg, Germany, 2018; pp. 496–511. [Google Scholar]
- Bordes, A.; Usunier, N.; Garcia-Duran, A.; Weston, J.; Yakhnenko, O. Translating embeddings for modeling multi-relational data. Adv. Neural Inf. Process. Syst. 2013, 26. [Google Scholar]
- Lu, D.; Zhu, D.; Du, H.; Sun, Y.; Wang, Y.; Li, X.; Qu, R.; Cao, N.; Higgs, R. Fusion Recommendation System Based on Collaborative Filtering and Knowledge Graph. Comput. Syst. Sci. Eng. 2022, 42, 1133–1146. [Google Scholar] [CrossRef]
- Chicaiza, J.; Valdiviezo-Diaz, P. A comprehensive survey of knowledge graph-based recommender systems: Technologies, development, and contributions. Information 2021, 12, 232. [Google Scholar] [CrossRef]
- Lin, Y.; Liu, Z.; Sun, M.; Liu, Y.; Zhu, X. Learning entity and relation embeddings for knowledge graph completion. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, Austin, TX, USA, 25–30 January 2015; Volume 29. [Google Scholar]
- Rendle, S. Factorization machines. In Proceedings of the 2010 IEEE International Conference on Data Mining, Sydney, Australia, 13–17 December 2010; pp. 995–1000. [Google Scholar]
- Sedhain, S.; Menon, A.K.; Sanner, S.; Xie, L. Autorec: Autoencoders meet collaborative filtering. In Proceedings of the 24th international conference on World Wide Web, Florence, Italy, 18–22 May 2015; pp. 111–112. [Google Scholar]
- Blondel, M.; Fujino, A.; Ueda, N.; Ishihata, M. Higher-order factorization machines. Adv. Neural Inf. Process. Syst. 2016, 29. [Google Scholar]
- Yu, X.; Ren, X.; Sun, Y.; Gu, Q.; Sturt, B.; Khandelwal, U.; Norick, B.; Han, J. Personalized entity recommendation: A heterogeneous information network approach. In Proceedings of the 7th ACM International Conference on Web Search and Data Mining, New York, NY, USA, 24–28 February 2014; pp. 283–292. [Google Scholar]
- Wang, X.; He, X.; Cao, Y.; Liu, M.; Chua, T.S. Kgat: Knowledge graph attention network for recommendation. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 950–958. [Google Scholar]
- Wang, Z.; Lin, G.; Tan, H.; Chen, Q.; Liu, X. CKAN: Collaborative knowledge-aware attentive network for recommender systems. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, Xi’an, China, 25–30 July 2020; pp. 219–228. [Google Scholar]
- Huang, X.; Chen, H.; Zhang, Z. Design and Application of Deep Hash Embedding Algorithm with Fusion Entity Attribute Information. Entropy 2023, 25, 361. [Google Scholar] [CrossRef] [PubMed]
- Wang, W.; Feng, F.; He, X.; Wang, X.; Chua, T.S. Deconfounded recommendation for alleviating bias amplification. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, Singapore, 14–18 August 2021; pp. 1717–1725. [Google Scholar]
- Noia, T.D.; Ostuni, V.C.; Tomeo, P.; Sciascio, E.D. Sprank: Semantic path-based ranking for top-n recommendations using linked open data. ACM Trans. Intell. Syst. Technol. (TIST) 2016, 8, 1–34. [Google Scholar] [CrossRef]
- Zhang, Y.; Ai, Q.; Chen, X.; Wang, P. Learning over knowledge-base embeddings for recommendation. arXiv 2018, arXiv:1803.06540. [Google Scholar]
- Morik, M.; Singh, A.; Hong, J.; Joachims, T. Controlling fairness and bias in dynamic learning-to-rank. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, Xi’an, China, 25–30 July 2020; pp. 429–438. [Google Scholar]
MovieLens-1M | Douban Dataset | ||
---|---|---|---|
User–Item Interaction | User | 4297 | 2519 |
Item | 3883 | 34,893 | |
Relationship | 6 | 17 | |
Rating | 893,578 | 1,276,928 | |
Average of Ratings | 210 | 507 | |
Sparsity | 94.64% | 98.55% |
System Recommendations | System Not Recommended | |
---|---|---|
Like | True Positive Rate (TP) | False Negative Rate (FN) |
Dislike | False Positive Rate (FP) | True Negative Rate (TN) |
MovieLens-1M | Douban Dataset | |||||||
---|---|---|---|---|---|---|---|---|
Method | P@10 | R@10 | N@10 | HR@10 | P@10 | R@10 | N@10 | HR@10 |
FM | 0.1241 | 0.0682 | 0.0549 | 0.2084 | 0.0914 | 0.0418 | 0.0213 | 0.0853 |
CoFM | 0.1272 | 0.0728 | 0.0568 | 0.2165 | 0.0925 | 0.0443 | 0.0221 | 0.0896 |
RFM | 0.1297 | 0.0745 | 0.0576 | 0.2215 | 0.0955 | 0.0452 | 0.0225 | 0.0929 |
FairCo | 0.1306 | 0.0754 | 0.0581 | 0.2243 | 0.0959 | 0.0457 | 0.0228 | 0.0935 |
RCKFM | 0.1324 | 0.0763 | 0.0586 | 0.2281 | 0.0964 | 0.0467 | 0.0234 | 0.0957 |
Improve | 4.09% | 4.81% | 3.17% | 5.36% | 4.22% | 5.42% | 5.88% | 6.81% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Huang, X.; Wang, J.; Cui, J. A Personalized Collaborative Filtering Recommendation System Based on Bi-Graph Embedding and Causal Reasoning. Entropy 2024, 26, 371. https://doi.org/10.3390/e26050371
Huang X, Wang J, Cui J. A Personalized Collaborative Filtering Recommendation System Based on Bi-Graph Embedding and Causal Reasoning. Entropy. 2024; 26(5):371. https://doi.org/10.3390/e26050371
Chicago/Turabian StyleHuang, Xiaoli, Junjie Wang, and Junying Cui. 2024. "A Personalized Collaborative Filtering Recommendation System Based on Bi-Graph Embedding and Causal Reasoning" Entropy 26, no. 5: 371. https://doi.org/10.3390/e26050371