Cross-Session Graph and Hypergraph Co-Guided Session-Based Recommendation

Li, Pingrong; Ma, Huifang

doi:10.3390/sym17030389

Open AccessArticle

Cross-Session Graph and Hypergraph Co-Guided Session-Based Recommendation

by

Pingrong Li

^1,* and

Huifang Ma

²

¹

School of E-Commerce, Longnan Normal University, Longnan 742500, China

²

College of Computer Science and Engineering, Northwest Normal University, Lanzhou 730070, China

^*

Author to whom correspondence should be addressed.

Symmetry 2025, 17(3), 389; https://doi.org/10.3390/sym17030389

Submission received: 9 February 2025 / Revised: 28 February 2025 / Accepted: 2 March 2025 / Published: 4 March 2025

(This article belongs to the Special Issue Symmetry/Asymmetry in Evolutionary Computation and Machine Learning)

Download

Browse Figures

Versions Notes

Abstract

:

Session-based recommendation (SBR) aims to predict a user’s next item of interest by analyzing their anonymous browsing patterns. While previous studies have demonstrated considerable efficacy, they may fall short when confronted with exceedingly sparse interaction data. This paper presents a novel approach, cross-session graph and hypergraph co-guided session-based recommendation (CGH-SBR), which adeptly forecasts subsequent items while upholding efficiency and precision. First, we construct a directed graph that captures sequential dependencies by modeling cross-session item transitions, alongside building a hypergraph that encapsulates higher-order relationships between items within sessions. Subsequently, we employ two distinct graph neural networks (GNNs) to learn item representations on these two graphs separately. Further, we innovate by integrating a symmetry-aware co-guided learning framework. This framework promotes the integration of diverse perspectives and facilitates mutual learning, leveraging the data’s symmetric properties to enhance the model’s pattern recognition capabilities. Comprehensive experimentation conducted on two public datasets showcases the outstanding performance and potential of the recommendation system presented by CGH-SBR.

Keywords:

session-based recommendation; cross-session graph; hypergraph; graph neural networks; co-guided mechanism

1. Introduction

Session-based recommendation (SBR) has become increasingly important across various online platforms, including E-commerce, social networks, and entertainment industries. Unlike traditional recommendation systems (RS) that depend on user profiles and extensive historical interaction data, SBR mines a user’s recent preferences by analyzing their anonymous browsing sessions, thereby predicting the next item they are likely to interact with [1,2]. This method is particularly valuable in overcoming the challenge of session data sparsity, which arises when information is either entirely missing (e.g., anonymous users) or insufficient (e.g., limited historical interaction records) [3,4].

Recent SBR research has largely been based on graph neural networks (GNNs), treating each session as a graph to gain intricate representations, yielding promising results [5,6,7]. Many of these studies [8,9] have integrated attention mechanisms to differentiate between long-term and short-term preferences using the current session’s data. Nevertheless, existing GNNs-based SBR approaches continue to struggle with data sparsity, especially in the context of limited short-term interactions.

Despite their remarkable accomplishments, current GNN-based SBR methods remain constrained by inherent limitations, particularly stemming from data sparsity caused by the ephemeral nature of user interactions. While the multi-view approach has emerged as a promising paradigm to alleviate the effects of data sparsity, it still faces two fundamental challenges that warrant deeper exploration:

Firstly, the issue of incomplete view construction persists. Existing methods typically construct session graphs based solely on isolated user interactions, treating item transitions as simplistic pairwise relationships. Such an oversimplified manner of representation fails to capture the intricate, higher-order dependencies and complex interaction patterns among items within the same session. This limitation not only results in an incomplete characterization of user preferences but also undermines the ability to model nuanced user behavior effectively.

Secondly, the problem of suboptimal view integration remains underexplored. While multiple views offer complementary information, the fusion process often lacks sophisticated strategies for integrating heterogeneous information sources. Moreover, the inherent noise in observed interactions—such as accidental clicks or exploratory browsing of irrelevant items—poses additional challenges. Conventional methods frequently struggle to effectively suppress this noise while preserving meaningful patterns, leading to potential information dilution and degraded recommendation performance.

These challenges highlight the need for more sophisticated approaches that can better handle the complexity of user interactions and improve the robustness of multi-view SBR systems. Addressing the above challenges is crucial for advancing SBR and improving recommendation accuracy in scenarios with sparse interaction data. By developing more sophisticated view construction techniques and robust view integration strategies, it is possible to enhance the ability of SBR models to handle real-world complexities and provide more personalized recommendations.

To tackle the above challenges, we propose a cross-session graph and hypergraph co-guided session-based recommendation (CGH-SBR) framework. CGH-SBR is designed to capture not only cross-session item transitions but also the higher-order relationships among items within sessions. Initially, we create a directed graph to model cross-session item transitions, capturing sequential dependencies. Furthermore, we construct a hypergraph to model higher-order relationships within sessions. Subsequently, we utilize two distinct graph neural network (GNN) architectures to learn item representations on these graphs. Moreover, we have devised a co-guided learning framework that encourages the integration of diverse viewpoints, facilitating reciprocal learning between them. This method enhances the model’s ability to effectively merge information from various sources. Finally, we adopt a co-guiding mechanism to fuse the learned embeddings from multiple perspectives to derive a comprehensive user representation for generating SBR recommendations. We validate the efficacy of our proposed model through extensive experiments on two real-world datasets.

Note the proposed CGH-SBR model distinguishes itself from other multi-view approaches in several key ways, particularly in how it integrates and leverages multi-modal graph structures and co-guided learning mechanisms. Unlike traditional multi-view approaches that often focus on pairwise interactions or static user-item relationships, CGH-SBR explicitly models both cross-session dependencies and intra-session high-order correlations. The directed graph captures sequential dependencies across different sessions, enabling the model to understand long-term user preferences. The hypergraph models higher-order correlations among items within a single session, allowing the model to capture complex relationships that go beyond pairwise interactions. This dual perspective ensures that the model accounts for both short-term and long-term user behavior, making it more robust and comprehensive compared to other methods. In addition, CGH-SBR introduces a co-guided learning framework that allows the two GNNs to collaborate and learn from each other. This framework encourages knowledge sharing between the two networks, enabling them to leverage complementary strengths. This also captures the complex interplay between cross-session dependencies and intra-session correlations, leading to more accurate predictions.

The major contributions of this paper are as follows:

We propose the construction of both directed and hypergraphs to learn various item dependencies across entire sessions.

We introduce a co-guided learning scheme that integrates diverse perspectives and enables mutual learning among them, allowing the model to effectively assimilate information from different sources.

We present comprehensive experiments on two public datasets to showcase the effectiveness of our proposed CGH-SBR model.

The remainder of this paper is organized as follows: Section 2 presents the foundational concepts and a precise articulation of our research problem. Section 3 delineates the comprehensive framework and the constituent elements of our approach. Section 4 provides an analysis of the experimental results, and Section 5 concludes our study and offers insights into future work.

2. Preliminaries

The objective of SBR is to anticipate the subsequent actions of users based on their anonymous historical activity sequences. Below, we define the SBR problem.

Let

S = {s_{1}, s_{2}, \dots, s_{| S |}}

represent a collection of sessions over an item set

V = {v_{1}, v_{2}, \dots, v_{N}}

, where N is the number of items. An anonymous session

s_{t} = [v_{t, 1}, v_{t, 2}, \dots, v_{t, n}] \in S

is a sequence of items ordered by timestamps,

v_{t, j} \in V

is the j-th clicked item in S_t and n is the length of session, which may contain duplicated items. Let

X^{(0)}

be a randomly initialized uniform distribution matrix, with its elements indicating a particular embedding item. In order to capture high-order learning representations, we perform GNNs on the constructed two graphs and to embed each item

v_{i} \in V

into the latent space. In particular, let

X_{r}^{(l)} [i] \in ℝ^{d}

denote the r-th view-specific representation of item v_i of dimension d in the l-th layer, where

r \in {c, h}

, indicating directed graph and hypergraph, respectively. The final representation of the entire item set is denoted as

X \in ℝ^{N \times d}

. Each session S_t is represented by a vector

s_{t}

indicating the combination of each item embedding from X consumed in s.

Formally, the goal of our model is to take all session sequences as an input, given a target session s_t, the recommendation framework returns a list of top-K candidate items to be consumed as the next one

v_{t, n + 1}

.

3. Methodology

In this section, we introduce our proposed model, cross-session graph and hypergraph co-guided session-based recommendation (CGH-SBR), which harnesses a novel multi-view co-guided mechanism to optimize session-based recommendations. The model effectively leverages both cross-session information and high-order item correlations. The architecture of our model is illustrated in Figure 1.

In essence, The CGH-SBR model operates in three stages: First, a cross-session graph (CG) is generated to encapsulate the relationships between items across different sessions. Concurrently, a hypergraph (HG) is developed to represent the intricate high-order interactions among items. And then, the model applies a query-aware attention mechanism along with hypergraph graph neural networks (HGNNs) to extract and encode item correlations from both the CG and HG into meaningful item representations (detailed in Section 3.1 and Section 3.2). Subsequently, a co-guided mechanism is introduced to capture the mutual relationships between different views and user interests (explained in Section 3.3). Finally, the prediction layer consolidates the session, item, and influence session embeddings, culminating in the prediction score for the target session-item pair (Section 3.4).

3.1. Multi-View Construction

In this section, we aim to construct two independent and complementary graphs by jointly considering the cross-session relationship and high-order item connection, i.e., a cross-session graph and a hypergraph. We argue that the former illustrates the cross-session item-level temporal relationships, whereas the hypergraph captures the high-order item correlations.

3.1.1. Cross-Session Graph Construction

To capture the pairwise item transitions across items within the entire session, we construct a cross-session graph from all users’ interaction sequences. This graph encapsulates the sequential relationships between items across different sessions.

In Figure 1b, there are a total of three sessions depicted. Initially, each individual session is transformed into a basic session graph. Subsequently, considering the recurrence of items, diverse session graphs are merged to form a cross-session graph. The primary aim of extending the session graph to a cross-session graph is to integrate cross-session insights into the process of learning individual session representations. For instance, focusing on node v₂, within session S₂, the items preceding and succeeding v₂ are v₈ and v₁, respectively. Furthermore, within session S₃, v₂ is succeeded by item v₅. Consequently, in the graph representation, v₂ exhibits two outgoing edges, connecting v₂ to v₁ and v₅, and one incoming edge, linking v₈ to v₂. Specifically, the cross-session view, represented as

G_{C} = (V, E_{C})

, is a weighted cross-session graph that extends over a set of sessions. The graph comprises a node set that denotes all item nodes and an edge set that represents all weighted directed edges. We use

(v_{i}, v_{j}, w_{i, j})

to signify a user’s click on an item, following an item in any session, and it indicates the weight of the edge. Note that the edge weight is computed as the frequency of co-occurrence of item pairs

(v_{i}, v_{j})

across different sessions, reflecting the popularity of the subsequent item. Importantly, the construction of the cross-session graph aims to integrate cross-session information into the learning process for individual session representations.

By structuring cross-session item transitions into a directed graph, we can naturally leverage the strengths of GNNs to process sequential dependencies. GNNs excel at propagating information through nodes and edges, allowing the model to effectively capture long-term user preferences and sequential patterns that span multiple sessions. In addition, the hierarchical feature extraction is critical for understanding the nuanced relationships between items and predicting the next item in a sequence.

3.1.2. Hypergraph Construction

To capture complex relationships beyond pairwise interactions in SBR systems, following [10], we utilize a hypergraph

G_{H} = (V, E_{H})

to represent sessions as hyperedges. As stated in Figure 1b, each hyperedge encompasses all items specific to a particular session. For example, hyperedge 1 encompasses all items associated with it. Specifically, we define each hyperedge as a set of items

(v_{t, 1}, v_{t, 2}, \dots, v_{t, n})

within a session, where each item

v_{t, j} \in V

is a node in the hypergraph. Upon conversion to the hypergraph, each pair of items clicked within a session becomes connected. It is crucial to highlight that we transform session sequences into an undirected graph, aligning with the concept that items within a session exhibit temporal relationships rather than strict sequential dependencies. This methodology enables us to explicitly depict many-to-many high-order interactions. Consequently, the hypergraph effectively encapsulates high-order relationships at the item level.

Traditional graphs are limited in their ability to model higher-order correlations between multiple items within a single session. Hypergraphs, on the other hand, allow for the representation of complex relationships involving multiple items simultaneously, making them ideal for capturing high-order dependencies. Moreover, HGNNs enable the aggregation of information from multiple items within a session. This capability is essential for understanding the collective influence of items on user behavior and preferences. The non-linear nature of hypergraphs allows HGNNs to model intricate and non-trivial relationships between items, which are often overlooked by traditional pairwise interaction models.

3.2. Dual-Channel Encoding

To comprehensively capture the pairwise cross-session transitions and intricate high-order correlations among items, we introduce GNNs and HGNNs that adeptly encapsulate inter-session dependencies into session-level representations, respectively.

3.2.1. Cross-Session Graph Encoding

Initially, we utilize the GCNs message passing mechanism [11] to encapsulate the local context of the transitional signals that occur between different items within the cross-session graph. We explicitly articulate the encoding function as follows:

X_{c}^{(l + 1)} = σ (A, X_{c}^{l}, W^{l}) = σ ({\hat{D}}^{- \frac{1}{2}} \hat{A} {\hat{D}}^{- \frac{1}{2}} X_{c}^{l} W^{l})

(1)

where

X_{c}^{(l + 1)} \in ℝ^{N \times d}

represents the learned item representations that are updated during l-th propagation layer. Each entry a_i_,j in the adjacent matrix A is set to 1 if there exists a transition relation from item v_i to v_j and A_ij = 0 otherwise. Note that to integrate the self-propagated signals, we refresh the adjacency matrix by summing the identity matrix and the original adjacency matrix, resulting in

\hat{A} = A + I

. We further apply the symmetric normalization strategy to conduct the information aggregation as

{\hat{D}}^{- \frac{1}{2}} \hat{A} {\hat{D}}^{- \frac{1}{2}}

, where

\hat{D}

denotes the diagonal node degree matrix of matrix

\hat{A}

.

σ (\cdot)

is the RELU activation function and

W^{l} \in ℝ^{d \times d}

is trainable weight matrices under the l-th propagation layer. After stacking the above GNN networks, we generate the embeddings

X_{c}^{L}

for item set

V

in a cross-session graph.

3.2.2. Hypergraph Encoding

To encapsulate complex interactions among items within a session, hypergraph graph neural networks (HGNNs) [12] are employed to refine item embeddings. Recognizing that diverse perspectives can impart varying significance to recommendation outcomes, it would be imprudent to broadcast the initial term embedding X₍₀₎ across all channels directly. Hence, to manage the transmission of information from the basic item embedding X₍₀₎ to the HGNNs channels, we introduce prefilters equipped with self-gating units (SGUs), defined as follows:

X_{h}^{(0)} = f_{g a t e} (X^{(0)}) = X^{(0)} ⊙ σ (X^{(0)} W_{1} + b_{1})

(2)

where

W_{1} \in R^{d \times d}

and

b_{1} \in R^{d}

are the parameter to be learned, ⊙ is the element-wise product operation, that is, the corresponding element of two vectors is multiplied, σ is a non-linear activation function.

Building on the spectral hypergraph convolution presented by Feng et al. [12], the hypergraph convolution is conceptualized as follows:

x_{i}^{(l + 1)} = \sum_{j = 1}^{N} \sum_{ε = 1}^{M} H_{i ε} H_{j ε} W_{ε ε} x_{j}^{(l)} P^{(l)}

(3)

where

P^{l} \in ℝ^{d \times d}

is the learnable parameter matrix between two convolution layers,

H \in ℝ^{N \times M}

is the incidence matrix. For

W_{𝜖 𝜖}

, we assign the same weight value of 1 to each hyperedge. The matrix form of Equation (2) with row normalization can be re-expressed as follows:

X_{h}^{(l + 1)} = D^{- 1} H W_{2} B^{- 1} H^{T} X_{h}^{(l)} P^{(l)}

(4)

The hypergraph convolution process can be understood as a dual-phase enhancement of the feature transformation from “item-session-term” for the structure of hypergraphs. The multiplication operation

H^{T} X_{h}^{(l)}

facilitates the aggregation of information from items to the session, while the pre-multiplied H is perceived as the aggregation of information from the session back to the items. Following the convolution of the foundational embedding

X_{h}^{(0)}

across L levels of the hypergraph, the item embeddings at each level are combined by averaging to derive the ultimate item embedding.

X_{h}^{*} = \frac{1}{L + 1} \sum_{l = 0}^{L} X_{h}^{(l)}

(5)

3.2.3. Session Representation Learning

To generate a session embedding that captures the preferences and dynamics of the session, we then aggregate the representations of all items in the session. Thus, the session embedding represents the collective behavior or interest of the user during that session.

For a given target session

s_{t} = [v_{t, 1}, v_{t, 2}, \dots, v_{t, n}] \in S

, the next step involves combining the representations of all items to create a session embedding. Recognizing that different item within this embedding may be of varying importance; we incorporate a soft-attention mechanism to refine the representation, ensuring that the session’s preferences are accurately captured and highlighted:

s_{t}^{ω} = \sum_{i = 1}^{n} α_{i} X_{ω} [i]

(6)

α_{i} = q^{T} σ (W_{3} X_{ω} [n] + W_{4} X_{ω} [i] + b),

(7)

where

s_{t}^{ω}

,

ω \in {h, c}

denotes the learned representations over sessions under the hypergraph or cross-session graph,

q \in ℝ^{d}

is a linear projection vector for generating the weight scalar

α_{i}

,

W_{3}, W_{4} \in ℝ^{d \times d}

and

b \in ℝ^{d}

are learnable transformation matrices and bias vector.

3.2.4. Design Discussion

In this section, we introduce a dual-channel encoding approach with the primary aim of fully exploiting the latent item correlation information within each other. This design offers the following distinct advantages:

Enhanced item interaction dependencies: Traditional graph-based SBR models using graph neural networks (GNNs) often encounter limitations related to single-session item transitions. To tackle this challenge, we introduce cross-session transitions to amplify the latent interaction among items. By merging the cross-session embeddings generated by GNNs with those from cross graphs, we significantly enhance the item interactions within the GNN framework, enabling a more comprehensive capture of contextual information.

High-order item correlation encoding: In contrast to conventional GNN architectures that predominantly focus on pairwise node information, the HGNNs we employ deliberately integrate high-order encoding into the embedding computation process. This inclusion supplements additional intra-session details, leading to a more holistic understanding of the underlying relationships within sessions. Through the incorporation of these design components, our proposed dual encoder surpasses the constraints of traditional GNN-based methods, providing a more comprehensive and effective approach for SBR.

3.3. Co-Guiding Schema

3.3.1. Co-Guiding Learning

As previously discussed in earlier sections, the cross-session embedding and hypergraph embedding are interdependent and collectively determine the accuracy of recommendation predictions. Therefore, after obtaining the embeddings from both perspectives, it is crucial to fuse them effectively to enable the model to extract and utilize insightful information about session properties. A straightforward summing of the embeddings cannot address the intricacies of this task.

In this section, we employ a co-guided learning framework [13] to delineate the intricate relationships between these two types of embeddings and facilitate their mutual enhancement. Specifically, we facilitate the interchange of information between cross-session embeddings and hypergraph embeddings through the concurrent updating of their respective weights. We merge the cross-session embedding with the hypergraph embedding in two distinct manners:

m_{c} = \tan h (W_{5} (s_{c} ⊙ s_{h}) + b_{c})

(8)

m_{j} = \tan h (W_{6} (s_{c} + s_{h}) + b_{j})

(9)

where

W_{5}, W_{6} \in ℝ^{d \times d}

,

b \in ℝ^{d}

are learnable transformation parameters. The

m_{c}

and

m_{j}

represent interactive relations between cross-session embedding and the hypergraph embedding under different semantic spaces.

And then, we utilize a gating mechanism to further model the mutual relations between

s_{c}

and

s_{h}

as follows:

r_{c} = σ (W_{7} m_{c} + U_{1} m_{j})

(10)

r_{j} = σ (W_{8} m_{c} + U_{2} m_{j})

(11)

m_{c} = \tan h (W_{9} (r_{c} ⊙ s_{c}) + U_{3} ((1 - r_{j}) ⊙ s_{h}))

(12)

m_{j} = \tan h (W_{10} (r_{c} ⊙ s_{c}) + U_{4} ((1 - r_{j}) ⊙ s_{h}))

(13)

where

W_{i}, U_{j} \in ℝ^{d \times d}

. The

r_{c} (r_{j}) \in ℝ^{d}

represents the remember gate, which controls how much interactions are retained when modeling the relations between them. Additionally, we utilize the complement of

r_{c} (r_{j})

, namely 1 −

r_{c} (r_{j})

, to integrate directed graph (or hypergraph) representations, thereby guiding the learning process and enhancing the semantic relevance of these two types of representations. This mechanism enhances the learning process by increasing the semantic significance of these interactions, thereby enriching the model’s understanding of relationships across sessions.

Finally, we obtain the enhanced cross-session embedding with the hypergraph embedding as follows:

s_{c}^{u} = m_{c} ⊙ (s_{c} + m_{j})

(14)

s_{h}^{u} = m_{j} ⊙ (s_{h} + m_{c})

(15)

where

s_{c}^{u}

and

s_{h}^{u}

are the cross-session embedding with the hypergraph embedding interactions whose semantics are enriched by each other. Note that the cross-session embedding with the hypergraph embedding extract information from each other to guide the learning process, which enables our method to model the complex relations between them from two channels in predicting recommendation.

3.3.2. Design Discussion

In this section, we present a co-guiding learning schema with the primary aim of fully exploiting the mutual correlation information within two embeddings. This design offers several benefits:

Diverse Perspective Integration: By using multiple GNNs, each potentially capturing different aspects of the data, the co-guiding mechanism allows for a more comprehensive understanding of the items and their relationships.

Mutual Learning: The co-guiding process facilitates the exchange of knowledge between the different GNNs, leading to a more robust and nuanced learning process. This mutual learning can help in uncovering hidden patterns that a single GNN might miss.

3.4. Model Optimization

Intuitively, the relevance of an item to the current session’s preferences determines its importance for recommendation. Once we have obtained the embeddings for each session, we concatenate the embeddings of a session learned through both channels. This allows us to calculate the score for each potential item, which is defined as follows:

Intuitively, the relevance of an item to the current session’s preferences determines its importance for recommendation. Once we have obtained the embeddings for each session, we concatenate the embeddings of session learned through both channels. This allows us to calculate the score for each potential item, which is defined as follows:

{\hat{z}}_{s_{t}, i_{k}} = M L P (s_{t}^{c} \oplus s_{t}^{h} \oplus x_{c}^{k} \oplus x_{h}^{k}),

(16)

Next, we use the SoftMax function to obtain the model output as follows:

{\hat{y}}_{s_{t}, v_{k}} = s o f t m a x ({\hat{z}}_{s_{t}, i_{k}}),

(17)

For each session, the loss function is defined as the cross-entropy between the predicted outcomes and the actual data. This can be expressed mathematically as:

L_{S} = - \sum_{s_{t} \in S} \sum_{v_{k} \in V} {y_{s_{t}, v_{k}} \log ({\hat{y}}_{s_{t}, v_{k}}) + (1 - y_{s_{t}, v_{k}}) \log (1 - {\hat{y}}_{s_{t}, v_{k}})}

(18)

where

y_{s_{t}, v_{k}}

is the one-hot encoding ground-truth vector of the item in the real data.

Finally, the objective function of SBR task is given as follows:

L = L_{S} + λ | | Θ | |_{2}^{2}

(19)

where

Θ

is the set of model parameters, λ is a hyperparameter, and

| | Θ | |_{2}^{2}

is the L2-regularization that is parameterized by λ to prevent over-fitting.

4. Experiments

To verify the efficacy of the proposed CGH-SBR model and the precision of its recommendation results, a series of experiments are conducted across two various datasets. These experiments aim to address the following research questions:

RQ1: What are the performance benefits of the CGH-SBR model over existing SBR techniques?

RQ2: How impactful is each component of the CGH-SBR in ensuring accurate recommendation results?

RQ3: How do varying hyperparameters influence the performance of the CGH-SBR method?

Concretely, this section is structured as follows: We first provide a detailed description of the dataset employed for the experiments. We then proceed to introduce the baseline models and the experimental framework. Lastly, we outline the experiments designed to validate the proposed CGH-SBR, analyzing the results to demonstrate its effectiveness.

4.1. Datasets

For our evaluation, we utilize two distinct datasets [14,15]:

Tmall: Originating from the IJCAI-15 competition, the dataset comprises anonymized user shopping logs from the Tmall online shopping platform.

Retailrocket: This is a dataset from a Kaggle contest, published by an E-commerce company, containing users’ browsing activity over a six-month period.

The comprehensive statistical details for both datasets are presented in Table 1. This includes the number of training sessions, test sessions, items within the datasets, as well as the average length of sessions.

4.2. Baselines and Experimental Settings

Baselines

To thoroughly evaluate the efficacy of our proposed algorithm, we engaged in a comparative analysis with several renowned SBR algorithms from different research streams, detailed as follows:

(1): RNN-based Approach:

GRU4REC [8]: This model employs multiple stacked GRU layers to encode session sequences and utilizes a ranking loss for model training.

(2): Attention-based Approaches:

NARM [16]: This is a neural attention model that argues for a recurrent network for SBR, attentively differing in the encoding of sequential items.

STAMP [17]: This approach replaces all RNN encoders from prior work with attention layers, aiming to better capture both current user interests and general user interests.

(3): GNNs-based Approaches:

SR-GNN [9]: This models every session as a graph and then employs a gated graph neural network to capture the complex item transitions inside sessions.

FGNN [18]: This model converts sessions into a global graph and employs a graph attention layer to learn item representations.

S2-DHCN [10]: It designs two types of hypergraphs to learn inter- and intra-session information and employs self-supervised learning to enhance session-based recommendation.

COTREC [19]: This model takes into account the internal and external connectivity of sessions and frames it as a contrastive learning model for session-based recommendation.

KGCL [20]: This builds an item attribute hypergraph with item knowledge and develops HCNs to establish the associations among items with common attributes and encode the complex high-order information among items.

HEML [21]: This extracts dynamic multiples interests from dual-scale sequential patterns and constructs hypergraph to model global multi-order dependencies.

Evaluation Metrics

To evaluate the performance of the methods on the test set, we followed a common strategy for top-K recommendation and preference ranking. We used two widely used evaluation metrics, P@K, MRR@K, where K denotes the number of top recommendations considered.

Implementation Detail

Our proposed CGH-SBR model is implemented with PyTorch (version 0.4.0), a widely used open-source deep learning framework. Consistent with the setup in [3,20,21], we set the embedding size d to 100, the mini-batch size to 100, and the L2 regularization to 10-5 to mitigate overfitting. In our model, all parameters are initialized from a Gaussian distribution with a mean of 0 and a standard deviation of 0.1. We trained our model with 30 epochs with the Adam optimizer, with a learning rate of 0.001 and a decay rate of 0.96. The three-layer architecture was deployed to yield the best performance. For the baseline models, we adopted the best parameter configurations reported in their respective original papers and reported their results directly if they were available, as we utilized the same datasets and evaluation metrics.

4.3. Performance Comparison (RQ1)

Initially, we benchmarked the CGH-SBR model against several baseline algorithms, with the results presented in Table 2.

Our evaluation focuses on the Top-K performance metric, adhering to the guidelines provided in references [20,21]. To emphasize the most exceptional outcomes, we have bolded the top-performing results and underlined the best results among the baseline models in each case. In particular, to further refine the analysis, we have also included improvement rates over the second-best comparison method for the experimental results, providing a clearer picture of the performance gains achieved by the CGH-SBR model over the baselines. These rates quantitatively demonstrate the model’s superiority and its robustness across different evaluation metrics. Overall, the following observations can be made from this table:

GRU4REC was the pioneering session-based model to utilize a recurrent architecture for capturing sequential data. However, a notable limitation of RNN-based approaches is ‘catastrophic forgetting’, where initial information is lost as sequences progress. The attention-based baselines, NARM and STAMP, integrate a self-attention mechanism that concentrates on the last item, considering it as the key element. This strategy overcomes the linear sequence processing of RNNs, as evidenced by their superior performance over GRU4REC. This is further substantiated by STAMP’s superior outcomes compared to NARM. STAMP abandons the recurrent architecture altogether and prioritizes the last item with great emphasis.
Among the GNN-based models, S2-DHCN and COTREC show consistent improvement over their respective original models, SR-GNN and FGNN, across all datasets. This indicates that contrastive learning does enhance recommendation performance. Moreover, KGCL and HEML exhibit consistently better performance than their counterparts across all datasets, suggesting that the hypergraph structure indeed contributes to the effectiveness of recommendations.
CGH-SBR model demonstrates remarkable performance, consistently outperforming all baseline methods across four key performance metrics on two benchmark datasets. This superior performance highlights the model’s exceptional capability in session-based recommendation tasks. The success of CGH-SBR can be attributed to several critical factors: Firstly, CGH-SBR leverages cross-session item transitions to capture long-term user preferences and sequential dependencies. Secondly, CGH-SBR goes beyond by modeling higher-order correlations among items within sessions, which can capture complex relationships involving multiple items simultaneously. This ability to account for multi-item contexts significantly enhances the model’s capacity to understand nuanced user preferences and recommend items that align closely with their interests. Finally, the adoption of co-guided learning framework further amplifies the model’s performance by fostering collaboration between the GNN and HGNN. By capturing the complex interplay between the cross-session graph and the hyper-graph, the framework ensures that the model benefits from both local and global perspectives of user-item interactions.

4.4. Ablation Studies (RQ2)

In this subsection, we integrate FGNN as an additional framework to further evaluate the impact of our presented each novel components. In particular, FGNN employs gated graph neural networks (GNNs) to encapsulate the intricate cross-session transitions between items within sessions. Thus, in our experimental setup, we perform ablation studies on the individual components of both CGH-SBR and FGNN by removing/integrating the hypergraph channel and the co-guided learning schema, respectively. The results of these various model configurations are detailed in Table 3.

From the table, we make the following observations:

(1) Effectiveness of dual-channel mechanism

The dual-channel architecture indeed boosts performance by explicitly accounting for both cross-item transitions and hypergraph high-order correlations.

As demonstrated in Table 3, the dual-channel design (Model-1) surpasses the FGNN baseline in performance. This improvement stems from our model’s capacity to encompass a comprehensive range of item features, incorporating both cross-item transitions and hypergraph high-order correlations, crucial for accurate and reliable recommendation results. Results from Model-3 and Model-4 indicate that excluding either channel leads to a decline in performance. This highlights that solely focusing on cross-item transitions or hypergraph high-order correlations is insufficient for the model to fully comprehend the intricate nature of item interactions. Consequently, our dual-channel approach effectively integrates both types of interactions, enabling the model to more adeptly capture complexities and enhance overall recommendation performance.

(2) Effectiveness of Co-guided Learning Schema

To evaluate the effectiveness of this framework, we integrate it into Model-1, resulting in the creation of Model-2. As illustrated in Table 3, the performance of FGNN is notably enhanced with the incorporation of the co-guided learning framework. Furthermore, when we substitute the co-guided learning framework with vector concatenation, as demonstrated by Model-5, there is a significant decrease in performance, indicating that the absence of the co-guided learning framework impedes the model’s ability to capture the reciprocal influence and synergistic effects between the two types of interactions. These results underscore that our model effectively learns distinct session representations by independently modeling the two interaction types.

In essence, the dual-channel architecture enables the accurate differentiation and representation of both interaction types within the model. Additionally, the co-guided learning framework boosts the model’s capability to capture the intricate relationships between these interactions, leading to superior performance.

4.5. The Impact of Hyper-Parameters (RQ3)

In the subsequent sections, we conduct comprehensive experiments to ascertain the influence of various hyperparameters on the performance of the CGH-SBR model.

4.5.1. Hidden State Dimensionality d

We explored embedding dimension within the range of 20 to 140. Figure 2 presents the experimental results in terms of P@20 and M@20, illustrating the effect of varying embedding dimensions on both datasets. It is evident that the performance improves as the embedding dimension is increased from 20 to 100. However, when the embedding dimension is further increased from 100 to 140, there is a slight decline in performance on both datasets. This decrement can be attributed to overfitting, where the model encodes more information than it can effectively handle. Notably, CGH-SBR demonstrates remarkable robustness to changes in embedding dimensions, as only a minor variation in performance is observed. This suggests that the proposed CGH-SBR is less sensitive to different configurations of hidden state dimensionality.

4.5.2. Depth of Graph Convolution L

To assess the model’s performance with varying layer depths, experiments were conducted across different datasets. Figure 3 presents the experimental results for each dataset in terms of P@20 and M@20.

It is evident from Figure 3 that the optimal performance is achieved with a three-layer architecture for both datasets. It is important to note that as the depth of layers increases, so does the computational cost of the model, and performance begins to decline after reaching a certain threshold. This is likely due to the addition of more embedding propagation layers, which can introduce noise signals when modeling associations between items, leading to over-smoothing.

5. Conclusions and Future Work

In this study, we present a novel method termed cross-session graph and hypergraph co-guided session-based recommendation (CGH-SBR), which adeptly forecasts the next item with a focus on efficiency and precision. Initially, we structure cross-session item transitions into a directed graph that encapsulates sequential dependencies, and we also model higher-order correlations among items within sessions to construct a hypergraph. Next, we develop two specialized graph neural networks (GNNs) to extract distinct item representations from these dual graph structures. Following this, we introduce a co-guided learning framework that fosters the amalgamation of varied viewpoints and enables collaborative learning among them. Comprehensive experimentation on two benchmark datasets has validated its superiority, demonstrating its exceptional performance and potential in the domain of recommendation systems. However, the incorporation of cross-session dependencies and higher-order correlations into the model may result in increased computational overhead, especially when dealing with large-scale datasets. This potential for added complexity could constrain the model’s scalability in practical, real-world scenarios. Furthermore, although the capacity to capture higher-order correlations augments the model’s ability to represent complex relationships, it also raises the likelihood of overfitting, particularly when the training data lacks diversity or comprises a small sample size.

While our current approach leverages static graph structures, future work could explore dynamic graph modeling techniques to adapt to evolving user preferences and item relationships over time. Incorporating temporal information into both the directed graph and hyper-graph could further enhance the model’s ability to capture real-time user behavior. Moreover, the integration of additional data sources, such as user demographics, contextual information, and item attributes, could provide richer representations for both graphs. This would enable the model to consider more comprehensive factors when making recommendations, potentially improving personalization.

Author Contributions

P.L. is responsible for conceptualization, methodology, software validation and formal analysis. H.M. is responsible for supervision, review and editing, and funding acquisition. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by the Industrial Support Project of Gansu Colleges (2022CYZC-11), National Natural Science Foundation of China (61762078).

Data Availability Statement

This study is for public datasets, which can be accessed at the following link: https://tianchi.aliyun.com/dataset/dataDetail?dataId=42 (accessed on 1 March 2025); https://www.kaggle.com/retailrocket/ecommerce-dataset (accessed on 1 March 2025).

Acknowledgments

I am grateful to the staff and facilities at School of E-commerce, Longnan Normal University, for their assistance and resources that were instrumental in the completion of this work. Additionally, I appreciate the helpful feedback and suggestions provided by the anonymous reviewers, which greatly improved the quality of this paper.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Xin, X.; Yang, L.; Zhao, Z.Q.; Ren, P.J.; Chen, Z.M.; Ma, J.; Ren, Z.C. On the effectiveness of unlearning in session-based recommendation. In Proceedings of the 17th ACM International Conference on Web Search and Data Mining, Merida, Mexico, 4–8 March 2024; pp. 855–863. [Google Scholar]
Wang, Z.Y.; Wei, W.; Zou, D.; Liu, Y.F.; Li, X.L.; Mao, X.L.; Qiu, M.H. Exploring global information for session-based recommendation. Pattern Recognit. 2024, 145, 109911. [Google Scholar] [CrossRef]
Xin, L.; Li, Z.; Gao, Y.F.; Yang, J.F.; Cao, T.Y.; Wang, Z.Y.; Yin, B.; Song, Y.Q. Enhancing user intent capture in session-based recommendation with attribute patterns. Adv. Neural Inf. Process. Syst. 2024, 36, 30821–30839. [Google Scholar]
Anand, V.; Maurya, A. A survey on recommender systems using graph neural network. ACM Trans. Inf. Syst. 2025, 43, 1–49. [Google Scholar] [CrossRef]
Bai, X.; Huang, Y.; Peng, H.; Yang, Q.; Wang, J.; Liu, Z. Spiking neural self-attention network for sequence recommendation. Appl. Soft Comput. 2025, 169, 112623. [Google Scholar] [CrossRef]
Liang, S.; Zheng, Z.; Zhang, G.; Kong, Q. SPECN: Sequential patterns enhanced capsule network for sequential recommendation. Appl. Intell. 2015, 55, 204. [Google Scholar]
An, G.; Sun, J.; Yang, Y.; Sun, F. Enhancing collaborative information with contrastive learning for session-based recommendation. Inf. Process. Manag. 2024, 61, 103738. [Google Scholar] [CrossRef]
Hidasi, B.; Karatzoglou, A.; Baltrunas, L.; Tikk, D. Session-based recommendations with recurrent neural networks. In Proceedings of the 4th International Conference on Learning Representations, San Juan, PR, USA, 2–4 May 2016. [Google Scholar]
Wu, S.; Tang, Y.; Zhu, Y.; Wang, L.; Xie, X.; Tan, T. Session-based Recommendation with Graph Neural Network. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; pp. 346–353. [Google Scholar]
Xia, X.; Yin, H.Z.; Yu, J.L.; Wang, Q.Y.; Cui, L.Z.; Zhang, X.L. Self-Supervised Hypergraph Convolutional Networks for Session-based Recommendation. In Proceedings of the AAAI Conference on Artificial Intelligence, Palo Alto, CA, USA, 2–9 February 2021; pp. 4503–4511. [Google Scholar]
Choi, M.; Kim, H.; Cho, Y.; Lee, J. Multi-intent-aware session-based recommendation. In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, Washington, DC, USA, 14–18 July 2024; pp. 2532–2536. [Google Scholar]
Feng, Y.; You, H.; Zhang, Z.; Ji, R.; Gao, Y. Hypergraph neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; pp. 3558–3565. [Google Scholar]
Wu, Z.; Ma, H.; Deng, B.; Li, Z.; Chang, L. Co-guided Dual-channel Graph Neural Networks for the prediction of compound-protein interaction. Appl. Soft Comput. 2024, 163, 111875. [Google Scholar] [CrossRef]
Li, Z.; Yang, C.; Chen, Y.; Wang, X.; Chen, H.; Xu, G.; Yao, L.; Sheng, M. Graph and Sequential Neural Networks in Session-based Recommendation: A Survey. ACM Comput. Surv. 2024, 57, 1–37. [Google Scholar] [CrossRef]
Su, J.; Chen, C.; Toh, D.; Huang, S. Evolving intra-and inter-session graph fusion for next item recommendation. Inf. Fusion 2025, 114, 102691. [Google Scholar] [CrossRef]
Li, J.; Ren, P.; Chen, Z.; Ren, Z.; Lian, T.; Ma, J. Neural attentive session-based recommendation. In Proceedings of the ACM Conference on Information and Knowledge Management, Singapore, 6–10 November 2017; pp. 1419–1428. [Google Scholar]
Liu, Q.; Zeng, Y.F.; Mokhosi, R.; Zhang, H.B. STAMP: Short-term attention/memory priority model for session-based recommendation. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD’18), London, UK, 19–23 August 2018; pp. 1831–1839. [Google Scholar]
Qiu, R.H.; Huang, Z.; Yin, H.Z.; Lin, J.J. Exploiting Cross-session Information for Session-based Recommendation with Graph Neural Networks. ACM Trans. Inf. Syst. (TOIS) 2020, 38, 1–23. [Google Scholar] [CrossRef]
Xia, X.; Yin, H.; Yu, J.; Shao, Y.; Cui, L. Self-Supervised Graph Co-Training for Session-based Recommendation. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, Gold Coast, QLD, Australia, 1–5 November 2021; pp. 2180–2190. [Google Scholar]
Zhang, X.; Ma, H.; Yang, F.; Li, Z.; Chang, L. KGCL: A Knowledge-enhanced Graph Contrastive learning framework for session-based recommendation. Eng. Appl. Artif. Intell. 2023, 124, 106512. [Google Scholar] [CrossRef]
Li, Q.; Ma, H.; Jin, W.; Ji, Y.; Li, Z. Hypergraph-Enhanced Multi-interest Learning for multi-behavior sequential recommendation. Expert Syst. Appl. 2024, 255, 124497. [Google Scholar] [CrossRef]

Figure 1. The overall framework of CGH-SBR. (a) Denotes the input session data, (b,c) Represent the two graph construction processes, respectively, (d) Indicates the learning process.

Figure 2. Performance comparison with various hidden state dimensionality d.

Figure 3. Performance comparison with various depth of graph convolution L.

Table 1. Dataset statistics.

	Tmall	Retailrocket
Number of training sessions	351,268	433,643
Number of test sessions	25,898	15,132
Number of items	40,728	36,968
Average length	6.69	5.43

Table 2. Performances of all comparison methods on both datasets, with the top-performing results in boldface and the best results underlined.

Datasets	Tmall				Retailrocket
Metrics	P@10	M@10	P@20	M@20	P@10	M@10	P@20	M@20
GRU4REC	9.46	5.77	10.92	5.89	38.37	23.30	44.22	23.59
NARM	19.16	10.42	23.34	10.72	42.06	24.85	50.08	24.38
STAMP	22.61	13.13	26.44	13.37	42.97	24.62	50.97	25.38
SR-GNN	23.42	13.44	27.58	13.69	43.18	26.05	51.34	26.56
FGNN	27.95	14.98	33.32	15.51	45.38	16.50	52.67	26.91
S2-DHCN	26.21	14.59	31.43	15.02	46.12	26.83	53.67	27.32
COTREC	30.45	17.43	36.33	17.93	48.41	29.38	56.09	29.85
KGCL	31.35	17.76	37.53	18.07	48.89	29.79	56.98	30.51
HEML	32.38	18.32	38.48	19.12	49.92	30.62	57.25	32.68
CGH-SBR	34.35	19.86	39.59	21.08	50.89	32.79	58.98	34.51

Table 3. The ablation study results obtained on both datasets. The best result in marked in bold.

Model	Cross- Session	Hypergraph	Co-Guided Learning	P@10	M@10	P@20	M@20
FGNN	√	-	-	27.95	14.98	33.32	15.51
Model-1	√	√	-	29.12	17.25	35.21	17.25
Model-2	√	√	√	33.24	18.25	38.01	20.02
CGH-SBR	√	√	√	34.35	19.86	39.59	21.08
Model-3	√	-	-	28.76	15.23	33.18	15.49
Model-4	-	√	-	29.35	16.22	34.52	16.25
Model-5	√	√	-	32.22	17.68	36.88	18.25
FGNN	√	-	-	45.38	26.50	52.67	26.91
Model-1	√	-	-	46.12	27.20	53.35	27.89
Model-2	√	√	-	48.32	31.52	57.25	32.58
CGH-SBR	√	√	√	50.89	32.79	58.98	34.51
Model-3	√	√	√	45.32	26.65	52.36	27.03
Model-4	√	-	-	46.12	27.12	53.31	27.04
Model-5	-	√	-	48.32	30.58	55.32	32.57

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, P.; Ma, H. Cross-Session Graph and Hypergraph Co-Guided Session-Based Recommendation. Symmetry 2025, 17, 389. https://doi.org/10.3390/sym17030389

AMA Style

Li P, Ma H. Cross-Session Graph and Hypergraph Co-Guided Session-Based Recommendation. Symmetry. 2025; 17(3):389. https://doi.org/10.3390/sym17030389

Chicago/Turabian Style

Li, Pingrong, and Huifang Ma. 2025. "Cross-Session Graph and Hypergraph Co-Guided Session-Based Recommendation" Symmetry 17, no. 3: 389. https://doi.org/10.3390/sym17030389

APA Style

Li, P., & Ma, H. (2025). Cross-Session Graph and Hypergraph Co-Guided Session-Based Recommendation. Symmetry, 17(3), 389. https://doi.org/10.3390/sym17030389

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Cross-Session Graph and Hypergraph Co-Guided Session-Based Recommendation

Abstract

1. Introduction

2. Preliminaries

3. Methodology

3.1. Multi-View Construction

3.1.1. Cross-Session Graph Construction

3.1.2. Hypergraph Construction

3.2. Dual-Channel Encoding

3.2.1. Cross-Session Graph Encoding

3.2.2. Hypergraph Encoding

3.2.3. Session Representation Learning

3.2.4. Design Discussion

3.3. Co-Guiding Schema

3.3.1. Co-Guiding Learning

3.3.2. Design Discussion

3.4. Model Optimization

4. Experiments

4.1. Datasets

4.2. Baselines and Experimental Settings

4.3. Performance Comparison (RQ1)

4.4. Ablation Studies (RQ2)

4.5. The Impact of Hyper-Parameters (RQ3)

4.5.1. Hidden State Dimensionality d

4.5.2. Depth of Graph Convolution L

5. Conclusions and Future Work

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI