0% found this document useful (0 votes)

90 views18 pages

Two Tower LLM Recommendation

The document describes the development of a Two-Tower Recommendation System using Graph Neural Networks (GNNs), Large Language Models (LLMs), and Reinforcement Learning (RL) to enhance personalized recommendations for Yelp business data. It outlines the architecture, embedding enhancements, and training workflow, focusing on how user and item profiles are encoded and optimized for better recommendation accuracy. The system leverages advanced embeddings and contextual reasoning to improve user-item similarity and utilizes a Q-network for reinforcement learning to refine recommendation strategies.

Uploaded by

Noor Uddin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

90 views18 pages

Two Tower LLM Recommendation

Uploaded by

Noor Uddin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Two_Tower_Final

December 15, 2024

0.1 A Two-Tower Recommendation System Powered by GNNs, LLMs, and RL

for Yelp Business Data

This notebook outlines the development of a personalized recommendation system utilizing both
Graph Neural Networks (GNNs) and Large Language Models (LLMs). We integrate Reinforcement
Learning (LR) to optimize recommendations and use advanced embeddings for richer user-item
profiles. This pipeline focuses on the following:
1. Constructing a Two-Tower Model with GNN-based user and item embeddings.
2. Leveraging LLMs to enhance user-item similarity via contextual reasoning.
3. Applying RL with an LLM-based reward estimator and chain-of-thought reasoning.

0.2 Technical Explanation

0.2.1 Two-Tower Architecture

The Two-Tower model consists of two independent towers: 1. User Tower: Encodes user profiles
using GNNs and user embeddings. 2. Item Tower: Encodes item attributes using GNNs and
item embeddings.
These towers generate latent representations that are combined to compute similarity scores. An
additional LLM Adapter refines embeddings with contextual language features which aligns LLM
and GNN embeddings for optimal performance.

0.2.2 Embedding Enhancements with LLMs

We employ pre-trained Sentence Transformers and GPT-2 for: - Semantic Understanding:

Sentence Transformers generate rich embeddings of user reviews and item descriptions. - Chain-
of-Thought Reasoning: GPT-2 evaluates recommendations using contextual reasoning which
improves semantic alignment between users and items.
The LLM embeddings are further transformed to a GNN-compatible space using a custom
TransformerAdapter.

0.2.3 Reinforcement Learning Optimization

Reinforcement learning fine-tunes the recommendation strategy. A Q-network learns to predict

optimal actions (recommendations) by maximizing long-term rewards based on user feedback.

1
Rewards are derived using: 1. Similarity Scores: From Sentence Transformers. 2. LLM-
Generated Scores: Chain-of-thought reasoning provides detailed explanations and suitability
scores. 3. Metadata Signals: Additional contextual signals refine the reward mechanism.

0.2.4 Dataset and Graph Construction

Yelp data containing user reviews, ratings, and business details forms the core dataset. Users and
items are represented as nodes, while interactions (reviews and ratings) form edges in the graph.
Self-loops and bi-directional edges are added to enhance graph connectivity.

0.2.5 Training Workflow

1. Two-Tower Training: The model minimizes prediction errors between generated scores and
actual user ratings using MSE loss.
2. RL Training: An �-greedy policy balances exploration (discovering new preferences) and
exploitation (prioritizing learned preferences).

0.2.6 Query Mechanism

For a given user query, the system: 1. Encodes the query using the LLM Adapter. 2. Generates
item embeddings via the item GNN. 3. Scores items using the Q-network and ranks them by
predicted Q-values.

0.3 Step 1: Import libraries

The first step is to import all the necessary libraries. We use PyTorch for deep learning, Sentence
Transformers for generating embeddings, and Transformers for LLMs.
[ ]: !pip install transformers torch_geometric sentence_transformers

Requirement already satisfied: transformers in /usr/local/lib/python3.10/dist-

packages (4.46.3)
Requirement already satisfied: torch_geometric in
/usr/local/lib/python3.10/dist-packages (2.6.1)
Requirement already satisfied: sentence_transformers in
/usr/local/lib/python3.10/dist-packages (3.2.1)
Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-
packages (from transformers) (3.16.1)
Requirement already satisfied: huggingface-hub<1.0,>=0.23.2 in
/usr/local/lib/python3.10/dist-packages (from transformers) (0.26.5)
Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.10/dist-
packages (from transformers) (1.26.4)
Requirement already satisfied: packaging>=20.0 in
/usr/local/lib/python3.10/dist-packages (from transformers) (24.2)
Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.10/dist-

2
packages (from transformers) (6.0.2)
Requirement already satisfied: regex!=2019.12.17 in
/usr/local/lib/python3.10/dist-packages (from transformers) (2024.9.11)
Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-
packages (from transformers) (2.32.3)
Requirement already satisfied: tokenizers<0.21,>=0.20 in
/usr/local/lib/python3.10/dist-packages (from transformers) (0.20.3)
Requirement already satisfied: safetensors>=0.4.1 in
/usr/local/lib/python3.10/dist-packages (from transformers) (0.4.5)
Requirement already satisfied: tqdm>=4.27 in /usr/local/lib/python3.10/dist-
packages (from transformers) (4.66.6)
Requirement already satisfied: aiohttp in /usr/local/lib/python3.10/dist-
packages (from torch_geometric) (3.11.10)
Requirement already satisfied: fsspec in /usr/local/lib/python3.10/dist-packages
(from torch_geometric) (2024.10.0)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages
(from torch_geometric) (3.1.4)
Requirement already satisfied: psutil>=5.8.0 in /usr/local/lib/python3.10/dist-
packages (from torch_geometric) (5.9.5)
Requirement already satisfied: pyparsing in /usr/local/lib/python3.10/dist-
packages (from torch_geometric) (3.2.0)
Requirement already satisfied: torch>=1.11.0 in /usr/local/lib/python3.10/dist-
packages (from sentence_transformers) (2.5.1+cu121)
Requirement already satisfied: scikit-learn in /usr/local/lib/python3.10/dist-
packages (from sentence_transformers) (1.5.2)
Requirement already satisfied: scipy in /usr/local/lib/python3.10/dist-packages
(from sentence_transformers) (1.13.1)
Requirement already satisfied: Pillow in /usr/local/lib/python3.10/dist-packages
(from sentence_transformers) (11.0.0)
Requirement already satisfied: typing-extensions>=3.7.4.3 in
/usr/local/lib/python3.10/dist-packages (from huggingface-
hub<1.0,>=0.23.2->transformers) (4.12.2)
Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-
packages (from torch>=1.11.0->sentence_transformers) (3.4.2)
Requirement already satisfied: sympy==1.13.1 in /usr/local/lib/python3.10/dist-
packages (from torch>=1.11.0->sentence_transformers) (1.13.1)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in
/usr/local/lib/python3.10/dist-packages (from
sympy==1.13.1->torch>=1.11.0->sentence_transformers) (1.3.0)
Requirement already satisfied: aiohappyeyeballs>=2.3.0 in
/usr/local/lib/python3.10/dist-packages (from aiohttp->torch_geometric) (2.4.4)
Requirement already satisfied: aiosignal>=1.1.2 in
/usr/local/lib/python3.10/dist-packages (from aiohttp->torch_geometric) (1.3.1)
Requirement already satisfied: async-timeout<6.0,>=4.0 in
/usr/local/lib/python3.10/dist-packages (from aiohttp->torch_geometric) (4.0.3)
Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.10/dist-
packages (from aiohttp->torch_geometric) (24.2.0)
Requirement already satisfied: frozenlist>=1.1.1 in

3
/usr/local/lib/python3.10/dist-packages (from aiohttp->torch_geometric) (1.5.0)
Requirement already satisfied: multidict<7.0,>=4.5 in
/usr/local/lib/python3.10/dist-packages (from aiohttp->torch_geometric) (6.1.0)
Requirement already satisfied: propcache>=0.2.0 in
/usr/local/lib/python3.10/dist-packages (from aiohttp->torch_geometric) (0.2.1)
Requirement already satisfied: yarl<2.0,>=1.17.0 in
/usr/local/lib/python3.10/dist-packages (from aiohttp->torch_geometric) (1.18.3)
Requirement already satisfied: MarkupSafe>=2.0 in
/usr/local/lib/python3.10/dist-packages (from jinja2->torch_geometric) (3.0.2)
Requirement already satisfied: charset-normalizer<4,>=2 in
/usr/local/lib/python3.10/dist-packages (from requests->transformers) (3.4.0)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-
packages (from requests->transformers) (3.10)
Requirement already satisfied: urllib3<3,>=1.21.1 in
/usr/local/lib/python3.10/dist-packages (from requests->transformers) (2.2.3)
Requirement already satisfied: certifi>=2017.4.17 in
/usr/local/lib/python3.10/dist-packages (from requests->transformers)
(2024.8.30)
Requirement already satisfied: joblib>=1.2.0 in /usr/local/lib/python3.10/dist-
packages (from scikit-learn->sentence_transformers) (1.4.2)
Requirement already satisfied: threadpoolctl>=3.1.0 in
/usr/local/lib/python3.10/dist-packages (from scikit-
learn->sentence_transformers) (3.5.0)

[ ]: import pandas as pd
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader
from torch_geometric.nn import GCNConv
from transformers import AutoTokenizer, AutoModel, AutoModelForCausalLM
from sklearn.model_selection import train_test_split
import random
from sentence_transformers import SentenceTransformer, util
from collections import defaultdict
import re

0.4 Step 2: Set device configuration

We configure the device to use GPU if available. This ensures faster computation for training and
inference.
[ ]: # Set device
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

4
0.5 Step 3: Load pre-trained models

We load pre-trained models for:

1. Sentence Transformer embeddings for semantic representation.
2. BERT for tokenization and embeddings.
3. GPT-2 for chain-of-thought reasoning and reward estimation.
[ ]: # Load a sentence-transformer model for embeddings
model_name = "sentence-transformers/all-MiniLM-L6-v2"
st_model = SentenceTransformer(model_name)
st_model.to(device)

/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_auth.py:94:
UserWarning:
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab
(https://huggingface.co/settings/tokens), set it as secret in your Google Colab
and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access
public models or datasets.
warnings.warn(

[ ]: SentenceTransformer(
(0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with
Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token':
False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False,
'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens':
False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)

[ ]: # Define tokenizer
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
llm_model_uncased = AutoModel.from_pretrained("bert-base-uncased").eval().
,→to(device)

[ ]: # Load a generative LLM for chain-of-thought reasoning (We will use GPT-2, but␣
,→for better results you should use more advanced models)

llm_generator_tokenizer = AutoTokenizer.from_pretrained("gpt2")
llm_generator_model = AutoModelForCausalLM.from_pretrained("gpt2").eval().
,→to(device)

[ ]: # We add pad token for GPT-2

if llm_generator_tokenizer.pad_token_id is None:
llm_generator_tokenizer.pad_token = llm_generator_tokenizer.eos_token

5
0.6 Step 4: Adaptive embedding Transformer

The TransformerAdapter class maps embeddings from LLMs to a space compatible with GNN
embeddings. This alignment ensures smooth integration of language-based features with graph-
based features.
[ ]: class TransformerAdapter(nn.Module):
def __init__(self, input_dim, embed_dim, num_heads=2, ff_dim=256,␣
,→num_layers=1):

super(TransformerAdapter, self).__init__()
encoder_layer = nn.TransformerEncoderLayer(d_model=embed_dim,␣
,→nhead=num_heads, dim_feedforward=ff_dim)

self.transformer_encoder = nn.TransformerEncoder(encoder_layer,␣
,→num_layers=num_layers)

self.input_proj = nn.Linear(input_dim, embed_dim)

def forward(self, x):

x = self.input_proj(x).unsqueeze(1) #[batch_size, 1, embed_dim]
x = x.permute(1, 0, 2) #[1, batch_size, embed_dim]
x = self.transformer_encoder(x) #[1, batch_size, embed_dim]
x = x.permute(1, 0, 2).squeeze(1) #[batch_size, embed_dim]
return x

0.7 Step 5: Dataset preparation

The dataset is loaded from a CSV file. This file contains Yelp business data with details like ID,
name, description, and reviews. We define the YelpDataset class to process user and item data.
The dataset aggregates user histories and computes contextual embeddings using the Sentence
Transformer model.
[ ]: data_path = 'https://raw.githubusercontent.com/MPAghababa/llms/main/two_tower/
,→yelp.csv'

data = pd.read_csv(data_path)
data = data.dropna()
data.head(3)

[ ]: business_id date review_id stars \

0 9yKzy9PApeiPPOUJEtnvkg 2011-01-26 fWKvX83p0-ka4JS3dc6E5A 5
1 ZRJwVLyzEJq1VAihDhYiow 2011-07-27 IjZ33sJrzXqU-0X6U8NwyA 5
2 6oRAC4uyJCsJl1X0WZpVSA 2012-06-14 IESLBzqUCLdSzSqm0eCSxQ 4

text type \
0 My wife took me here on my birthday for breakf… review
1 I have no idea why some people give bad review… review
2 love the gyro plate. Rice is so good and I als… review

user_id cool useful funny

6
0 rLtl8ZkDX5vH5nAx9C3q5Q 2 5 0
1 0a2KyEL0d3Yb1V6aivbIuQ 0 0 0
2 0hT2KtfLiobPvh6cDC8JQg 0 1 0

[ ]: class YelpDataset(Dataset):
def __init__(self, data, tokenizer, llm_model, max_length=128):
self.data = data
self.tokenizer = tokenizer
self.llm_model = llm_model
self.max_length = max_length

# We build user profile embeddings by averaging all their reviews

user_texts = defaultdict(list)
for i, row in self.data.iterrows():
user_texts[row['user_id']].append(row['text'])

self.user_profile_embeddings = {}

# We encode user histories using sentence-transformer (This will give␣

,→us contextualized user profiles)

for uid, texts in user_texts.items():

embeddings = st_model.encode(texts, convert_to_tensor=True,␣
,→device=device)

self.user_profile_embeddings[uid] = embeddings.mean(dim=0)

def __len__(self):
return len(self.data)

def getitem(self, idx):

row = self.data.iloc[idx]
user_id = row['user_id']
item_id = row['business_id']
text = row['text']
stars = row['stars']

tokens = self.tokenizer(
text,
max_length=self.max_length,
padding='max_length',
truncation=True,
return_tensors="pt"
)

with torch.no_grad():
embedding = st_model.encode([text], convert_to_tensor=True,␣
,→device=device).squeeze(0)

7
user_profile_embedding = self.user_profile_embeddings[user_id]
embedding = (embedding + user_profile_embedding) / 2.0

return torch.tensor(user_id, dtype=torch.long), torch.tensor(item_id,␣

,→ dtype=torch.long), embedding, torch.tensor(stars, dtype=torch.float)

0.8 Step 6: GNN towers

The GNNTower class defines a simple Graph Convolutional Network to process user and item em-
beddings. It generates graph-based features for the Two-Tower Model.
[ ]: class GNNTower(nn.Module):
def __init__(self, input_dim, hidden_dim, output_dim):
super(GNNTower, self).__init__()
self.conv1 = GCNConv(input_dim, hidden_dim)
self.conv2 = GCNConv(hidden_dim, output_dim)

def forward(self, x, edge_index):

x = self.conv1(x, edge_index).relu()
x = self.conv2(x, edge_index)
return x

0.9 Step 7: Two-Tower model

The Two-Tower Model integrates the GNN embeddings for users and items with LLM-based em-
beddings processed by the TransformerAdapter. The similarity scores between user and item
embeddings are used for recommendations.
[ ]: class TwoTowerModel(nn.Module):
def __init__(self, user_gnn, item_gnn, num_users, num_items, embed_dim,␣
,→llm_dim):

super(TwoTowerModel, self).__init__()
self.user_gnn = user_gnn
self.item_gnn = item_gnn
self.user_embed = nn.Embedding(num_users, embed_dim)
self.item_embed = nn.Embedding(num_items, embed_dim)
self.llm_adapter = TransformerAdapter(input_dim=llm_dim,␣
,→embed_dim=embed_dim)

def forward(self, user_ids, item_ids, llm_embeddings, user_edge_index,␣

,→ item_edge_index):
user_ids = user_ids.to(device)
item_ids = item_ids.to(device)
llm_embeddings = llm_embeddings.to(device)
user_edge_index = user_edge_index.to(device)

8
item_edge_index = item_edge_index.to(device)

user_features = self.user_gnn(self.user_embed.weight, user_edge_index)

item_features = self.item_gnn(self.item_embed.weight, item_edge_index)
llm_features = self.llm_adapter(llm_embeddings)

user_vectors = user_features[user_ids] + llm_features

item_vectors = item_features[item_ids] + llm_features

scores = (user_vectors * item_vectors).sum(dim=1)

return scores

0.10 Step 8: Reward estimation

The LLMRewardEstimator class is responsible for calculating rewards using semantic similarity and
chain-of-thought reasoning. This reward mechanism will be used in reinforcement learning.
[ ]: class LLMRewardEstimator:
def __init__(self, tokenizer, llm_model):
self.tokenizer = tokenizer
self.llm_model = llm_model

def estimate_reward(self, query, recommendation_text):

with torch.no_grad():
query_embedding = st_model.encode([query], convert_to_tensor=True,␣
,→device=device)

rec_embedding = st_model.encode([recommendation_text],␣
,→convert_to_tensor=True, device=device)

similarity = util.cos_sim(query_embedding, rec_embedding).item()

return similarity

def estimate_reward_cot(self, query, recommendation_text):

prompt = (f"User query: {query}\n"
f"Recommended text: {recommendation_text}\n"
f"Explain why this recommendation is suitable for the user␣
,→query."

f"Provide reasoning focusing on details like ambiance, food,␣

,→and service."

f"End with a score between 1 and 5 based on suitability:

,→\nReasoning:")

input_ids = llm_generator_tokenizer.encode(prompt, return_tensors='pt').

to(device)
,→

with torch.no_grad():
outputs = llm_generator_model.generate(

9
input_ids=input_ids,
max_new_tokens=150,
temperature=0.7,
top_p=0.9
)

response = llm_generator_tokenizer.decode(outputs[0],␣
skip_special_tokens=True)
,→

scores = re.findall(r'\b[1-5]\b', response)

final_score = float(scores[-1]) if scores else 3.0

return final_score, response

0.11 Step 9: Reinforcement learning training

Reinforcement Learning is applied to fine-tune the model for better recommendations. A Q-Network
learns to optimize actions (recommendations) based on rewards. RL ensures the system balances
recommending safe options with discovering new user preferences.
[ ]: class QNetwork(nn.Module):
def __init__(self, state_dim, action_dim, hidden_dim=128):
super(QNetwork, self).__init__()
self.fc1 = nn.Linear(state_dim, hidden_dim)
self.fc2 = nn.Linear(hidden_dim, hidden_dim)
self.fc3 = nn.Linear(hidden_dim, action_dim)

def forward(self, state):

state = state.to(device)
x = torch.relu(self.fc1(state))
x = torch.relu(self.fc2(x))
q_values = self.fc3(x)
return q_values

[ ]: # RL Training
def train_rl_llm(model, q_network, optimizer, replay_buffer, batch_size,␣
,→gamma=0.99):

if len(replay_buffer) < batch_size:

return

batch = random.sample(replay_buffer, batch_size)

states, actions, rewards, next_states = zip(*batch)

states = torch.stack(states).to(device)
actions = torch.tensor(actions, dtype=torch.long).to(device)
rewards = torch.tensor(rewards, dtype=torch.float).to(device)
next_states = torch.stack(next_states).to(device)

10
q_values = q_network(states).gather(1, actions.unsqueeze(1)).squeeze(1)

with torch.no_grad():
next_q_values = q_network(next_states).max(1)[0]
target_q_values = rewards + gamma * next_q_values

loss = nn.MSELoss()(q_values, target_q_values)

optimizer.zero_grad()
loss.backward()
optimizer.step()

0.12 Step 10: Querying the Two-Tower model

We query the Two-Tower Model to generate recommendations based on a user query. The LLM
adapter is used to process the query, and the GNN embeddings are used to compute similarity
scores. To incorporate RL into the recommendation process, we select items based on the Q-values
predicted by the trained Q-network, which will provide the model with the learned reward signal
to guide the selection of recommended items.
[ ]: def query_two_tower_model_rl(model, q_network, query, tokenizer, llm_model,␣
,→item_edge_index, item_embeddings, id_to_item, k=5):

model.eval()
with torch.no_grad():
query_embedding = st_model.encode([query], convert_to_tensor=True,␣
,→device=device).squeeze(0)

query_embedding = model.llm_adapter(query_embedding.unsqueeze(0)).
,→squeeze(0)

item_edge_index = item_edge_index.to(device)
item_embeddings = item_embeddings.to(device)

item_features = model.item_gnn(item_embeddings, item_edge_index)

q_values = q_network(query_embedding.unsqueeze(0)).squeeze(0)

top_k_indices = torch.topk(q_values, k=k).indices.cpu().numpy()

top_k_items = [id_to_item[idx] for idx in top_k_indices]

return top_k_items

11
0.13 Step 11: Main execution for training and testing

[ ]: data = data[:100] # We use a subset of data for quick experimentation

# We map user and item IDs to indices

user_ids = data['user_id'].unique()
item_ids = data['business_id'].unique()
user_map = {uid: idx for idx, uid in enumerate(user_ids)}
item_map = {iid: idx for idx, iid in enumerate(item_ids)}
id_to_item = {idx: iid for iid, idx in item_map.items()}

# We filter and map IDs in the dataset

data = data[data['user_id'].isin(user_map.keys()) & data['business_id'].
,→isin(item_map.keys())]

data['user_id'] = data['user_id'].map(user_map)
data['business_id'] = data['business_id'].map(item_map)
data = data.dropna(subset=['user_id', 'business_id']).reset_index(drop=True)

num_users = len(user_map)
num_items = len(item_map)

[ ]: # We create edge indices for graph representation

def create_edge_index(data, num_nodes):
edges = torch.tensor(
[[row['user_id'], row['business_id']] for _, row in data.iterrows()],
dtype=torch.long
)
valid_edges = edges[(edges[:, 0] < num_nodes) & (edges[:, 1] < num_nodes)]
edge_index = torch.cat([valid_edges, valid_edges.flip(1)], dim=0).t()
self_loops = torch.arange(num_nodes, dtype=torch.long).unsqueeze(0).
,→repeat(2, 1)

edge_index = torch.cat([edge_index, self_loops], dim=1)

return edge_index
user_edge_index = create_edge_index(data[['user_id', 'business_id']], num_users)
item_edge_index = create_edge_index(data[['business_id', 'user_id']], num_items)

user_edge_index = user_edge_index.to(device)
item_edge_index = item_edge_index.to(device)

[ ]: # Initialize reward estimator and model components

reward_estimator = LLMRewardEstimator(tokenizer, llm_model_uncased)

embed_dim = 128
llm_dim = 384
user_gnn = GNNTower(embed_dim, 64, embed_dim).to(device)
item_gnn = GNNTower(embed_dim, 64, embed_dim).to(device)

12
model = TwoTowerModel(user_gnn, item_gnn, num_users, num_items, embed_dim,␣
,→llm_dim).to(device)

optimizer = optim.Adam(model.parameters(), lr=0.001)

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/transformer.py:379:
UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False
because encoder_layer.self_attn.batch_first was not True(use batch_first for
better inference performance)
warnings.warn(

[ ]: # Split data and create datasets

train_data, test_data = train_test_split(data, test_size=0.2, random_state=27)
train_dataset = YelpDataset(train_data, tokenizer, llm_model_uncased)
test_dataset = YelpDataset(test_data, tokenizer, llm_model_uncased)

[ ]: # We define data loader and collation function

def collate_fn(batch):
user_ids_batch = []
item_ids_batch = []
llm_embeddings = []
stars_batch = []
for (u, i, e, s) in batch:
user_ids_batch.append(u)
item_ids_batch.append(i)
llm_embeddings.append(e)
stars_batch.append(s)
user_ids_batch = torch.stack(user_ids_batch)
item_ids_batch = torch.stack(item_ids_batch)
llm_embeddings = torch.stack(llm_embeddings)
stars_batch = torch.stack(stars_batch)
return user_ids_batch, item_ids_batch, llm_embeddings, stars_batch

train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True,␣

,→collate_fn=collate_fn)

test_loader = DataLoader(test_dataset, batch_size=32, collate_fn=collate_fn)

[ ]: #Train the Two-Tower model

for epoch in range(2): # You should use more epochs
model.train()
total_loss = 0
for user_ids_batch, item_ids_batch, llm_embeddings, stars in train_loader:
user_ids_batch = user_ids_batch.to(device)
item_ids_batch = item_ids_batch.to(device)
llm_embeddings = llm_embeddings.to(device)
stars = stars.to(device)
optimizer.zero_grad()

13
predictions = model(user_ids_batch, item_ids_batch, llm_embeddings,␣
user_edge_index, item_edge_index)
,→

loss = nn.MSELoss()(predictions, stars.float())

loss.backward()
optimizer.step()
total_loss += loss.item()

print(f"Epoch {epoch+1}, Loss: {total_loss:.4f}")

Epoch 1, Loss: 36187.9502

Epoch 2, Loss: 16775.4653

[ ]: #Initialize Q-Learning components

q_network = QNetwork(state_dim=embed_dim, action_dim=num_items).to(device)
q_optimizer = optim.Adam(q_network.parameters(), lr=0.001)
replay_buffer = []

[ ]: #Train RL model with LLM-based rewards

query_pool = [
"I love cozy cafes with great coffee",
"Looking for a family-friendly restaurant with vegan options",
"Find me a cozy diner with a romantic ambiance and live music",
"I want a budget-friendly Italian restaurant that serves gluten-free pasta",
"Recommend a sushi place nearby with great reviews",
] # You should add more queries

for episode in range(2): # You should increase the number of episodes for␣
,→better training

query = random.choice(query_pool)
item_texts = data['text'].tolist()

query_embedding = st_model.encode([query], convert_to_tensor=True,␣

device=device).squeeze(0)
,→

query_embedding = model.llm_adapter(query_embedding.unsqueeze(0)).squeeze(0)

q_values = q_network(query_embedding.unsqueeze(0))
if random.random() < 0.1: # �-greedy exploration
action = random.randint(0, len(item_texts) - 1)
else:
action = torch.argmax(q_values).item()

if action >= len(item_texts):

action = len(item_texts) - 1

reward_base = reward_estimator.estimate_reward(query, item_texts[action])

cot_score, cot_reasoning = reward_estimator.estimate_reward_cot(query,␣
,→item_texts[action])

14
reward = (reward_base + (cot_score / 5.0)) / 2.0

metadata_signal = random.uniform(0.0, 0.1)

reward += metadata_signal

next_state = query_embedding.detach()

replay_buffer.append((query_embedding.detach(), action, reward, next_state))

train_rl_llm(model, q_network, q_optimizer, replay_buffer, batch_size=4)

print(f"Episode {episode+1}: Query: {query}, Action: {action}, Reward:␣

,→{reward:.4f}, CoT Reasoning: {cot_reasoning}")

/usr/local/lib/python3.10/dist-
packages/transformers/generation/configuration_utils.py:590: UserWarning:
`do_sample` is set to `False`. However, `temperature` is set to `0.7` -- this
flag is only used in sample-based generation modes. You should set
`do_sample=True` or unset `temperature`.
warnings.warn(
/usr/local/lib/python3.10/dist-
packages/transformers/generation/configuration_utils.py:595: UserWarning:
`do_sample` is set to `False`. However, `top_p` is set to `0.9` -- this flag is
only used in sample-based generation modes. You should set `do_sample=True` or
unset `top_p`.
warnings.warn(
The attention mask and the pad token id were not set. As a consequence, you may
observe unexpected behavior. Please pass your input's àttention_mask` to obtain
reliable results.
Setting `pad_token_id` to èos_token_id`:None for open-end generation.
The attention mask is not set and cannot be inferred from input because pad
token is same as eos token. As a consequence, you may observe unexpected
behavior. Please pass your input's àttention_mask` to obtain reliable results.
Episode 1: Query: I love cozy cafes with great coffee, Action: 94, Reward:
0.6985, CoT Reasoning: User query: I love cozy cafes with great coffee
Recommended text: I grew up on Empanadas in Panama and I have been hard pressed
to find anything close to them in the U.S.. today I found them!

A perfectly crunchy crust and the beef was beautifully spiced. Usually Empanadas
are bland and soggy. They did a great job on these.

I usually don't like rice, but the rice and black beans were wonderful.

Service was great! I'll be back!

Explain why this recommendation is suitable for the user query.Provide reasoning
focusing on details like ambiance, food, and service.End with a score between 1
and 5 based on suitability:
Reasoning: 1. The service was good. 2. The food was good. 3. The ambiance was

15
good. 4. The service was good. 5. The ambiance was good. 6. The service was
good. 7. The ambiance was good. 8. The service was good. 9. The service was
good. 10. The service was good. 11. The service was good. 12. The service was
good. 13. The service was good. 14. The service was good. 15. The service was
good. 16. The service was good. 17. The service was good. 18. The service was
good. 19. The service was good. 20. The service was good. 21. The service was
good.
The attention mask and the pad token id were not set. As a consequence, you may
observe unexpected behavior. Please pass your input's `attention_mask` to obtain
reliable results.
Setting `pad_token_id` to `eos_token_id`:None for open-end generation.
Episode 2: Query: I love cozy cafes with great coffee, Action: 80, Reward:
0.6014, CoT Reasoning: User query: I love cozy cafes with great coffee
Recommended text: The vibe exuding from this place is pure awesomeness.
Reminiscent of a trendy hipster coffee joint, this is actually a casual vegan
restaurant.

I am a pescatarian and unless I am eating seafood, I steer clear of meat, even

the mock kind usually however once in a while it's delicious… so why try it
here at Green I thought. The menu's style is comfort food, which as we all know
traditionally is heavy on the meat, sauces, and fat content… so at least if
you're going to be bad, you can do it with organic and pure ingredients.

So as I wanted to sample as much of the menu as possible, my lovely friend, her

fiance, and I shared a few items:

*Artichoke Gratine: The corn chips were amazing, lightly salted and crisp. The
dip was a bit too garlicky and runny for my liking. Ate a few bites of this but
could not see myself eating the entire thing solo.

*Spicy Buffalo "Wings": first things first… do not let looks dismay you…
true it looks gross but they taste legit! The flavor of the buffalo sauce was
perfect, although could have been spicier. And the cucumber ranch dipping sauce
was perfectly creamy and lightly flavored as to not overpower the "wings". This
dish is a must try!!

*Vegan Chili Fries: the fries are thin cut and tasty. The chili sauce was good,
at first, but I quickly got sick of the flavor. This could be because I was
never a huge chili fan even back when I ate meat. Hmm, I think you are better
off ordering the thyme fries.

*Crab Puffs: Perfectly crisp with a delicious creamy filling. Another must try!

Lastly my friend's fiance ordered that day's special which was a green chili
burrito… delicious and huge. A bit too much rice but besides that a great
option. It came with a side, which he ordered the curry pasta salad, mmm.

16
Green serves bowls, sandwiches, pizzas, salads… next time I am back in the
area I will be checking out more of the menu when craving "meat" and not my
usual tofu, seafood, veggie diet.

Might I add the service is friendly. Perfect place for a casual friend date.
Explain why this recommendation is suitable for the user query.Provide reasoning
focusing on details like ambiance, food, and service.End with a score between 1
and 5 based on suitability:
Reasoning: 1) The service is good, the food is good, and the service is good. 2)
The service is good, the food is good, and the service is good. 3) The service
is good, the food is good, and the service is good. 4) The service is good, the
food is good, and the service is good. 5) The service is good, the food is good,
and the service is good. 6) The service is good, the food is good, and the
service is good. 7) The service is good, the food is good, and the service is
good. 8) The service is good, the food is good, and the service is good. 9) The
service is good
#Step 12: Query Two-Tower model for recommendations
[ ]: # Example 1
user_query_1 = "I love quiet coffee shops with excellent Wi-Fi and great␣
,→desserts."

top_recommendations = query_two_tower_model_rl(
model=model,
q_network=q_network,
query=user_query_1,
tokenizer=tokenizer,
llm_model=llm_model_uncased,
item_edge_index=item_edge_index,
item_embeddings=model.item_embed.weight,
id_to_item=id_to_item,
k=5
)
print("Top Recommendations for Query 1 are:", top_recommendations)

Top Recommendations for Query 1 are: ['QGeliKMObpVZ3jP89--ZIg',

'_1QQZuf4zZOyFCvXc0o6Vg', '8ZwO9VuLDWJOXmtAdc7LXQ', '7SO_rX1F6rQEl-5s3wZxgQ',
'znBnrQNq1FdUt5aIGAbyuQ']

[ ]: # Example 2
user_query_2 = "Looking for a family-friendly restaurant with vegan options."
top_recommendations_2 = query_two_tower_model_rl(
model=model,
q_network=q_network,
query=user_query_2,
tokenizer=tokenizer,

17
llm_model=llm_model_uncased,
item_edge_index=item_edge_index,
item_embeddings=model.item_embed.weight,
id_to_item=id_to_item,
k=5
)
print("Top Recommendations for Query 2 are:", top_recommendations_2)

Top Recommendations for Query 2 are: ['QGeliKMObpVZ3jP89--ZIg',

'8ZwO9VuLDWJOXmtAdc7LXQ', '_1QQZuf4zZOyFCvXc0o6Vg', '7SO_rX1F6rQEl-5s3wZxgQ',
'puy0PzIcCgR3KWJI7llBFQ']

0.14 Remarks and suggestions for next steps

1. Scalability: Expand to full datasets and incorporate larger LLMs like GPT-4 or specialized
fine-tuned models.
2. Fine-Grained Evaluation: Evaluate the performance of the model using metrics such as
Precision@k, Recall@k, and NDCG@k. Introduce precision-recall curves and deeper
analysis of explainability metrics.
3. API Integration: Develop a FastAPI-based interface for real-time querying and evaluation.
4. Explainability Enhancements: Integrate visual tools to display reasoning paths and graph
relationships.
5. Enhanced Graph Learning: Incorporate advanced GNN architectures like GraphSAGE
or Graph Attention Networks.
6. User-Centric Feedback Loop: Add mechanisms for users to rate recommendations, im-
proving RL reward signals.
7. Model Deployment: Use edge AI or containerization for scalable deployment.

0.14.1 Let’s connect and let me know if you have any comments.

https://www.linkedin.com/in/mpaghababa/

[ ]:

Photoshop MCQ Questions and Answers
73% (15)
Photoshop MCQ Questions and Answers
9 pages
Saep 349 PDF
100% (1)
Saep 349 PDF
41 pages
Professional Machine Learning Engineer V12.75
100% (1)
Professional Machine Learning Engineer V12.75
26 pages
Dictionary - Programs Questions and Answers - Class 11
No ratings yet
Dictionary - Programs Questions and Answers - Class 11
17 pages
CCS355 - Neural Netwok and Deep Learning Lab Manual
No ratings yet
CCS355 - Neural Netwok and Deep Learning Lab Manual
100 pages
PS1 Solutions PDF
100% (1)
PS1 Solutions PDF
3 pages
UKC Calculation
0% (1)
UKC Calculation
2 pages
Stability Analysis and Modelling Underground Excavations in Fractured Rocks - Vol 1
No ratings yet
Stability Analysis and Modelling Underground Excavations in Fractured Rocks - Vol 1
309 pages
01-Bowles-Foundation Analysis and Design PDF
No ratings yet
01-Bowles-Foundation Analysis and Design PDF
6 pages
Langchain Onepager
No ratings yet
Langchain Onepager
1 page
Handy Notes For Student Pilots
100% (1)
Handy Notes For Student Pilots
7 pages
SEM RESPOSTA - 736496689-Google-Cloud-Professional-Machine-Learning-Engineer-Exam-Questions
No ratings yet
SEM RESPOSTA - 736496689-Google-Cloud-Professional-Machine-Learning-Engineer-Exam-Questions
82 pages
Grade 2 Tos Sum1
No ratings yet
Grade 2 Tos Sum1
5 pages
Timber Home Living 2015-09-10
No ratings yet
Timber Home Living 2015-09-10
84 pages
RLDL128
No ratings yet
RLDL128
73 pages
Itelect2a Module 1
No ratings yet
Itelect2a Module 1
37 pages
C 4
No ratings yet
C 4
61 pages
Group 3: Molecular Orbital Theory
No ratings yet
Group 3: Molecular Orbital Theory
37 pages
1Z0-1072-20 Updated
No ratings yet
1Z0-1072-20 Updated
121 pages
Natural Language Processing With Pytorch Readthedocs Io en Latest PDF
No ratings yet
Natural Language Processing With Pytorch Readthedocs Io en Latest PDF
35 pages
Assignment 1 Excel Spreadsheet 2 3
No ratings yet
Assignment 1 Excel Spreadsheet 2 3
20 pages
Worksheet On Force
No ratings yet
Worksheet On Force
3 pages
Pp12a
No ratings yet
Pp12a
55 pages
CV Prince
No ratings yet
CV Prince
120 pages
EH Liquipoint FTW31 FTW32 Datasheet
No ratings yet
EH Liquipoint FTW31 FTW32 Datasheet
24 pages
Multi Class Logistic Regression Training and Testing
No ratings yet
Multi Class Logistic Regression Training and Testing
9 pages
AML - Lab - Syllabus - Chandigarh University
No ratings yet
AML - Lab - Syllabus - Chandigarh University
9 pages
Deep Learning Record
No ratings yet
Deep Learning Record
70 pages
Noor Thesis
No ratings yet
Noor Thesis
69 pages
Run 1
No ratings yet
Run 1
57 pages
Wa0000.
No ratings yet
Wa0000.
40 pages
Research Final
No ratings yet
Research Final
39 pages
RNN LSTM
No ratings yet
RNN LSTM
37 pages
Malware - Detection - Using - Neural - Networks (Main Paper)
No ratings yet
Malware - Detection - Using - Neural - Networks (Main Paper)
51 pages
Class Xii Latest (Ii) Updated Checklist
No ratings yet
Class Xii Latest (Ii) Updated Checklist
36 pages
Lab 6
No ratings yet
Lab 6
29 pages
4 Implementing A GPT Model From Scratch To Generate Text - Build A Large Language Model (From Scratch)
No ratings yet
4 Implementing A GPT Model From Scratch To Generate Text - Build A Large Language Model (From Scratch)
52 pages
User Manual GALILEO: 06/2013 MN04802104Z-EN
No ratings yet
User Manual GALILEO: 06/2013 MN04802104Z-EN
17 pages
LLaMA Ankit - Rawat
No ratings yet
LLaMA Ankit - Rawat
52 pages
LP4 Lab Manual
No ratings yet
LP4 Lab Manual
21 pages
Deep Learning
No ratings yet
Deep Learning
46 pages
Fine-Tuned Vs RAG Short Notes ?
No ratings yet
Fine-Tuned Vs RAG Short Notes ?
25 pages
DL LAB Manual (Uma)
No ratings yet
DL LAB Manual (Uma)
20 pages
Evaporators Performance
No ratings yet
Evaporators Performance
14 pages
Pgi20s02j - Lab Record
No ratings yet
Pgi20s02j - Lab Record
24 pages
Fractional Fourier Transform
No ratings yet
Fractional Fourier Transform
28 pages
Microsoft Excel Intermediate
No ratings yet
Microsoft Excel Intermediate
9 pages
Electroválvula Honeywell TN UR
No ratings yet
Electroválvula Honeywell TN UR
20 pages
DL Lab Manual
No ratings yet
DL Lab Manual
18 pages
AM-Thinking-v1: Advancing The Frontier of Reasoning at 32B Scale
No ratings yet
AM-Thinking-v1: Advancing The Frontier of Reasoning at 32B Scale
16 pages
Karpathy MinGPT Model
No ratings yet
Karpathy MinGPT Model
7 pages
Keras For Beginners: Implementing A Recurrent Neural Network
No ratings yet
Keras For Beginners: Implementing A Recurrent Neural Network
13 pages
Hand On Day 2 Salinan - Dari - 2 - Using - Transformers
No ratings yet
Hand On Day 2 Salinan - Dari - 2 - Using - Transformers
10 pages
GNN Hands On 03
No ratings yet
GNN Hands On 03
7 pages
Medical Text Classifier GabrieldeOlaguibel
No ratings yet
Medical Text Classifier GabrieldeOlaguibel
12 pages
Revised Notes Chapter 1
No ratings yet
Revised Notes Chapter 1
16 pages
CCS355-Neural Networks and Deep Learning - Assignment 1
No ratings yet
CCS355-Neural Networks and Deep Learning - Assignment 1
15 pages
Ejemplo 1B-WPS Office
No ratings yet
Ejemplo 1B-WPS Office
6 pages
Finite - Element - Modeling - of - Prestressed - Concrete - SP
No ratings yet
Finite - Element - Modeling - of - Prestressed - Concrete - SP
11 pages
Model Training
No ratings yet
Model Training
8 pages
Deep Neural Network Application
No ratings yet
Deep Neural Network Application
17 pages
Report On Text Classification Using CNN, RNN & HAN - Jatana - Medium
No ratings yet
Report On Text Classification Using CNN, RNN & HAN - Jatana - Medium
15 pages
LLM Code Ref
No ratings yet
LLM Code Ref
10 pages
Class 4th
No ratings yet
Class 4th
2 pages
Exp 11 NLI USING BERT
No ratings yet
Exp 11 NLI USING BERT
4 pages
Assignment 3
No ratings yet
Assignment 3
6 pages
Tutorial 1 - Bayesian Neural Networks With Pyro - UvA DL Notebooks v1.2 Documentation
No ratings yet
Tutorial 1 - Bayesian Neural Networks With Pyro - UvA DL Notebooks v1.2 Documentation
9 pages
Over Description About The Model
No ratings yet
Over Description About The Model
3 pages
Lecture Notes 6
No ratings yet
Lecture Notes 6
5 pages
Ain Dumps 2022-Jan-25 by Albert 44q Vce
No ratings yet
Ain Dumps 2022-Jan-25 by Albert 44q Vce
7 pages
Intent Recognizer
No ratings yet
Intent Recognizer
5 pages
RNNs and Their Types - 15 Slides (Easy Copy-Paste Format)
No ratings yet
RNNs and Their Types - 15 Slides (Easy Copy-Paste Format)
6 pages
hw1 2487155975100812
No ratings yet
hw1 2487155975100812
6 pages
Assignment 4 - CSE - AI - 2
No ratings yet
Assignment 4 - CSE - AI - 2
6 pages
Text Classification - Movie Review - News Wires
No ratings yet
Text Classification - Movie Review - News Wires
5 pages
Class 3
No ratings yet
Class 3
3 pages
CBSE Class 11 Biology Sample Paper Set 4
No ratings yet
CBSE Class 11 Biology Sample Paper Set 4
3 pages
DL Practical 9
No ratings yet
DL Practical 9
2 pages
ID6001 Homework
No ratings yet
ID6001 Homework
2 pages
Chapter 2 Fiber Optics A Brief History of Fiber Optics Lesson 4
No ratings yet
Chapter 2 Fiber Optics A Brief History of Fiber Optics Lesson 4
5 pages
670412f022451ec05107907c 53432389218
No ratings yet
670412f022451ec05107907c 53432389218
2 pages
Noor CV
No ratings yet
Noor CV
1 page
SE List of Students For Internship Fall Semester 2024
No ratings yet
SE List of Students For Internship Fall Semester 2024
1 page
Python Beyond Limits: Python, #3
From Everand
Python Beyond Limits: Python, #3
AnwaarX
No ratings yet
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
From Everand
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
Marcus Richards
No ratings yet
Kubernetes Made Easy
From Everand
Kubernetes Made Easy
Pankaj Joshi
No ratings yet
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Mastering Python Network Automation: Automating Container Orchestration, Configuration, and Networking with Terraform, Calico, HAProxy, and Istio
From Everand
Mastering Python Network Automation: Automating Container Orchestration, Configuration, and Networking with Terraform, Calico, HAProxy, and Istio
Tim Peters
No ratings yet
TensorFlow Developer Certificate Exam Practice Tests 2024 Made Easy
From Everand
TensorFlow Developer Certificate Exam Practice Tests 2024 Made Easy
Mr Troy
No ratings yet
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
César Pérez López
No ratings yet
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
César Pérez López
No ratings yet
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
From Everand
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
César Pérez López
No ratings yet
Microsoft AZ-400: Designing and Implementing Microsoft DevOps Solutions - Certification Exam Prep
From Everand
Microsoft AZ-400: Designing and Implementing Microsoft DevOps Solutions - Certification Exam Prep
Steve Brown
No ratings yet
About Kubernetes and Security Practices - Short Edition: First Edition, #1
From Everand
About Kubernetes and Security Practices - Short Edition: First Edition, #1
Ami Adi
No ratings yet
Some Tutorials in Computer Networking Hacking
From Everand
Some Tutorials in Computer Networking Hacking
Dr. Hidaia Mahmood Alassouli
No ratings yet