0% found this document useful (0 votes)
18 views

Developing a Recommender System Using Large Language Models

This dissertation presents the development of a movie recommendation engine utilizing large language models to address challenges such as the cold start and explainability problems in existing systems. The research involves fine-tuning a 13 billion parameter language model to generate personalized recommendations and explanations based on user preferences expressed in natural language. Experimental results indicate that this approach effectively enhances recommendation accuracy and user satisfaction.

Uploaded by

josh.tade
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

Developing a Recommender System Using Large Language Models

This dissertation presents the development of a movie recommendation engine utilizing large language models to address challenges such as the cold start and explainability problems in existing systems. The research involves fine-tuning a 13 billion parameter language model to generate personalized recommendations and explanations based on user preferences expressed in natural language. Experimental results indicate that this approach effectively enhances recommendation accuracy and user satisfaction.

Uploaded by

josh.tade
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

Developing a movie

recommendation engine using


large language models.

Course: CMT-400 Dissertation project


Author: Joshua Oguntade (C2082222)
Degree: MSc, Advanced computer science.
Supervisor: Prof. Irena Spasic
Date: 19th September 2023
School: School of Computer Science and Informatics, Cardiff University.
Abstract
Recommendation systems have become part of our daily lives, we encounter one when we shop, listen to
music, watch videos and even when we are choosing who to date.

Despite the ubiquitousness and success of recommendation systems, they still face huge challenges such
as the explainability problem.

This research proposed an implementation of a recommendation system leveraging the significant


capabilities of large language models to solve some of the problems faced in modern recommender
systems by converting user preferences into prompts for generating recommendation choices with relevant
explanations.

To achieve this, we fine-tuned a language model of 13 billion parameters to take user preferences in
natural language format as input, with instructions to provide recommendation choices that matches user
preferences most and provide explanations during inference. Our experimental results demonstrate that our
solution performs excellently in solving the cold start and explainability problems encountered in modern
recommendation systems.

Acknowledgements
I would like to acknowledge the efforts and guidance of my supervisor, Professor Irena Spasic, whose
insights and feedback helped shape this project into comprehensive and impactful work. I would also like to
acknowledge my family whose support made any of this possible.
Table of content

Abstract ........................................................................................................................................................................... 2
Acknowledgements......................................................................................................................................................... 2
Table of content .............................................................................................................................................................. 3
1. Introduction.............................................................................................................................................................. 5
1.1 Introduction ....................................................................................................................................................... 5
1.2 Problem............................................................................................................................................................... 5
1.2.1 Cold start problem .................................................................................................................................... 5
1.2.2 Sparsity problem ....................................................................................................................................... 5
1.2.3 Explainability problem ............................................................................................................................. 6
1.3 Research objectives ........................................................................................................................................ 6
1.4 Scope and Limitations .................................................................................................................................... 6
1.5 Organization ...................................................................................................................................................... 7
1.6 Conclusion ......................................................................................................................................................... 7
2. Literature Review.................................................................................................................................................... 8
2.1 Introduction ....................................................................................................................................................... 8
2.2 Recommender Systems.................................................................................................................................. 8
2.3 Large language models .................................................................................................................................. 9
3. Architecture ........................................................................................................................................................... 12
3.1 Introduction ..................................................................................................................................................... 12
3.2 Architectural goals and objectives ............................................................................................................ 12
3.2.1 Goals .......................................................................................................................................................... 12
3.2.2 Architectural objectives ........................................................................................................................ 12
3.3 Architectural styles and patterns ............................................................................................................... 13
3.4 System components and modules ............................................................................................................ 14
3.4.1 The web service component ................................................................................................................ 15
3.4.2 The AI service component .................................................................................................................... 15
3.4.3 The LLM model component .................................................................................................................. 16
3.5 Data flow and communication .................................................................................................................... 16
3.6 Conclusion ....................................................................................................................................................... 17
4. Implementation ..................................................................................................................................................... 18
4.1 Introduction ..................................................................................................................................................... 18
4.2 Software requirements.................................................................................................................................. 18
4.3 Software Process Model ............................................................................................................................... 18
4.4 System Overview ............................................................................................................................................ 19
4.4 High-Level Design .......................................................................................................................................... 21
4.4.1 User Interaction Flow ............................................................................................................................. 21
4.4.2 Language model Integration. ............................................................................................................... 21
4.4.3 Postgres Database Integration ............................................................................................................ 21
4.5 Implementation ............................................................................................................................................... 22
4.5.1 Preferences page .................................................................................................................................... 22
4.5.2 Recommendation results page............................................................................................................ 25
4.7 Conclusion ....................................................................................................................................................... 26
5. Result and Evaluation ......................................................................................................................................... 27
5.1 Introduction ..................................................................................................................................................... 27
5.2 Experimental setup ........................................................................................................................................ 27
5.2.1 Dataset ....................................................................................................................................................... 27
5.2.2 Metrics ....................................................................................................................................................... 27
5.2.3 Result ......................................................................................................................................................... 28
6 Reflection................................................................................................................................................................. 29
6.1 Introduction ..................................................................................................................................................... 29
6.2 Motivations ...................................................................................................................................................... 29
6.3 Challenges ....................................................................................................................................................... 29
6.4 Lessons Learned ............................................................................................................................................ 29
7. Conclusion and future work ............................................................................................................................. 31
7.1 Introduction ..................................................................................................................................................... 31
7.2 Contributions .................................................................................................................................................. 31
7.3 Future work ...................................................................................................................................................... 31
References .................................................................................................................................................................. 33
1. Introduction
1.1 Introduction
Since the early 1990s, researchers have been looking for ways to harness the opinions of people online in
an effort to distinguish useful information from noise (Jannach et al, 2010).

This has become particularly relevant given the amount of data generated in this century. According to IBM,
we currently generate 2.5 quintillion bytes of data daily (IBM, 2020).
Every minute, Spotify adds 13 new songs; 4.2 million videos are watched on YouTube, Instagram posts
46,700 photos, and 3.6 million searches are done on Google (Schrage, 2020) and these numbers are only
going to keep going up.

This is why companies are increasingly adopting recommendations systems as the first line of defense
against consumer over-choice to great success: 80% of what people watch on Netflix comes from their
recommendation engine (Gomez-Uribe et al, 2016), Alibaba tripled its gross merchandise volume (GMV)
within a year due to their machine-learning enhanced recommenders and TikTok is the fasting growing
social media platform due to its recommendation engine (Schrage, 2020).

Modern recommender systems have been leveraging the advances made in deep learning to improve their
performance by overcoming problems faced in conventional models (Zhang et al, 2019).
YouTube leverages a deep neural network-based recommendation algorithm for video recommendation
(Covington et al, 2016), Yahoo! News uses an RNN-based recommender system (Shumpei et al, 2017)
and Facebook uses a state-of-the-art deep learning recommendation model (DLRM) to tackle
personalization and recommendation tasks (Naumov et al, 2019).

1.2 Problem
Despite the success of various recommendation methods such as collaborative filtering (CF) (Herlock et al,
1999), content-based methods (Balabanović et al, 1997), and deep learning-based methods (Covington et
al, 2016), there are still existing challenges faced by modern recommender engines: cold start problem
(Lika et al, 2014), sparsity problem (Wang et al, 2018) and explainability problem (Chen et al, 2022).

1.2.1 Cold start problem


The "Cold Start" problem is a common challenge in recommendation systems that arises when the system
encounters new users or items for which it lacks sufficient historical data. In other words, the system has
limited or no information about the preferences, behaviors, or attributes of these users or items. This
situation hampers the system's ability to generate accurate and relevant recommendations.

1.2.2 Sparsity problem


In most real-world scenarios, users interact with only a fraction of available items, resulting in a sparse
matrix where most entries are missing. This sparsity poses a hurdle for recommender engines that rely on
historical data to make accurate predictions. This can lead to suboptimal or inaccurate recommendations,
as the system struggles to identify hidden preferences or similarities among users and items due to lack of
sufficient information to establish meaningful patterns and relationships between users and items.
1.2.3 Explainability problem
As recommendation systems become more sophisticated, users increasingly demand transparency and
insights into why certain recommendations are made. The explainability problem arises when users
struggle to understand the rationale behind the system's recommendations, leading to a lack of trust,
decreased user satisfaction, and potential privacy concerns.

Serendipitous recommendations are not sufficient; people need to be able to see and grasp the underlying
“why.” “Recommendation explainability” has consequently become one of the most major areas in
recommender systems research (Schrage, 2020).

1.3 Research objectives


The primary research goal of this project is to design, develop, and evaluate a recommendation engine that
harnesses the power of a large language model to address the cold start and explainability problems of
traditional recommendation systems. This engine will aim to provide accurate, context-aware, and
explainable recommendations that enhance user satisfaction and engagement.

It plans to do this by providing answers to the following research questions:

• Can a large language model-based recommendation engine solve the cold start problem by
providing relevant recommendations without knowing the user’s movie history?
• Can a large language model-based recommendation engine solve the sparsity problem by providing
relevant recommendations without knowing the relationship between user and movies such as
ratings and reviews?
• Can a large language model-based recommendation engine solve the explainability problem by
generating relevant explanation for every recommendation choice it makes?

The objectives of the project are:

1. Architectural Design: Formulate an architectural design that seamlessly integrates a large


language model into the recommendation process. This design should account for real-time
responsiveness, scalability, and effective management of challenges like cold start and
explainability.
2. Language Model Integration: Seamlessly integrate a pre-trained large language model into the
recommendation engine. This integration should empower the engine to comprehend semantic
nuances within user interactions, item descriptions, and other types of data.
3. Recommendation Generation: Develop recommendation algorithms that capitalize on the insights
provided by the integrated language model. These algorithms should generate personalized content
suggestions, effectively addressing issues such as the cold start problem and explainability.
4. Explainability Enhancement: Enhance the explainability of recommendations through the
language model's capabilities by showing why a recommendation is made. This involves generating
clear and human-readable explanations for recommendations, fostering user understanding and
trust.

1.4 Scope and Limitations


This dissertation focuses on the development of a recommendation engine that harnesses the large
language model’s inference abilities (Pope et al, 2023) to enhance recommendation accuracy and
personalization by using its understanding of language and context to generate a recommendation that is
relevant and appropriate. The scope encompasses the entire software development lifecycle, from
architectural design to implementation and evaluation. While the integration of a large language model and
how it works is pivotal, this project does not concern itself with the development and training of a large
language model.

Some of the limitations of this project include:


1. Model Selection: The project will focus on a large language model available at the time of
research. The choice of the model might impact the generalizability of the results to other models.
2. Data Availability: The effectiveness of the recommendation engine heavily depends on the
availability and quality of data. The limited size of the dataset used may impact the accuracy and
performance of the engine.
3. Domain Specificity: The research focuses on generating recommendations using movie dataset.
The recommendations and insights generated by the engine may not be directly applicable to other
domains and other types of data.
4. Privacy Considerations: The project does not delve deeply into user privacy concerns associated
with utilizing user-generated data for recommendation generation. Ensuring ethical data usage and
privacy protection is beyond the scope of this project.

1.5 Organization
The dissertation is structured into six chapters, each focusing on a distinct aspect of building a
recommendation engine based on a large language model. Chapter 1 introduces the problem, research
questions, and objectives of the study. Chapter 2 delves into the literature review, presenting an overview
of related work on recommendation systems and large language models. Chapter 3 elaborates on the
architectural design and components of the proposed recommendation engine. Chapter 4 discusses the
implementation details. Chapter 5 discusses the results and evaluations of the research. Chapter 6
discusses the reflections and lessons learnt working on this project and lastly, Chapter 7 concludes the
dissertation by summarizing the findings, discussing contributions, and highlighting potential avenues for
future research.

1.6 Conclusion
This dissertation seeks to leverage the reasoning capabilities of large language models to generate
effective content recommendations by providing acceptable explanations for recommendation choices. By
devising an architecture that seamlessly integrates these two domains, we aim to contribute to the field of
recommendation systems and software development alike. Through this journey, we endeavor to create a
recommendation engine that is robust in design and has a transformative impact on the field of
recommendation engine design.
2. Literature Review
2.1 Introduction
Recommender systems have become a mainstay application area of breakthrough research in machine
learning and natural language processing in areas such as e-commerce, internet music and video, news,
and other fields that generate enormous amounts of data (Amatriain & Basilico, 2016).

The breakthrough in the use of transformer-based architectures in building large language models has led
to impressive performance in the application of large language models in generative tasks in fields like
education (Milano et al, 2023), finance (Wu et al, 2023), and healthcare (Sallam, 2023).

This chapter summarizes the work and progress made in recommendation systems and the application of
deep learning technologies in building recommender systems.

This chapter also briefly explores large language models, the categories of large language models, how
large language models work and highlights earlier research done in the application of large language
models in the development of recommendation systems.

2.2 Recommender Systems


Recommender Systems (RSs) are software tools and techniques providing suggestions for items to be of
use to a user (Ricci, F., et al. 2011). Recommender systems try to predict what the most suitable products
or services are, based on the user’s preferences and constraints.
Recommender systems have become a key part of the everyday life of an average person.
They offer personalized suggestions designed to match user preferences in various application domains,
such as entertainment, e-commerce, transportation, relationships, job matching and many other important
fields (Schrage, 2020).
The basic idea of recommender systems is to make use of the interactions between users, items,
interaction between the user and items, and the associated item data such as item titles or descriptions,
user profiles, and user reviews, to predicting the probability that users will like some specified items and
selecting the items that have the highest probability scores (Covington, et al 2016).
More specifically, collaborative behaviors between users and items are used to design various
recommendation models, which can be further used to learn the representations of users and items
(Adomavicius et al, 2005).
There have been various approaches to building an effective recommender system including collaborative
filtering (Herlocker et al, 2000), content-based filtering (Thorat et al, 2015) or hybrid filtering (Sharma et al,
2022).
With the tremendous success of the application of deep learning algorithms in different fields, there has
been a rise in the application of deep learning in the design and building of recommendation systems. Deep
learning has been used to effectively capture important user-item relationships and catch the intricate
relationships within the data itself in recommender systems. (Zhang et al, 2019).
The advantages of using deep learning algorithms in recommender systems include:
• It makes it possible to model complex and intricate user/item interaction patterns using methods
such as matrix factorization (Xiangnan et al, 2017).
• Deep learning networks make it possible to handle multimedia data processing representation
learning from various sources (Zhang et al, 2019).
Some of the deep learning algorithms used in recommender systems include Deep structured semantic
model (Po-Sen et al, 2013), collaborative deep learning (Wang et al, 2015), collaborative deep ranking
(Ying et al, 2016), Graph convolutional network (Kipf et al, 2016) and many more.
However, despite the success of deep learning-based recommender systems, most existing advanced
recommender systems still face the issue of explainability (Afchar et al, 2022) and cold start problem (Lika
et al, 2014) which makes it difficult to provide accurate recommendations for both new items and new
users.
Recently, language models have been increasingly used in recommender systems due to their capacity to
comprehend and produce human natural language.
Furthermore, to take advantage of LLM’s ability for language generation, (Liu et al. 2023) researchers are
exploring transformer-based recommender systems to simultaneously make item recommendations and
generate explanations.

2.3 Large language models


Large language models are a type of Artificial Intelligence (AI) that are trained on a large amount of data
with trillions of tokens (Touvron et al, 2023) to understand the patterns and structures of natural languages
and mimic human intelligence. A token is the basic unit of text that a large language model uses to process
and generate language.
Large language models are based on the transformer architecture (a deep neural network design based on
a self-attention mechanism, that dispenses with recurring and convolutions entirely) to predict upcoming
words in text.
According to research, some of the top large language models released such as GPT-3 (Brown et al,
2020), GPT-4(OpenAI, 2023), LaMDa (Thoppilan et al, 2022), Llama (Touvron et al, 2023), PaLM 2 (Anil et
al, 2023), are based on the transformer architecture.

According to their architecture, there are three main categories of large language models:
I. Encoder-only models uses bi-directional attention to process token sequences, considering both
the left and right context of each token to learn contextual representations. Examples include BERT
(Devlin et al, 2018).
II. Decoder-only models use a self-attention mechanism for one directional word sequence
processing from left to right. Examples include GPT (Brown et al, 2020) and CPM (Zhang et al,
2021).
III. Encoder-decoder models handle any text-to-text task by converting every natural language
processing problem into a text generation problem. Examples include T5 (Raffel et al, 2020) and
ERNIE 3.0 (Sun et al, 2021).
These models are trained on enormous amounts of text data and use a combination of techniques such as
word embeddings, attention mechanisms, and layered neural networks to learn the relationships between
words and the context in which they are used.

Large language models working by taking the following steps:


1. Text preprocessing: The first step in training a large language model is to preprocess the text
data. This typically involves tokenizing the text into individual words or sub words, removing stop
words and punctuation, and other preprocessing techniques such as lemmatization and multiword
grouping (Camacho-Collados and Pilehvar, 2017).
2. Word Embeddings: Once the text data has been preprocessed, the next step is to create word
embeddings. Word embeddings are a way of representing words as vectors in a high-dimensional
space, such that similar words are close together and dissimilar words are farther apart. There are
two types of word embedding methods:
a. context-dependent embeddings method, which generates different embeddings for the same
word that is dependent on the context in which it is used.
b. context-independent method, also known as “classic” word embeddings, generate word
embeddings that are characterized by being unique and distinct for each word without
considering the word’s context. This technique is typically used in generating pre-trained
models for large language models. (Congcong Wang et al, 2021).
3. Encoding: The core architecture of large language models is the Transformer's encoder-decoder
architecture. However, for most language generation tasks, only the encoder component is used.
The encoder processes the input text through multiple layers of self-attention mechanisms and
feedforward neural networks (Vaswani et al, 2017). Self-attention allows the model to weigh the
importance of each word in the context of the entire input sequence, capturing long-range
dependencies efficiently.
4. Attention Mechanism: The attention mechanism is a technique used in the encoder to allow the
model to focus on various parts of the input sequence as it processes it. The attention mechanism
computes a weighted sum of the input elements, where the weights are learned during training and
reflect the importance of each input element for the current output (Phoung and Hutter, 2022).
5. Pre-training: During the pre-training phase, the large language model is exposed to a massive
corpus of text data, often containing billions of words. The model learns to predict the next word in a
sentence given the context of the preceding words. This process is known as "masked language
modeling" or "autoencoding." To enable autoencoding, some of the input tokens in the text are
randomly masked, and the model's objective is to predict the original tokens from the surrounding
context. This task encourages the model to understand the relationships between words and learn
contextual representations that capture the meaning and syntax of the language (Han et al, 2021).
6. Fine-tuning: After pre-training, the large language model is fine-tuned on specific downstream
tasks. Fine-tuning involves training the model on smaller datasets related to the specific task, such
as text classification, question answering, or language translation. During fine-tuning, the model's
weights are adjusted to perform well on the targeted task using the contextual knowledge acquired
during pre-training (Radford et al, 2018).
7. Inference and Output Generation: Once the model is fine-tuned, it can be used for inference on
new text data. For language generation tasks, the model generates coherent and contextually
relevant text based on a prompt or initial input (Radford et al, 2018). Inference often involves
techniques like beam search or sampling to generate diverse and high-quality output.
Figure 1: A diagram of the transformer architecture which large language models are based on (Vaswani et
al, 2017).

These models have revolutionized the field of AI by demonstrating unprecedented capabilities in


understanding and generating humanlike knowledge.
One of such remarkable capabilities is in-concept learning (ICL), (Wei et al, 2023), a model’s capacity to
provide answers based on the input context as opposed to merely relying on inside knowledge obtained
through pre-training.
Works such as self-generated in context learning (Kim et al, 2022), learning to retrieve prompts for in
context learning (Rubin et al, 2021), show that in context learning allows language models to adapt their
responses based on prompts instead of generating generic responses when handling various tasks.
Another capability that makes language models remarkable is chain of thought prompting (Wei et al, 2022),
a method of breaking a problem down into a series of intermediate reasoning steps.
Current research such as Self-Taught Reasoner known as STaR (Zelikman et al, 2022), Three-hop
Reasoning known as THOR (Fei et al, 2023), and Tab-CoT (Jin & Lu, 2023) continue to delve into how to
enable language models to reason effectively and deliver more accurate responses.
With these powerful capabilities, it comes as no surprise that there is lots of research on using large
language models to improve recommendation systems. Examples include user preference (Kang et al,
2023), analyzing user interactions to predict preferences (Bao et al, 2023, Chen, 2023), making
recommendations explainable (Gao et al, 2023) and in making recommendation systems more
conversational (Deng et al, 2023).
3. Architecture
3.1 Introduction

In the following chapter, we will delve into the core elements of the software architecture for the
recommendation engine, examining the architectural styles and patterns that underpin its successful
implementation.

We will explore the intricacies of data handling, model deployment, and user interaction, all within the
context of our pursuit to craft a recommendation engine that leverages the transformative capabilities of
large language models.

3.2 Architectural goals and objectives


3.2.1 Goals
The goals I set out to achieve when designing the architecture of the proposed recommendation engine
are:
1. Scalability and performance: The architecture should seamlessly scale to handle increasing
amounts of data and user interactions while maintaining high performance levels. The system
should be able to accommodate a large amount of data without compromising on response times
and recommendation accuracy.
2. Real time recommendation generation: The architecture should allow real-time recommendation
generation based on user interactions. Users should receive personalized recommendations
promptly, enhancing their engagement and satisfaction.
3. Adaptability to Language Model Updates: The architecture should smoothly accommodate
changes and updates to the underlying large language model. As advancements in language
models emerge, the system should be able to integrate these improvements without causing major
disruptions.
4. Personalization accuracy: The architecture should effectively capture user preferences, contextual
nuances, and evolving interests to provide relevant and precise recommendations.

3.2.2 Architectural objectives


The following objectives were considered when designing the architecture of the software:
1. Component-based design: The system should be decomposed into distinct and loosely coupled
components, each responsible for specific tasks such as data preprocessing, model training, real-
time inference, and user interaction.
2. Event-Driven Data Flow: The system should implement an event-driven architecture that facilitates
seamless data flow and communication between components. This approach enables real-time
updates and asynchronous processing while minimizing bottlenecks.
3. API Design for Interactions: The system should make use of well-defined APIs for communication
between system components and external services. This ensures standardized interactions and
allows for easy integration with third-party applications.
4. User-Focused Interfaces: The system should have user interfaces that facilitate seamless
interactions and enhance user experience by gathering user preferences, displaying
recommendations, and enable user feedback to refine future suggestions.

By addressing these architectural goals and objectives, the proposed recommendation engine will provide
personalized recommendations by leveraging the power of large language models and providing a reliable,
scalable, and personalized user experience.
3.3 Architectural styles and patterns
Like many recommendation systems, the proposed recommendation platform is server based, and a client
is necessary to access the functionality. The system can support simple clients, like a web-based interface,
mobile applications or clients tailored to support specific features of the system.

At a high level, the proposed platform uses the n-tier architecture (Varma 2009) in its design. This
architecture is the de facto design pattern for web applications due to its simplicity, familiarity, and low cost
(Richards & Ford, 2020).

Accordingly, the proposed recommendation engine is a collection of layers with each layer performing
specific a role in the application:

i. Presentation layer: This layer handles all user interface and browser communication logic. Users
interact with the application using this layer and its main purpose is to provide an interface where
users can view recommended content, provide feedback, and engage with the system. This layer is
typically developed using HTML, CSS, JavaScript, Kotlin, Swift or Java languages.

ii. Application layer: This layer is responsible for executing specific business rules associated with
the request made by the user through the presentation layer. This layer processes user interactions,
coordinates recommendation generation, and manages real-time updates. This layer also leverages
insights from user profile and language model engine to produce tailored recommendations. The
application layer can also add, delete or modify data in the data layer. The application layer is
typically developed using Python, Java, Perl, PHP, or Ruby, and communicates with the data layer
using API calls such as REST (Wilde and Pautasso, 2011), SOAP (Box et al, 2000) or RPC
(Remote Procedure Call) (Srinivasan, 1995).

iii. Database layer: The data management layer handles data storage and retrieval. It interacts with
databases, data sources, and external services to save user profiles, behavior logs, content
metadata, and other relevant data. This can be a relational database management system (Jatana
et al, 2012) such as PostgreSQL, MySQL, MariaDB, Oracle, DB2, Informix or Microsoft SQL Server,
or in a NoSQL Database server (Gajendran, 2012) such as Cassandra, CouchDB or MongoDB.
Figure 2: System Architecture for proposed recommendation engine

3.4 System components and modules


The following section discusses the components of the system and the interconnections among them in
more detail.
Figure 3: System components of proposed recommendation engine.

3.4.1 The web service component


This module creates and maintains user data, movie history and handles user requests including displaying
movie recommendations.
The web service component is responsible for processing information and interacting with the database
layer to add, modify or delete data.

The web service component also interacts with the AI service component to generate recommendations by
providing data such movie library and user preferences to the AI service component which interacts with
the AI model component to generate recommendations based on the data provided.

The web service component receives the refined recommendations and processes them before releasing
them as a final recommendation to the user. This component also tracks the users’ interaction with the
system and tracks the recommendations.

3.4.2 The AI service component


This component serves as the pipeline that connects the web service engine to the LLM (Large Language
Model) model engine that generates the movie recommendations shown to the user.

The AI service component serves as an ETL pipeline (Vassiliadis, 2009) that extracts, transforms, and load
the user data into machine-readable format and transfers the data to the LLM model that ingests the data
and generates recommendations based on the provided format.

The AI service component takes a set of candidate recommendations generated by the LLM model
component, and then reorders and removes items from the set to generate a refined recommendation. The
refined recommendation is then passed back to the web service engine which then passes it to the
presentation layer for display to the user.
3.4.3 The LLM model component
The LLM model component is the large language model engine responsible for generating
recommendations by picking a set of reasonable candidates for a recommendation based on user data
provided. Items are picked by applying one or more heuristics and adding the identified candidates to a set
which is then returned to the AI service component.

The LLM model component is responsible for tasks such as pre-training, fine-tuning, prompting and
completion (Fan et al, 2023).

Pre-training helps the LLM recognize and generate coherent responses by training on diverse and
unlabeled data. By doing this, the LLM model engine understands different grammar, syntax, semantics,
and human-like reasoning.

Fine-tuning helps the LLM model engine to understand movie specific domain knowledge by training the
LLM model on task-specific datasets such as movie information and knowledge. This process allows the
LLM model engine to improve its performance in the recommendation domain.

Prompting helps the LLMs (Large Language Models) to achieve better results by using text templates
applied to LLM model input to adapt to user specific scenarios (Fan et al, 2023). This allows the LLM model
engine to provide recommendations specific to the current user.

The LLM model engine returns recommendations in the form of completions (OpenAI, 2023) based on the
data and the prompt provided.

3.5 Data flow and communication


The efficient flow of data and seamless communication between the various components of the platform
are pivotal for its success. The intricacies of data movement, transformation, and interaction underpin the
platform’s ability to generate accurate and personalized recommendations. Here's how data flow and
communication are orchestrated within such a system:

Figure 4: Data flow within proposed recommendation engine

1. Data ingestion and collection: The data flow journey begins with the ingestion and collection of
data sources, including user preferences, movie content metadata, and textual content. These data
can originate from the user, databases, and external sources. The data is aggregated, and
metadata is extracted to form a comprehensive foundation for generating recommendations.
2. User interaction capture: User interactions, such as reviews, views, and ratings, are captured in
real time. These interactions provide valuable signals that shape the user profiles and influence the
recommendations. Capturing user feedback and engagement is an ongoing process that feeds into
the system's learning loop.
3. Preprocessing and cleansing: The data collected undergoes preprocessing to ensure its quality
and uniformity. This involves tasks such as data cleaning, normalization, and transformation.
Textual content is tokenized, and linguistic features are extracted to facilitate analysis by the
language model.
4. Large language model integration: The language model integration phase involves
communicating the preprocessed data to the language model for semantic analysis. This interaction
enables the model to understand context, sentiment, and nuances within the data. The language
model's outputs are then utilized in the recommendation process.
5. Real time recommendation generation: Based on the insights gained from the data, the language
model generates recommendations which are assessed and matched with user preferences.

3.6 Conclusion

In this chapter, we have delved into the software architecture of the proposed recommendation platform.
The architectural decisions made are pivotal in shaping the effectiveness, scalability, and user satisfaction
of the recommendation platform. Through the exploration of architectural goals, design principles, and
specific components, we have laid the foundation for a robust and sophisticated system that leverages the
capabilities of modern AI technology.

By adopting a 3-tiered architecture, we have ensured a separation of concerns, enabling modular


development and maintainability. The presentation, application, and data layers work in harmony, each
contributing to the overall functionality of the recommendation engine.
4. Implementation
4.1 Introduction
This chapter contains a basic outline of the software engineering process and tools involved in the
development of the recommendation engine. We describe the software used to develop the
recommendation platform including the programming languages and applications and the software
framework used.

4.2 Software requirements


Software requirements are computing pre-requisites needed for the software to run on any computer and
produce the required service by the user (Bozyiğit et al, 2021). The recommendation engine must be able
to do the following to satisfy the requirements:

• Users should be able to request movie recommendations without specifying their watch history.
• Users should be able to request movie recommendations by specifying the genres of movies that
they like.
• Users should be able to request movie recommendations by specifying the countries they prefer.
• Users should be able to see the reasons why a movie was recommended.
• The software should be able to handle enormous amounts of data and have low latency

4.3 Software Process Model


A software process model helps developers to systematize and plan their software development process
from its initial feasibility study through to its deployment in the field and maintenance (Ruparelia, 2010).

I utilized the waterfall process model for this project due to the following reasons:
1. It provides a structured and sequential approach to software development.
2. Each phase is well-defined and follows a logical progression which provides simplicity and clarity.
3. Its clear-cut structure allows for thorough planning and documentation and focuses on
understanding the project requirement.
4. It provides a level of predictability in the development process that makes it easy to establish
realistic timelines.

Figure 5: Phases in waterfall process model. (Adobe 2022).


Alternative process models available include Agile, scrum and Kanban. These models are great when
working with a team, especially on projects with flexible requirements (Ruparelia, 2010).

4.4 System Overview


The movie recommendation platform is a web-based application accessible through standard web
browsers and mobile devices. The core components of the system include:

Frontend: The frontend, developed using JavaScript and Ruby on Rails, provides an intuitive and visually
appealing interface for users to interact with the platform. Through this interface, users can specify the
genres and countries of origin of movies and receive personalized movie recommendations.
The frontend will be built using HTML, CSS, and JavaScript languages.

Backend: The backend, implemented in Python and Ruby on Rails, serves as the engine that processes
user requests, handles business logic, and communicates with the machine learning model.
The backend is powered by two different APIs:
• Ruby API: This API processes user requests, handles business logic, save and query user data in
the database. This API connects the front end to the database and handles all user interactions. I
chose Ruby as the language of choice and Ruby on rails as framework of choice for this API due to
its built in security features such as such as protection against common web vulnerabilities like SQL
injection, Cross-Site Scripting (XSS), and Cross-Site Request Forgery (CSRF) and its
encouragement of the use of RESTful architecture for designing APIs by providing clean and
predictable URL structures.
• Python API: This API processes the interaction between the platform and the machine learning
engine that powers the movie recommendation. This API powers the ETL pipeline that extracts,
transforms, and loads data from the database to the model engine, vector storage and retrieval
actions. I chose Python for this API due to its rich ecosystem and ease of use for machine learning
projects. Also, I chose the FastAPI web framework due to its ability to handle multiple requests
asynchronously, an ability that is important to ensure that the system can handle multiple requests
with low latency.

Machine Learning Service: The movie recommendations are generated by the machine learning service
powered by Facebook’s Llama-2 language model.
This service takes user data such as user movie preferences, viewing history, movie profiles and converts
them to embedding vectors that can be easily consumed by machine learning models and algorithms. This
service also returns predicted completions and probabilities of alternatives when given a prompt.

There are several popular language models available, including GPT-3 (Brown et al, 2020), GPT-4(OpenAI,
2023), LaMDa (Thoppilan et al, 2022), Llama (Touvron et al, 2023), PaLM 2 (Anil et al, 2023). Each model
has its strengths and weaknesses, and the choice between them depends on the specific use case and
technical requirements.

LLaMA-2 (Touvron et al. 2023) is a recent language model, released in July 2023, that has gained
popularity due to its high performance and efficiency.

I chose to use Llama-2 over the other language models available for the following reasons:
1. Architecture: LlaMa-2 adopted grouped-query attention (Ainslie et al., 2023) technique, to speed
up its inference process.
2. Efficiency: LlaMa-2 has been designed with efficiency in mind, using less computation and
memory. This makes it particularly useful for deployment of resource-constrained devices or for
applications where computational resources are limited.
3. Open-source: LlaMa-2 is open-source, which allows researchers and developers to modify and
customize the model to fit their specific needs.
4. Pre-training: LLaMA-2 is pre-trained on 2 trillion tokens of data (Touvron et al, 2023), including the
internet, which helps it learn a rich set of semantic and syntactic features.

Postgres Database: Postgres is used to store the user data, movie information, and other relevant data
necessary for generating accurate recommendations. The Postgres database is also used to store, and the
embedding vectors generated by the machine learning service.
There are several popular databases available including: MySQL, MongoDB, Oracle, Microsoft SQL server,
MariaDB, SQLite and many more.

I decided to use Postgres database because of its support for vector similarity search (using the pgvector
library) and because of my previous experience working with it.

Figure 6: Implementation of an LLM based movie recommendation platform


4.4 High-Level Design

The high-level design outlines the interactions between the application components:

4.4.1 User Interaction Flow


1. Users access the movie recommendation platform through their web browsers or mobile devices.
2. The front end communicates the user preferences to the back-end service via Ruby REST API.
3. The Ruby backend processes the user input and stores relevant data in the Postgres database.
4. When a user requests movie recommendations, the backend sends a request to the language
model service to process the user's preferences and generate relevant responses which are shown
to the user in the form of movie recommendations through the frontend.

4.4.2 Language model Integration.


Llama-2 is integrated into the system as a fine-tuned machine learning model. The backend communicates
with Llama-2 using API calls to send user preferences and receive movie recommendations.
1. The Python backend extracts, transforms, and loads the user data in the language model which
converts the data to vector embeddings.
2. The generated vector embeddings are then saved in the Postgres database using the Python API.
3. To generate a movie recommendation, a user or system generated prompt is sent to the language
model using the Python model.
4. The language model generates a response based on the prompt provided, which is shown to the
user via the backend.

4.4.3 Postgres Database Integration


The Postgres database stores user preferences, movie information, query prompts, and other relevant
data.
1. Movie data such as name, description, genre, and more is stored in the database via a REST API
call by the backend.
2. User actions such as preferences and query prompts are saved in the database when carried out by
the Ruby backend.
3. Embedding vectors generated by the language model are stored in the database by the Python
backend.
4. Embedding vectors stored in the database are queried and retrieved by the Python backend when
the language model service is generating a movie recommendation.
Figure 7: Database schema for the recommendation engine.

4.5 Implementation
The major pages in the platform are the preferences page and the recommendations result page.

4.5.1 Preferences page


The preferences page allows the user to set their preferences including similar movies to the ones they are
looking for, the genre they have in mind and to select countries of choice.

Figure 8: Preferences page – Blank state

The movie dropdown is dynamically populated by movie data using the TVMaze API. I implemented it by
writing a typeahead function in JavaScript that takes user input and searches the TVMaze API using the
input and returns the results.
Figure 9: TV show typeahead function.

The genre dropdown down is fetched from a static array of movies containing genres such as Drama,
Crime, Espionage and more.

Figure 10: An array of movie genres users can select from.

The countries dropdown is also populated by a typeahead function that automatically fetches a list of all
countries using the countriesnow (https://countriesnow.space/api/v0.1/countries/iso) API.
Figure 11: Countries typeahead function

Once a user chooses their preferences, they click on the Give me recommendations button and the API
fetches the recommendation from the language model (Llama-2) service.
Figure 12: Preferences page (User selects preferences)

4.5.2 Recommendation results page


When a recommendation is generated, the result is displayed on the recommendations result page shown
in figure 15. However, before the recommendation is generated, the following steps happen:

• Prompt generation: The Ruby API generates a prompt using the user data and a pre-generated
prompt template.

Figure 13: Prompt generation function

• Completion: The Ruby API sends the generated prompt to the language model service through the
Python API. The language model generates a recommendation based on the prompt provided and
then streams the response back to the Python API.

Figure 14: Python API method interacting with the Language model service.

The recommendation generated by the language model service is parsed by the Python API and then
returned to the Ruby API which displays the recommendation result to the user.
Figure 15: Recommendations result page

4.7 Conclusion
This chapter provided a comprehensive overview of the system implementation for the movie
recommendation platform. By harnessing the power of Llama-2, Python, Ruby on Rails, JavaScript, and
Postgres, the platform delivers personalized movie recommendations, enhancing the movie-watching
experience for users.
5. Result and Evaluation
5.1 Introduction
The major objective of the research is to implement a recommendation engine that can leverage large
language models to solve the problems of sparsity, cold start and explainability in recommender systems
and in this chapter, we revisit these objectives and assess the extent to which they have been met.

To evaluate our recommendation engine, we conducted an extensive usage experiment. Through this, we
aim to answer the following research question:

• Can the recommendation engine solve the cold start problem by providing relevant
recommendations without knowing the user’s movie history?
• Can the recommendation engine solve the sparsity problem by providing relevant recommendations
without knowing the relationship between user and movies such as ratings and reviews?
• Can the recommendation engine solve the explainability problem by generating relevant explanation
for every recommendation choice it makes?

5.2 Experimental setup


5.2.1 Dataset
I used the TV show dataset from TVMaze, an API provider with data about available TV shows globally.
The dataset comprises of 67284 TV shows. This dataset was used to finetune the Llama-2 13B parameter
base model.
Additionally, the dataset contains information about the TV show such as title, description, country of
production and genre.

5.2.2 Metrics
To evaluate the performance of the recommendation engine, I used the following metrics:

• User coverage: This is the percentage of users whom the recommender was able to generate a
recommendation list over the number of potential users (Massa & Avesani, 2009). This is a good
measure of how well the recommender does in solving the cold start problem by measuring the
number of cold start users who got a relevant recommendation.

Coverageuser= (u÷ U) * 100

u = users with recommendation list

U = total number of users

• Explanation coverage: This is the percentage of recommendations for which explanations are
provided. This is a good measure of how well the recommender solves the explainability problem by
measuring how many recommendation choices have a good explanation regarding why it was
chosen.

Coverageexplainability = (e÷ E) * 100


e = recommendation choice with explainability

E = total number of recommendation choices.


5.2.3 Result
To evaluate the performance of the recommendation engine, I conducted 200 simulations by getting 10
testers to request recommendation using the platform.
To conduct a simulation, a random user without an account is only required to provide some TV shows like
the ones they would like to see, specify the genre and the country or production. Then, they get a
recommendation list with five recommended choices with explanations.

This generated 200 recommendation lists with 1000 recommendation choices.

Based on this, I got the following result:

Total simulations User coverage Explanation coverage


200 100% 100%

This means, for the 10 different users without an account, they get a recommendation list every time they
use the recommender and all the choices provided in the list have a relevant explanation regarding why it
was recommended.

This demonstrates the effectiveness of the recommender in generating recommendations for new users
and explaining its reasoning for every recommended choice helping users see the reasoning behind its
choices.
6 Reflection
6.1 Introduction
In this chapter, I discuss the motivations behind choosing this project, the challenges faced during its
implementation, and the lessons learned throughout the process.

6.2 Motivations
The main motivation behind choosing this project was to explore the potential of using large language
models for recommendation systems. I have seen how recommendation systems have become part of our
everyday life with their application in shopping, entertainment, finance, and health.

Traditional methods of recommendation, such as collaborative filtering and content-based filtering, have
been successful but limited in their ability to help users understand the rationale behind their
recommendations. I wanted to investigate whether large language models could be used to improve the
explainability of recommendations.

6.3 Challenges
The implementation of a large language model-based recommendation system posed several challenges,
including data quality, training time, and time management.

Data Quality: The first challenge was collecting and preprocessing the dataset that captures relevant
details about movies to be used in finetuning the language model. The dataset needed to be
comprehensive, diverse, and representative of various countries and genres. Moreover, the data had to be
cleaned and normalized to ensure consistency and quality. I addressed this challenge by writing a python
script to fetch data from a movie API provider and then create a jsonl file containing the relevant training
dataset of over 60,000 movie descriptions.
Training Time: The second challenge was fine tuning the large language model. The model required
significant computational resources and a substantial amount of time to finetune. To address this challenge,
I used Replicate, a web platform that provides a high-performance computing cluster to train and fine tune
open-source language models using an optimized architecture to reduce training time.
Time management: The third challenge was balancing the demands of research, data gathering, technical
implementation and writing while working on a tight deadline. To address this challenge, I broke down the
project into different modules which becomes easier to manage and track using a project plan (I used a
Gantt chart in this case).

6.4 Lessons Learned


Throughout the project, I learned several valuable lessons that can be applied to future research projects.

While working on this project, I learnt how to manage a project using Gantt charts. This helped me to
effectively manage my time and ensure that I complete my research on schedule.
I also learnt how to conduct independent research by learning how to identify research gaps, formulate
research questions, and design and implement a research study. I learnt how to critically evaluate existing
research, identify patterns and trends, and synthesize information to inform their own research.

I learnt different ways to evaluate the performance of a recommendation system and how to choose
appropriate evaluation metrics.
7. Conclusion and future work
7.1 Introduction
In this research, we conducted an implementation of a recommendation engine based on large-language
models to solve the cold start and explainability problems; two crucial problems faced by current
recommendation algorithms (Lika et al, 2014, Wang et al, 2018).

In this chapter, we will talk about the contributions of this research to the recommender systems field and
highlight areas where further research and exploration can continue to advance the field

7.2 Contributions
While implementing a recommender system based on large language models, this project has made
several noteworthy contributions to the domain of recommender systems:

1. Novel architecture: This project developed and rigorously tested novel recommendation
architecture tailored to integrating large language models into recommender systems with focus on
scalability and real time responsiveness.
2. Cold start problem mitigation: By integrating large language models, this project shows how to
resolve cold start problem in recommender systems. Our solution enables personalized
recommendations even for users with no historical interaction data.
3. Enhanced explainability: This project shows how the integration of a large language model can
help recommender systems to provide clear and understandable explanations for our
recommendations. This not only contributes to increased user trust but also fosters a deeper
understanding of the underlying recommendation process.
4. Real world applicability: While a lot of earlier research works (Kang et al, 2023, Bao et al, 2023,
Chen, 2023, Gao et al, 2023, Deng et al, 2023) focuses on the theoretical implications of leveraging
language models in recommender systems, this project shows the practical implications by
successfully deploying a large language model-based recommender system for movies, showing
tangible benefits on user experience.
These contributions collectively reflect the depth and breadth of this project's impact, from advancing the
theoretical foundations of language model backed recommender systems to providing practical solutions
that enhance user experiences and address critical challenges.

The significance of this work extends beyond this project, serving as a catalyst for continued research and
development in the quest for more effective, user-centric, and reliable recommendation systems in an ever-
evolving digital landscape.

7.3 Future work


While this project has made significant strides in building a large language model-based recommender
system, there are several promising avenues for future work and research that will further explore. These
potential directions encompass both refinement of existing approaches and the exploration of emerging
opportunities.

• Improving recommendation diversity and serendipity: The current recommender engine


focuses on improving user coverage and explanation coverage to solve the cold start and
explainability problem. I would like to improve the diversity and serendipity of the recommendations
of the engine. Diversity is a measure of how dissimilar recommended items are for a user (Kunaver
& Pozrl, 2017) while serendipity is a measure of how surprising relevant recommendations are
(Kotkov et al, 2016).
• Ethical and fair recommendations: I would like to investigate methods to ensure that
recommendation systems are fair, unbiased, and sensitive to ethical considerations.
• Reinforcement Learning Integration: In the current iteration of the recommender engine, single-
round recommendation is adopted, and no attention is paid to user feedback regarding the
recommendation effect (such as clicking, adding movies to watch library, and user watching
behaviors). Therefore, it is also worth exploring how to integrate different real-time user behaviors
and interactions with the system into the recommendation paradigm and construct a multi-round
conversation-based recommendation model. Through multi-round conversation-based
recommendation, not only can the contextual learning abilities of language models be maximized
but also more training prompt language materials can be generated from the interactions, which can
improve the recommendation results.

In summary, this research project aims to explore the possibility of solving the cold start and explainability
problems in the recommender systems field by applying large language model. I hope that this work can
inspire researchers to analyze more opportunities for applying large language models in similar tasks and
further improve their performance in existing scenarios.
References
Schrage, M. 2020. Recommendation Engines. MIT Press.

Ricci, F., Rokach, L., Shapira, B. 2011. Introduction to Recommender Systems Handbook. Springer,
Boston, MA.

Covington, P., Adams, J. and Sargin, E., 2016, September. Deep neural networks for YouTube
recommendations. In Proceedings of the 10th ACM conference on recommender systems (pp. 191-198).

Adomavicius, G. and Tuzhilin, A., 2005. Toward the next generation of recommender systems: A survey of
the state-of-the-art and possible extensions. IEEE transactions on knowledge and data engineering, 17(6),
pp.734-749.

Pope, R., Douglas, S., Chowdhery, A., Devlin, J., Bradbury, J., Heek, J., Xiao, K., Agrawal, S. and Dean, J.,
2023. Efficiently scaling transformer inference. Proceedings of Machine Learning and Systems, 5.

Jonathan L. Herlocker, Joseph A. Konstan, and John Riedl. 2000. Explaining collaborative filtering
recommendations. In Proceedings of the 2000 ACM conference on Computer supported cooperative work
(CSCW '00). Association for Computing Machinery, New York, NY, USA, 241–250.

Thorat, P.B., Goudar, R.M. and Barve, S., 2015. Survey on collaborative filtering, content-based filtering
and hybrid recommendation system. International Journal of Computer Applications, 110(4), pp.31-36.

Sharma, S., Rana, V. and Malhotra, M., 2022. Automatic recommendation system based on hybrid filtering
algorithm. Education and Information Technologies, pp.1-16.

Shuai Zhang, Lina Yao, Aixin Sun, and Yi Tay. 2019. Deep Learning Based Recommender System: A
Survey and New Perspectives. ACM , Article 5 (January 2020), 38 pages.

Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural
Collaborative Filtering. In Proceedings of the 26th International Conference on World Wide Web (WWW
'17). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva,
CHE, 173–182.

Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Acero, and Larry Heck. 2013. Learning deep
structured semantic models for web search using clickthrough data. In Proceedings of the 22nd ACM
international conference on Information & Knowledge Management (CIKM '13). Association for
Computing Machinery, New York, NY, USA.

Wang, H., Wang, N., and Yeung, D.Y., 2015, August. Collaborative deep learning for recommender
systems. In Proceedings of the 21st ACM SIGKDD international conference on knowledge discovery and
data mining (pp. 1235-1244).

Ying, H., Chen, L., Xiong, Y. and Wu, J., 2016. Collaborative deep ranking: A hybrid pair-wise
recommendation algorithm with implicit feedback. In Pacific-Asia conference on knowledge discovery and
data mining (pp. 555-567). Springer, Cham.

Kipf, T.N. and Welling, M., 2016. Semi-supervised classification with graph convolutional networks. arXiv
preprint arXiv:1609.02907.
Lika, B., Kolomvatsos, K. and Hadjiefthymiades, S., 2014. Facing the cold start problem in recommender
systems. Expert systems with applications, 41(4), pp.2065-2073.

Afchar, D., Melchiorre, A., Schedl, M., Hennequin, R., Epure, E. and Moussallam, M., 2022. Explainability in
music recommender systems. AI Magazine, 43(2), pp.190-208.

J. Liu, C. Liu, R. Lv, K. Zhou, and Y. Zhang, “Is chatgpt a good recommender? a preliminary study,” arXiv
preprint arXiv:2304.10149, 2023.

Devlin, J., Chang, M.W., Lee, K. and Toutanova, K., 2018. Bert: Pre-training of deep bidirectional
transformers for language understanding. arXiv preprint arXiv:1810.04805.

Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P.,
Sastry, G., Askell, A. and Agarwal, S., 2020. Language models are few-shot learners. Advances in neural
information processing systems, 33, pp.1877-1901.

Zhang, Z., Han, X., Zhou, H., Ke, P., Gu, Y., Ye, D., Qin, Y., Su, Y., Ji, H., Guan, J. and Qi, F., 2021. CPM:
A large-scale generative Chinese pre-trained language model. AI Open, 2, pp.93-99.

Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W. and Liu, P.J., 2020.
Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine
Learning Research, 21(1), pp.5485-5551.

Sun, Y., Wang, S., Feng, S., Ding, S., Pang, C., Shang, J., Liu, J., Chen, X., Zhao, Y., Lu, Y. and Liu, W.,
2021. Ernie 3.0: Large-scale knowledge enhanced pre-training for language understanding and generation.
arXiv preprint arXiv:2107.02137.

OpenAI, “GPT-4 Technical Report”, arXiv e-prints, 2023. doi:10.48550/arXiv.2303.08774.

Thoppilan, R., De Freitas, D., Hall, J., Shazeer, N., Kulshreshtha, A., Cheng, H.T., Jin, A., Bos, T., Baker,
L., Du, Y. and Li, Y., 2022. Lamda: Language models for dialog applications. arXiv preprint
arXiv:2201.08239.

Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.A., Lacroix, T., Rozière, B., Goyal, N., Hambro,
E., Azhar, F. and Rodriguez, A., 2023. Llama: Open and efficient foundation language models. arXiv
preprint arXiv:2302.13971.

Anil, R., Dai, A.M., Firat, O., Johnson, M., Lepikhin, D., Passos, A., Shakeri, S., Taropa, E., Bailey, P.,
Chen, Z. and Chu, E., 2023. Palm 2 technical report. arXiv preprint arXiv:2305.10403.

Wei, J., Wei, J., Tay, Y., Tran, D., Webson, A., Lu, Y., Chen, X., Liu, H., Huang, D., Zhou, D. and Ma, T.,
2023. Larger language models do in-context learning differently. arXiv preprint arXiv:2303.03846.

Kim, H.J., Cho, H., Kim, J., Kim, T., Yoo, K.M. and Lee, S.G., 2022. Self-generated in-context learning:
Leveraging auto-regressive language models as a demonstration generator. arXiv preprint
arXiv:2206.08082.

Rubin, O., Herzig, J. and Berant, J., 2021. Learning to retrieve prompts for in-context learning. arXiv
preprint arXiv:2112.08633.
Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V. and Zhou, D., 2022. Chain-of-
thought prompting elicits reasoning in large language models. Advances in Neural Information Processing
Systems, 35, pp.24824-24837

Zelikman, E., Wu, Y., Mu, J. and Goodman, N., 2022. Star: Bootstrapping reasoning with reasoning.
Advances in Neural Information Processing Systems, 35, pp.15476-15488.

Fei, H., Li, B., Liu, Q., Bing, L., Li, F. and Chua, T.S., 2023. Reasoning Implicit Sentiment with Chain-of-
Thought Prompting. arXiv preprint arXiv:2305.11255.

Jin, Z. and Lu, W., 2023. Tab-CoT: Zero-shot Tabular Chain of Thought. arXiv preprint arXiv:2305.17812.

Wu, S., Irsoy, O., Lu, S., Dabravolski, V., Dredze, M., Gehrmann, S., Kambadur, P., Rosenberg, D. and
Mann, G., 2023. Bloomberggpt: A large language model for finance. arXiv preprint arXiv:2303.17564.

Sallam M. (2023). ChatGPT Utility in Healthcare Education, Research, and Practice: Systematic Review on
the Promising Perspectives and Valid Concerns. Healthcare (Basel, Switzerland), 11(6), 887.
https://doi.org/10.3390/healthcare11060887

Milano, S., McGrane, J.A. & Leonelli, S. (2023). Large language models challenge the future of higher
education. Nat Mach Intell 5, 333–334. https://doi.org/10.1038/s42256-023-00644-2

Kang, W.C., Ni, J., Mehta, N., Sathiamoorthy, M., Hong, L., Chi, E. and Cheng, D.Z., 2023. Do LLMs
Understand User Preferences? Evaluating LLMs On User Rating Prediction. arXiv preprint
arXiv:2305.06474.

Bao, K., Zhang, J., Zhang, Y., Wang, W., Feng, F. and He, X., 2023. Tallrec: An effective and efficient
tuning framework to align large language model with recommendation. arXiv preprint arXiv:2305.00447.

Chen, Z., 2023. PALR: Personalization Aware LLMs for Recommendation. arXiv preprint arXiv:2305.07622.

Gao, Y., Sheng, T., Xiang, Y., Xiong, Y., Wang, H. and Zhang, J., 2023. Chat-rec: Towards interactive and
explainable llms-augmented recommender system. arXiv preprint arXiv:2303.14524.

Deng, Y., Zhang, W., Xu, W., Lei, W., Chua, T.S. and Lam, W., 2023. A unified multi-task learning
framework for multi-goal conversational recommender systems. ACM Transactions on Information
Systems, 41(3), pp.1-25.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł. and Polosukhin, I.,
2017. Attention is all you need. Advances in neural information processing systems, 30.
Camacho-Collados, J. and Pilehvar, M.T., 2017. On the role of text preprocessing in neural network
architectures: An evaluation study on text categorization and sentiment analysis. arXiv preprint
arXiv:1707.01780.

Congcong Wang, Paul Nulty, and David Lillis. 2021. A Comparative Study on Word Embeddings in Deep
Learning for Text Classification. In Proceedings of the 4th International Conference on Natural Language
Processing and Information Retrieval (NLPIR '20). Association for Computing Machinery, New York, NY,
USA, 37–46. https://doi.org/10.1145/3443279.3443304

Phuong, M. and Hutter, M., 2022. Formal algorithms for transformers. arXiv preprint arXiv:2207.09238.
Han, X., Zhang, Z., Ding, N., Gu, Y., Liu, X., Huo, Y., Qiu, J., Yao, Y., Zhang, A., Zhang, L. and Han, W.,
2021. Pre-trained models: Past, present and future. AI Open, 2, pp.225-250.

Radford, A., Narasimhan, K., Salimans, T. and Sutskever, I., 2018. Improving language understanding by
generative pre-training.

Xavier Amatriain and Justin Basilico. 2016. Past, Present, and Future of Recommender Systems: An
Industry Perspective. In Proceedings of the 10th ACM Conference on Recommender Systems (RecSys
'16). Association for Computing Machinery, New York, NY, USA, 211–214.
https://doi.org/10.1145/2959100.2959144

Vasudeva Varma, 2009. Software architecture: A case-based approach. New Delhi, India: Dorling
Kindersley.
Mark Richards, Neal Ford, 2020. Fundamentals of Software Architecture: An Engineering Approach.
O’Reilly Media.

Srinivasan, R., 1995. RPC: Remote procedure call protocol specification version 2 (No. rfc1831).

Box, D., Ehnebuske, D., Kakivaya, G., Layman, A., Mendelsohn, N., Nielsen, H.F., Thatte, S. and Winer,
D., 2000. Simple object access protocol (SOAP).

Wilde, E. and Pautasso, C. eds., 2011. REST: from research to practice. Springer Science & Business
Media.

Jatana, N., Puri, S., Ahuja, M., Kathuria, I. and Gosain, D., 2012. A survey and comparison of relational and
non-relational database. International Journal of Engineering Research & Technology, 1(6), pp.1-5.

Gajendran, S.K., 2012. A survey on nosql databases. University of Illinois.


Vassiliadis, P., 2009. A survey of extract–transform–load technology. International Journal of Data
Warehousing and Mining (IJDWM), 5(3), pp.1-27.

Fan, W., Zhao, Z., Li, J., Liu, Y., Mei, X., Wang, Y., Tang, J. and Li, Q., 2023. Recommender systems in the
era of large language models (llms). arXiv preprint arXiv:2307.02046

OpenAI API documentation. 2023. Completions. Available at: https://platform.openai.com/docs/api-


reference/completions [Accessed: 08 August 2023].

Herlocker, J.L., Konstan, J.A., Borchers, A. and Riedl, J., 1999, August. An algorithmic framework for
performing collaborative filtering. In Proceedings of the 22nd annual international ACM SIGIR conference
on Research and development in information retrieval (pp. 230-237).

M. Balabanović and Y. Shoham. Fab: Content-based, collaborative recommendation. Communications of


the ACM, 40(3):66–72, 1997.

Lika, B., Kolomvatsos, K. and Hadjiefthymiades, S., 2014. Facing the cold start problem in recommender
systems. Expert systems with applications, 41(4), pp.2065-2073.
Wang, J., Huang, P., Zhao, H., Zhang, Z., Zhao, B. and Lee, D.L., 2018, July. Billion-scale commodity
embedding for e-commerce recommendation in alibaba. In Proceedings of the 24th ACM SIGKDD
international conference on knowledge discovery & data mining (pp. 839-848).
Chen, X., Zhang, Y. and Wen, J.R., 2022. Measuring" why" in recommender systems: A comprehensive
survey on the evaluation of explainable recommendation. arXiv preprint arXiv:2202.06466.

Dietmar Jannach, Markus Zanker, Alexander Felfernig, and Gerhard Friedrich. 2010. Recommender
Systems: An Introduction.

IBM Newsroom, 2020. 5 Things to Know About IBM’s New Tape Storage World Record. Available at
https://newsroom.ibm.com/IBM-research?item=32682.

Carlos A Gomez-Uribe and Neil Hunt. 2016. The netflix recommender system: Algorithms, business value,
and innovation. TMIS 6, 4 (2016)

Naumov, M., Mudigere, D., Shi, H.J.M., Huang, J., Sundaraman, N., Park, J., Wang, X., Gupta, U., Wu,
C.J., Azzolini, A.G. and Dzhulgakov, D., 2019. Deep learning recommendation model for personalization
and recommendation systems. arXiv preprint arXiv:1906.00091.

Shumpei Okura, Yukihiro Tagami, Shingo Ono, and Akira Tajima. 2017. Embedding-based news
recommendation for millions of users. In Proceedings of the SIGKDD. ACM, Halifax, NS, 1933–1942

Bozyiğit, F., Aktaş, Ö. and Kılınç, D., 2021. Linking software requirements and conceptual models: A
systematic literature review. Engineering Science and Technology, an International Journal, 24(1), pp.71-
82.

Nayan B. Ruparelia. 2010. Software development lifecycle models. SIGSOFT Softw. Eng. Notes 35, 3 (May
2010), 8–13. https://doi.org/10.1145/1764810.1764814

Ainslie, J., Lee-Thorp, J., de Jong, M., Zemlyanskiy, Y., Lebrón, F. and Sanghai, S., 2023. GQA: Training
Generalized Multi-Query Transformer Models from Multi-Head Checkpoints. arXiv preprint
arXiv:2305.13245.

Massa, P. and Avesani, P., 2009. Trust metrics in recommender systems. In Computing with social trust
(pp. 259-285). London: Springer London.

Kunaver, M. and Požrl, T., 2017. Diversity in recommender systems–A survey. Knowledge-based systems,
123, pp.154-162.

Kotkov, D., Wang, S. and Veijalainen, J., 2016. A survey of serendipity in recommender systems.
Knowledge-Based Systems, 111, pp.180-192.

You might also like