RAG-based-Chatbot-using-LLMs
RAG-based-Chatbot-using-LLMs
Abstract—Historically, Artificial Intelligence (AI) was used to accurate and relevant responses to user inquiries, thereby
understand and recommend information. Now, Generative AI can enhancing user experience and streamlining communication
also help us create new content. Generative AI builds on existing processes.
technologies, like Large Language Models (LLMs) which are
trained on large amounts of text and learn to predict the next By harnessing the power of LLMs, the project aims to
word in a sentence. Generative AI can not only create new text, create a chatbot that can understand natural language queries,
but also images, videos, or audio. This project focuses on the generate contextually relevant responses, and provide valuable
implementation of a chatbot based the concepts of Generative assistance to users within the company. The project idea is
AI and Large Language Models which can answer any query proposed in the desire to leverage cutting-edge AI advance-
regarding the content provided in the PDFs. The primary
technologies utilized include Python libraries like LangChain, ments to enhance user interactions and streamline commu-
PyTorch for model training, and Hugging Face’s Transformers nication processes. Understanding LLMs and NLP is essen-
library for accessing pre-trained models like Llama2, GPT- tial for developing advanced AI systems, chatbots, language
3.5 (Generative Pre-trained Transformer) architectures. The re- models, and applications that require robust natural language
sponses are generated using the Retrieval Augmented Generation understanding and generation capabilities. These technologies
(RAG) approach. The project aims to develop a chatbot which can
generate the sensible responses from the data in the form of PDF are revolutionizing how computers interact with and process
files. The project demonstrates the capabilities and applications human language, enabling a wide range of innovative applica-
of advanced Natural Language Processing (NLP) techniques tions across industries, which opens a wide range of learning
in creating conversational agents that can be deployed across opportunities.
various platforms in the corporation, to enhance user interaction
and support automated tasks. II. LITERATURE REVIEW
Index Terms—Generative AI, Artificial Intelligence, Natural
Language Processing, Large Language Model, Llama2, Tran- [1] The review suggested that chatbots can be used ev-
formers, Document Loaders, Retrieval Augmented Generation, erywhere because of its accuracy, lack of dependability on
Vector Database, Langchain, Chainlit human resources and 24x7 accessibility. In recent years, ad-
vancements in technologies such as Artificial Intelligence (AI),
I. INTRODUCTION Big Data, and Internet of Things (IoT) have revolutionized
In today’s digital age, the demand for intelligent conver- various industries. Among these innovations, Chatbots, or
sational agents, known as chatbots, has surged dramatically. conversational AIs, have emerged as a significant application.
These chatbots, powered by cutting-edge technologies such Chatbots, powered by AI and Natural Language Processing
as Large Language Models (LLMs) and advanced Natural (NLP), simulate human conversation, offering automation and
Language Processing (NLP) techniques, have revolutionized efficiency across diverse domains like education, healthcare,
how businesses and organizations interact with their customers and business. Through a review of existing literature, this study
and users. In line with this technology, the project aims to explores the types, advantages, and disadvantages of chatbots,
develop a sophisticated chatbot utilizing LLMs and related highlighting their versatility, accuracy, and ability to operate
technologies, specifically trained on a set of emails. Lever- continuously without reliance on human resources.
aging the Retrieval-Augmented Generation (RAG) approach [2] The paper presents a college inquiry chatbot as a solution
within the Python programming language, the chatbot will to challenges in locating specific information, especially for
be capable of understanding user queries, retrieving relevant non-affiliated visitors in the college website. While GUI and
information from a corpus of email data, and generating web-based interfaces are mainstream, alternative interfaces
contextually appropriate responses. The utilization of LLMs, occasionally emerge to address specific needs. Powered by
such as Llama2, Llama3, Mistral, GPT (Generative Pretrained AI and NLP algorithms, the developed chatbot intelligently
Transformer), combined with the RAG architecture, offers handle queries related to various college activities, including
unparalleled capabilities in natural language understanding examination cell, admission, academics, attendance, place-
and generation. By training the chatbot on a specific set of ment, and more.
emails, it is ensured that the chatbot is tailored to the domain- [3] The paper talks about the challenges posed by the
specific needs and queries encountered in real-world email pandemic, accessing health-care services has become increas-
communications. This approach enables the chatbot to provide ingly difficult. To address this issue, a chatbot application
the content generation process by facilitating access to relevant • The web application enables users to upload any PDF
information from external sources. Retrievers accomplish this file they wish to query. The PDF data undergoes parsing
by employing various techniques such as semantic search to extract the relevant content. This involves removing
and information retrieval to identify and retrieve pertinent unnecessary elements such as headers, footers, and any
information based on user queries. By accessing external other extraneous details.
knowledge sources, Retrievers enhance the comprehensiveness • The pre-processed text is segmented into smaller units
and accuracy of the generated responses. or chunks to facilitate efficient processing and analysis.
This segmentation helps in managing large volumes of
E. User Interaction with Large Language Model (LLM) text data. Embeddings, which are vector representations
User Interaction with the Large Language Model (LLM) of the text, are generated using libraries like sentence-
is facilitated through a dedicated web-based interface tailored transformers. These embeddings encode the semantic
for seamless communication. Upon receiving user queries, meaning of the text, making it suitable for retrieval and
the LLM undertakes comprehension, transforming them into generation tasks.
query embeddings that encapsulate the semantic essence of the • The generated embeddings are stored in vector store
inquiries. Leveraging these embeddings, the system conducts databases like FAISS. These databases serve as reposi-
semantic searches to retrieve pertinent context, subsequently tories for the embeddings, allowing quick and efficient
crafting responses that adeptly address the users’ queries. retrieval based on semantic similarity.
Through this iterative process, the LLM ensures effective and • The embeddings in the vector store enable the Retriever-
contextually relevant interactions, enhancing user satisfaction Augmented Generation (RAG) system to retrieve relevant
and system usability information, enhancing the contextuality and accuracy of
the chatbot’s responses. When a user query is received,
IV. IMPLEMENTATION AND RESULTS it is converted into query embeddings, which are used to
Python’s versatility, combined with its robust community perform a semantic search in the vector store to retrieve
support and cross-platform compatibility, has made itself relevant context.
widely utilized in training Large Language Models (LLMs). • The project utilizes pre-trained LLM, Llama2-7B model.
Python 3.x (Python 3.8 or higher) is used for development in The model is obtained from the Hugging Face Trans-
this project. formers library, which provides tools for fine-tuning and
Deep Learning Libraries like PyTorch, LangChain as the deployment.
primary deep learning framework for model development and • The retrieved context, along with the user query, is
training. LangChain is a deep learning framework primar- fed into the LLM to generate coherent and contextually
ily focused on natural language processing (NLP) tasks. It relevant responses. The system uses RAG to integrate
provides a set of tools and utilities specifically tailored for external knowledge sources seamlessly.
NLP applications, including text preprocessing, tokenization, • A user-friendly web-based interface is developed using
sequence modeling, and language generation. LangChain aims framework, Chainlit. This interface allows users to inter-
to simplify the development and deployment of NLP models act with the chatbot in real-time.
by offering high-level abstractions and pre-built components
for common NLP tasks.
Transformers are the architectural backbone that powers
LLMs, enabling them to process and understand text at scale.
Transformers Library like Hugging Face Transformers library,
open-source library developed by Hugging Face, a company
specializing in natural language processing (NLP) technolo-
gies, which provides easy-to-use interfaces for working with
transformer-based models, including both pre-trained models
and tools for fine-tuning them on custom datasets. The library
supports a wide range of transformer architectures, including
BERT, GPT, RoBERTa, T5, and more.
Chainlit is the open-source Python libraries that allows to
create web applications for machine learning and data science
projects with minimal effort. It’s designed to make it easy
for developers to build interactive web apps without requiring
expertise in web development.
Utilizing the mentioned technologies, the chatbot has been
developed which takes the PDF as input and answers any
queries asked by the user. The following mentions the features
of the developed web application: Fig. 2. User interface for the Chatbot
The implementation of the RAG framework significantly [7] Afzal, Anum & Kowsik, Alexander & Fani, Rajna & Matthes, Florian.
improved the chatbot’s ability to provide accurate and contex- (2024). Towards Optimizing and Evaluating a Retrieval Augmented QA
Chatbot using LLMs with Human-in-the-Loop.
tually relevant responses. This approach used Reinforcement [8] Bacciu, A.; Cocunasu, F.; Siciliano, F.; Silvestri, F.; Tonellotto, N.;
Learning to minimize the number of LLM tokens required, and Trappolini, G. 2023. RRAML: Reinforced Retrieval Augmented
reducing the overall computational cost. The web-based in- Machine Learning.
[9] Chen, Jiawei & Lin, Hongyu & Han, Xianpei & Sun, Le. (2024).
terface provided a seamless and interactive user experience. Benchmarking Large Language Models in Retrieval-Augmented Gen-
Users could query the chatbot and receive prompt responses, eration. Proceedings of the AAAI Conference on Artificial Intelligence.
enhancing their overall interaction with the system. 38. 17754-17762. 10.1609/aaai.v38i16.29728.
[10] Li, Xianzhi & Chan, Samuel & Zhu, Xiaodan & Pei, Yulong & Ma,
Figure 2 Shows the chatbot interface using which the users Zhiqiang & Liu, Xiaomo & Shah, Sameena. (2023). Are ChatGPT
can interact with the LLM. The chatbot responds to the query and GPT-4 General-Purpose Solvers for Financial Text Analytics? A
using the data provided in the PDF files. Study on Several Typical Tasks. 408-422. 10.18653/v1/2023.emnlp-
industry.39.
V. CONCLUSION [11] Zhihan Lv, Generative artificial intelligence in the metaverse
era, Cognitive Robotics, Volume 3, 2023, Pages 208-217,
The chatbot is designed to engage in natural language ISSN 2667-2413, https://doi.org/10.1016/j.cogr.2023.06.001.
conversations, providing intelligent responses to the queries (https://www.sciencedirect.com/science/article/pii/S2667241323000198)
[12] Zant, Tijn & Kouw, Matthijs & Schomaker, Lambert. (2012). Generative
related to uploaded PDFs. The chatbot is expected to answer Artificial Intelligence. 10.1007/978-3-642-31674-6-8.
the queries based on the the PDF data. The responses are [13] Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion
generated using the Retrieval Augmented Generation (RAG) Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. ”Attention
is all you need.” Advances in neural information processing systems 30
approach. (2017).
In conclusion, the implementation of the chatbot using [14] Rackauckas, Zackary. ”RAG-Fusion: a New Take on Retrieval-
LLMs and the RAG framework demonstrated the potential of Augmented Generation.” arXiv preprint arXiv:2402.03367 (2024).
[15] Khan, Salman, Muzammal Naseer, Munawar Hayat, Syed Waqas Zamir,
advanced NLP techniques in creating efficient and effective Fahad Shahbaz Khan, and Mubarak Shah. ”Transformers in vision: A
conversational agents. The project achieved significant im- survey.” ACM computing surveys (CSUR) 54, no. 10s (2022): 1-41.
provements in response accuracy and efficiency by employing [16] Goldman, Sharon M. ”Transformers.” Journal of Consumer Marketing
27, no. 5 (2010): 469-473.
the RAG framework, which integrated external knowledge [17] Alan, Ahmet Yusuf, Enis Karaarslan, and Omer Aydin. ”A RAG-
sources to enrich the chatbot’s contextual understanding. The based Question Answering System Proposal for Understanding Islam:
use of a policy-based model for optimizing LLM token usage MufassirQAS LLM.” arXiv preprint arXiv:2401.15378 (2024).
[18] Feuerriegel, Stefan & Hartmann, Jochen & Janiesch, Christian &
demonstrated substantial cost savings while maintaining high Zschech, Patrick. (2023). Generative AI.
response quality. The results of this project highlight the [19] Quidwai, Mujahid Ali, and Alessandro Lagana. ”A RAG Chatbot for
effectiveness of combining LLMs with retrieval mechanisms to Precision Medicine of Multiple Myeloma.” medRxiv (2024): 2024-03.
[20] Bras¸oveanu, Adrian MP, and Ra˘zvan Andonie. ”Visualizing transformers
create sophisticated conversational agents capable of handling for nlp: a brief survey.” In 2020 24th International Conference Informa-
complex queries. The chatbot not only automated routine tion Visualisation (IV), pp. 270-279. IEEE, 2020.
query responses but also provided a scalable solution for [21] Wolf, Thomas, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement
Delangue, Anthony Moi, Pierric Cistac et al. ”Huggingface’s trans-
future expansion and enhancement. The implementation sets formers: State-of-the-art natural language processing.” arXiv preprint
a foundation for future research and development in the field arXiv:1910.03771 (2019).
of AI-driven conversational systems, paving the way for more [22] Fill, Hans-Georg & Fettke, Peter & Ko¨pke, Julius. (2023). Conceptual
Modeling and Large Language Models: Impressions From First Exper-
sophisticated and efficient automated support solutions. iments With ChatGPT. Enterprise Modelling and Information Systems
Architectures. 18. 1-15. 10.18417/emisa.18.3.
REFERENCES [23] Gao, Y., Xiong, Y., Gao, X., Jia, K., Pan, J., Bi, Y., Dai, Y., Sun, J.
[1] S. Meshram, N. Naik, M. VR, T. More and S. Kharche, ”Con- and Wang, H., 2023. Retrieval-augmented generation for large language
versational AI: Chatbots,” 2021 International Conference on Intel- models: A survey. arXiv preprint arXiv:2312.10997.
ligent Technologies (CONIT), Hubli, India, 2021, pp. 1-6, doi: [24] Li, J., Yuan, Y. and Zhang, Z., 2024. Enhancing LLM Factual Accuracy
10.1109/CONIT51480.2021.9498508. with RAG to Counter Hallucinations: A Case Study on Domain-Specific
[2] Lalwani, Tarun and Bhalotia, Shashank and Pal, Ashish and Rathod, Queries in Private Knowledge-Bases. arXiv preprint arXiv:2403.10446.
Vasundhara and Bisen, Shreya, Implementation of a Chatbot System
using AI and NLP (May 31, 2018). International Journal of Innovative
Research in Computer Science & Technology (IJIRCST) Volume-6,
Issue-3, May-2018.
[3] Bal, Sauvik & Jash, Kiran & Mandal, Lopa. (2024). An Implementation
of Machine Learning-Based Healthcare Chabot for Disease Prediction
(MIBOT). 10.1007/978-981-99-6866-4-32.
[4] Kulkarni, Mandar, Praveen Tangarajan, Kyung Kim, and Anusua Trivedi.
”Reinforcement learning for optimizing rag for domain chatbots.”.arXiv
preprint arXiv:2401.06800.(2024).
[5] C. Jeong, “Generative AI service implementation using LLM application
architecture: based on RAG model and LangChain framework,” Journal
of Intelligence and Information Systems, vol. 29, no. 4, pp. 129–164,
Dec. 2023.
[6] Jeong, Cheonsu. (2023). A Study on the Implementation of Generative
AI Services Using an Enterprise Data-Based LLM Application Archi-
tecture. Advances in Artificial Intelligence and Machine Learning. 3.
1588-1618. 10.54364/AAIML.2023.1191.