0% found this document useful (0 votes)
124 views

RAG-based-Chatbot-using-LLMs

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
124 views

RAG-based-Chatbot-using-LLMs

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

International Journal of Scientific Research in Engineering and Management (IJSREM)

Volume: 08 Issue: 06 | June - 2024 SJIF Rating: 8.448 ISSN: 2582-3930

RAG based Chatbot using LLMs


Ananya G Dr. Vanishree K
Department of ISE Department of ISE
R V College of Engineering R V College of Engineering
Bengaluru, India Bengaluru, India

Abstract—Historically, Artificial Intelligence (AI) was used to accurate and relevant responses to user inquiries, thereby
understand and recommend information. Now, Generative AI can enhancing user experience and streamlining communication
also help us create new content. Generative AI builds on existing processes.
technologies, like Large Language Models (LLMs) which are
trained on large amounts of text and learn to predict the next By harnessing the power of LLMs, the project aims to
word in a sentence. Generative AI can not only create new text, create a chatbot that can understand natural language queries,
but also images, videos, or audio. This project focuses on the generate contextually relevant responses, and provide valuable
implementation of a chatbot based the concepts of Generative assistance to users within the company. The project idea is
AI and Large Language Models which can answer any query proposed in the desire to leverage cutting-edge AI advance-
regarding the content provided in the PDFs. The primary
technologies utilized include Python libraries like LangChain, ments to enhance user interactions and streamline commu-
PyTorch for model training, and Hugging Face’s Transformers nication processes. Understanding LLMs and NLP is essen-
library for accessing pre-trained models like Llama2, GPT- tial for developing advanced AI systems, chatbots, language
3.5 (Generative Pre-trained Transformer) architectures. The re- models, and applications that require robust natural language
sponses are generated using the Retrieval Augmented Generation understanding and generation capabilities. These technologies
(RAG) approach. The project aims to develop a chatbot which can
generate the sensible responses from the data in the form of PDF are revolutionizing how computers interact with and process
files. The project demonstrates the capabilities and applications human language, enabling a wide range of innovative applica-
of advanced Natural Language Processing (NLP) techniques tions across industries, which opens a wide range of learning
in creating conversational agents that can be deployed across opportunities.
various platforms in the corporation, to enhance user interaction
and support automated tasks. II. LITERATURE REVIEW
Index Terms—Generative AI, Artificial Intelligence, Natural
Language Processing, Large Language Model, Llama2, Tran- [1] The review suggested that chatbots can be used ev-
formers, Document Loaders, Retrieval Augmented Generation, erywhere because of its accuracy, lack of dependability on
Vector Database, Langchain, Chainlit human resources and 24x7 accessibility. In recent years, ad-
vancements in technologies such as Artificial Intelligence (AI),
I. INTRODUCTION Big Data, and Internet of Things (IoT) have revolutionized
In today’s digital age, the demand for intelligent conver- various industries. Among these innovations, Chatbots, or
sational agents, known as chatbots, has surged dramatically. conversational AIs, have emerged as a significant application.
These chatbots, powered by cutting-edge technologies such Chatbots, powered by AI and Natural Language Processing
as Large Language Models (LLMs) and advanced Natural (NLP), simulate human conversation, offering automation and
Language Processing (NLP) techniques, have revolutionized efficiency across diverse domains like education, healthcare,
how businesses and organizations interact with their customers and business. Through a review of existing literature, this study
and users. In line with this technology, the project aims to explores the types, advantages, and disadvantages of chatbots,
develop a sophisticated chatbot utilizing LLMs and related highlighting their versatility, accuracy, and ability to operate
technologies, specifically trained on a set of emails. Lever- continuously without reliance on human resources.
aging the Retrieval-Augmented Generation (RAG) approach [2] The paper presents a college inquiry chatbot as a solution
within the Python programming language, the chatbot will to challenges in locating specific information, especially for
be capable of understanding user queries, retrieving relevant non-affiliated visitors in the college website. While GUI and
information from a corpus of email data, and generating web-based interfaces are mainstream, alternative interfaces
contextually appropriate responses. The utilization of LLMs, occasionally emerge to address specific needs. Powered by
such as Llama2, Llama3, Mistral, GPT (Generative Pretrained AI and NLP algorithms, the developed chatbot intelligently
Transformer), combined with the RAG architecture, offers handle queries related to various college activities, including
unparalleled capabilities in natural language understanding examination cell, admission, academics, attendance, place-
and generation. By training the chatbot on a specific set of ment, and more.
emails, it is ensured that the chatbot is tailored to the domain- [3] The paper talks about the challenges posed by the
specific needs and queries encountered in real-world email pandemic, accessing health-care services has become increas-
communications. This approach enables the chatbot to provide ingly difficult. To address this issue, a chatbot application

© 2024, IJSREM | www.ijsrem.com DOI: 10.55041/IJSREM35600 | Page 1


International Journal of Scientific Research in Engineering and Management (IJSREM)
Volume: 08 Issue: 06 | June - 2024 SJIF Rating: 8.448 ISSN: 2582-3930

leveraging Natural Language Processing (NLP) and machine


learning concepts is proposed. This chatbot system, developed
using supervised machine learning, aims to provide disease
diagnosis and treatment recommendations with detailed de-
scriptions of various illnesses before consulting with a doctor.
The application features a GUI-based text assistant for user-
friendly interaction, allowing users to input symptoms and risk
factors for their condition. The chatbot then offers personalized
suggestions, including analgesics and advice on when to seek
physical medical attention.
[4] This paper introduces a Retrieval Augmented Generation
(RAG) approach for constructing a chatbot that addresses
user queries using Frequently Asked Questions (FAQ) data.
Leveraging Large Language Models (LLMs), particularly a
paid ChatGPT model, the system utilizes contextual question
answering capabilities acquired through training. The paper
outlines the training of an in-house retrieval embedding model
using infoNCE loss, showcasing its superior performance
over a general-purpose public embedding model in terms of Fig. 1. Methodology for the proposed system
retrieval accuracy and Out-of-Domain (OOD) query detection.
Furthermore, the paper explores the optimization of LLM to-
non-text elements, as well as handling special characters and
ken usage and associated costs using Reinforcement Learning
formatting issues.
(RL), proposing a policy-based model external to the RAG
pipeline. This model interacts with the pipeline through policy B. Text Segmentation and Embedding Generation
actions, updating policies to optimize costs. The extracted text from the text files undergoes segmenta-
[5] This study addresses the challenge of integrating Large tion into smaller units, enhancing the efficiency of subsequent
Language Models (LLMs) into corporate environments where processing and analysis. This segmentation divides the text
internal data utilization is limited. It proposes a method for into manageable chunks, enabling the system to focus on
implementing generative AI services using LLMs within the specific aspects of the information. Following segmentation,
LangChain framework. The study explores various strategies the system generates embeddings for the segmented text.
to leverage LLMs, focusing on fine-tuning and direct use Embeddings are numerical representations of text that capture
of document information. It details information storage and the semantic meaning of the information. By encoding the un-
retrieval methods, employing the Retrieval Augmented Gener- derlying context and relationships within the text, embeddings
ation (RAG) model for context recommendation and Question- enable the system to interpret and understand the content more
Answering (QA) systems. By enhancing understanding of gen- effectively. This transformation of textual data into numerical
erative AI technology, the study enables active utilization of vectors facilitates various downstream tasks, such as semantic
LLMs in corporate service implementation, offering valuable search and contextually enriched content generation.
insights for practical applications.
C. Creation of Vector Store Databases
III. METHODOLOGY
Vector Store Databases serve as foundational components
The methodology of the project involves preprocessing pdf of the chatbot system, enabling efficient storage and retrieval
data, segmenting text, and generating embeddings for semantic of textual embeddings. These databases store the numerical
understanding. Leveraging Retrieval Augmented Generation representations of text, known as embeddings, which encap-
(RAG), Retrievers bridge generative models and external sulate the semantic meaning of the information.The embed-
knowledge sources. the users interact with the LLM using a dings stored in Vector Store Databases enable the Retriever-
web application which integrates with the database and the Augmented Generation (RAG) systems to retrieve and inte-
generative model. grate relevant information into the generated outputs. RAG
systems leverage the complementary strengths of retrieval-
A. Preprocessing of PDF Data
based and generation-based approaches to produce more con-
The data is provided to the model in the form of PDF files. textually accurate and informative responses compared to
PDF documents contain text data that needs to be extracted traditional generation models.
for analysis. Text extraction techniques are used to convert the
textual content of PDFs into a format that can be processed D. Retrievers in RAG Framework
by the chatbot. Preprocessing steps may be applied to the In the RAG framework, Retrievers serve as essential com-
extracted text to clean and standardize it. This can involve ponents that bridge the gap between the generative model and
removing irrelevant content, such as headers, footers, and external knowledge sources. Their role is pivotal in enriching

© 2024, IJSREM | www.ijsrem.com DOI: 10.55041/IJSREM35600 | Page 2


International Journal of Scientific Research in Engineering and Management (IJSREM)
Volume: 08 Issue: 06 | June - 2024 SJIF Rating: 8.448 ISSN: 2582-3930

the content generation process by facilitating access to relevant • The web application enables users to upload any PDF
information from external sources. Retrievers accomplish this file they wish to query. The PDF data undergoes parsing
by employing various techniques such as semantic search to extract the relevant content. This involves removing
and information retrieval to identify and retrieve pertinent unnecessary elements such as headers, footers, and any
information based on user queries. By accessing external other extraneous details.
knowledge sources, Retrievers enhance the comprehensiveness • The pre-processed text is segmented into smaller units
and accuracy of the generated responses. or chunks to facilitate efficient processing and analysis.
This segmentation helps in managing large volumes of
E. User Interaction with Large Language Model (LLM) text data. Embeddings, which are vector representations
User Interaction with the Large Language Model (LLM) of the text, are generated using libraries like sentence-
is facilitated through a dedicated web-based interface tailored transformers. These embeddings encode the semantic
for seamless communication. Upon receiving user queries, meaning of the text, making it suitable for retrieval and
the LLM undertakes comprehension, transforming them into generation tasks.
query embeddings that encapsulate the semantic essence of the • The generated embeddings are stored in vector store
inquiries. Leveraging these embeddings, the system conducts databases like FAISS. These databases serve as reposi-
semantic searches to retrieve pertinent context, subsequently tories for the embeddings, allowing quick and efficient
crafting responses that adeptly address the users’ queries. retrieval based on semantic similarity.
Through this iterative process, the LLM ensures effective and • The embeddings in the vector store enable the Retriever-
contextually relevant interactions, enhancing user satisfaction Augmented Generation (RAG) system to retrieve relevant
and system usability information, enhancing the contextuality and accuracy of
the chatbot’s responses. When a user query is received,
IV. IMPLEMENTATION AND RESULTS it is converted into query embeddings, which are used to
Python’s versatility, combined with its robust community perform a semantic search in the vector store to retrieve
support and cross-platform compatibility, has made itself relevant context.
widely utilized in training Large Language Models (LLMs). • The project utilizes pre-trained LLM, Llama2-7B model.
Python 3.x (Python 3.8 or higher) is used for development in The model is obtained from the Hugging Face Trans-
this project. formers library, which provides tools for fine-tuning and
Deep Learning Libraries like PyTorch, LangChain as the deployment.
primary deep learning framework for model development and • The retrieved context, along with the user query, is
training. LangChain is a deep learning framework primar- fed into the LLM to generate coherent and contextually
ily focused on natural language processing (NLP) tasks. It relevant responses. The system uses RAG to integrate
provides a set of tools and utilities specifically tailored for external knowledge sources seamlessly.
NLP applications, including text preprocessing, tokenization, • A user-friendly web-based interface is developed using
sequence modeling, and language generation. LangChain aims framework, Chainlit. This interface allows users to inter-
to simplify the development and deployment of NLP models act with the chatbot in real-time.
by offering high-level abstractions and pre-built components
for common NLP tasks.
Transformers are the architectural backbone that powers
LLMs, enabling them to process and understand text at scale.
Transformers Library like Hugging Face Transformers library,
open-source library developed by Hugging Face, a company
specializing in natural language processing (NLP) technolo-
gies, which provides easy-to-use interfaces for working with
transformer-based models, including both pre-trained models
and tools for fine-tuning them on custom datasets. The library
supports a wide range of transformer architectures, including
BERT, GPT, RoBERTa, T5, and more.
Chainlit is the open-source Python libraries that allows to
create web applications for machine learning and data science
projects with minimal effort. It’s designed to make it easy
for developers to build interactive web apps without requiring
expertise in web development.
Utilizing the mentioned technologies, the chatbot has been
developed which takes the PDF as input and answers any
queries asked by the user. The following mentions the features
of the developed web application: Fig. 2. User interface for the Chatbot

© 2024, IJSREM | www.ijsrem.com DOI: 10.55041/IJSREM35600 | Page 3


International Journal of Scientific Research in Engineering and Management (IJSREM)
Volume: 08 Issue: 06 | June - 2024 SJIF Rating: 8.448 ISSN: 2582-3930

The implementation of the RAG framework significantly [7] Afzal, Anum & Kowsik, Alexander & Fani, Rajna & Matthes, Florian.
improved the chatbot’s ability to provide accurate and contex- (2024). Towards Optimizing and Evaluating a Retrieval Augmented QA
Chatbot using LLMs with Human-in-the-Loop.
tually relevant responses. This approach used Reinforcement [8] Bacciu, A.; Cocunasu, F.; Siciliano, F.; Silvestri, F.; Tonellotto, N.;
Learning to minimize the number of LLM tokens required, and Trappolini, G. 2023. RRAML: Reinforced Retrieval Augmented
reducing the overall computational cost. The web-based in- Machine Learning.
[9] Chen, Jiawei & Lin, Hongyu & Han, Xianpei & Sun, Le. (2024).
terface provided a seamless and interactive user experience. Benchmarking Large Language Models in Retrieval-Augmented Gen-
Users could query the chatbot and receive prompt responses, eration. Proceedings of the AAAI Conference on Artificial Intelligence.
enhancing their overall interaction with the system. 38. 17754-17762. 10.1609/aaai.v38i16.29728.
[10] Li, Xianzhi & Chan, Samuel & Zhu, Xiaodan & Pei, Yulong & Ma,
Figure 2 Shows the chatbot interface using which the users Zhiqiang & Liu, Xiaomo & Shah, Sameena. (2023). Are ChatGPT
can interact with the LLM. The chatbot responds to the query and GPT-4 General-Purpose Solvers for Financial Text Analytics? A
using the data provided in the PDF files. Study on Several Typical Tasks. 408-422. 10.18653/v1/2023.emnlp-
industry.39.
V. CONCLUSION [11] Zhihan Lv, Generative artificial intelligence in the metaverse
era, Cognitive Robotics, Volume 3, 2023, Pages 208-217,
The chatbot is designed to engage in natural language ISSN 2667-2413, https://doi.org/10.1016/j.cogr.2023.06.001.
conversations, providing intelligent responses to the queries (https://www.sciencedirect.com/science/article/pii/S2667241323000198)
[12] Zant, Tijn & Kouw, Matthijs & Schomaker, Lambert. (2012). Generative
related to uploaded PDFs. The chatbot is expected to answer Artificial Intelligence. 10.1007/978-3-642-31674-6-8.
the queries based on the the PDF data. The responses are [13] Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion
generated using the Retrieval Augmented Generation (RAG) Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. ”Attention
is all you need.” Advances in neural information processing systems 30
approach. (2017).
In conclusion, the implementation of the chatbot using [14] Rackauckas, Zackary. ”RAG-Fusion: a New Take on Retrieval-
LLMs and the RAG framework demonstrated the potential of Augmented Generation.” arXiv preprint arXiv:2402.03367 (2024).
[15] Khan, Salman, Muzammal Naseer, Munawar Hayat, Syed Waqas Zamir,
advanced NLP techniques in creating efficient and effective Fahad Shahbaz Khan, and Mubarak Shah. ”Transformers in vision: A
conversational agents. The project achieved significant im- survey.” ACM computing surveys (CSUR) 54, no. 10s (2022): 1-41.
provements in response accuracy and efficiency by employing [16] Goldman, Sharon M. ”Transformers.” Journal of Consumer Marketing
27, no. 5 (2010): 469-473.
the RAG framework, which integrated external knowledge [17] Alan, Ahmet Yusuf, Enis Karaarslan, and Omer Aydin. ”A RAG-
sources to enrich the chatbot’s contextual understanding. The based Question Answering System Proposal for Understanding Islam:
use of a policy-based model for optimizing LLM token usage MufassirQAS LLM.” arXiv preprint arXiv:2401.15378 (2024).
[18] Feuerriegel, Stefan & Hartmann, Jochen & Janiesch, Christian &
demonstrated substantial cost savings while maintaining high Zschech, Patrick. (2023). Generative AI.
response quality. The results of this project highlight the [19] Quidwai, Mujahid Ali, and Alessandro Lagana. ”A RAG Chatbot for
effectiveness of combining LLMs with retrieval mechanisms to Precision Medicine of Multiple Myeloma.” medRxiv (2024): 2024-03.
[20] Bras¸oveanu, Adrian MP, and Ra˘zvan Andonie. ”Visualizing transformers
create sophisticated conversational agents capable of handling for nlp: a brief survey.” In 2020 24th International Conference Informa-
complex queries. The chatbot not only automated routine tion Visualisation (IV), pp. 270-279. IEEE, 2020.
query responses but also provided a scalable solution for [21] Wolf, Thomas, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement
Delangue, Anthony Moi, Pierric Cistac et al. ”Huggingface’s trans-
future expansion and enhancement. The implementation sets formers: State-of-the-art natural language processing.” arXiv preprint
a foundation for future research and development in the field arXiv:1910.03771 (2019).
of AI-driven conversational systems, paving the way for more [22] Fill, Hans-Georg & Fettke, Peter & Ko¨pke, Julius. (2023). Conceptual
Modeling and Large Language Models: Impressions From First Exper-
sophisticated and efficient automated support solutions. iments With ChatGPT. Enterprise Modelling and Information Systems
Architectures. 18. 1-15. 10.18417/emisa.18.3.
REFERENCES [23] Gao, Y., Xiong, Y., Gao, X., Jia, K., Pan, J., Bi, Y., Dai, Y., Sun, J.
[1] S. Meshram, N. Naik, M. VR, T. More and S. Kharche, ”Con- and Wang, H., 2023. Retrieval-augmented generation for large language
versational AI: Chatbots,” 2021 International Conference on Intel- models: A survey. arXiv preprint arXiv:2312.10997.
ligent Technologies (CONIT), Hubli, India, 2021, pp. 1-6, doi: [24] Li, J., Yuan, Y. and Zhang, Z., 2024. Enhancing LLM Factual Accuracy
10.1109/CONIT51480.2021.9498508. with RAG to Counter Hallucinations: A Case Study on Domain-Specific
[2] Lalwani, Tarun and Bhalotia, Shashank and Pal, Ashish and Rathod, Queries in Private Knowledge-Bases. arXiv preprint arXiv:2403.10446.
Vasundhara and Bisen, Shreya, Implementation of a Chatbot System
using AI and NLP (May 31, 2018). International Journal of Innovative
Research in Computer Science & Technology (IJIRCST) Volume-6,
Issue-3, May-2018.
[3] Bal, Sauvik & Jash, Kiran & Mandal, Lopa. (2024). An Implementation
of Machine Learning-Based Healthcare Chabot for Disease Prediction
(MIBOT). 10.1007/978-981-99-6866-4-32.
[4] Kulkarni, Mandar, Praveen Tangarajan, Kyung Kim, and Anusua Trivedi.
”Reinforcement learning for optimizing rag for domain chatbots.”.arXiv
preprint arXiv:2401.06800.(2024).
[5] C. Jeong, “Generative AI service implementation using LLM application
architecture: based on RAG model and LangChain framework,” Journal
of Intelligence and Information Systems, vol. 29, no. 4, pp. 129–164,
Dec. 2023.
[6] Jeong, Cheonsu. (2023). A Study on the Implementation of Generative
AI Services Using an Enterprise Data-Based LLM Application Archi-
tecture. Advances in Artificial Intelligence and Machine Learning. 3.
1588-1618. 10.54364/AAIML.2023.1191.

© 2024, IJSREM | www.ijsrem.com DOI: 10.55041/IJSREM35600 | Page 4

You might also like