0% found this document useful (0 votes)

51 views7 pages

Conversational Text Extraction With Large Language Models Using Retrieval-Augmented Systems

Uploaded by

jcsilva

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

51 views7 pages

Conversational Text Extraction With Large Language Models Using Retrieval-Augmented Systems

Uploaded by

jcsilva

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/385270908

Conversational Text Extraction with Large Language Models Using Retrieval-

Augmented Systems

Conference Paper · October 2024

CITATIONS READS

0 36

5 authors, including:

Soham Roy Mitul Goswami

KIIT University KIIT University
3 PUBLICATIONS 0 CITATIONS 7 PUBLICATIONS 6 CITATIONS

SEE PROFILE SEE PROFILE

Nisharg Nargund Suneeta Mohanty

KIIT University KIIT University
10 PUBLICATIONS 3 CITATIONS 34 PUBLICATIONS 141 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Soham Roy on 26 October 2024.

The user has requested enhancement of the downloaded file.

Conversational Text Extraction with Large
Language Models Using Retrieval-Augmented
Systems
Soham Roy Mitul Goswami Nisharg Nargund
School of Computer Engineering School of Computer Engineering School of Computer Engineering KIIT
KIIT Deemed to be University KIIT Deemed to be University Deemed to be University
Bhubaneswar, India Bhubaneswar, India Bhubaneswar, India
[email protected] Email- [email protected] Email- [email protected]

Suneeta Mohanty* Prasant Kumar Pattnaik

Corresponding Author School of Computer Engineering
School of Computer Engineering KIIT Deemed to be University
Bhubaneswar, India
KIIT Deemed to be University Email- [email protected]
Bhubaneswar, India
*
Email- [email protected]

Abstract— This study introduces a system leveraging Large integration of Retrieval-Augmented Generation (RAG) with
Language Models (LLMs) to extract text and enhance user LLMs is noteworthy, it is essential to recognize that similar
interaction with PDF documents via a conversational interface. frameworks have been explored in existing literature. This
Utilizing Retrieval-Augmented Generation (RAG), the system paper aims to build upon these studies by providing a tailored
provides informative responses to user inquiries while
application for document interaction. The advent of machine
highlighting relevant passages within the PDF. Upon user
upload, the system processes the PDF, employing sentence learning, particularly deep learning, marked a significant leap
embeddings to create a document-specific vector store. This forward, with models like Word2Vec and GloVe introducing
vector store enables efficient retrieval of pertinent sections in word embeddings that captured semantic relationships
response to user queries. The LLM then engages in a between words [3]. Furthermore, Transformers utilize self-
conversational exchange, using the retrieved information to attention mechanisms to process and understand text in
extract text and generate comprehensive, contextually aware parallel, rather than sequentially, enabling them to capture
answers. While our approach demonstrates competitive long-range dependencies and contextual information more
ROUGE values compared to existing state-of-the-art techniques effectively. BERT, with its bidirectional approach, improved
for text extraction and summarization, we acknowledge that
further qualitative evaluation is necessary to fully assess its
the understanding of context within a text, while GPT, with
effectiveness in real-world applications. The proposed system its autoregressive nature, excelled in text generation [4][5].
gives competitive ROUGE values as compared to existing state- However, while the use of these models has become
of-the-art techniques for text extraction and summarization, widespread, the integration of retrieval augmented generation
thus offering a valuable tool for researchers, students, and for targeted PDF interaction remains under-explored. This
anyone seeking to efficiently extract knowledge and gain work focuses on addressing this niche, aiming to bridge this
insights from documents through an intuitive question- gap by combining large language models with document
answering interface. retrieval in the conversational interface, providing a more
tailored application in the domain of document interaction.
Keywords—Large Language Model (LLM), Retrieval
Augmented Generation, Embeddings, Text Extraction, ROUGE
These advancements in LLMs have significantly enhanced
text extraction and data retrieval capabilities. This capability
I. INTRODUCTION is particularly useful for handling the ever-growing volume
The ever-growing volume of digital documents, particularly of digital documents, enabling efficient knowledge extraction
PDFs, presents a significant challenge: efficiently extracting and insight generation [6].
knowledge from their text-heavy content. Over the years,
various tools and techniques have been developed to address Building on the advancements in LLMs, Retrieval-
this issue, from basic keyword search functionalities to more Augmented Generation (RAG) systems enhance the
advanced text mining and natural language processing (NLP) capability of these models by integrating a retrieval
algorithms [1]. Despite these advancements, many solutions mechanism. RAG combines information retrieval and
still fall short of providing contextually relevant information generative processes to produce highly accurate and
quickly and accurately. The evolution of artificial contextually relevant responses [7]. In a RAG framework, the
intelligence and machine learning, particularly in the form of system first retrieves relevant passages from a large corpus of
large language models, has revolutionized this process, documents based on the user's query [8][9]. This combination
enabling more sophisticated and efficient extraction of allows the model to generate responses that are both informed
knowledge from vast repositories of digital documents [2]. by a broad understanding of language and enriched with
precise, relevant details from the retrieved content. This
Large language models (LLMs) have undergone significant approach significantly improves the model's ability to handle
evolution, transforming the landscape of natural language complex queries and extract pertinent information from large
processing (NLP) and information retrieval. While the datasets, making it an invaluable tool for efficient and
accurate knowledge extraction. In this study, the authors

introduce a novel approach for text extraction that leverages commences with a robust retrieval component that sifts
an LLM system to enhance user interaction with documents through a vast corpus of documents to pinpoint relevant
via a conversational interface. passages aligned with the user's query. Traditional IR
techniques like TF-IDF and BM25 evaluate the statistical
II. RELATED WORK relevance of terms across documents, prioritizing those that
Recent advancements in document understanding and closely match the query [18]. Fig. 1 demonstrates the detailed
information extraction have been driven by deep learning RAG architecture. Advanced methods such as neural
techniques. Traditional keyword matching and rule-based retrievers, exemplified by Dense Passage Retrieval (DPR),
methods often struggle with complex documents, while deep employ deep learning models to encode documents into dense
learning models provide more robust solutions capable of embeddings, capturing semantic relationships and enhancing
accurately handling intricate structures. For instance, M. Li contextual understanding.
et al. introduced the “BiomedRAG” model, which supervises
retrieval in the biomedical domain through varied chunk
database creation, enhancing prediction accuracy [10].
Similarly, M. D. Cyril Zakka et al. developed the “Almanac”
framework, which retrieves medical guidelines and treatment
advice, outperforming typical LLMs in factuality,
completeness, user preference, and safety [11]. Additionally,
K. Yang et al. introduced "LeanDojo," a RAG-based LLM
that streamlines theorem proving with comprehensive
toolkits and data [12]. P. Lewis et al. explored a fine-tuning
recipe for RAG models, leveraging pre-trained parametric
and non-parametric memory to improve language
development [13].

Z. Feng et al. proposed an iterative retrieval-generation Fig. 1. RAG Architecture

collaborative framework that not only allows for the use of
both parametric and non-parametric knowledge but also aids Once relevant passages are identified, they undergo encoding
in the discovery of the correct reasoning path via retrieval- into document embeddings. These embeddings encapsulate
generation interactions, which is critical for tasks that require the semantic meaning and context of the retrieved text,
multi-step reasoning [14]. J. Miao et al. demonstrated the employing techniques like sentence embeddings from models
development of a specialized ChatGPT model connected with such as the Universal Sentence Encoder or BERT [19]. These
an RAG system that is intended to meet the KDIGO 2023 embeddings serve as enriched inputs to the subsequent stage
criteria for chronic kidney disease [15]. In a different domain, of the RAG architecture. Integration with a generative model,
H. Li et al. demonstrated the efficacy of leveraging attention typically an LLM such as GPT, marks the next critical phase.
processes in neural networks to focus on key areas of material The generative model utilizes the contextual information
for better question answering in language models [16]. embedded in the document embeddings to produce responses
Similarly, Y. Zhang et al. suggested a unique Multi-Modal that are not only grammatically accurate and fluent but also
Knowledge-aware Hierarchical Attention Network contextually aligned with the user's query [20]. By integrating
(MKHAN) to efficiently leverage a multi-modal knowledge detailed context from the embeddings, the generative model
graph (MKG) for explainable medical question answering ensures that its responses are informed by both the broad
[17]. However, these approaches are often tailored to specific linguistic knowledge it has learned during training and the
use cases, lacking the generalizability required for broader specific details extracted from the retrieved passages. To
document interaction tasks. further refine performance, the RAG architecture often
Our work builds upon these advancements by presenting a involves fine-tuning the generative model on task-specific
RAG-inspired system for the interactive exploration of user- datasets [21].
uploaded PDF documents. We employ advanced sentence IV. PROPOSED METHODOLOGY
embeddings to ensure efficient retrieval of relevant content.
By integrating this context into the response generation A. Data Chunking
process of the large language model (LLM), we create a more The integration of the PyPDF2 library enables efficient text
tailored and contextually rich user experience. This approach extraction and management of PDF documents within the
allows users to engage in focused conversations that explore model. Initially, a PdfReader object is created to represent the
the specific content and nuances of the uploaded PDFs, entire PDF, facilitating seamless interaction with its content
thereby enhancing the effectiveness and relevance of [22].
information retrieval and dialogue within the system.
III. RETRIEVAL AUGMENTED GENERATION
The Retrieval-Augmented Generation (RAG) architecture
represents a sophisticated integration of information retrieval
(IR) and generative modeling techniques, designed to
enhance the precision and relevance of generated responses
in natural language processing tasks. The RAG process
embeddings is employing pre-trained sentence transformers.
In the model, the specific embedding used is “sentence-
transformers/all-MiniLM-L6-v2”. These models are trained
on extensive text corpora and have learned to encapsulate the
semantic essence of sentences within vector representations
[23]. The sentence embedding function is defined in Equation
(3) where S is a sentence composed of a sequence of words,
W is the embedding matrix for the vocabulary, and f is the
sentence embedding function that maps a sentence S to a
vector 𝑠 ∈ 𝑅𝑑 .

𝑠 = 𝑓(𝑠) (3)
Fig. 2. Workflow of the Proposed Model
This function f often involves multiple steps, including word
To extract the raw text, the model iteratively traverses each embeddings, contextual embeddings using transformer
page using a loop, employing the extract_text() method. The models, and aggregation methods. Each word 𝜔𝑖 in the
extracted texts are then consolidated into a single string sentence S is mapped to a vector 𝑤i using an embedding
variable, pdf_text, which captures the entire textual content
matrix W in Equation (4) where 𝑊[𝜔𝑖 ]
of the PDF. Given the potentially large size of pdf_text, the
model implements text chunking to improve computational
𝑤𝑖 = 𝑊[𝜔𝑖 ] (4)
efficiency. Equation (1) outlines the mathematical process for
text chunking, where n represents the number of desired
To obtain a single fixed-dimensional vector representing the
chunks, chunk_size indicates the size of each chunk, and
entire sentence, an aggregation method using mean pooling
chunk_overlap defines the overlap between consecutive
is applied to the contextual embeddings. Equation (5)
chunks.
computes the average of the contextual embeddings of all
𝑖𝑛 = (𝑛 − 1) × (𝑐ℎ𝑢𝑛𝑘_𝑠𝑖𝑧𝑒 − 𝑐ℎ𝑢𝑛𝑘_𝑜𝑣𝑒𝑟𝑙𝑎𝑝) (1)
words in the sentence, resulting in the final sentence
embedding
Equation (1) calculates the starting index for each chunk
based on its position n, chunk size, and overlap, ensuring that 1
each chunk overlaps with the previous ones by a specified 𝑆= ∑𝑛1 ℎ (5)
𝑛
number of characters. To determine the ending index for each
chunk, Equation (2) provides a formula where jn indicates the The vector S in Equation (5) is the sentence embedding,
ending index of chunk n. which captures the meaning of the sentence in a way that
𝑗𝑛 = 𝑖𝑛 + 𝑐ℎ𝑢𝑛𝑘_𝑠𝑖𝑧𝑒 (2) allows for efficient comparison and retrieval in natural
language processing tasks. The model uses
This approach allows for the systematic division of large text HuggingFaceEmbedings class from langchain-
into smaller segments, facilitating easier processing and community.embeddings module to work with the pre-trained
analysis in natural language processing tasks such as sentence transformer model. To load the model, the model
information retrieval, text summarization, and machine name is specified along with any extra configuration options.
translations. Each chunk is associated with metadata to enrich The embeddings object generates vector representations for
the context and facilitate easier retrieval of specific text each text chunk using the compute_embeddings method,
segments. Metadata is organized as a list of dictionaries, with which takes a list of chunks as input and outputs
each dictionary corresponding to a chunk in the text list. corresponding embedding vectors that capture their meanings
Typically, metadata includes a key-value pair where the key in numerical form. These vectors are then combined with
denotes the origin or source identifier of the chunk within the metadata, which includes information about the source of
PDF signifies the page number, and "pl" denotes paragraph each chunk within the PDF document. This integration results
level). This approach allows for precise tracking and retrieval in a final list of document representations optimized for
of information within the PDF document, enhancing the efficient retrieval. Consequently, the model can quickly and
model’s capability to handle and manipulate textual content accurately locate relevant passages in response to user
effectively in various applications and user interactions. queries, leveraging the meanings captured in the embeddings
along with contextual details from the metadata.
B. Vector Embeddings For Efficient Retrieval
In preparing text for efficient retrieval, the model utilizes C. Building The Conversational Retrieval Chain(CRC)
sentence embedding techniques to convert text chunks into
numerical representations. This step is crucial for enabling This comprehensive approach involves several key
fast and accurate retrieval of semantically similar sentences components that synergize to deliver a seamless user
or passages from a document. Sentence embedding experience.
techniques are designed to map sentences from their original
high-dimensional textual space into a lower-dimensional  Large Language Model
vector space. This transformation allows for efficient At the heart of the CRC is the LLM, which generates
comparison and retrieval of sentences based on their semantic responses to user queries. The model utilizes the Groq LLM
content. A widely used approach for generating these (llm_groq), integrated through the langchain_groq library.
This pre-trained LLM leverages its extensive knowledge At the core of the model's functionality is the Conversational
base, derived from vast amounts of text data, to understand Retrieval Chain. To generate contextually rich responses, the
and answer user questions accurately. The LLM's capability chain first accesses the conversation history via the
to generate coherent and contextually appropriate responses ConversationBufferMemory [25], which retains past user
makes it a crucial component of the CRC. queries and responses, ensuring that the current interaction
benefits from previous exchanges.
 Retriever Subsequently, the system utilizes a retriever that operates on
a pre-constructed document vector store, comprising
The retriever component is responsible for fetching relevant embeddings of text chunks extracted from the PDF. The
information from the document based on the user's query. retriever searches for document sections that are semantically
The model employs the faiss library from similar to the user's query, using cosine similarity to evaluate
langchain_community.vectorstores to create a vector store how closely related two vectors are within the embedding
using the document embeddings generated earlier. These space. Equation (6) illustrates the concept of cosine
embeddings transform text chunks into numerical similarity.
representations that capture their semantic content. The
𝑢.𝑣
vector store allows for efficient retrieval of document cos(𝑢, 𝑣) = ‖ ‖ ‖ ‖ (6)
( 𝑢 | 𝑣 )
sections (chunks) that are semantically similar to the user's
query. The as_retriever method of the vector store object is
used to create a retriever object that integrates into the CRC, Cosine similarity scores range from -1 (completely
enabling precise and relevant information retrieval. dissimilar) to 1 (identical). The model retrieves the document
sections with the highest cosine similarity scores, indicating
 Memory their relevance to the user's query. These retrieved sections,
along with the conversation context, are then fed into the
Memory management is essential for preserving Groq LLM. By leveraging its pre-trained knowledge and the
conversational context. The model employs the specific context from the retrieved text, the Groq LLM
ConversationBufferMemory class from langchain.memory to generates comprehensive and accurate responses to user
store past user queries and LLM responses. This history is questions.
crucial for the LLM to reference prior interactions when When relevant document sections are retrieved, the model
generating current responses. The memory is set up with enhances responses by referencing these sources. It assigns
keys: memory_key="chat_history" for conversation history unique identifiers to each retrieved section and may include
and output_key="answer" for the LLM's responses. This these references in the response text. This method not only
configuration facilitates more coherent and contextually ensures transparency but also allows users to verify the
aware interactions. information's origin. To improve usability, Streamlit
expanders (st.expander) are used, enabling users to easily
 Chain Configuration view the content of the retrieved document sections. By
clicking on the corresponding source names, users can
To integrate these components, the expand and read the specific excerpts that informed the
ConversationalRetrievalChain.from_llm method is employed LLM's response. This interactive feature allows users to
with specific parameters. The LLM parameter is configured explore the document content more deeply, enhancing their
to utilize thr Groq LLM object(llm_groq). The chain_type is understanding and engagement with the material.
designated as “stuff”, indicating a focus on retrieving factual V. EXPERIMENTATIONS AND RESULTS
content from the document. The retriever parameter is linked
to the retriever object generated from the vector store, To assess the model's capability to navigate and summarize
ensuring efficient retrieval of relevant document section. The complex academic materials, we employed ROUGE (Recall-
memory parameter is set to conversation buffer memory Oriented Understudy for Gisting Evaluation) scores, a widely
object. Further, return_source_documents is True instructing accepted metric for evaluating the quality of automatically
chains to return chunks along with responses. This ensures generated summaries against human-written references.
the accurate answers enriched with relevant context from the However, relying solely on ROUGE metrics may not
documents. adequately reflect the system's interactive and conversational
aspects. Therefore, future studies will include qualitative
D. User Interaction And Model Response evaluations to examine user interaction quality and the
The model enables a natural and interactive conversation model's effectiveness from an end-user perspective, ensuring
between users and their uploaded PDF documents. The a comprehensive assessment of its performance.
process starts when users input their questions through a text The evaluation utilized a carefully curated dataset comprising
field integrated into the Streamlit interface, ensuring that top-cited research articles, known for their dense information
initiating queries is straightforward and accessible. Users content, technical language, and intricate methodologies.
type their questions into the provided input field These articles posed significant challenges, making them
(st.text_input), and upon submission, the system promptly well-suited for rigorously testing the model's summarization
captures the query and processes it using the retrieval chain. capabilities. The article abstracts served as input reference
The chain.invoke method efficiently directs the query to summaries for calculating the ROUGE scores for each
subsequent stages of the workflow. document. This analysis provided valuable insights into the
model's proficiency in accurately capturing and summarizing
critical findings from highly technical and detailed research compares the model performance with other SOTA
literature. ROUGE measures the overlap of n-grams between approaches.
the generated text and the reference text, mathematically
represented in Equation (6), where the maximum number of TABLE II. COMPARISON OF PROPOSED MODEL PERFORMANCE
METRICS
n-grams co-occurring in both the candidate and reference
summaries is considered. Model ROUGE - 1 ROUGE - 2 ROUGE - L
RAG-PDF 0.4604 0.3576 0.4283
∑𝑆∈{𝑆𝑢𝑚𝑚𝑎𝑟𝑖𝑒𝑠} ∑𝑔𝑟𝑎𝑚𝑛 ∈𝑆 𝐶𝑜𝑢𝑛𝑡_𝑚𝑎𝑡𝑐ℎ(𝑔𝑟𝑎𝑚𝑛 ) (Our Model)
𝑅𝑂𝑈𝐺𝐸 = ∑𝑆∈{𝑆𝑢𝑚𝑚𝑎𝑟𝑖𝑒𝑠} ∑𝑔𝑟𝑎𝑚𝑛 ∈𝑆 𝐶𝑜𝑢𝑛𝑡(𝑔𝑟𝑎𝑚𝑛 )
(6) ML + RL
ROUGE + Novel, 0.4019 0.1738 0.3752
With LM [26]
The study specifically used ROUGE-1 (unigrams) and COSUM [27] 0.4908 0.2379 0.2834
ROUGE-2 (bigrams) in the evaluation. ROUGE-L measures Latent Semantic 0.4621 0.2618 0.3479
Analysis [28]
the longest common subsequence (LCS) between the EdgeSumm [29] 0.5379 0.2858 0.4979
candidate and reference summaries. It's calculated using Generative
Equations (7), and (8) followed by Equation (9), where X is Adversarial 0.3992 0.1765 0.3671
the reference summary of length m, Y is the candidate Network [30]
summary of length n, and β is typically set to favor recall (β TFRSP [31] 0.2483 0.2874 0.2043
>1). Table II presents a comparative analysis of various models
based on ROUGE-1, ROUGE-2, and ROUGE-L scores,
𝐿𝐶𝑆(𝑋,𝑌) which evaluate summary quality against reference
𝑅𝑂𝑈𝐺𝐸 − 𝐿𝑅𝑒𝑐𝑎𝑙𝑙 = (7)
𝑚 summaries. The RAG-PDF model demonstrates strong
𝐿𝐶𝑆(𝑋,𝑌)
𝑅𝑂𝑈𝐺𝐸 − 𝐿𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 𝑛
(8) performance, achieving a ROUGE-1 score of 0.4604,
(1+𝛽 2 )∙𝑅∙𝑃 ROUGE-2 score of 0.3576, and ROUGE-L score of 0.4283,
𝑅𝑂𝑈𝐺𝐸 − 𝐿𝐹 = 𝑅+𝛽 2 ∙𝑃
(9) indicating its effectiveness in capturing both individual words
and longer sequences for coherent summaries.
To evaluate the model performance, the authors tested it with While EdgeSumm excels in ROUGE-1 and ROUGE-L, its
a custom dataset of various research papers sourced from top lower ROUGE-2 score reveals limitations in bigram
research databases and analyzed the ROUGE scores of the coherence. Our model balances coherence, particularly in
generated answers. The relatively moderate ROUGE scores complex technical text. In contrast, the ML + RL ROUGE +
can be attributed to the model’s focus on condensing Novel model shows poorer performance, especially in
extensive content into concise responses. This indicates the ROUGE-2 (0.1738) and ROUGE-L (0.3752), suggesting
model’s tendency to prioritize brevity and specificity over challenges in capturing bi-gram sequences. COSUM
word-for-word overlap. The average representative scores performs well in ROUGE-1 (0.4908) but lacks coherence in
obtained from evaluating upon the dataset, are given in Table. longer sequences with lower ROUGE-2 (0.2379) and
I. ROUGE-L (0.2834).
Latent Semantic Analysis is comparable to our model in
TABLE I. PERFORMANCE METRICS OF THE MODEL ROUGE-1 (0.4621) but falls short in ROUGE-2 (0.2618) and
Performance Metric Score Values (Average) ROUGE-L (0.3479). The Generative Adversarial Network
ROUGE – 1 0.4604 model exhibits low scores across metrics, particularly in
ROUGE – 2 0.3576 ROUGE-2 (0.1765). Lastly, the TFRSP model scores the
ROUGE - L 0.4283 lowest in ROUGE-1 (0.2483) and ROUGE-L (0.2043),
indicating significant challenges in summary generation.
The scores indicate that approximately 46% of individual
words (ROUGE-1) and around 35% of bigram phrases While ROUGE metrics provide useful insights, they may not
(ROUGE-2) in the generated responses matched those found fully capture user experience or interaction quality.
in the original documents. The ROUGE-L score, which lies Therefore, future work will focus on incorporating user-
between ROUGE-1 and ROUGE-2, demonstrates some centered evaluations, including qualitative feedback and
preservation of word order while accommodating gaps and interaction analysis, to align the system’s performance with
rephrasing. The relatively low ROUGE scores highlight the real-world needs.
system's capability to distill information into concise answers
instead of replicating large text segments. Moreover, the VI. CONCLUSION AND FUTURE WORK
complexity and dense information structure of the input The model introduces a unique approach for interacting with
research articles creates a high bar for any model aiming to PDF documents via a conversational interface, harnessing the
balance conciseness with informativeness. Good summaries power of LLMs and RAG. This system enables users to
often rephrase ideas, leading to lower word-for-word matches extract valuable insights from complex and text-heavy
but potentially better conveyance of key concepts. materials effectively. One of its standout features is its focus
Furthermore, the system focuses on providing specific on the specific content of uploaded PDFs, rather than relying
answers to questions, naturally leading to lower overlap with on extensive external knowledge bases. By employing
the full text of the documents. Moreover, the significant sentence embeddings, the model converts text chunks into
length difference between focused answers and entire articles numerical vectors and utilizes cosine similarity for efficient
further contributes to the lower ROUGE scores. Table. II retrieval, aligning responses closely with user intent.
Performance evaluations reveal competitive ROUGE
scores—0.4604 for ROUGE-1, 0.3576 for ROUGE-2, and D. (2020). Retrieval-augmented generation for knowledge-intensive
NLP tasks. In Advances in Neural Information Processing Systems 33
0.4283 for ROUGE-L—demonstrating the model's capability (NeurIPS 2020).
to capture essential content and structure while [14] Z. Feng, X. Feng, D. Zhao, M. Yang and B. Qin, "Retrieval-Generation
outperforming many existing models in summarization and Synergy Augmented Large Language Models," ICASSP 2024 - 2024
question answering. IEEE International Conference on Acoustics, Speech and Signal
To enhance the practical application of this system, future Processing (ICASSP), Seoul, Korea, Republic of, 2024, pp. 11661-
11665
work will aim to generalize its approach for a wider variety
[15] Miao, J., Thongprayoon, C., Suppadungsuk, S., Garcia Valencia, O., &
of document types. This will include refining the retrieval Cheungpasitporn, W. (2024). Integrating retrieval-augmented
mechanism to accommodate diverse structures, such as legal, generation with large language models in nephrology: Advancing
financial, and multimodal documents, thereby increasing the practical applications. Medicina.
system's versatility in real-world scenarios. We also plan to [16] Hao, T., Li, X., He, Y. et al. Recent progress in leveraging deep
incorporate reinforcement learning techniques to improve learning methods for question answering. Neural Comput & Applic 34,
2765–2783 (2022).
user interactions, allowing the model to adapt dynamically
[17] Zhang, Y., Qian, S., Fang, Q., & Xu, C. (2019). Multi-modal
based on feedback. Exploring the incorporation of knowledge Knowledge-aware Hierarchical Attention Network for Explainable
graphs and ontologies may also improve semantic Medical Question Answering. Proceedings of the 27th ACM
understanding and contextualization. Furthermore, refining International Conference on Multimedia.
the model with user interaction data and reinforcement [18] Sawarkar, K., Mangal, A., & Solanki, S. R. (2024). Blended RAG:
Improving RAG (Retriever-Augmented Generation) accuracy with
learning can facilitate more personalized responses, ensuring semantic search and hybrid query-based retrievers. Information
that the system continuously evolves to meet user needs Retrieval, ArXiv.
effectively. [19] N. Arif, S. Latif and R. Latif, "Question Classification Using Universal
Sentence Encoder and Deep Contextualized Transformer," 2021 14th
REFERENCES International Conference on Developments in eSystems Engineering
(DeSE), Sharjah, United Arab Emirates, 2021, pp. 206-211.
[1] Khurana, D., Koli, A., Khatter, K. et al. Natural language processing:
state of the art, current trends and challenges. Multimed Tools Appl 82, [20] Goswami. M, Panda. N, Mohanty. S, and Pattnaik. P. K, "Machine
3713–3744 (2023). Learning Techniques and Routing Protocols in 5G and 6G Mobile
Network Communication System - An Overview," 2023 7th
[2] L. R. Bahl, P. F. Brown, P. V. de Souza and R. L. Mercer, "A tree-
International Conference on Trends in Electronics and Informatics
based statistical language model for natural language speech
(ICOEI), Tirunelveli, India, 2023, pp. 1094-1101
recognition," in IEEE Transactions on Acoustics, Speech, and Signal
Processing, vol. 37, no. 7, pp. 1001-1008, July 1989. [21] Li, H., Su, Y., Cai, D., Wang, Y., & Liu, L. (2022). A survey on
retrieval-augmented text generation.Computation and Language,
[3] Curto, G., Jojoa Acosta, M.F., Comim, F. et al. Are AI systems biased
ArXiv.
against the poor? A machine learning analysis using Word2Vec and
GloVe embeddings. AI & Soc 39, 617–632 (2024). [22] Bui, D. D. A., Del Fiol, G., & Jonnalagadda, S. (2016). PDF text
classification to leverage information extraction from publication
[4] X. Zheng, C. Zhang and P. C. Woodland, "Adapting GPT, GPT-2 and
reports. Journal of Biomedical Informatics, 61, 141-148.
BERT Language Models for Speech Recognition," 2021 IEEE
Automatic Speech Recognition and Understanding Workshop [23] F. Heimerl, C. Kralj, T. Möller and M. Gleicher, "embComp: Visual
(ASRU), Cartagena, Colombia, 2021, pp. 162-168. Interactive Comparison of Vector Embeddings," in IEEE Transactions
on Visualization and Computer Graphics, vol. 28, no. 8, pp. 2953-
[5] Y. Qu, P. Liu, W. Song, L. Liu and M. Cheng, "A Text Generation and
2969, 1 Aug. 2022
Prediction System: Pre-training on New Corpora Using BERT and
GPT-2," 2020 IEEE 10th International Conference on Electronics [24] Goswami. M, Mohanty. S, and Pattnaik. P. K, Optimization of machine
Information and Emergency Communication (ICEIEC), Beijing, learning models through quantization and data bit reduction in
China, 2020, pp. 323-326. healthcare datasets, Franklin Open, Volume 8, 2024.
[6] Wang, L., Ma, C., Feng, X. et al. A survey on large language model [25] Singh, A. Ehtesham, S. Mahmud and J. -H. Kim, "Revolutionizing
based autonomous agents. Front. Comput. Sci. 18, 186345 (2024). Mental Health Care through LangChain: A Journey with a Large
Language Model," 2024 IEEE 14th Annual Computing and
[7] Xu, L., Lu, L., Liu, M. et al. Nanjing Yunjin intelligent question-
Communication Workshop and Conference (CCWC), Las Vegas, NV,
answering system based on knowledge graphs and retrieval augmented
USA, 2024, pp. 0073-0078
generation technology. Herit Sci 12, 118 (2024).
[26] Kryściński, W., Paulus, R., Xiong, C., & Socher, R. (2018). Improving
[8] Louis, A., van Dijck, G., & Spanakis, G. (2024). Interpretable Long-
abstraction in text summarization, Computation and Language, ArXiv.
Form Legal Question Answering with Retrieval-Augmented Large
Language Models. Proceedings of the AAAI Conference on Artificial [27] Alguliyev, R.M., Aliguliyev, R.M., Isazade, N.R., Abdi, A., & Idris,
Intelligence, 38(20), 22266-22275. N.B. (2018). COSUM: Text summarization based on clustering and
optimization. Expert Systems, 36.
[9] Yang, X., Chen, A., PourNejatian, N. et al. A large language model for
electronic health records. npj Digit. Med. 5, 194 (2022). [28] Ozsoy, M. G., Alpaslan, F. N., & Cicekli, I. (2011). Text
summarization using Latent Semantic Analysis. Journal of Information
[10] Li, M., Kilicoglu, H., Xu, H., & Zhang, R. (2024). BiomedRAG: A
Science, 37(4), 405-417.
Retrieval Augmented Large Language Model for Biomedicine.
Computation and Language, ArXiv. [29] El-Kassas, W. S., Salama, C. R., Rafea, A. A., & Mohamed, H. K.
(2020). EdgeSumm: Graph-based framework for automatic text
[11] Hiesinger, W., Zakka, C., Chaurasia, A., Shad, R., Dalal, A., Kim, J.,
summarization. Information Processing & Management, 57(6).
Moor, M., Alexander, K., Ashley, E., Boyd, J., Boyd, K., Hirsch, K.,
Langlotz, C., & Nelson, J. (2023). Almanac: Retrieval-Augmented [30] Liu, L., Lu, Y., Yang, M., Qu, Q., Zhu, J., & Li, H. (2018). Generative
Language Models for Clinical Medicine. Research Square. Adversarial Network for Abstractive Text
Summarization. Proceedings of the AAAI Conference on Artificial
[12] Yang, K., Swope, A., Gu, A., Chalamala, R., Song, P., Yu, S., Godil,
Intelligence, 32(1).
S., Prenger, R. J., & Anandkumar, A. (2023). LeanDojo: Theorem
proving with retrieval-augmented language models. In Advances in [31] M. S M, R. M P, A. R E and E. S. G SR, "Text Summarization Using
Neural Information Processing Systems 36 (NeurIPS 2023). Text Frequency Ranking Sentence Prediction," 2020 4th International
Conference on Computer, Communication and Signal Processing
[13] Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N.,
(ICCCSP), Chennai, India, 2020, pp. 1-6
Küttler, H., Lewis, M., Yih, W., Rocktäschel, T., Riedel, S., & Kiela,

View publication stats

BiofertilizersAnoveltoolforagriculture1 2 6 IJMR
No ratings yet
BiofertilizersAnoveltoolforagriculture1 2 6 IJMR
10 pages
Bio-Efficacy of Cow Urine On Crop Production: A Review: April 2018
No ratings yet
Bio-Efficacy of Cow Urine On Crop Production: A Review: April 2018
5 pages
Marketing Mix Modeling MMM Concepts and Model Interpretation IJERTV10IS060396
No ratings yet
Marketing Mix Modeling MMM Concepts and Model Interpretation IJERTV10IS060396
11 pages
Effect of Preheated Pure Oxygen On I C Engine Performance Using Heat Pipe Exhaust Heat Recovery Method
No ratings yet
Effect of Preheated Pure Oxygen On I C Engine Performance Using Heat Pipe Exhaust Heat Recovery Method
7 pages
Use of Nanotechnology in Solar PV Cell
No ratings yet
Use of Nanotechnology in Solar PV Cell
5 pages
Comparative Study of PFNA Vs PFNA 2 in Unstable in
No ratings yet
Comparative Study of PFNA Vs PFNA 2 in Unstable in
4 pages
Geotextilefilter
No ratings yet
Geotextilefilter
17 pages
Three-Dimensional Macroporous Graphene Scaffolds F
No ratings yet
Three-Dimensional Macroporous Graphene Scaffolds F
12 pages
Potato Leaf Disease Detection Using Deep Learning
No ratings yet
Potato Leaf Disease Detection Using Deep Learning
6 pages
Fever IJCIIS2018
No ratings yet
Fever IJCIIS2018
11 pages
Citric Acid Production From Cane Molasses Using Submerged Fermentation by Aspergillus Niger ATCC9142
No ratings yet
Citric Acid Production From Cane Molasses Using Submerged Fermentation by Aspergillus Niger ATCC9142
9 pages
Effect of Music Therapy On Hospital Induced Anxiety and HRQoL in CABG Patients
No ratings yet
Effect of Music Therapy On Hospital Induced Anxiety and HRQoL in CABG Patients
6 pages
Comparative Review of Energy Storage Systems, Their Roles and Impacts On Future Power Systems
No ratings yet
Comparative Review of Energy Storage Systems, Their Roles and Impacts On Future Power Systems
32 pages
Government Policies and Initiatives For Development of Ayurveda 2017 Journal of Ethnopharmacology
No ratings yet
Government Policies and Initiatives For Development of Ayurveda 2017 Journal of Ethnopharmacology
8 pages
Enhancement of Supermarket Using Smart Trolley: International Journal of Computer Applications January 2021
No ratings yet
Enhancement of Supermarket Using Smart Trolley: International Journal of Computer Applications January 2021
7 pages
A Hybrid Design Approach of PVT Tolerant, Power Efficient Ring VCO
No ratings yet
A Hybrid Design Approach of PVT Tolerant, Power Efficient Ring VCO
9 pages
Estimating Petrophysical Parametersdueto Fluid Substitutionin Sandstone Reservoirusing Gassmann Equation
No ratings yet
Estimating Petrophysical Parametersdueto Fluid Substitutionin Sandstone Reservoirusing Gassmann Equation
7 pages
Carbon Footprint of Textile and Clothing Products: April 2015
No ratings yet
Carbon Footprint of Textile and Clothing Products: April 2015
27 pages
Estimationof Blood Glucosefrom PPGSignalusing Convolutional Neural Network
No ratings yet
Estimationof Blood Glucosefrom PPGSignalusing Convolutional Neural Network
7 pages
Bracketpositioning
No ratings yet
Bracketpositioning
3 pages
Marketing Mix Modeling MMM Concepts and Model Interpretation IJERTV10IS060396
No ratings yet
Marketing Mix Modeling MMM Concepts and Model Interpretation IJERTV10IS060396
11 pages
Pone 0030305
No ratings yet
Pone 0030305
8 pages
A Global Optimization Technique For Modelling and Control of Permanent Magnet Synchronous Motor Drive
No ratings yet
A Global Optimization Technique For Modelling and Control of Permanent Magnet Synchronous Motor Drive
9 pages
A Triangular System For Denial of Service Attack Detection Based On Multivariate Correlation Analysis
No ratings yet
A Triangular System For Denial of Service Attack Detection Based On Multivariate Correlation Analysis
7 pages
Analysis of Conventional CMOS and FinFET Based 6-T
No ratings yet
Analysis of Conventional CMOS and FinFET Based 6-T
7 pages
19.cellular Light Weight Concrete Using Glass Fiber
No ratings yet
19.cellular Light Weight Concrete Using Glass Fiber
6 pages
14 Choudhary Etal ASIA2016
No ratings yet
14 Choudhary Etal ASIA2016
10 pages
Chemistry and Pharmacology of Piper Longum L
No ratings yet
Chemistry and Pharmacology of Piper Longum L
11 pages
Analysis of Wind Turbine Blade Prototype Using ANSYS: December 2020
No ratings yet
Analysis of Wind Turbine Blade Prototype Using ANSYS: December 2020
10 pages
Fordetal. JBR
No ratings yet
Fordetal. JBR
16 pages
Therapeutic and Medicinal Uses of Aloe Vera: A Review: Pharmacology & Pharmacy January 2013
No ratings yet
Therapeutic and Medicinal Uses of Aloe Vera: A Review: Pharmacology & Pharmacy January 2013
13 pages
Biochemical Composition of Pulp and Seed of Wild Jack (Artocarpus Hirsutus Lam.) Fruit
No ratings yet
Biochemical Composition of Pulp and Seed of Wild Jack (Artocarpus Hirsutus Lam.) Fruit
3 pages
Sanjeev Kumar IJPS1
No ratings yet
Sanjeev Kumar IJPS1
6 pages
Invitro
No ratings yet
Invitro
7 pages
10-Carbon Nanotube Reinforced Metal Matrix Composites
No ratings yet
10-Carbon Nanotube Reinforced Metal Matrix Composites
25 pages
A Novel QR Code Based Smart Attendance Tracking System
No ratings yet
A Novel QR Code Based Smart Attendance Tracking System
5 pages
Gensim for Natural Language Processing: Definitive Reference for Developers and Engineers
From Everand
Gensim for Natural Language Processing: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Adaptive Capacity Assessment For A Flood Vulnerabl
No ratings yet
Adaptive Capacity Assessment For A Flood Vulnerabl
14 pages
Speed Control and Electrical Braking of Axial Ux BLDC Motor
No ratings yet
Speed Control and Electrical Braking of Axial Ux BLDC Motor
7 pages
Control of DC Motor Using Genetic Algorithm Based PID Controller
No ratings yet
Control of DC Motor Using Genetic Algorithm Based PID Controller
7 pages
Effects of Early Proprioceptive Neuromuscular Faci
No ratings yet
Effects of Early Proprioceptive Neuromuscular Faci
5 pages
Scars
No ratings yet
Scars
7 pages
CAR-T Therapy
No ratings yet
CAR-T Therapy
14 pages
Meth Mouth
No ratings yet
Meth Mouth
5 pages
How Aware Are Mothers About Early Childhood Developmental Milestones
No ratings yet
How Aware Are Mothers About Early Childhood Developmental Milestones
6 pages
6) Artcile On COVID
No ratings yet
6) Artcile On COVID
8 pages
Growth Performance of Tellicherry Goats in An Organized Farm
No ratings yet
Growth Performance of Tellicherry Goats in An Organized Farm
4 pages
2019 Apc CHD Indian Guidelines
No ratings yet
2019 Apc CHD Indian Guidelines
34 pages
In-Vitro Antioxidant Activity of Canna Indica Extracts Using Different Solvent System
No ratings yet
In-Vitro Antioxidant Activity of Canna Indica Extracts Using Different Solvent System
5 pages
2014villarinoetal Theeffesctof-IJFST 2014
No ratings yet
2014villarinoetal Theeffesctof-IJFST 2014
10 pages
DesignandFabricationofaHighEnduranceCost EffectiveAuto StabilisedAirship
No ratings yet
DesignandFabricationofaHighEnduranceCost EffectiveAuto StabilisedAirship
6 pages
Article 002
No ratings yet
Article 002
6 pages
19 ElectronicApexLocators-Anoverview
No ratings yet
19 ElectronicApexLocators-Anoverview
7 pages
DETERMINATION OF SpO 2 BY SPECTRAL ANALYSIS OF DAT
No ratings yet
DETERMINATION OF SpO 2 BY SPECTRAL ANALYSIS OF DAT
5 pages
Electronic Apex Locators-An Overview: April 2017
No ratings yet
Electronic Apex Locators-An Overview: April 2017
7 pages
Annonaarticle
No ratings yet
Annonaarticle
11 pages
Android Based Smart Appointment System SAS For Booking and Interacting With Teacher For Counselling
No ratings yet
Android Based Smart Appointment System SAS For Booking and Interacting With Teacher For Counselling
6 pages
Arsenic Biochar Sustainability 14 14523
No ratings yet
Arsenic Biochar Sustainability 14 14523
21 pages
Review On Multi-Criteria Decision Analysis in Sustainable Manufacturing Decision Making
No ratings yet
Review On Multi-Criteria Decision Analysis in Sustainable Manufacturing Decision Making
26 pages
Are View of Dental Implant History
No ratings yet
Are View of Dental Implant History
4 pages
Emerging Tech Assignment
No ratings yet
Emerging Tech Assignment
3 pages
SAP Data Archiving Basic Guide For Beginner's - SAP Community
No ratings yet
SAP Data Archiving Basic Guide For Beginner's - SAP Community
14 pages
C++ STL Guide - STL Operations and Time Complexities - LeetCode Discuss
No ratings yet
C++ STL Guide - STL Operations and Time Complexities - LeetCode Discuss
1 page
Apachesim Ve 2014 Session A Training Notes PDF
No ratings yet
Apachesim Ve 2014 Session A Training Notes PDF
23 pages
Paython Papers
No ratings yet
Paython Papers
17 pages
MODULE 2 and 3
No ratings yet
MODULE 2 and 3
53 pages
crAPI Complete Report
No ratings yet
crAPI Complete Report
50 pages
Inc Restore
No ratings yet
Inc Restore
131 pages
Quiz6 Solution PDF
No ratings yet
Quiz6 Solution PDF
3 pages
Pm-Analyze v10.2 Whatsnew
No ratings yet
Pm-Analyze v10.2 Whatsnew
28 pages
Koushik TCS 10yrs CV 1807
No ratings yet
Koushik TCS 10yrs CV 1807
9 pages
Save As PDF
No ratings yet
Save As PDF
5 pages
Advanced Database Management Systems Index
No ratings yet
Advanced Database Management Systems Index
6 pages
Hadoop Backup and Recovery
No ratings yet
Hadoop Backup and Recovery
14 pages
AL ICT 2013 Paper II A-B
No ratings yet
AL ICT 2013 Paper II A-B
9 pages
ComputerScience-SQP Set2
No ratings yet
ComputerScience-SQP Set2
7 pages
AWP Previous Papers
No ratings yet
AWP Previous Papers
7 pages
Ananditaa and Harshit CS PROJECT 12th
No ratings yet
Ananditaa and Harshit CS PROJECT 12th
13 pages
Pega GenAI Blueprint - Referral Management
No ratings yet
Pega GenAI Blueprint - Referral Management
14 pages
Big Data Landscape 2017
No ratings yet
Big Data Landscape 2017
1 page
Complete Debby Work
No ratings yet
Complete Debby Work
46 pages
Ankit Resume Data Engineer T
No ratings yet
Ankit Resume Data Engineer T
1 page
Implementation Guide For Aerial Applicationof Fire Retardant
No ratings yet
Implementation Guide For Aerial Applicationof Fire Retardant
69 pages
Tripodlast
No ratings yet
Tripodlast
27 pages
Drishti
No ratings yet
Drishti
12 pages
MySQL - Learn Data Analytics Together's Group
No ratings yet
MySQL - Learn Data Analytics Together's Group
96 pages
ISM Case Study 1
No ratings yet
ISM Case Study 1
3 pages
Questions Stu
No ratings yet
Questions Stu
3 pages
Mobile Security PPT 1
No ratings yet
Mobile Security PPT 1
27 pages
Operators in MySQL
No ratings yet
Operators in MySQL
4 pages

Conversational Text Extraction With Large Language Models Using Retrieval-Augmented Systems

Uploaded by

Conversational Text Extraction With Large Language Models Using Retrieval-Augmented Systems

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Conversational Text Extraction with Large Language Models Using Retrieval-

Conference Paper · October 2024

Soham Roy Mitul Goswami

SEE PROFILE SEE PROFILE

Nisharg Nargund Suneeta Mohanty

SEE PROFILE SEE PROFILE

The user has requested enhancement of the downloaded file.

Suneeta Mohanty* Prasant Kumar Pattnaik

XXX-X-XXXX-XXXX-X/XX/$XX.00 ©20XX IEEE

Z. Feng et al. proposed an iterative retrieval-generation Fig. 1. RAG Architecture

View publication stats

You might also like