Medium
Medium
462 6
The reason for this broad utilization likely lies in its ability to accurately
fetch and convey the information users seek. In an age overwhelmed by
information, it serves to selectively provide the ‘necessary’ information.
Despite the significant progress made to date, there have been numerous
challenges. For instance, one such challenge is the ‘hallucination’
phenomenon, where inaccurate information is provided. This issue
stems from various causes, with a primary one being the
misinterpretation of user intent, leading to irrelevant information being
fetched.
1. Building large language models from scratch, which allows for clear
data context from the outset but comes with high construction costs.
Pre-Retrieval
Data granularity refers to the level of detail or precision of the data to be
searched by the RAG model to enhance the generation process,
processed in advance before the retrieval step.
The choice of data granularity affects the model’s performance and its
ability to generate accurate and contextually relevant text.
Fine-grained data can provide more specific and detailed information for
the generation task, while coarse-grained data can provide broader
context or general knowledge.
Chunking
This is the process of appropriately processing the input form of source
data for quantification into a large language model. Since the number of
tokens that can be input into a large language model is limited, it’s
important to segment and input the information properly.
If one person speaks for 59 minutes and the other for 1 minute in an
hour, the conversation is dominated by one person ‘inputting’
information, resembling not an exchange but an infusion of information.
Retrieval
This stage involves searching a document or text segment database to
find content related to the user’s query. It includes understanding the
intent and context of the query and selecting the most relevant
documents or texts from the database based on this understanding.
Post-Retrieval
This stage processes the retrieved information to effectively integrate it
into the generation process. It may include summarizing the searched
text, selecting the most relevant facts, and refining the information to
better match the user’s query.
2. Missed the Top Ranked Documents: The second issue arises when
documents related to the user’s query are retrieved but are of minimal
relevance, leading to answers that don’t satisfy the user’s expectations.
This primarily stems from the subjective nature of determining the
“number of documents” to retrieve during the process, highlighting a
major limitation. Therefore, it’s necessary to conduct various
experiments to define this k hyperparameter properly.
7. Incomplete: The seventh limitation is when, despite the ability to use the
context in generating answers, missing information leads to incomplete
responses to the user’s query.
Using a single database for both vectorDB and GraphDB allows for
semantic (GraphRAG) and vector (RAG) indexing within the same
database, facilitating verification of retrieval accuracy and enabling
improvements for inaccuracies.
GraphRAG architecture
There are 4 modules for executing GraphRAG Query Rewriting , Augment
, Retrieval 에서 Semantic Search , Similarity Search.
Query Rewriting
Rewriting User’s query impelemnt in this process. if user write and order
the engine, we can add the additional and useful context its query
prompt format. In this process, we redefined this things for clarify the
users intention.
Pre-Retrieval & Post-Retrieval
This phase involves contemplating what information to retrieve and how
to process that information once retrieved. During the Pre-Retrieval
phase, the focus is primarily on decisions related to setting the chunking
size, how to index, ensuring data is well-cleaned, and detecting and
removing any irrelevant data if present.
GraphRAG limitations
GraphRAG, like RAG, has clear limitations, which include how to form
graphs, generate queries for querying these graphs, and ultimately
decide how much information to retrieve based on these queries. The
main challenges are ‘query generation’, ‘reasoning boundary’, and
‘information extraction’. Particularly, the ‘reasoning boundary’ poses a
significant limitation as optimizing the amount of related information
can lead to overload during information retrieval, negatively impacting
the core aspect of GraphRAG, which is answer generation.
Applying GraphRAG
GraphRAG utilizes graph embeddings from GNN (graph neural network)
results to enhance text embeddings with user query response inference.
This method, known as Soft-prompting, is a type of prompt engineering.
Prompt engineering can be divided into Hard and Soft categories. Hard
involves explicitly provided prompts, requiring manual context addition
to user queries. This method’s downside is the subjective nature of
prompt creation, although it’s straightforward to implement.
When efforts like introducing BM25 for exact search in a hybrid search
approach, improving the ranking process, or fine-tuning for embedding
quality do not significantly enhance RAG performance, it might be worth
considering GraphRAG.
Conclusion
This post covered everything from RAG to GraphRAG, focusing on
methods like fine-tuning, building from scratch, prompt engineering,
and RAG to improve response quality. While RAG is acclaimed for
efficiently fetching related documents for answering queries at relatively
lower costs, it faces several limitations in the retrieval process. Advanced
RAG, or GraphRAG, emerges as a solution to overcome these limitations
by leveraging ‘semantic’ reasoning and retrieval. Key considerations for
effectively utilizing GraphRAG include information extraction techniques
to infer and generate connections between chunked data, knowledge
indexing for storage and retrieval, and models for generating graph
queries, such as the Cypher Generation Model. With new technologies
emerging daily, this post aims to serve as a resource on GraphRAG,
helping you become more familiar with this advanced approach. Thank
you for reading through this extensive discussion.
1
Search Write
Reference
https://medium.com/@bijit211987/top-rag-pain-points-and-solutions-
108d348b4e5d
https://luv-bansal.medium.com/advance-rag-improve-rag-performance-
208ffad5bb6a
https://deci.ai/blog/fine-tuning-peft-prompt-engineering-and-rag-which-
one-is-right-for-you/
https://towardsdatascience.com/advanced-retrieval-augmented-
generation-from-theory-to-llamaindex-implementation-4de1464a9930
629 Followers
Linkedin : jeongiitae / i'm the graph and network data enthusiast. I always consider how the
graph data is useful in the real-world.
Tomaz Bratanic in Neo4j Developer Blog Dominik Polzer in Towards Data Science
Lists