RAG Slide ENG
RAG Slide ENG
Haofen Wang
Tongji University
1. RAG Over view
• Hallucination
• Outdated information
• Low efficiency in parameterizing knowledge
• Lack of in-depth knowledge in specialized domains
• Weak inferential capabilities
By attaching a external
knowledge base, there is no need
to retrain the entire large model
for each specific task.
Prompt Engineering
Retrieval-Augmented
Generation
Instruct / Fine-tuning
and traceability
• Pre-Retrieval Process:retrieve
confidence judgment
Search
Naive RAG Read Retrieve Generate
Aggregation Read
Predict
Rewrite DSP
Demonstrate Search Predict Generate Advanced RAG
(2022)
RAG Rerank
Retrieve Rewrite-
Filter Retrieve-Read Rewrite Retrieve Read
(2023)
Generate
Demonstrate
Retrieve-then-
read Retrieve Read Generate
Reflect (2023)
Modular RAG
Comparison of RAG Paradigms
The three key questions of RAG
Phrase|NPM 2023
Richer semantic and structured
Retrieval information, but the retrieval
granularity efficiency is lower and is limited
by the quality of KG.
Entity|EasE 2022
Token|KNN-LMM 2019
It excels in handling long-tail
and cross-domain issues with
high computational efficiency,
but it requires significant
storage.
meticulous
low High
level of structuration
Key issue of RAG — How to use the retrieved content
Integrating the retrieved information into different layers of the generation model,during inference process.
Integrate
retrieval Supports the retrieval of more knowledge
positions. blocks, but introduces additional
Model / Interlayer complexity and must be trained.
High efficiency, but low Balancing efficiency and A large amount of information
relevance of the retrieved information might not yield the with low efficiency and
documents optimal solution redundant information.
Conducting once search Adaptively conduct the search. Retrieve once for every N tokens
during the reasoning process. generated.
Low High
Retrieval frequency
Overview of RAG Development
PART 03
Key Technologies and Evaluation
Techniques for Better RAG —— Data indexing optimization
Chunk Optimization
Adding Metadata
Metadata Filtering/Enrichment
● Summary → Document
Replace document retrieval with summary
retrieval, not only retrieving the most directly
relevant nodes, but also exploring additional nodes
associated with those nodes.
Cross-lingustic
Triples
Structured Data
Subgraph | SUGRE [Kang
Subgraphs et al., 2023]
LLM Memory
LLM Memory | Selfmem
Generated Text
[Cheng et al., 2023]
Generated Code
Techniques for Better RAG —— KG as a Retrieval Data Source
GraphRAG
Extract entities from the user's input query, then construct a subgraph to form context, and finally feed it
into the large model for generation.
Implementation
Use LLM (or other models) to extract key entities from the question.
Retrieve subgraphs based on entities, delving to a certain depth, such as 2 hops or even more.
Utilize the obtained context to generate answers through LLM.
Techniques for Better RAG —— Query Optimization
Questions and answers do not always possess high semantic similarity; adjusting the Query can yield better retrieval results.
Fine-tuning According to Domain-Specific Fine-tuning the Adapter Module to Align the Embedding
Repositories and Downstream Tasks Model with the Retrieval Repository
Techniques for Better RAG —— Retrieval Process Optimization
Iterative Adaptive
Iteratively Retrieving from the Corpus to Acquire Dynamically Determined by the LLM, the Timing and
More Detailed and In-depth Knowledge Scope of Retrieva
IRCOT
[Trivedi et Self-RAG
al.,2022] [Asai et al.,
2023]
Techniques for Better RAG —— Hybrid (RAG + Fine-tuning)
Retriever Fine-Tuning Generator Fine-Tuning
Highly Adaptive
General-Purpose
Retrieval Plugin
Augment with Structural
Information Integration
• R-FT
Mi n i m i z i n g t h e K L D i v e r g e n c e
Between the Retriever Distribution
and LLM Preferences
• LM-FT
Maximizing the Likelihood of the
Correct Answer Given Retrieval-
Augmented Instructions
RA-DIT [Lin et al., 2023]
Summary of Related Research
Key Metr ics & Capabilities Key Metr ics Key Capabilities
Noise Robustness Negative Rejection
Quer y Context Relevance: Can the model extract useful When therequired knowledge is not
Answer Relevance Is the context enhanced with information from noisy exsiting in the retrieved documents, the
Is the answer relevant to retrieved documents relevant documents? answer should be refused.
the query? to the query?
Assessment Fr amewor k
Use LLM as the adjudicator judge.
• Answer Fidelity
Evaluation
Tr uLens RAGAS ARES • Answer Relevance
Name Pr os Cons
AutoGen
RAG Industry Application Practices
AI Toolchain
Enhancement
Vedio Code