0% found this document useful (0 votes)

181 views8 pages

Rag Ultimate Guide

This document discusses different methods for summarizing and answering questions about documents using large language models (LLMs). It describes loading documents, splitting them into chunks to fit LLM context windows, creating word embeddings to convert text to numeric representations, and storing embeddings in a vector database. It then examines retrieval methods like maximal marginal relevance and compression techniques to find relevant text. It outlines approaches to question answering like using LLMs to retrieve and refine answers from document chunks in map-reduce or iterative refinement styles.

Uploaded by

koolzfire

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

181 views8 pages

Rag Ultimate Guide

Uploaded by

koolzfire

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Chat With Your PDF Ultimate Guide

To Retrieval Augmented Generation

Code With Prince

Load Documents

How To Load PDF Document

Document Loading

The first step in creating an ability to chat with your documents is first loading the document. Making
splits of the document, creating embedding and storing these embeddings in a database that we can
later on query and use to answer user questions. Let's first learn to load a PDF document. Here's
how we can do this.

Splitting Document Into Chunks

Document Splitting

Now that we have loaded the document, it’s a whole book that we want to chat with. Imagine copying
a whole book and pasting it in chatGPT. Do you honestly think that would work? No right? Of course
it won’t the reason being that LLMs can only process certain about of text at a go. This limit is what is
refered to as context window.

The “context window” refers to the span of text or tokens that the model uses to generate or
understand a particular word or sequence of words in a given sentence or text. The context window
determines how much of the surrounding text the model takes into account when making predictions
or processing language.

The context window is limited in how much tokens or word count it can take. For this reason we need
to split our loaded PDF file into what we call “chunks”. This chunk will be small enough to fit within a
context window. Let’s do just this in Python.

Documents
Pass to be embedded

Split Document
Into chunks
Word Embeddings And Vector Stores

Creating Embeddings

Computers do not understand natural languages that we humans speak. We humans can
understand letters and a variety of symbols, on the other hand, computers can not, they only
understand numeric values. So how do convert our text from the document into numeric
representation that the model can understand?

Word embedding or word vector is an approach with which we represent documents and words. It is
defined as a numeric vector input that allows words with similar meanings to have the same
representation. It can approximate meaning and represent a word in a lower dimensional space.
These can be trained much faster than the hand-built models that use graph embeddings like
WordNet.

Here is where embedding comes into play. In this article I'll be using the sentence transformer which
is an open source embedding model we can use. You can also pay for an OpenAI embedding model
is you want to use that. The tradeoffs is not something I'll be covering. Let's install the
sentence_transformers

[988, 098, 997, 9083, 494]

Documents
[988, 098, 997, 9083, 494]

[988, 098, 997, 9083, 494]

Split Document Vector

Into chunks Embeddings

Vector DB
Retrieval Methods
Maximal Marginal Relevance (MMR)

Maximal Marginal Relevance (MMR)

This does not only give you the right context to work with, but also ensures diversity.

Most Similar texts

Documents

Most diverse texts: MMR

How It Works
1. Query the vector store

2. Set fetch_k value to get the fetch_k most similar search. This is basically semantic search

3. From the choosen most similar search, choose k most diverse context texts
Retrieval Methods
Compression Technique

Compression Technique

Compress relevant documents into only two or three sentences containing the exact information you
need with the help of an LLM. From here we can then make a final call to the LLM passing in the two
or three sentences that contain the exact information needed to answer the question. Down side of
this is the cost of making so many LLM calls needed to "compress" down the information we have
retrieved.

Question

Vector DB

Compress Using
LLM

LLM
Retrieval Methods
Self Query Retriever

LLM Aided Retrieval

This is used when the question asked or the text we want to search the vector store against is just
not to deal with semantics but also meta-data about the semantics

Example
What was mentioned about puppies in the paper between pages 3 and 6?

In this question, we do not want to only answer the semantic question of what was said about
puppies. We also want the meta-data information from only pages 3 and 6 specifically. Here is
where LLMs help to split the question into a filter and a semantic search term. "Puppies" and "pages
3 and 6". This is what we call SelfQuery
Question Answering
Map Reduce

Map Reduce

In Map Reduce, each document or chunk is sent to the LLM to obtain an original answer, these
original answers are then composed into one final answer. This involves more LLM calls.

Documents
Send to LLM Combine

LLM Final LLM

Chunks
Answers Answers

The downside of map reduce is that, sometimes the result can be worse than stuff
method. This is due to the fact that the results are all spread over multiple document.
Kind of like "looking for a needle in a hay-stack", the needle is there, just not easily
found. Due to multiple document chunks.

The advantage of this being, if the answer or the information we are searching for is
in multiple documents. We'll have a chance to look at each document and use it in
answering the question at hand without running out of room in the context window.
Question Answering
Refine

Refine

This simply takes each chunk of document, passes it to the LLM along side the prompt question. An
answer to that prompt question is generated. From here the next chunk or document is passed to the
LLM and the first answer is refine or fine tuned based on the information from this document. This is
repeated for all the other documents till a correct answer is obtained. Number of calls to the LLM is
proportionate to the number of documents

LLM
Documents
Answers

Chunks

Final LLM
Answers

The downside to this approach is that, we make calls to the LLM a number of times
that is directly propotional to the number of documents we have retrieved. Hence
more documents, more expensive. What is all the documents we have retrieved do
not have the answer we are looking for?

Advantages of this is that it produces better results compared to the `map reduce`
chain. As each time, as we iterate over each document, the final answer is fine-
tuned. Example if the answer we are looking for is in document 4 then, we'll
continually fine tune our answer till we arrive at that "right" answer, something map
reduce technique does not provide as. All in all, this approach allows more carrying
over of more information than map reduce strategy, more information more, results.

LangChain - Chat With Your Data
No ratings yet
LangChain - Chat With Your Data
32 pages
How To Create A Private ChatGPT With Your Own Data
No ratings yet
How To Create A Private ChatGPT With Your Own Data
11 pages
ADMS Guide - 2 11.2015 PDF
No ratings yet
ADMS Guide - 2 11.2015 PDF
33 pages
64-Data Transfer From EPLAN 5 PDF
No ratings yet
64-Data Transfer From EPLAN 5 PDF
56 pages
Ezserver User Manual
No ratings yet
Ezserver User Manual
39 pages
Gas Insulated Transformers
No ratings yet
Gas Insulated Transformers
12 pages
Double Entry - Summer 2020
No ratings yet
Double Entry - Summer 2020
3 pages
Multimodal RAG Systems Hands-On Guide
No ratings yet
Multimodal RAG Systems Hands-On Guide
7 pages
Exercises For 3 Dimensional Data Base: Purpose & Description
No ratings yet
Exercises For 3 Dimensional Data Base: Purpose & Description
2 pages
Using Chatgpt With Prompt Engineering
No ratings yet
Using Chatgpt With Prompt Engineering
22 pages
Pscad V5
No ratings yet
Pscad V5
2 pages
SmartPlant Electrical Advanced User S Training Guide PDF
50% (2)
SmartPlant Electrical Advanced User S Training Guide PDF
161 pages
Savnw PSCAD
No ratings yet
Savnw PSCAD
7 pages
Python - Iot PDF
100% (1)
Python - Iot PDF
50 pages
Automatic Fault Detection System Using PLC
No ratings yet
Automatic Fault Detection System Using PLC
26 pages
Getting Started With Oneliner & Power Flow V11
No ratings yet
Getting Started With Oneliner & Power Flow V11
27 pages
High Voltage Substation
50% (2)
High Voltage Substation
66 pages
Build An AI Coding Agent With LangGraph by LangChain
No ratings yet
Build An AI Coding Agent With LangGraph by LangChain
11 pages
Dokumen - Tips - Free Download Here Wiring Estimating and Costing by Uppalpdf Free Download Here PDF
100% (1)
Dokumen - Tips - Free Download Here Wiring Estimating and Costing by Uppalpdf Free Download Here PDF
2 pages
Behavioral Characterization of Electric Vehicle Charging Loads in A Distribution Power Grid Through Modeling of Battery Chargers
No ratings yet
Behavioral Characterization of Electric Vehicle Charging Loads in A Distribution Power Grid Through Modeling of Battery Chargers
10 pages
Instrumentation Earthing: Test Pen Factory - China - Changzhou Dichuang
No ratings yet
Instrumentation Earthing: Test Pen Factory - China - Changzhou Dichuang
2 pages
Understanding LabVIEW Programming Patterns and Frameworks
No ratings yet
Understanding LabVIEW Programming Patterns and Frameworks
43 pages
Introduction To Natural Language Processing (NLP)
No ratings yet
Introduction To Natural Language Processing (NLP)
87 pages
AI Lab Manual New
No ratings yet
AI Lab Manual New
41 pages
Courses in Electrical Engineering: Digital Electronics Chapter Two: Logic Gates
0% (1)
Courses in Electrical Engineering: Digital Electronics Chapter Two: Logic Gates
19 pages
Developing Retrieval Augmented Generation (RAG) Based LLM Systems From Pdfs - An Expert Report
No ratings yet
Developing Retrieval Augmented Generation (RAG) Based LLM Systems From Pdfs - An Expert Report
36 pages
DP LBF 1.34 PDF
No ratings yet
DP LBF 1.34 PDF
3 pages
Use of Shunt & Series Capacitors in Transmission Lines
88% (8)
Use of Shunt & Series Capacitors in Transmission Lines
6 pages
Different Types of Bus Bar Arrangement
67% (3)
Different Types of Bus Bar Arrangement
18 pages
Breaker Rating Application Guide
No ratings yet
Breaker Rating Application Guide
48 pages
Lightning Protection Systems Design For Substations by Using Masts and Matlab PDF
No ratings yet
Lightning Protection Systems Design For Substations by Using Masts and Matlab PDF
5 pages
GSM Based Smart Energy Meter Using Raspberry PI
No ratings yet
GSM Based Smart Energy Meter Using Raspberry PI
6 pages
IOT Based Android App Using MIT App Inventor For Automatic Fire Detection and Alert System
No ratings yet
IOT Based Android App Using MIT App Inventor For Automatic Fire Detection and Alert System
1 page
Etn104 - Principles of Electrical Engineering
No ratings yet
Etn104 - Principles of Electrical Engineering
10 pages
IEEE STD 1255-2000
No ratings yet
IEEE STD 1255-2000
32 pages
Wifi Lora 32 Pinout Diagram: GND Power
No ratings yet
Wifi Lora 32 Pinout Diagram: GND Power
1 page
Day 4th Learning
No ratings yet
Day 4th Learning
51 pages
Function Calling - OpenAI API
No ratings yet
Function Calling - OpenAI API
5 pages
Grounding - EMC 33002439 - K01 - 000 - 00 (EN)
No ratings yet
Grounding - EMC 33002439 - K01 - 000 - 00 (EN)
344 pages
14 Key Skills To Master Large Language Models 1729745509
No ratings yet
14 Key Skills To Master Large Language Models 1729745509
17 pages
Building RAG Apps
No ratings yet
Building RAG Apps
32 pages
Week 5 Large Language Models
No ratings yet
Week 5 Large Language Models
5 pages
Visual Word: Unlocking the Power of Image Understanding
From Everand
Visual Word: Unlocking the Power of Image Understanding
Fouad Sabry
No ratings yet
Semantic Search and Beyond handout-Tim-Clarke
No ratings yet
Semantic Search and Beyond handout-Tim-Clarke
16 pages
Harness Proprietary Data With Foundational Models and RAG: by Marian Veteanu
No ratings yet
Harness Proprietary Data With Foundational Models and RAG: by Marian Veteanu
20 pages
The Unreasonable Ineffectiveness of The Deeper Lay
No ratings yet
The Unreasonable Ineffectiveness of The Deeper Lay
15 pages
Steps Involved in RAG
No ratings yet
Steps Involved in RAG
4 pages
Summer Course Material
No ratings yet
Summer Course Material
52 pages
SQLite Database Programming for Xamarin: Cross-platform C# database development for iOS and Android using SQLite.XM
From Everand
SQLite Database Programming for Xamarin: Cross-platform C# database development for iOS and Android using SQLite.XM
Anthony Serpico
No ratings yet
SESSION 1 LLMs
No ratings yet
SESSION 1 LLMs
40 pages
Flowise AI Tutorial #3 File Loaders, Text Splitters, Embeddings & Vector Stores
No ratings yet
Flowise AI Tutorial #3 File Loaders, Text Splitters, Embeddings & Vector Stores
3 pages
Large Language Models and Where To Use Them - Part 2
No ratings yet
Large Language Models and Where To Use Them - Part 2
12 pages
Introduction
No ratings yet
Introduction
17 pages
Yann Debray - 1714613827618
No ratings yet
Yann Debray - 1714613827618
16 pages
Prompt Engineering
No ratings yet
Prompt Engineering
44 pages
Your First RAG
No ratings yet
Your First RAG
21 pages
C# Data Structures and Algorithms: Harness the power of C# to build a diverse range of efficient applications
From Everand
C# Data Structures and Algorithms: Harness the power of C# to build a diverse range of efficient applications
Marcin Jamro
No ratings yet
NLP Crash Course Comprehensive
No ratings yet
NLP Crash Course Comprehensive
2 pages
NLP Prez Word - Sentence Embedding - MAQUET - MARTIN - LEEFEBURE - MOGAVERO
No ratings yet
NLP Prez Word - Sentence Embedding - MAQUET - MARTIN - LEEFEBURE - MOGAVERO
18 pages
How I Stay Up To Date On The Latest AI Science News - YouTube
No ratings yet
How I Stay Up To Date On The Latest AI Science News - YouTube
2 pages
Lecture # 14-1 Introduction To RAG
No ratings yet
Lecture # 14-1 Introduction To RAG
56 pages
Pentest Standard PDF
No ratings yet
Pentest Standard PDF
229 pages
Google Search Operators
No ratings yet
Google Search Operators
41 pages
Vsphere Esxi Vcenter Server 55 Command Line Interface Concepts Examples Guide
No ratings yet
Vsphere Esxi Vcenter Server 55 Command Line Interface Concepts Examples Guide
148 pages
Internet Resources For Teachers
No ratings yet
Internet Resources For Teachers
4 pages
Vsphere Esxi Vcenter Server 55 Command Line Interface Concepts Examples Guide
No ratings yet
Vsphere Esxi Vcenter Server 55 Command Line Interface Concepts Examples Guide
148 pages
Fuzzing Frameworks
No ratings yet
Fuzzing Frameworks
49 pages
Step-By-Step Guide For Setting Up VPN-Based Remote Access
No ratings yet
Step-By-Step Guide For Setting Up VPN-Based Remote Access
56 pages
Risk and Risk Management in Software Project
No ratings yet
Risk and Risk Management in Software Project
16 pages
iON Digital Assessment: Redefining The Future of Assessment Process
No ratings yet
iON Digital Assessment: Redefining The Future of Assessment Process
8 pages
Robots, Androids, Al: Which Transmit
No ratings yet
Robots, Androids, Al: Which Transmit
7 pages
Salesforce Platform Pricing Editions
No ratings yet
Salesforce Platform Pricing Editions
2 pages
SEL-749M Base Unit
No ratings yet
SEL-749M Base Unit
3 pages
IT Summary
No ratings yet
IT Summary
16 pages
Exercise: Explore Data Using Data Visualization Techniques
No ratings yet
Exercise: Explore Data Using Data Visualization Techniques
41 pages
LRC Resources For : Animation, Interaction & Moving Image
No ratings yet
LRC Resources For : Animation, Interaction & Moving Image
8 pages
Alcatel-Lucent Vs Microsoft
100% (1)
Alcatel-Lucent Vs Microsoft
11 pages
Husnain-CV Updated
No ratings yet
Husnain-CV Updated
3 pages
Satish Dnyanasadhana: Pradhan College
No ratings yet
Satish Dnyanasadhana: Pradhan College
32 pages
Selvam J. Practice With ESP32 Project... 2022
No ratings yet
Selvam J. Practice With ESP32 Project... 2022
247 pages
Data Communications Standards by Tomasi
No ratings yet
Data Communications Standards by Tomasi
26 pages
Website Design Client Onboarding Template
No ratings yet
Website Design Client Onboarding Template
14 pages
Owner's Manual Supplement: Uconnect 8.4/8.4 NAV
No ratings yet
Owner's Manual Supplement: Uconnect 8.4/8.4 NAV
271 pages
KCAA01 Cyber Security
No ratings yet
KCAA01 Cyber Security
1 page
Rhod RGB: Quick Installation Guide
No ratings yet
Rhod RGB: Quick Installation Guide
12 pages
Q1 SUMMATIVE TEST 2 Trends and Issues in ICT
No ratings yet
Q1 SUMMATIVE TEST 2 Trends and Issues in ICT
2 pages
SqUID Warehouse Robot
No ratings yet
SqUID Warehouse Robot
3 pages
2017 DB Guenter Unbescheid Authentifizierung Und Autorisierung in Oracle Systemen Praesentation
No ratings yet
2017 DB Guenter Unbescheid Authentifizierung Und Autorisierung in Oracle Systemen Praesentation
32 pages
Step by Step Instruction For OSSTET 2016
No ratings yet
Step by Step Instruction For OSSTET 2016
4 pages
Microsoft Certified: Azure Security Engineer Associate
No ratings yet
Microsoft Certified: Azure Security Engineer Associate
78 pages
I Get To Love You Ruelle Sheet Music For Piano, Violin (Mixed Duet)
No ratings yet
I Get To Love You Ruelle Sheet Music For Piano, Violin (Mixed Duet)
1 page
PROPOSAL 1 Busker System
No ratings yet
PROPOSAL 1 Busker System
10 pages
Presentation 27 05 2024
No ratings yet
Presentation 27 05 2024
24 pages
Backup Operators
No ratings yet
Backup Operators
14 pages
MS Azure ALL
No ratings yet
MS Azure ALL
39 pages
SQL Commands For Final
No ratings yet
SQL Commands For Final
7 pages
Are You Ready To Go Paperless?: Crawfordtech Bank Statement Sample
No ratings yet
Are You Ready To Go Paperless?: Crawfordtech Bank Statement Sample
2 pages
T2DDT0 Manual
No ratings yet
T2DDT0 Manual
4 pages

Rag Ultimate Guide

Uploaded by

Rag Ultimate Guide

Uploaded by

Chat With Your PDF Ultimate Guide

To Retrieval Augmented Generation

Code With Prince

How To Load PDF Document

Splitting Document Into Chunks

[988, 098, 997, 9083, 494]

[988, 098, 997, 9083, 494]

[988, 098, 997, 9083, 494]

[988, 098, 997, 9083, 494]

Split Document Vector

Maximal Marginal Relevance (MMR)

Most Similar texts

Most diverse texts: MMR

LLM Aided Retrieval

LLM Final LLM

You might also like