Basics of Retrieval-Augmented Generation or RAG

Uploaded by

Khalid

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

108 views2 pages

Basics of Retrieval-Augmented Generation or RAG

Uploaded by

Khalid

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 2

- Welcome to "Back To Basics".

In this episode, we will discuss the basics of Retrieval-Augmented

Generation, or RAG, in generative AI.

As interest in large language models, or LLMs, increases, developers are exploring ways to harness their
potential.

However, pre-trained LLMs might not perform optimally right out of the box for your business needs.

You may need to decide between using model fine-tuning, a process where a pre-trained model is
further trained on a new dataset without starting from scratch or Retrieval-Augmented Generation to
enhance the performance.

In this episode, we will explore what RAG is and a pattern to implement RAG using Amazon Bedrock
foundation models and other AWS Services.

RAG can be particularly useful in developing applications, like Q&A chat bots that securely interact with
your internal knowledge basis or enterprise data sources.

Such an approach is more suitable compared to out-of-the-box LLMs, which may lack your enterprise-
specific knowledge. Let's dive into understanding what Retrieval-Augmented Generation is.

Retrieval-Augmented Generation helps to retrieve data from outside a foundation model and augment
your prompts, which is a natural language text that requests the LLM to perform a specific task by
adding the relevant retrieved data in context.

It is composed of three components: retrieval, augmentation, and generation. Upon receiving a user
query, relevant content is retrieved from external knowledge basis or other data sources based on the
specifics of the query.

The retrieved contextual information is then appended to the original user query, creating an
augmented query to serve as the input to the foundation model.

The foundation model then generates a response based on the augmented query.

With this high-level flow, now let's cover a few different types of retrieval and see where RAG fits in. The
three typical types of retrieval consist of, first, rule-based, in which the model fetches unstructured
data, such as documents and thus keyword-based searches. Second, transaction-based, where
transactional data is retrieved from a database or an API. And third, semantic-based, where the model
retrieves relevant documents based on text embeddings [Text embeddings are a technique that
converts text into numerical vectors that represent the meaning and context of the words].

This is where the RAG model is most applicable. First, let's further define embedding and its relevance
when implementing RAG.

Embedding refers to transforming data, like text, images, audio, into numerical representation in a high-
dimensional vector space using machine learning algorithms.

This allows understanding semantics, learning complex patterns, and using the vector representation for
applications like search, classification, and natural language processing.

Let's take a deeper look at an end-to-end RAG architecture leveraging AWS Services.
Step – 1 First, you start with the selection of a large language model. Some considerations to keep in
mind are use cases, context length, hosting, training data if applicable, customization, and license
agreements. For this, you can use Amazon Bedrock, which is a fully managed service that offers a choice
of high-performing foundation models from leading AI companies via a single API.

Along with a broad set of capabilities, you need to build generative AI applications with security, privacy,
and responsible.

Step- 2 Now, with the LLM selection in place, you will start with identifying your knowledge base and
converting them into embeddings for a vector store.

For embedding models, factors such as max input size, latency, output embedding size, ease of hosting,
and accuracy are crucial considerations. When considering embedding models, your options include
Amazon Titan Embeddings, Cohere Embed, and other embedding models. For your query, generate the
embeddings of the query using the same embedding model.

Step- 3 Next are the vector databases, which provide you the ability to store and retrieve vectors as
high-dimensional points. With this, you can index vectors generated by embeddings into a vector
database. When evaluating options, consider the nature of data sources and formats, dimensions, the
choice between fully-managed services and self-managed, development complexity, and scalability.

Available vector store options include Vector Engine for Amazon OpenSearch, AWS Kendra, and Aurora
pgvector store. Now, let's talk about orchestration for all of these components. Some available options
are Amazon Bedrock Knowledge Base and Agents, LangChain, LlamaIndex, Step Functions, as well as
other open-source solutions.

In this episode, we discussed the basics of RAG and reference patterns for implementation. We covered
the basics around embeddings and why it is important when implementing RAG in your applications.
Finally, we saw an end-to-end architecture using AWS Services, including Amazon Bedrock. Check out
the links in the description below for more details. Thank you for watching "Back to Basics". See you
next time.

ServiceNow CSA Master Cheat Sheet
No ratings yet
ServiceNow CSA Master Cheat Sheet
15 pages
Prompt Engineering
0% (1)
Prompt Engineering
2 pages
(KoBIA) 바이오의약품 산업동향 보고서 (2023.12) (최종)
No ratings yet
(KoBIA) 바이오의약품 산업동향 보고서 (2023.12) (최종)
143 pages
Big Data PPT 55b0fc01e7543
No ratings yet
Big Data PPT 55b0fc01e7543
31 pages
RSDB Update 20190620
100% (1)
RSDB Update 20190620
1 page
Aerospace Engineering
No ratings yet
Aerospace Engineering
8 pages
NEW천일문기본 (TB) 정답 Unlocked
No ratings yet
NEW천일문기본 (TB) 정답 Unlocked
88 pages
LLM and RAG
No ratings yet
LLM and RAG
12 pages
Turbo-Pump Supply System For Liquid-Propellant Rocket Engine
No ratings yet
Turbo-Pump Supply System For Liquid-Propellant Rocket Engine
8 pages
창백한 푸른 점 - 110408문경후
No ratings yet
창백한 푸른 점 - 110408문경후
5 pages
Aerospace 2011 12
No ratings yet
Aerospace 2011 12
84 pages
(기술로드맵) 2018년 중소기업 전략기술로드맵 (2019-2021) 01 - 인공지능
No ratings yet
(기술로드맵) 2018년 중소기업 전략기술로드맵 (2019-2021) 01 - 인공지능
282 pages
On Ai
No ratings yet
On Ai
24 pages
Development of LRE Cooling System Module in A Concurrent Engineering Approach
No ratings yet
Development of LRE Cooling System Module in A Concurrent Engineering Approach
126 pages
Bachelor of Engineering Aerospace 2015 2016
No ratings yet
Bachelor of Engineering Aerospace 2015 2016
108 pages
Artificial Intelligence Module Wise Notes
No ratings yet
Artificial Intelligence Module Wise Notes
65 pages
B.Tech - Aerospace Engineering PDF
No ratings yet
B.Tech - Aerospace Engineering PDF
138 pages
Artificial - Intelligence - Presentation (FINAL)
No ratings yet
Artificial - Intelligence - Presentation (FINAL)
23 pages
Chapter 3 (Artificial Intelligence (AI) )
100% (1)
Chapter 3 (Artificial Intelligence (AI) )
34 pages
GNN-XAI 学习提纲.md
No ratings yet
GNN-XAI 学习提纲.md
4 pages
2023
No ratings yet
2023
211 pages
What Is An AI Agent
No ratings yet
What Is An AI Agent
4 pages
Generative Ai Guide
No ratings yet
Generative Ai Guide
2 pages
How A Chat Bot Using Dialog Flow Works
No ratings yet
How A Chat Bot Using Dialog Flow Works
18 pages
Guide To Evaluating LLM and RAG Systems
No ratings yet
Guide To Evaluating LLM and RAG Systems
41 pages
Chapter 1 Introduction To AI
No ratings yet
Chapter 1 Introduction To AI
53 pages
Presentation LLM
No ratings yet
Presentation LLM
5 pages
2024 Susi Plan
No ratings yet
2024 Susi Plan
102 pages
1GitHub - Modelcontextprotocol - Python-Sdk - The Official Python SDK For Model Context Protocol Servers and Clients
No ratings yet
1GitHub - Modelcontextprotocol - Python-Sdk - The Official Python SDK For Model Context Protocol Servers and Clients
9 pages
Brief Introduction To GenAI
No ratings yet
Brief Introduction To GenAI
1 page
Using AI To Automate The Literature Review Process
No ratings yet
Using AI To Automate The Literature Review Process
68 pages
Ebook Scaling RAG Systems From POC To Production - 2025
No ratings yet
Ebook Scaling RAG Systems From POC To Production - 2025
28 pages
20 Types of LLM Guardrails
No ratings yet
20 Types of LLM Guardrails
12 pages
Unit - 1
No ratings yet
Unit - 1
65 pages
7 - Introduction-To-Prompt-Engineering
No ratings yet
7 - Introduction-To-Prompt-Engineering
8 pages
Feature Engineering 1
No ratings yet
Feature Engineering 1
68 pages
Ecc
No ratings yet
Ecc
76 pages
Kubernetes QuickStartGuides
No ratings yet
Kubernetes QuickStartGuides
57 pages
국방 인공지능 (AI) 실증 기획 연구 PDF
100% (2)
국방 인공지능 (AI) 실증 기획 연구 PDF
192 pages
Tf-Idf: David Kauchak cs160 Fall 2009
No ratings yet
Tf-Idf: David Kauchak cs160 Fall 2009
51 pages
Design and Construction of An Upgraded Hybrid Rocket
No ratings yet
Design and Construction of An Upgraded Hybrid Rocket
115 pages
IoT Frameworks, Tools, APIs and Architectures
No ratings yet
IoT Frameworks, Tools, APIs and Architectures
11 pages
RAG For Knowledge Intensive Tasks
No ratings yet
RAG For Knowledge Intensive Tasks
19 pages
Geometry Practice Quiz For High School Students
No ratings yet
Geometry Practice Quiz For High School Students
3 pages
Micro-Framework: Presented By-Khirod Kumar Behera
No ratings yet
Micro-Framework: Presented By-Khirod Kumar Behera
10 pages
Retrieval Augmentation Reduces Hallucination in Conversation
No ratings yet
Retrieval Augmentation Reduces Hallucination in Conversation
21 pages
Unit - 1 Deep Learning Techniques
No ratings yet
Unit - 1 Deep Learning Techniques
18 pages
Sanjay K. Kuanar, Brojo Kishore Mishra, Sheng-Lung Peng, Daniel D. Dasig Jr. - The Role of IoT and Blockchain - Techniques and Applications-Apple Academic Press (2022)
100% (1)
Sanjay K. Kuanar, Brojo Kishore Mishra, Sheng-Lung Peng, Daniel D. Dasig Jr. - The Role of IoT and Blockchain - Techniques and Applications-Apple Academic Press (2022)
501 pages
2021 Lecture02 IntelligentAgents
No ratings yet
2021 Lecture02 IntelligentAgents
65 pages
U1 NLP App Solved
No ratings yet
U1 NLP App Solved
26 pages
21B21A4238 Generative AI
No ratings yet
21B21A4238 Generative AI
26 pages
Knowledge Graphs V Vector Databases and When Not To Use Them!
No ratings yet
Knowledge Graphs V Vector Databases and When Not To Use Them!
3 pages
MemGPT - Towards LLMs As Operating Systems - 2310.08560
No ratings yet
MemGPT - Towards LLMs As Operating Systems - 2310.08560
15 pages
LangChain Academy - Introduction To LangGraph - Motivation
No ratings yet
LangChain Academy - Introduction To LangGraph - Motivation
17 pages
Chapter 2: Chatgpt in Academic Writing and Publishing: A Comprehensive Guide
No ratings yet
Chapter 2: Chatgpt in Academic Writing and Publishing: A Comprehensive Guide
8 pages
Technical Seminar: Sapthagiri College of Engineering
No ratings yet
Technical Seminar: Sapthagiri College of Engineering
18 pages
Implementation of Generative A
No ratings yet
Implementation of Generative A
13 pages
AE2230 II Rocket Propulsion - 2019
No ratings yet
AE2230 II Rocket Propulsion - 2019
67 pages
Prompt Engineering For Vision Models Slides 1720084286
No ratings yet
Prompt Engineering For Vision Models Slides 1720084286
17 pages
Ethics of Artificial Intelligence
No ratings yet
Ethics of Artificial Intelligence
20 pages
Generative AI APIs For Practical Applications
No ratings yet
Generative AI APIs For Practical Applications
27 pages
Run Docker Containers On Windows Server 2019 - ComputingForGeeks
No ratings yet
Run Docker Containers On Windows Server 2019 - ComputingForGeeks
1 page
system integrator Second Edition
From Everand
system integrator Second Edition
Gerardus Blokdyk
No ratings yet
Examining The Effectiveness of Gottman Couple Ther
No ratings yet
Examining The Effectiveness of Gottman Couple Ther
7 pages
Bank Scope
No ratings yet
Bank Scope
11 pages
Following Will Be Job Description
No ratings yet
Following Will Be Job Description
1 page
Home Automation Using STM32 and Bluetooth Module
0% (1)
Home Automation Using STM32 and Bluetooth Module
21 pages
Dba Code
No ratings yet
Dba Code
83 pages
Ivan Bayross Book PDF
21% (14)
Ivan Bayross Book PDF
39 pages
Professional Data Engineer Sample Questions
No ratings yet
Professional Data Engineer Sample Questions
29 pages
Assignment Cover Sheet Qualification Module Number and Title
No ratings yet
Assignment Cover Sheet Qualification Module Number and Title
13 pages
Bigdata MCQ QA Part2
No ratings yet
Bigdata MCQ QA Part2
9 pages
TejaswiSVS DataBIEngineer
No ratings yet
TejaswiSVS DataBIEngineer
3 pages
Dotnet Interview
No ratings yet
Dotnet Interview
2 pages
1 Stored Procedures in PL/SQL: 1.1 Oracle Users
No ratings yet
1 Stored Procedures in PL/SQL: 1.1 Oracle Users
22 pages
BDA - HIVE & PIG-Other Notes in Detail
No ratings yet
BDA - HIVE & PIG-Other Notes in Detail
162 pages
Multimedia Systems in Libraries and Their Applications: DR CK Rarnaiah
No ratings yet
Multimedia Systems in Libraries and Their Applications: DR CK Rarnaiah
16 pages
Hsslive Xii Cs Key Dec 2024
No ratings yet
Hsslive Xii Cs Key Dec 2024
9 pages
Pro Entity Framework Core 2 For ASP - NET Core MVC Adam Freeman Instant Download
No ratings yet
Pro Entity Framework Core 2 For ASP - NET Core MVC Adam Freeman Instant Download
56 pages
Chapter 2 Database System Concepts and Architecture
No ratings yet
Chapter 2 Database System Concepts and Architecture
31 pages
Database Design - 2nd Edition
No ratings yet
Database Design - 2nd Edition
144 pages
SWAT - A System-Wide Approach To Tunable Leakage Mitigation in Encrypted Data Stores
No ratings yet
SWAT - A System-Wide Approach To Tunable Leakage Mitigation in Encrypted Data Stores
17 pages
Some Important Ques
No ratings yet
Some Important Ques
14 pages
DWM Te QP
No ratings yet
DWM Te QP
7 pages
Ranorex Studio Advanced
No ratings yet
Ranorex Studio Advanced
199 pages
1904401-Database Management System
No ratings yet
1904401-Database Management System
15 pages
Practical:-4: Select From Account Where Balance (Select Sum (Balance) From Account)
No ratings yet
Practical:-4: Select From Account Where Balance (Select Sum (Balance) From Account)
4 pages
CV Cloud Operations Engineer Umme Ammara-2
No ratings yet
CV Cloud Operations Engineer Umme Ammara-2
2 pages
Dbms Notes
No ratings yet
Dbms Notes
28 pages
G Mart Review
No ratings yet
G Mart Review
6 pages
Aissce CS Practical Exam 2021
No ratings yet
Aissce CS Practical Exam 2021
5 pages
DBMS (R23) Lab Manual - Final
No ratings yet
DBMS (R23) Lab Manual - Final
55 pages
What Are The Products Available in The Market For Power BI
No ratings yet
What Are The Products Available in The Market For Power BI
5 pages
Introduction To Oracle: Lecturer: J. Mutai
No ratings yet
Introduction To Oracle: Lecturer: J. Mutai
12 pages

Basics of Retrieval-Augmented Generation or RAG

Uploaded by

Basics of Retrieval-Augmented Generation or RAG

Uploaded by

- Welcome to "Back To Basics".

In this episode, we will discuss the basics of Retrieval-Augmented

You might also like