0% found this document useful (0 votes)

146 views18 pages

Introduction To RAG (Retrieval Augmented Generation) and Vector Database - by Sachinsoni - Medium

Uploaded by

MohitKhemka

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

146 views18 pages

Introduction To RAG (Retrieval Augmented Generation) and Vector Database - by Sachinsoni - Medium

Uploaded by

MohitKhemka

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Open in app

Get unlimited access to the best of Medium for less than $1/week. Become a member

Introduction to RAG (Retrieval Augmented

Generation) and Vector Database
Sachinsoni · Following
8 min read · Sep 15, 2024

Listen Share More

Retrieval-Augmented Generation (RAG) is a powerful technique that enhances the

capabilities of language models by combining two key processes: retrieving information
and generating text. While traditional language models like GPT rely only on what they
were trained on, RAG goes a step further by searching for relevant information from
external sources, like a database or documents, to help generate more accurate and
detailed answers. This makes RAG especially useful in tasks where up-to-date or specialized
knowledge is needed, such as answering questions or generating informative content.
image by CampusX

Limitations of Large Language Models Before RAG :

Large Language Models (LLMs) like GPT-3 have impressive capabilities, but they come with
several key limitations:

1. Outdated Knowledge: LLMs can’t access new information after they’re trained. They
rely only on the data they were trained with, so they can’t provide real-time or up-to-
date information.

2. Factual Mistakes: LLMs can generate fluent text but sometimes give incorrect or
misleading answers, especially on less common or specialized topics.

3. Hallucination Problem: LLMs sometimes “hallucinate,” meaning they confidently

generate information that sounds reasonable but is completely false. This happens
because they rely only on patterns from their training data rather than real-time
information.

4. No Access to External Information: LLMs can’t look up answers from external sources,
like the internet or a database, which limits their ability to provide specific or accurate
details on certain topics.

How RAG Solves These Issues

Retrieval-Augmented Generation (RAG) helps solve many of these problems by allowing
LLMs to fetch relevant information from external sources. Instead of relying solely on their
internal knowledge, RAG models can search databases or documents for real-time, up-to-
date information, which helps reduce hallucinations and improves the accuracy of
generated content. By integrating retrieval, RAG ensures that the model generates more
reliable, fact-based answers, even for specialized or complex queries.

Understanding Information Retrieval Before RAG

Before diving into Retrieval-Augmented Generation (RAG), it’s crucial to understand
Information Retrieval (IR), which plays a foundational role. As the name suggests,
information retrieval is about finding and extracting relevant data from large datasets.
Think of it like this: when we were kids, we used to answer questions based on a given
paragraph. We didn’t use the entire paragraph; instead, we picked the exact information
needed to answer the question. This is the essence of IR — retrieving only the relevant
information from massive collections, which could be text, images, audio, or even video.

image by CampusX

The process of IR generally consists of a few key steps, the first of which is indexing.
Indexing involves converting external data sources into numerical representations, making
it easier to search through large datasets. For example, if you’re looking for information on
the “ICC Cricket World Cup 2023,” IR systems will scan through the database and rank all
related documents, prioritizing the ones that contain the most relevant information.

Workflow of RAG :
In Retrieval-Augmented Generation (RAG), the workflow revolves around three main
components: Retrieve, Augment, and Generate. Here’s a detailed breakdown of each
phase:

1. Retrieve
This phase is responsible for fetching relevant information from an external knowledge
base, database, or document repository. The process begins with a query, usually derived
from the user’s input or a given prompt.

Embedding Model: The input query is first converted into vector embeddings using an
embedding model. This model maps the input into a numerical form that can be used
for similarity searches.

Vector Database: Once the query is embedded, it is sent to a Vector DB, which contains
embeddings of documents, text data, or any relevant external information. This
database is indexed based on vector similarity (cosine similarity is often used).

Retriever & Ranker: A retriever component then selects the top N documents or
relevant data points based on similarity. These documents are ranked in order of
relevance, typically using semantic search or other retrieval algorithms like sparse or
dense retrieval methods.

2. Augment
In this phase, the retrieved information is used to provide additional context to the query or
prompt, enhancing the model’s understanding of the task.

Retrieved Context: The top N documents fetched in the retrieval stage are passed back
to the model as retrieved context. This information is appended or "augmented" with
the original user query to provide additional details and improve the relevance and
accuracy of the response.

The goal here is to leverage both the external knowledge base and the model’s trained
knowledge to handle specific or unseen questions better.
image Source

3. Generate
The final stage is responsible for generating the actual output, which combines the original
prompt/query with the augmented data from the retrieval phase.

LLMs (Large Language Models): The augmented prompt, along with the retrieved
context, is passed to the LLMs (e.g., GPT, BERT, or any transformer-based model). The
LLMs processes the input and generates a response that is more context-aware and
accurate, thanks to the extra information it received from the retrieval phase.

Formatted Response: The output is returned as a formatted response, usually displayed

in the user interface. The user query is enriched with additional, often domain-specific
or up-to-date, information, addressing some of the inherent limitations of standard
language models.

Visualizing the RAG Workflow:

The below diagram outlines the Retrieval-Augmented Generation (RAG) framework,
showing how it integrates retrieval methods with a generative approach to improve text
generation.
image Source

Explanation of Above diagram workflow :

A. Left Side (Retrieval Methods)
1. Private or Custom Data:

This represents a large corpus of documents or information that may not be part of the
trained model (such as private databases, custom datasets, or any source of external
knowledge).

2. Embedding Model:

An embedding model (like BERT, or Sentence Transformers) is used to convert the

documents and the user’s query into dense vector representations (embeddings). These
vectors represent the semantic meaning of the documents and queries in high-
dimensional space.

3. Vector DB:

The embeddings are stored in a Vector Database, which allows for efficient similarity
search. When a query is issued, the database retrieves the documents based on their
vector similarity to the query.

4. Retriever and Ranker:

Retriever: When a query is provided, the retriever pulls the top N most relevant
documents (based on vector similarity).
Ranker: The ranker ranks these documents based on relevance (using similarity
measures like cosine similarity).

. Retrieved Context:

The top N most relevant documents or pieces of text are selected as the retrieved
context to be used by the generative model in the next step.

B. Right Side (Generative Approach)

1. External Data Source:

This highlights the limitation of purely generative models (such as LLMs) that do not
have access to new or external data beyond their training corpus. They may:

Not be up-to-date.

Suffer from hallucinations.

Lack specific domain knowledge.

2. Embedding Model:

This step represents how the query and retrieved documents are encoded into
embeddings to be processed by the generative model (e.g., LLMs).

3. LLMs (Large Language Model):

The LLMs (like GPT, BART, T5, etc.) takes the retrieved context (from the left side of the
diagram) along with the original query and generates a response. This helps to improve
the factual accuracy of the output by integrating external, relevant documents.

4. Formatted Response (User Interface):

The user receives the final output, which is more informed and factually correct, as it
integrates both generative capabilities and retrieved information.

So far, we have covered the basics of RAG. Now, let’s delve into the concept of Vector
Databases.

Understanding Vector Databases :

In today’s digital landscape, when you perform a search on Google, such as “calories in
apple” versus “employees in Apple,” the search engine cleverly distinguishes between the
fruit and the company. But have you ever wondered how Google achieves this? The answer
lies in a technique known as semantic search.
Semantic search moves beyond simple keyword matching to understand the intent behind
a user’s query and leverage context for more accurate results. At its core, semantic search
relies on the concept of embeddings — numerical representations of text.
What Are Embeddings ?
Embeddings transform words or sentences into numeric vectors. For instance, consider the
word “Apple.” In one context, it could refer to the fruit, while in another, it might denote the
tech company. To represent this, we create a vector that encodes features related to each
context. For “Apple” the fruit, the vector might reflect attributes like “fruit,” “sweet,” and
“edible.” For “Apple” the company, the vector would focus on “technology,” “company,” and
“innovation.”

image Source

These vectors allow us to capture the semantic similarity between words. For example, the
vectors for “Apple” and “orange” might show similarities in their fruit-related attributes,
while vectors for “Apple” and “Samsung” would highlight their similarities in the tech
context.

The Role of Vector Databases :

With thousands or even millions of embeddings to manage, storing and searching these
vectors efficiently becomes crucial.
image Source

Traditional relational databases might initially seem like a viable option. You’d generate
embeddings, store them in a SQL database, and then compare new query embeddings to
retrieve relevant results. However, this approach struggles with scalability and efficiency
when dealing with vast amounts of data.
image Source

To address these challenges, vector databases come into play. They are optimized for
storing and querying large-scale vector embeddings. Instead of linear search, which is
computationally expensive and slow for large datasets, vector databases use techniques like
indexing and locality-sensitive hashing (LSH) to speed up searches.

Locality Sensitive Hashing (LSH) :

LSH is a technique that partitions vectors into “buckets” based on their similarity. When a
search query is performed, it is hashed into one of these buckets, significantly reducing the
number of comparisons needed. Instead of comparing the query vector with every stored
vector, you only need to compare it with those in the same bucket. This method accelerates
search times and enhances efficiency.
image Source

Why Vector Databases?

Vector databases offer two primary benefits:

1. Faster Searches: By employing techniques like LSH, they can rapidly locate relevant
vectors.

2. Optimal Storage: They are designed to handle the unique requirements of vector data,
ensuring efficient storage and retrieval.

As vector databases continue to evolve, they are becoming increasingly essential for
applications involving semantic search, recommendation systems, and more.

I hope this overview helps you understand the fundamentals of vector databases and their
significance in modern data retrieval and search technologies.

References :

Research Paper for RAG : Retrieval-Augmented Generation for Large Language Models: A Survey

Vector Database Youtube Video: https://youtu.be/72XgD322wZ8?si=KPFg30be_EBu7EUa

Vector Database Article : https://www.pinecone.io/learn/vector-database/

I trust this blog has enriched your understanding of Retrieval Augmented Generation(RAG).
If you found value in this content, I invite you to stay connected for more insightful posts.
Your time and interest are greatly appreciated. Thank you for reading!

Following

Written by Sachinsoni
465 Followers · 17 Following

More from Sachinsoni

Sachinsoni

Python Cookiecutter : Streamlining MLOps Project Setup

Cookiecutter is a valuable tool for streamlining the project setup process by providing pre-configured
templates specifically designed for…

Mar 5 5

Sachinsoni

Mastering Pip: Essential for Data Scientists and Python Developers

Ever wondered how Python developers and data scientists effortlessly manage all those cool packages
they use in their projects? Well, it’s…
Oct 14, 2023 1

Sachinsoni

From Code to Containers: Revolutionizing Data Science with Docker

Imagine if every time you wanted to share your amazing data science project, you had to deal with
software compatibility issues, missing…

Aug 27, 2023 110

Sachinsoni

Different Metrices in Machine Learning for Measuring performance of

Classification Algorithms
Accuracy: In the context of classification problems, accuracy refers to the measurement of how often a
model correctly predicts the class…

Jun 12, 2023

See all from Sachinsoni

Recommended from Medium

Dhiraj K

ChatGPT: Reinforcement Learning from Human Feedback (RLHF)

Imagine training a dog. You reward it with a treat when it performs a trick correctly, and if it misbehaves,
you guide it towards better…

Oct 26
In Towards AI by Mdabdullahalhasib

A Complete Guide to Embedding For NLP & Generative AI/LLM

Understand the concept of vector embedding, why it is needed, and implementation with LangChain.

Oct 19 111

Lists

Staff picks
776 stories · 1469 saves

Stories to Help You Level-Up at Work

19 stories · 880 saves

Self-Improvement 101
20 stories · 3093 saves

Productivity 101
20 stories · 2599 saves
Samar Singh

Mastering RAG: Advanced Methods to Enhance Retrieval-Augmented

Generation
RAG (Retrieval Augmentation Generation) is technique which gives llm the external knowledge or
data,the data on which llm has not been…

Jun 17 94 1

Shrinivasan Sankar

Chunking in Retrieval Augmented Generation (RAG) — theory with hands-on

In my previous article, we saw a comprehensive overview of Retrieval Augmented Generation(RAG). We
saw why we need RAG and in what…
Jul 22 2

Ignacio Pérez

Understanding RAG Basics, its Patterns and its Stages — Introduction

StepByStep: Introducing RAG, how it works and its patterns and stages or modules.

Aug 17 164

In Python in Plain English by Luis Valencia

Implementing Byte Pair Encoding (BPE) for Tokenization: A Step-by-Step Guide

In the book “Build a Large Language Model (From Scratch)”, Sebastian Raschka introduces various
ways to process text data for large…

Oct 15 2

See more recommendations

Agentic AI Fundamentals Quiz Complete With Code Diagrams Nida Rizwan
100% (1)
Agentic AI Fundamentals Quiz Complete With Code Diagrams Nida Rizwan
14 pages
Robotics Research Paper
100% (3)
Robotics Research Paper
23 pages
Self-Improving LLM Architectures With Open Source
No ratings yet
Self-Improving LLM Architectures With Open Source
14 pages
Pig Full Lecture
No ratings yet
Pig Full Lecture
38 pages
Feature Engineering 1
No ratings yet
Feature Engineering 1
68 pages
Data Wrangling With R
No ratings yet
Data Wrangling With R
174 pages
Reading:: Sources
No ratings yet
Reading:: Sources
15 pages
Generative AI
No ratings yet
Generative AI
25 pages
Supercharge Your Data Lake With Snowflake
No ratings yet
Supercharge Your Data Lake With Snowflake
13 pages
Building An AI-First Company
100% (1)
Building An AI-First Company
29 pages
Data Wrangling
No ratings yet
Data Wrangling
50 pages
Scalable LLM Deployment Architecture and Design
No ratings yet
Scalable LLM Deployment Architecture and Design
10 pages
20 Types of LLM Guardrails
No ratings yet
20 Types of LLM Guardrails
12 pages
Newwhitepaper Agents2
No ratings yet
Newwhitepaper Agents2
84 pages
Mortar Pig Cheat Sheet
50% (2)
Mortar Pig Cheat Sheet
13 pages
Techsonar QTAE24001ENN
No ratings yet
Techsonar QTAE24001ENN
47 pages
Planning For Big Data - CIO's Handbook For The Changing Data Landscape, O'Reilly 2012
No ratings yet
Planning For Big Data - CIO's Handbook For The Changing Data Landscape, O'Reilly 2012
84 pages
Generative AI Interview Questions and Answers
No ratings yet
Generative AI Interview Questions and Answers
7 pages
Hadoop Commands
No ratings yet
Hadoop Commands
2 pages
Neo4j - GraphRAG - 2024
100% (1)
Neo4j - GraphRAG - 2024
23 pages
Data Wrangling
No ratings yet
Data Wrangling
15 pages
Guide To Fast GraphRAG
No ratings yet
Guide To Fast GraphRAG
7 pages
NCA-GENL Nvidia Generative Ai Llms Exam Dumps
No ratings yet
NCA-GENL Nvidia Generative Ai Llms Exam Dumps
5 pages
LLM With Knowledge Graphs
No ratings yet
LLM With Knowledge Graphs
40 pages
LangChain & RAG
No ratings yet
LangChain & RAG
62 pages
t8 Manual 1.2
No ratings yet
t8 Manual 1.2
323 pages
Apache Pig
No ratings yet
Apache Pig
61 pages
Data Science Internship
No ratings yet
Data Science Internship
2 pages
Wrangling Webinar
No ratings yet
Wrangling Webinar
151 pages
Fine Tuning Techniques For Large Language Models LLMs
No ratings yet
Fine Tuning Techniques For Large Language Models LLMs
15 pages
Maxdna Distributed Control System: Max Station 1 Max Station 2 Max Station 3
No ratings yet
Maxdna Distributed Control System: Max Station 1 Max Station 2 Max Station 3
78 pages
Vector Database in LLMs
No ratings yet
Vector Database in LLMs
14 pages
Federated Learning Overview, Strategies, Applications, Tools and
No ratings yet
Federated Learning Overview, Strategies, Applications, Tools and
24 pages
Hugging Face
100% (1)
Hugging Face
11 pages
Pig and Pig Latin
No ratings yet
Pig and Pig Latin
16 pages
The DOM GraphRAG Project
No ratings yet
The DOM GraphRAG Project
30 pages
Graph RAG
No ratings yet
Graph RAG
7 pages
Canada NOC Code List PDF 2024 - In-Demand Jobs in Canada
No ratings yet
Canada NOC Code List PDF 2024 - In-Demand Jobs in Canada
363 pages
The Power of Ontologies and Knowledge Graphs - Practical Examples From The Financial Industry - Ontotext
No ratings yet
The Power of Ontologies and Knowledge Graphs - Practical Examples From The Financial Industry - Ontotext
13 pages
Ontology Unit 2 Notes
No ratings yet
Ontology Unit 2 Notes
25 pages
Fine-Tuning Legal-BERT - LLMs For Automated Legal Text Classification - by Drewgelbard - Nov, 2024 - Towards AI
No ratings yet
Fine-Tuning Legal-BERT - LLMs For Automated Legal Text Classification - by Drewgelbard - Nov, 2024 - Towards AI
27 pages
Bias-Variance Tradeoff Presentation
No ratings yet
Bias-Variance Tradeoff Presentation
11 pages
AI Ethics
No ratings yet
AI Ethics
22 pages
Kenny-230718-Top 70 Microsoft Data Science Interview Questions
No ratings yet
Kenny-230718-Top 70 Microsoft Data Science Interview Questions
17 pages
Visualisation For Data Science Predict Overview 3267
No ratings yet
Visualisation For Data Science Predict Overview 3267
15 pages
Knowledge Graph Construction Using Large Language Models
No ratings yet
Knowledge Graph Construction Using Large Language Models
17 pages
1GitHub - Modelcontextprotocol - Python-Sdk - The Official Python SDK For Model Context Protocol Servers and Clients
No ratings yet
1GitHub - Modelcontextprotocol - Python-Sdk - The Official Python SDK For Model Context Protocol Servers and Clients
9 pages
Hadoop
No ratings yet
Hadoop
34 pages
PIG Interview Qusetions
No ratings yet
PIG Interview Qusetions
15 pages
HF Security Smart-Pass - Installation Instructions - 1.5.9 - 20220304
No ratings yet
HF Security Smart-Pass - Installation Instructions - 1.5.9 - 20220304
28 pages
Hadoop & Big Data
No ratings yet
Hadoop & Big Data
36 pages
Classification of Customer Churn Prediction Model For Telecommunication Industry Using Analysis of Variance
No ratings yet
Classification of Customer Churn Prediction Model For Telecommunication Industry Using Analysis of Variance
7 pages
LangChain Academy - Introduction To LangGraph - Motivation
No ratings yet
LangChain Academy - Introduction To LangGraph - Motivation
17 pages
(2018!04!16) Bali DL PRB Justification
No ratings yet
(2018!04!16) Bali DL PRB Justification
7 pages
Hadoop Pig Presentation
No ratings yet
Hadoop Pig Presentation
33 pages
Apache Pig
No ratings yet
Apache Pig
21 pages
Osmania University: Master of Computer Applications (MCA) Semester III and IV 2020 - 2021
No ratings yet
Osmania University: Master of Computer Applications (MCA) Semester III and IV 2020 - 2021
35 pages
Enterprise Ontology-Based Information Systems Development
No ratings yet
Enterprise Ontology-Based Information Systems Development
25 pages
10 Evani Generative AI Champion
No ratings yet
10 Evani Generative AI Champion
39 pages
Business Analyst Questions 1697392771
No ratings yet
Business Analyst Questions 1697392771
28 pages
Machine Learning Interview Question
No ratings yet
Machine Learning Interview Question
72 pages
Brief Introduction To GenAI
No ratings yet
Brief Introduction To GenAI
1 page
CS 8520: Artificial Intelligence: Knowledge Representation
No ratings yet
CS 8520: Artificial Intelligence: Knowledge Representation
30 pages
Website Development Agreement
No ratings yet
Website Development Agreement
9 pages
Deep Learning Interview
No ratings yet
Deep Learning Interview
28 pages
De Eep Tem Peratur Re Freez Er: S Service Manuall
No ratings yet
De Eep Tem Peratur Re Freez Er: S Service Manuall
21 pages
Hands-On Hadoop Tutorial
100% (1)
Hands-On Hadoop Tutorial
13 pages
Sap SD Credit Management Further Knowledge Material
100% (1)
Sap SD Credit Management Further Knowledge Material
5 pages
Linear Regression
No ratings yet
Linear Regression
59 pages
Cloudera Nokia Case Study Final
No ratings yet
Cloudera Nokia Case Study Final
2 pages
P633 Cortec
No ratings yet
P633 Cortec
1 page
Shounter Volume III, Section - 4
No ratings yet
Shounter Volume III, Section - 4
99 pages
M32-Edit V 3.2 PDF
No ratings yet
M32-Edit V 3.2 PDF
2 pages
Top 170 Machine Learning Interview Questions and Answers (2024) - Reader View
No ratings yet
Top 170 Machine Learning Interview Questions and Answers (2024) - Reader View
51 pages
PC Intro To Sequences
No ratings yet
PC Intro To Sequences
15 pages
SAP MM - Defining Organizational Structure
No ratings yet
SAP MM - Defining Organizational Structure
19 pages
2308.08708-Consciousness in Artificial Intelligence
No ratings yet
2308.08708-Consciousness in Artificial Intelligence
88 pages
How Is Battery Life Affected Through Use?
No ratings yet
How Is Battery Life Affected Through Use?
16 pages
20 - 1 - ML - Unsup - 01 - Partition Based - Kmeans
No ratings yet
20 - 1 - ML - Unsup - 01 - Partition Based - Kmeans
20 pages
Hive
No ratings yet
Hive
37 pages
Keysight - Techniques For Advanced Cable Testing Using FieldFox Handheld Analyzers
No ratings yet
Keysight - Techniques For Advanced Cable Testing Using FieldFox Handheld Analyzers
15 pages
April Salary
No ratings yet
April Salary
1 page
Interview Questions
No ratings yet
Interview Questions
14 pages
7180 Rudder Angle Indicator: Owner's Operation, Installation & Maintenance Manual
No ratings yet
7180 Rudder Angle Indicator: Owner's Operation, Installation & Maintenance Manual
24 pages
VX2757-mhd/VX2757-mhd-CN/ VX2757-mhd-7 Display: User Guide
No ratings yet
VX2757-mhd/VX2757-mhd-CN/ VX2757-mhd-7 Display: User Guide
27 pages
Text Analytics - Capstone Project
No ratings yet
Text Analytics - Capstone Project
19 pages
Top 25 Interview Questions On RNN - Reader View
No ratings yet
Top 25 Interview Questions On RNN - Reader View
9 pages
Texas Tech Thesis Guidelines
100% (2)
Texas Tech Thesis Guidelines
4 pages
Enhancing Discontinuities in Seismic Data and Automated Fault Mapping
No ratings yet
Enhancing Discontinuities in Seismic Data and Automated Fault Mapping
19 pages
Homeopresc
No ratings yet
Homeopresc
2 pages
Class 5 Memory Allocaion
No ratings yet
Class 5 Memory Allocaion
15 pages
June Salary
No ratings yet
June Salary
1 page
May Salary
No ratings yet
May Salary
1 page
1.0 Intro To Info Systems (I)
No ratings yet
1.0 Intro To Info Systems (I)
31 pages
netLabs!UG Internship 2024 Capstone Project
No ratings yet
netLabs!UG Internship 2024 Capstone Project
6 pages
CW Article Public Key Vs Private Key Digital Signatures
No ratings yet
CW Article Public Key Vs Private Key Digital Signatures
4 pages
C Questions
No ratings yet
C Questions
20 pages
Md. Rashidul Islam
No ratings yet
Md. Rashidul Islam
2 pages
Ultimate Salesforce Data Cloud for Customer Experience: Explore, Implement and Elevate B2C Experiences Through Customer Data Innovations Using Salesforce Data Cloud
From Everand
Ultimate Salesforce Data Cloud for Customer Experience: Explore, Implement and Elevate B2C Experiences Through Customer Data Innovations Using Salesforce Data Cloud
Gourab Mukherjee
No ratings yet
Building and Operating Data Hubs: Using a practical Framework as Toolset
From Everand
Building and Operating Data Hubs: Using a practical Framework as Toolset
Georg Graner
No ratings yet
(Excerpts From) Investigating Performance: Design and Outcomes With Xapi
From Everand
(Excerpts From) Investigating Performance: Design and Outcomes With Xapi
Janet Laane Effron
No ratings yet
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
From Everand
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
Robert Johnson
No ratings yet
Hybrid Neural Networks: Fundamentals and Applications for Interacting Biological Neural Networks with Artificial Neuronal Models
From Everand
Hybrid Neural Networks: Fundamentals and Applications for Interacting Biological Neural Networks with Artificial Neuronal Models
Fouad Sabry
No ratings yet

Introduction To RAG (Retrieval Augmented Generation) and Vector Database - by Sachinsoni - Medium

Uploaded by

Introduction To RAG (Retrieval Augmented Generation) and Vector Database - by Sachinsoni - Medium

Uploaded by

Open in app

Introduction to RAG (Retrieval Augmented

Listen Share More

Retrieval-Augmented Generation (RAG) is a powerful technique that enhances the

Limitations of Large Language Models Before RAG :

3. Hallucination Problem: LLMs sometimes “hallucinate,” meaning they confidently

How RAG Solves These Issues

Understanding Information Retrieval Before RAG

Formatted Response: The output is returned as a formatted response, usually displayed

Visualizing the RAG Workflow:

Explanation of Above diagram workflow :

An embedding model (like BERT, or Sentence Transformers) is used to convert the

4. Retriever and Ranker:

B. Right Side (Generative Approach)

Suffer from hallucinations.

Lack specific domain knowledge.

3. LLMs (Large Language Model):

4. Formatted Response (User Interface):

Understanding Vector Databases :

The Role of Vector Databases :

Locality Sensitive Hashing (LSH) :

Why Vector Databases?

Vector Database Youtube Video: https://youtu.be/72XgD322wZ8?si=KPFg30be_EBu7EUa

Vector Database Article : https://www.pinecone.io/learn/vector-database/

More from Sachinsoni

Python Cookiecutter : Streamlining MLOps Project Setup

Mastering Pip: Essential for Data Scientists and Python Developers

From Code to Containers: Revolutionizing Data Science with Docker

Aug 27, 2023 110

Different Metrices in Machine Learning for Measuring performance of

Jun 12, 2023

See all from Sachinsoni

Recommended from Medium

ChatGPT: Reinforcement Learning from Human Feedback (RLHF)

A Complete Guide to Embedding For NLP & Generative AI/LLM

Stories to Help You Level-Up at Work

Mastering RAG: Advanced Methods to Enhance Retrieval-Augmented

Chunking in Retrieval Augmented Generation (RAG) — theory with hands-on

Understanding RAG Basics, its Patterns and its Stages — Introduction

In Python in Plain English by Luis Valencia

Implementing Byte Pair Encoding (BPE) for Tokenization: A Step-by-Step Guide

See more recommendations

You might also like