0% found this document useful (0 votes)

38 views26 pages

The Rise of Vector Databases in The Age of LLMs

The document discusses the rise of vector databases in the context of large language models (LLMs) and their applications. It highlights the evolution of data management systems, the importance of embeddings, and how vector databases enhance search capabilities and data retrieval processes. The author also outlines trade-offs in choosing vector databases and suggests various use cases beyond search, such as anomaly detection and recommendation systems.

Uploaded by

farman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views26 pages

The Rise of Vector Databases in The Age of LLMs

Uploaded by

farman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

The rise of vector databases

in the age of LLMs

Farman Chauhan
LinkedIn/farmanchauhan215
Vector databases in 2022-23

In the last year or so, vector databases seem to be everywhere

Worldwide interest in these keywords

ChatGPT GPT-4 and ChatGPT

goes viral plugins released

Nov 2022 Mar 2023

https://trends.google.com/
Goals of this talk

1. Let’s make databases interesting to a broader audience!

○ Databases, and the choice thereof, power a host of interesting downstream applications
○ I myself never enjoyed the topic of databases, until I began thinking about data modeling!

2. Let’s try to think about data as we do about mathematics

○ Data isn’t something we create; it already exists, and it must be discovered
○ Data is an artifact of the activity in the universe, and is embedded in space and time
○ Humans catalog, visualize and analyze data in ways we choose

3. Talk embeddings and vector databases, and how LLMs tie them together
What is a database?

● On its own, data is unorganized, lacks context and doesn’t provide value
● Data, with context, is information
● A database is a system built to organize data and make it available as
information
○ Storage
○ Management (CRUD)
○ Querying
Databases: (almost) as diverse as civilization itself

Real-world data is messy, unpredictable and has unbounded variety in shape/form

SQL NoSQL
NewSQL
● Combine the benefits of SQL/NoSQL
paradigms
● SQL-like query languages
● SQL-like ACID compliance
● NoSQL-like flexibility (no-schema)
● NoSQL-like horizontal scalability
The same data can be viewed differently (1)

Relational model: SQL

● ID 1 “knows” ID 8
● ID 2 “knows” ID 17
● Information is stored relationally, in multiple
tables
● To query the relationships, the tables must be
joined
● Ideal for transactional data that requires
guaranteed consistency (e.g., financial
transactions)
The same data can be viewed differently (2)

Document model: NoSQL

● ID 1 “knows” ID 8
● ID 2 “knows” ID 17
● Relationships are stored
redundantly, in a pairwise
manner
● Ideal in cases where
metadata fields are a mix
of short-form/long-form
numbers/text, whose
structure isn’t always
known upfront
The same data can be viewed differently (3)

Graph model: NoSQL

● ID 1 “knows” ID 8
● ID 2 “knows” ID 17
● What we view as a “‘row” in a SQL table is a “record” in a
graph
● Nodes represent a concept/entity
● Edges represent how these concepts are related in the
real-world
● To query the relationship, we simply traverse between nodes
● Ideal when we want to analyze highly-connected data
Hybrid databases also exist
Document-Graph
model
Document model Graph model ● Underlying unit of data may be a table or
a document
● Relationships are natively defined
between these units without requiring
expensive/verbose joins

Document- Relational-Graph
Relational model
model
Relational model
Do we need a fourth paradigm? (Hint: No)

A vector DB is a purpose-built DB that treats the vector data type as a first-class citizen
● In computer science, a “vector” is an array of numbers of size n
○ [0.3425, 0.4512, -0.3563. 0.0753, …]n
● “Embedding” → Compressed representation (used interchangeably with “vector”)

Obtain similarity scores w.r.t. the

source sentence
Sentence transformer model Get a vector for each block of text
Extending the capabilities of existing data model paradigms

Vector DBs can be viewed as a natural extension to SQL/NoSQL

Vector SQL NoSQL Vector
Exact → Full-text → Semantic search

● Earlier, search required specifying exact keywords that exist in the data
○ New York City vs. New york
● Full-text search (e.g., Elasticsearch) allowed us to improve retrieval
relevance by utilizing relative word frequencies
○ The relative frequency of the terms “New” and “York” get us close enough to the user query
● However, terms that mean the same thing are not captured
○ Train vs. Light rail
● Vector databases enable LLMs to “understand” factual data
○ Vectors spaces are the “language” of models like GPT-4, as well as vector storage engines
○ The “knowledge” inside an LLM, just like the data in a vector DB, live in vector space
Visualizing vector spaces in lower dimensions
● Each data point is transformed to its
representation in vector space: in 3D space,
each vector would have 3 dimensions,
represented by 3 numbers
○ [0.3234, 0.4242, 0.0253]
● For text, the vectors are created via
transformer models, which capture semantics
of language (not just word features)
● A user query is transformed to the same
space, and the distance between it and the
data points can be efficiently computed
Realistic example: Higher-dimensional vector spaces
● Each dimension of a real
sentence embedding
represents its position in
higher-dimensional
vector space

● Similar concepts (e.g.,

“ground transport” and
“Boston” have similar
vector values

● Dissimilar concepts
(“Toronto” vs “Denver”)
have dissimilar values
Source: https://txt.cohere.com/sentence-word-embeddings/
Trade-offs in choosing embedding models

The Massive Text Embedding (MTEB) leaderboard is a good place to start!

Image credit: Vespa blog https://blog.vespa.ai/bge-embedding-models-in-

vespa-using-bfloat16/

💡 Note: The MTEB leaderboard considers only exact, exhaustive search —

when coupling these models with ANN search, your results may not
correspond with their rankings!
Vector indexes in practice (HNSW)
● Hierarchical Navigable Small
World Graphs (HNSW) is the
index that powers search
functionality in many vector
databases

● It achieves a good balance of

recall and latency, by rapidly
“Train to Boston City Center” narrowing down on the
region of interest in vector
space

● However, it can consume a

“Ground transportation fair amount of memory, so
at Boston airport” disk-based methods are
becoming more important
Upcoming vector indexes on-disk (Vamana)
● "Vamana" is a recently developed graph-
based vector index, part of the DiskANN
suite of ANN algorithms

● The original C++ on-disk

implementation is challenging to
transform to existing DBs in a way that
is efficient and scalable (ongoing work
at LanceDB, Weaviate & others)

● It indexes data that's too large to fit in

memory, and because of its “inside-out”
approach, it’s still efficient despite
being entirely on-disk
Putting it all together: What makes a vector database
Long-term memory for ChatGPT via vector DBs
● OpenAI provides a ChatGPT retrieval plugin, that
connects to variety of vector DBs
○ https://github.com/openai/chatgpt-retrieval-plugin

● It continually stores GPT’s responses to the user

in every chat
● On the Nth day, when the user sends a query that
requires historical context, retrieving the top-k
similar chat entries for that user is trivial
● It’s possible to build a custom API that does this
for other LLMs than OpenAI’s, too
Retrieval Augmented Generation (RAG) 1: No vector DBs

● User query is passed

directly as a prompt to an
LLM

● LLM constructs a query for

the database of choice (as
specified via a single-shot
prompt)

● Natural language response

is generated to send back
to human

● Limitation: Query
generated could be
incorrect or return null
result
Retrieval Augmented Generation (RAG) 2: Vector DB-augmented

● Data is first stored in a vector

database as embeddings

● User query is first converted to

vector form and top-k most similar
results are returned from vector DB

● The top-k results are used as

context to build a prompt to an LLM
(alongside the user query)

● The LLM then only needs to look

through top-k results (not the whole
dataset), and a generated response
is sent back to user
Trade-offs when choosing a vector database

1. Purpose-built or incumbent solution

2. On-prem or cloud
3. Indexing speed vs. query latency
4. Good recall vs. low latency
5. In-memory vs. on-disk index
6. Sparse vs. dense vectors
7. Keyword or vector search & retrieval (or hybrid)
8. Pre-filtering vs. post filtering

Blog post on this available on thedataquarry.com

Vector databases are not all about search!

● Vectors are truly multi-modal (text, images and audio)

● “Long term memory for AI” is not the only use case
○ For the first time in history, we have a storage layer that speaks the same
language as the query layer (i.e., “vectors”)
● Many more interesting applications are enabled by vector databases:
○ Data discovery with human feedback (when keyword isn’t known in advance)
○ Recommendation systems (embed search query history over time per user)
○ Anomaly detection (most dissimilar vectors)

- OpenAI’s text-embedding-ada-002 produces vectors with 1536 dimensions

- Cohere’s embedding dimensions are anywhere from 512 to 4096
- sentence-transformers embeddings are 384, 512 or 768 dimensions

Always test the cheapest model (all-MiniLM-L6-v2) first, on your own data
Supabase observed pgvector with all-
MiniLM-L6-v2 outperforming text-
embedding-ada-002 by 78% when holding
the precision@10 constant at 0.99, all while
consuming less memory resources

Pre-training data distribution, database indexing and other optimizations

dictate the outcome!
An opinionated slide: My go-tos

How to choose a vector DB amongst the sea of options? After analyzing many
trade-offs, my go-tos are the following (due to ease of setup and use)

● Open-source ● Open-source
● Built in Rust 🦀 (fast + lightweight) ● Built in Rust 🦀 (fast + lightweight)
● Client-server architecture ● Embedded, serverless
● Hosted cloud solution available ● DB is tightly coupled with application layer
● Custom filtering algorithm (neither ● Fast disk-based search & retrieval for huge,
pre/post filter) + search-as-you-type out-of-memory data
● Use as first choice wherever possible ● Keep an eye out for them later in 2023
Thank you!
Questions/comments?
https://openai.com/blog/introducing-chatgpt-enterprise
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Cramming: Training a Language Model on a Single GPU in One Day
LaMDA: Language Models for Dialog Applications

NCA-GENL Nvidia Generative Ai Llms Exam Dumps
No ratings yet
NCA-GENL Nvidia Generative Ai Llms Exam Dumps
5 pages
AI in 100 Images
No ratings yet
AI in 100 Images
104 pages
Skip Gram
100% (1)
Skip Gram
37 pages
Vector Database
No ratings yet
Vector Database
8 pages
Vector Database in LLMs
No ratings yet
Vector Database in LLMs
14 pages
Res Net
No ratings yet
Res Net
13 pages
Chatbot Openai Project Report
No ratings yet
Chatbot Openai Project Report
7 pages
What Is The Need For Residual Learning?
No ratings yet
What Is The Need For Residual Learning?
3 pages
Generative AI With Large Language Models AWS & DeepLearning
No ratings yet
Generative AI With Large Language Models AWS & DeepLearning
96 pages
Face Detection and Smile Detection
No ratings yet
Face Detection and Smile Detection
8 pages
Implementation of A Chatbot System Using Ai and NLP
No ratings yet
Implementation of A Chatbot System Using Ai and NLP
6 pages
Abstractive Text Summarization Using Deep Learning
No ratings yet
Abstractive Text Summarization Using Deep Learning
43 pages
LangChain & RAG
No ratings yet
LangChain & RAG
62 pages
Knowledge Graph Construction Using Large Language Models
No ratings yet
Knowledge Graph Construction Using Large Language Models
17 pages
Federated Learning - Hope and Scope
No ratings yet
Federated Learning - Hope and Scope
4 pages
A Review On Large Language Models Architectures Ap
No ratings yet
A Review On Large Language Models Architectures Ap
31 pages
MM-LLMs Recent Advances in MultiModal Large Language Models
No ratings yet
MM-LLMs Recent Advances in MultiModal Large Language Models
22 pages
10 Evani Generative AI Champion
No ratings yet
10 Evani Generative AI Champion
39 pages
Performance Analysis of LoRA Finetuning Llama-2
No ratings yet
Performance Analysis of LoRA Finetuning Llama-2
4 pages
5 Pretraining On Unlabeled Data - Build A Large Language Model (From Scratch)
No ratings yet
5 Pretraining On Unlabeled Data - Build A Large Language Model (From Scratch)
61 pages
Everything You Need To Know About Small Language Models (SLM) and Its Applications
No ratings yet
Everything You Need To Know About Small Language Models (SLM) and Its Applications
3 pages
LangChain Programming For Beginners
No ratings yet
LangChain Programming For Beginners
154 pages
Techniques To FineTune LLMs
No ratings yet
Techniques To FineTune LLMs
7 pages
Knowledge Graphs V Vector Databases and When Not To Use Them!
No ratings yet
Knowledge Graphs V Vector Databases and When Not To Use Them!
3 pages
Small Language Models (SLMS)
No ratings yet
Small Language Models (SLMS)
23 pages
542 315 Word2vec
No ratings yet
542 315 Word2vec
20 pages
Computer Science Project Proposal (Hospital Chatbot)
No ratings yet
Computer Science Project Proposal (Hospital Chatbot)
5 pages
2023 Intro To Generative Ai
No ratings yet
2023 Intro To Generative Ai
15 pages
Ai Notes
No ratings yet
Ai Notes
2 pages
Generative Adversial Network
No ratings yet
Generative Adversial Network
21 pages
5 Techiques To FineTune LLMs
No ratings yet
5 Techiques To FineTune LLMs
7 pages
Large-Language-Model-Based-Artificial-Intelligence-In-The-Language-Classroom-Practical-Ideas-For-Teaching - Content File PDF
No ratings yet
Large-Language-Model-Based-Artificial-Intelligence-In-The-Language-Classroom-Practical-Ideas-For-Teaching - Content File PDF
20 pages
College ChatBot - Report
No ratings yet
College ChatBot - Report
90 pages
Hugging Face Transformers
No ratings yet
Hugging Face Transformers
8 pages
Session 11-12 - Text Analytics
No ratings yet
Session 11-12 - Text Analytics
38 pages
Newwhitepaper Agents2
No ratings yet
Newwhitepaper Agents2
84 pages
Object Detection - Deep Learning: Jamia Hamdard
No ratings yet
Object Detection - Deep Learning: Jamia Hamdard
26 pages
RAG-HAT - A Hallucination-Aware Tuning Pipeline For LLM in Retrieval-Augmented Generation
No ratings yet
RAG-HAT - A Hallucination-Aware Tuning Pipeline For LLM in Retrieval-Augmented Generation
11 pages
20191216134846D3338 - COMP6579 Session 10 - Big Data Analytics (Apache Spark - SparkML)
No ratings yet
20191216134846D3338 - COMP6579 Session 10 - Big Data Analytics (Apache Spark - SparkML)
42 pages
AI Institutes
No ratings yet
AI Institutes
98 pages
Graph RAG
No ratings yet
Graph RAG
7 pages
Generative AI Interview Questions and Answers
No ratings yet
Generative AI Interview Questions and Answers
7 pages
Hugging Face
100% (1)
Hugging Face
11 pages
GenAI Roadmap
No ratings yet
GenAI Roadmap
8 pages
Pawan Resume May 2023
No ratings yet
Pawan Resume May 2023
2 pages
Chatbots With Personality Using Deep Learning
No ratings yet
Chatbots With Personality Using Deep Learning
47 pages
Internship Papers Previous
No ratings yet
Internship Papers Previous
52 pages
LLM and RAG
No ratings yet
LLM and RAG
12 pages
Federated Learning Overview, Strategies, Applications, Tools and
No ratings yet
Federated Learning Overview, Strategies, Applications, Tools and
24 pages
Shreyash's Resume
No ratings yet
Shreyash's Resume
1 page
Face Recognition Using Neural Network: Seminar Report
100% (3)
Face Recognition Using Neural Network: Seminar Report
33 pages
Automatic Music Generation
No ratings yet
Automatic Music Generation
16 pages
Generative AI
No ratings yet
Generative AI
2 pages
Lab7 LLM Chains
No ratings yet
Lab7 LLM Chains
7 pages
Introduction - Hugging Face NLP Course
No ratings yet
Introduction - Hugging Face NLP Course
8 pages
Implementation of N-Gram Technique
No ratings yet
Implementation of N-Gram Technique
6 pages
Video Classification Using Deep Learning For Video Providers Project Report
No ratings yet
Video Classification Using Deep Learning For Video Providers Project Report
36 pages
A Gentle Intro To Chaining LLMS, Agents, and Utils Via LangChain
No ratings yet
A Gentle Intro To Chaining LLMS, Agents, and Utils Via LangChain
26 pages
Vector Database
No ratings yet
Vector Database
7 pages
Sponsored DZ RC 396 Getting Started Vector Databas
No ratings yet
Sponsored DZ RC 396 Getting Started Vector Databas
9 pages
WF4 Pre Production HoW
No ratings yet
WF4 Pre Production HoW
142 pages
CH 1 Pre-Assignment Practice
No ratings yet
CH 1 Pre-Assignment Practice
6 pages
Dual Operational Amplifier: General Description Package Outline
No ratings yet
Dual Operational Amplifier: General Description Package Outline
4 pages
E-Learning and Job Performance of Academic Staff in Bayelsa State Owned Universities
No ratings yet
E-Learning and Job Performance of Academic Staff in Bayelsa State Owned Universities
6 pages
Albert Einstein
No ratings yet
Albert Einstein
19 pages
Ramsey S Legacy 1st Edition Lillehammer Download PDF
100% (6)
Ramsey S Legacy 1st Edition Lillehammer Download PDF
84 pages
Computer Ebook English RBE
No ratings yet
Computer Ebook English RBE
69 pages
HW#7 Solutions
No ratings yet
HW#7 Solutions
5 pages
HHXHNCJMKVGK
No ratings yet
HHXHNCJMKVGK
5 pages
Stability Analysis and Modelling Underground Excavations in Fractured Rocks - Vol 1
No ratings yet
Stability Analysis and Modelling Underground Excavations in Fractured Rocks - Vol 1
309 pages
Describing Gases Focus Points
No ratings yet
Describing Gases Focus Points
2 pages
P235GH Engl PDF
No ratings yet
P235GH Engl PDF
4 pages
TDMS File Format Internal Structure
No ratings yet
TDMS File Format Internal Structure
14 pages
Heat of Combustion Lab 2
No ratings yet
Heat of Combustion Lab 2
14 pages
IO Wheel Balancer WB220L - CE - 1.1 - ENG - Set910710984
No ratings yet
IO Wheel Balancer WB220L - CE - 1.1 - ENG - Set910710984
18 pages
Crude Oil Emulsions A State-Of-The-Art Review
100% (3)
Crude Oil Emulsions A State-Of-The-Art Review
11 pages
Type of Proportions
No ratings yet
Type of Proportions
20 pages
Thesis Topics On Image Processing
100% (3)
Thesis Topics On Image Processing
6 pages
Earned Value Analysis-15-12-2016 - AH PDF
No ratings yet
Earned Value Analysis-15-12-2016 - AH PDF
17 pages
Essentials of Machine Learning Algorithms (With Python and R Codes) PDF
100% (1)
Essentials of Machine Learning Algorithms (With Python and R Codes) PDF
20 pages
Assignment 01
No ratings yet
Assignment 01
2 pages
Screenshot 2020-08-05 at 3.32.42 PM
No ratings yet
Screenshot 2020-08-05 at 3.32.42 PM
1 page
DVE Viscometer
No ratings yet
DVE Viscometer
1 page
Pos - 0101 Qe Et200sp Elev.28kw Inv - Sew.+cat. 1,5kw Sew 2i004764 (Es2-2019) q5 Vinamilk - 04!30!2020 English Version
No ratings yet
Pos - 0101 Qe Et200sp Elev.28kw Inv - Sew.+cat. 1,5kw Sew 2i004764 (Es2-2019) q5 Vinamilk - 04!30!2020 English Version
51 pages
Solving Tarkeeb PDF
No ratings yet
Solving Tarkeeb PDF
146 pages
FIBA Basketball Equipment 2020 - V1
No ratings yet
FIBA Basketball Equipment 2020 - V1
30 pages
WJEC GCSE Maths Intermediate Paper 2 November 2022
No ratings yet
WJEC GCSE Maths Intermediate Paper 2 November 2022
24 pages
Shop 04 PEB Data
No ratings yet
Shop 04 PEB Data
9 pages
Cusps: Akshuz 09-Nov-1984 09:55:15 PM Ernakulam 76:17:0 E, 9:59:0 N Tzone: 5.5 KP (Original) Ayanamsha 23:33:6
No ratings yet
Cusps: Akshuz 09-Nov-1984 09:55:15 PM Ernakulam 76:17:0 E, 9:59:0 N Tzone: 5.5 KP (Original) Ayanamsha 23:33:6
1 page
Tunnelling Applications Shotcrete Reinforcement
No ratings yet
Tunnelling Applications Shotcrete Reinforcement
11 pages

The Rise of Vector Databases in The Age of LLMs

Uploaded by

The Rise of Vector Databases in The Age of LLMs

Uploaded by

The rise of vector databases

in the age of LLMs

In the last year or so, vector databases seem to be everywhere

Worldwide interest in these keywords

ChatGPT GPT-4 and ChatGPT

Nov 2022 Mar 2023

1. Let’s make databases interesting to a broader audience!

2. Let’s try to think about data as we do about mathematics

Real-world data is messy, unpredictable and has unbounded variety in shape/form

Relational model: SQL

Document model: NoSQL

Graph model: NoSQL

Obtain similarity scores w.r.t. the

Vector DBs can be viewed as a natural extension to SQL/NoSQL

● Similar concepts (e.g.,

The Massive Text Embedding (MTEB) leaderboard is a good place to start!

Image credit: Vespa blog https://blog.vespa.ai/bge-embedding-models-in-

💡 Note: The MTEB leaderboard considers only exact, exhaustive search —

● It achieves a good balance of

● However, it can consume a

● The original C++ on-disk

● It indexes data that's too large to fit in

● It continually stores GPT’s responses to the user

● User query is passed

● LLM constructs a query for

● Natural language response

● Data is first stored in a vector

● User query is first converted to

● The top-k results are used as

● The LLM then only needs to look

1. Purpose-built or incumbent solution

Blog post on this available on thedataquarry.com

● Vectors are truly multi-modal (text, images and audio)

Further reading in Qdrant blog: https://qdrant.tech/articles/vector-similarity-beyond-search/

- OpenAI’s text-embedding-ada-002 produces vectors with 1536 dimensions

Pre-training data distribution, database indexing and other optimizations

You might also like