0% found this document useful (0 votes)

66 views3 pages

Picking A Vector Database - A Comparison and Guide For 2023

picking

Uploaded by

milvidaroku

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

66 views3 pages

Picking A Vector Database - A Comparison and Guide For 2023

picking

Uploaded by

milvidaroku

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

9/4/24, 9:30 AM Picking a vector database: a comparison and guide for 2023

Picking a vector database: a comparison and

guide for 2023
In an era where semantic search and retrieval-augmented generation (RAG) are redefining our online interactions, the backbone
supporting these advancements is often overlooked: vector databases. If you're diving into applications like large language models,
RAG, or any platform leveraging semantic search, you're in the right place.

Picking a vector database can be hard. Scalability, latency, costs, and even compliance hinge on this choice. For those navigating
this terrain, I've embarked on a journey to sieve through the noise and compare the leading vector databases of 2023. I’ve included
the following vector databases in the comparision: Pinecone, Weviate, Milvus, Qdrant, Chroma, Elasticsearch and PGvector. The
data behind the comparision comes from ANN Benchmarks, the docs and internal benchmarks of each vector database and from
digging in open source github repos.

A comparison of leading vector databases

Pinecone Weaviate Milvus Qdrant Chroma Elasticsearch PGvector

Is open source ❌ ✅ ✅ ✅ ✅ ❌ ✅
Self-host ❌ ✅ ✅ ✅ ✅ ✅ ✅
Cloud
management
✅ ✅ ✅ ✅ ❌ ✅ (✔️)

Purpose-built
for Vectors
✅ ✅ ✅ ✅ ✅ ❌ ❌
Developer
experience
👍👍👍 👍👍 👍👍 👍👍 👍👍 👍 👍
Community 8k☆ github, 4k 23k☆ github, 4k 13k☆ github, 3k 9k☆ github, 6k
Community 23k slack 6k☆ github
page & events slack slack discord discord

Queries per
150 *for p2, but
second (using 700-100 *from
more pods can 791 2406 326 ? 141
text nytimes-256- various reports
be added
angular)

Latency, ms
1 *batched
(Recall/Percentile
search, 0.99
95 (millis), 2 1 4 ? ? 8
recall, 200k
nytimes-256-
SBERT
angular)

Supported Multiple (11

? HNSW HNSW HNSW HNSW HNSW/IVFFla
index types total)

Hybrid Search
(i.e. scalar ✅ ✅ ✅ ✅ ✅ ✅ ✅
filtering)

Disk index
support
✅ ✅ ✅ ✅ ✅ ❌ ✅
Role-based
access control
✅ ❌ ✅ ❌ ❌ ✅ ❌
Dynamic
segment Dynamic Dynamic
placement vs. ? Static sharding segment Static sharding segment Static sharding -
static data placement placement
sharding

Free hosted tier ✅ ✅ ✅ (free self-

hosted)
(free self-
hosted)
(free self-
hosted)
(varies)

file:///home/hai/Downloads/Picking a vector database_ a comparison and guide for 2023.html 1/3

9/4/24, 9:30 AM Picking a vector database: a comparison and guide for 2023

Pinecone Weaviate Milvus Qdrant Chroma Elasticsearch PGvector

Pricing (50k
$70 fr. $25 fr. $65 est. $9 Varies $95 Varies
vectors @1536)

Pricing (20M $227 ($2074 for fr. $309 ($2291 fr. $281 ($820
vectors, 20M high $1536 for high for high Varies est. $1225 Varies
req. @768) performance) performance) performance)

Navigating the terrain of vector databases in 2023 reveals a diverse array of options each catering to different needs. The
comparison table paints a clear picture, but here's a succinct summary to aid your decision:

1. Open-Source and hosted cloud: If you lean towards open-source solutions, Weviate, Milvus, and Chroma emerge as top
contenders. Pinecone, although not open-source, shines with its developer experience and a robust fully hosted solution.

2. Performance: When it comes to raw performance in queries per second, Milvus takes the lead, closely followed by Weviate and
Qdrant. However, in terms of latency, Pinecone and Milvus both offer impressive sub-2ms results. If nmultiple pods are added
for pinecone, then much higher QPS can be reached.

3. Community Strength: Milvus boasts the largest community presence, followed by Weviate and Elasticsearch. A strong
community often translates to better support, enhancements, and bug fixes.

4. Scalability, advanced features and security: Role-based access control, a feature crucial for many enterprise applications, is
found in Pinecone, Milvus, and Elasticsearch. On the scaling front, dynamic segment placement is offered by Milvus and
Chroma, making them suitable for ever-evolving datasets. If you're in need of a database with a wide array of index types,
Milvus' support for 11 different types is unmatched. While hybrid search is well-supported across the board, Elasticsearch does
fall short in terms of disk index support.

5. Pricing: For startups or projects on a budget, Qdrant's estimated $9 pricing for 50k vectors is hard to beat. On the other end of
the spectrum, for larger projects requiring high performance, Pinecone and Milvus offer competitive pricing tiers.

In conclusion, there's no one-size-fits-all when it comes to vector databases. The ideal choice varies based on specific project
needs, budget constraints, and personal preferences. This guide offers a comprehensive lens to view the top vector databases of
2023, hoping to simplify the decision-making process for developers. My choice? I’m testing out Pinecone and Milvus in the wild,
mostly because of their high performance, Milvus strong community and price flexibility at
scale.

Emil Fröberg
co-founder of Vectorview

Sources
https://www.kdnuggets.com/2023/06/vector-databases-important-llms.html

https://ann-benchmarks.com/
https://qdrant.tech/benchmarks/
https://zilliz.com/comparison

Github and docs for each vector database

Appendix 1: explination of comparision parameters

file:///home/hai/Downloads/Picking a vector database_ a comparison and guide for 2023.html 2/3

9/4/24, 9:30 AM Picking a vector database: a comparison and guide for 2023

Is open source: Indicates if the software's source code is freely available to the public, allowing developers to review, modify,
and distribute the software.

Self-host: Specifies if the database can be hosted on a user's own infrastructure rather than being dependent on a third-party
cloud service.

Cloud management: Offers an interface for database cloud management

Purpose-built for Vectors: This means the database was specifically designed with vector storage and retrieval in mind, rather
than being a general database with added vector capabilities.

Developer experience: Evaluates how user-friendly and intuitive it is for developers to work with the database, considering
aspects like documentation, SDKs, and API design.

Community: Assesses the size and activity of the developer community around the database. A strong community often
indicates good support, contributions, and the potential for continued development.

Queries per second: How many queries the database can handle per second using a specific dataset for benchmarking (in this
case, the nytimes-256-angular dataset)

Latency: the delay (in milliseconds) between initiating a request and receiving a response. 95% of query latencies fall under the
specified time for the nytimes-256-angular dataset.

Supported index types: Refers to the various indexing techniques the database supports, which can influence search speed
and accuracy. Some vector databases may support multiple indexing types like HNSW, IVF, and more.

Hybrid Search: Determines if the database allows for combining traditional (scalar) queries with vector queries. This can be
crucial for applications that need to filter results based on non-vector criteria.

Disk index support: Indicates if the database supports storing indexes on disk. This is essential for handling large datasets that
cannot fit into memory.

Role-based access control: Checks if the database has security mechanisms that allow permissions to be granted to specific
roles or users, enhancing data security.

Dynamic segment placement vs. static data sharding: Refers to how the database manages data distribution and scaling.
Dynamic segment placement allows for more flexible data distribution based on real-time needs, while static data sharding
divides data into predetermined segments.

Free hosted tier: Specifies if the database provider offers a free cloud-hosted version, allowing users to test or use the
database without initial investment.

Pricing (50k vectors @1536) and Pricing (20M vectors, 20M req. @768): Provides information on the cost associated with
storing and querying specific amounts of data, giving an insight into the database's cost-effectiveness for both small and large-
scale use cases.

file:///home/hai/Downloads/Picking a vector database_ a comparison and guide for 2023.html 3/3

JavaScript Multiple Choice Questions and Answers
75% (12)
JavaScript Multiple Choice Questions and Answers
11 pages
Vector Databases - A Technical Primer
100% (1)
Vector Databases - A Technical Primer
68 pages
Ansible Notes
No ratings yet
Ansible Notes
18 pages
Designing Data Intensive Applications
25% (4)
Designing Data Intensive Applications
61 pages
Dinesh Kumar Wms Production Support Analyst Mobile: 408-512-0572 Linkedin: Summary
No ratings yet
Dinesh Kumar Wms Production Support Analyst Mobile: 408-512-0572 Linkedin: Summary
6 pages
Hypertable An Open Source, High Performance, Scalable Database
100% (2)
Hypertable An Open Source, High Performance, Scalable Database
37 pages
Vector Database Essentials
No ratings yet
Vector Database Essentials
26 pages
Explaining Vector Databases in 3 Levels of Difficulty - by Leonie Monigatti - Jul, 2023 - Towards Data Science
No ratings yet
Explaining Vector Databases in 3 Levels of Difficulty - by Leonie Monigatti - Jul, 2023 - Towards Data Science
12 pages
Building A Ha and DR Solution Using Alwayson SQL Fcis and Ags v1
No ratings yet
Building A Ha and DR Solution Using Alwayson SQL Fcis and Ags v1
37 pages
The Rise of Vector Databases in The Age of LLMs
No ratings yet
The Rise of Vector Databases in The Age of LLMs
26 pages
Embeddings, Vector Databases, and Search in LLM
No ratings yet
Embeddings, Vector Databases, and Search in LLM
38 pages
14 Types of Databases and Data Stores You Should Know
No ratings yet
14 Types of Databases and Data Stores You Should Know
16 pages
Vector Database
No ratings yet
Vector Database
8 pages
What Are Vector Databases
No ratings yet
What Are Vector Databases
5 pages
Vector Databases
No ratings yet
Vector Databases
24 pages
An Introduction To DSpace
No ratings yet
An Introduction To DSpace
17 pages
Vector Databases
No ratings yet
Vector Databases
35 pages
WinCC GeneralInfo Installation Readme en-US en-US
No ratings yet
WinCC GeneralInfo Installation Readme en-US en-US
290 pages
Vector Search - GenAI+Search
No ratings yet
Vector Search - GenAI+Search
40 pages
Freedos Is A Complete, Free, Dos-Compatible Operating System. Use This Cheat Sheet To Help You With The Most Common Commands
No ratings yet
Freedos Is A Complete, Free, Dos-Compatible Operating System. Use This Cheat Sheet To Help You With The Most Common Commands
2 pages
Vector Database in LLMs
No ratings yet
Vector Database in LLMs
14 pages
SNOWL UserGuide
No ratings yet
SNOWL UserGuide
68 pages
Akash High Scale Benchmarks
No ratings yet
Akash High Scale Benchmarks
74 pages
VC++ Book
No ratings yet
VC++ Book
92 pages
4.1 Intro Nosql
No ratings yet
4.1 Intro Nosql
43 pages
4.1 Intro Nosql
No ratings yet
4.1 Intro Nosql
45 pages
4.1 Intro Nosql
No ratings yet
4.1 Intro Nosql
43 pages
EMC VNX Series: Release 7.1
No ratings yet
EMC VNX Series: Release 7.1
44 pages
Milvus Overview
No ratings yet
Milvus Overview
53 pages
5.1 Intro Nosql
No ratings yet
5.1 Intro Nosql
22 pages
Akash Mavle Links To Lot of Scalable Big Data Architectures
No ratings yet
Akash Mavle Links To Lot of Scalable Big Data Architectures
57 pages
Akash Box Akash Notes3
No ratings yet
Akash Box Akash Notes3
55 pages
SSS Manual
No ratings yet
SSS Manual
21 pages
Pgconfeu2023 Vectors
No ratings yet
Pgconfeu2023 Vectors
75 pages
04-2 Intro Nosql
No ratings yet
04-2 Intro Nosql
18 pages
Spatial, Text, and Multimedia Databases: Erik Zeitler Udbl
No ratings yet
Spatial, Text, and Multimedia Databases: Erik Zeitler Udbl
53 pages
5 General OOP Concepts
No ratings yet
5 General OOP Concepts
30 pages
Database Systems Performance Evaluation For Iot Applications
No ratings yet
Database Systems Performance Evaluation For Iot Applications
14 pages
Vector Databases
No ratings yet
Vector Databases
2 pages
DMBS11 Exp
No ratings yet
DMBS11 Exp
25 pages
Modern Database Systems: Team Members
No ratings yet
Modern Database Systems: Team Members
16 pages
Futureinternet 15 00010
No ratings yet
Futureinternet 15 00010
23 pages
DMS Microproject
No ratings yet
DMS Microproject
30 pages
Final Year Project
No ratings yet
Final Year Project
25 pages
4.1 Intro Nosql-Converted-133751863122661863
No ratings yet
4.1 Intro Nosql-Converted-133751863122661863
43 pages
You LL Learn Why They Matter What Makes Them Different How They Work The New Use Cases They Re Designed For and How To Get Started 1688203106
No ratings yet
You LL Learn Why They Matter What Makes Them Different How They Work The New Use Cases They Re Designed For and How To Get Started 1688203106
25 pages
Salesforce Certified Platform Developer I: Certification Exam Guide
No ratings yet
Salesforce Certified Platform Developer I: Certification Exam Guide
14 pages
Dbms
No ratings yet
Dbms
12 pages
Training Material - Blockchain Concepts and Its Applications
No ratings yet
Training Material - Blockchain Concepts and Its Applications
14 pages
ONLINE LAB - Encrypting A VM Data Disk PDF
No ratings yet
ONLINE LAB - Encrypting A VM Data Disk PDF
9 pages
Analyzing The Adoption of Database Management Systems Throughout The Life Cycle of Open Source Projects
No ratings yet
Analyzing The Adoption of Database Management Systems Throughout The Life Cycle of Open Source Projects
26 pages
Vespa Barc More Than Vectors Layout AS
No ratings yet
Vespa Barc More Than Vectors Layout AS
15 pages
Vector-DataBase in AI
No ratings yet
Vector-DataBase in AI
14 pages
Exam MB 300 Dynamics 365 Core Finance and Operations-Skills Measured
No ratings yet
Exam MB 300 Dynamics 365 Core Finance and Operations-Skills Measured
4 pages
Information Security-Policy-Template
No ratings yet
Information Security-Policy-Template
3 pages
Survey of Vector Database Management Systems: Noname Manuscript No
No ratings yet
Survey of Vector Database Management Systems: Noname Manuscript No
25 pages
Bridging Data Silos Using Big Data Integration
No ratings yet
Bridging Data Silos Using Big Data Integration
6 pages
ORACLE CORE DBA 12c
No ratings yet
ORACLE CORE DBA 12c
5 pages
NOSQL Databases
No ratings yet
NOSQL Databases
18 pages
A Comprehensive Survey On Vector Database
No ratings yet
A Comprehensive Survey On Vector Database
13 pages
wr1360366pd The Database Buyers Guide v7b FNL - 41037341USEN
No ratings yet
wr1360366pd The Database Buyers Guide v7b FNL - 41037341USEN
12 pages
Most Popular Databases: 1. Mysql
No ratings yet
Most Popular Databases: 1. Mysql
5 pages
Manu - A Cloud Native Vector Database Management System
No ratings yet
Manu - A Cloud Native Vector Database Management System
14 pages
Banking System
No ratings yet
Banking System
6 pages
Unit 3
No ratings yet
Unit 3
7 pages
Assignment 01CCE 224
No ratings yet
Assignment 01CCE 224
10 pages
Data Analytics
No ratings yet
Data Analytics
6 pages
Sponsored DZ RC 396 Getting Started Vector Databas
No ratings yet
Sponsored DZ RC 396 Getting Started Vector Databas
9 pages
Database Types
No ratings yet
Database Types
4 pages
TM 3
No ratings yet
TM 3
8 pages
C Programming VIVA Questions
No ratings yet
C Programming VIVA Questions
2 pages
CS 4407 Programming Assign. Unit 2
No ratings yet
CS 4407 Programming Assign. Unit 2
6 pages
20 - 04 - 2024 Cheatsheet
No ratings yet
20 - 04 - 2024 Cheatsheet
3 pages
Vector Database
No ratings yet
Vector Database
7 pages
Vector Database Management Systems
No ratings yet
Vector Database Management Systems
13 pages
AWS - Devops A3 - Senior System Engineer
No ratings yet
AWS - Devops A3 - Senior System Engineer
2 pages
Milvus
No ratings yet
Milvus
4 pages
Vector Database
No ratings yet
Vector Database
3 pages
Note 859998 - Installing SAP Credit Management 6.0: Symptom
No ratings yet
Note 859998 - Installing SAP Credit Management 6.0: Symptom
4 pages
2.2.1. Governance - Azure AD - Entities
No ratings yet
2.2.1. Governance - Azure AD - Entities
4 pages
What Is Vector
No ratings yet
What Is Vector
4 pages
Markelytics - Vector Database PSL - 07022025
No ratings yet
Markelytics - Vector Database PSL - 07022025
3 pages
Non - Technical Questions
No ratings yet
Non - Technical Questions
7 pages
Ali Resume
No ratings yet
Ali Resume
7 pages
Informatica PowerExchange TN VSAM
No ratings yet
Informatica PowerExchange TN VSAM
2 pages
Vector DB Survey
No ratings yet
Vector DB Survey
2 pages
Vector Databases
No ratings yet
Vector Databases
2 pages
Mohammed Zoheb Mlops Devops
No ratings yet
Mohammed Zoheb Mlops Devops
1 page
Linux Syllabus
No ratings yet
Linux Syllabus
3 pages
The Little Book of Sitecore® Tips: Volume 1
From Everand
The Little Book of Sitecore® Tips: Volume 1
Neil P Shack
No ratings yet

Picking A Vector Database - A Comparison and Guide For 2023

Uploaded by

Picking A Vector Database - A Comparison and Guide For 2023

Uploaded by

9/4/24, 9:30 AM Picking a vector database: a comparison and guide for 2023

Picking a vector database: a comparison and

A comparison of leading vector databases

Pinecone Weaviate Milvus Qdrant Chroma Elasticsearch PGvector

Supported Multiple (11

Free hosted tier ✅ ✅ ✅ (free self-

file:///home/hai/Downloads/Picking a vector database_ a comparison and guide for 2023.html 1/3

Pinecone Weaviate Milvus Qdrant Chroma Elasticsearch PGvector

Github and docs for each vector database

Appendix 1: explination of comparision parameters

file:///home/hai/Downloads/Picking a vector database_ a comparison and guide for 2023.html 2/3

Cloud management: Offers an interface for database cloud management

file:///home/hai/Downloads/Picking a vector database_ a comparison and guide for 2023.html 3/3

You might also like