Bring your own dense vectors
Stack Serverless
Elasticsearch enables you store and search mathematical representations of your content called embeddings or vectors, which help machines understand and process your data more effectively. There are two types of representation (dense and sparse), which are suited to different types of queries and use cases (for example, finding similar images and content or storing expanded terms and weights).
In this introduction to vector search, you'll store and search for dense vectors. You'll also learn the syntax for searching these documents using a k-nearest neighbour (kNN) query.
- If you're using Elasticsearch Serverless, create a project that is optimized for vectors. To add the sample data, you must have a
developer
oradmin
predefined role or an equivalent custom role. - If you're using Elastic Cloud Hosted or a self-managed cluster, start Elasticsearch and Kibana. The simplest method to complete the steps in this guide is to log in with a user that has the
superuser
built-in role.
To learn about role-based access control, check out User roles.
When you create vectors (or vectorize your data), you convert complex and nuanced content (such as text, videos, images, or audio) into multidimensional numerical representations. They must be stored in specialized data structures designed to ensure efficient similarity search and speedy vector distance calculations.
In this quide, you'll use documents that already have dense vector embeddings. To deploy a vector embedding model in Elasticsearch and generate vectors while ingesting and searching your data, refer to the links in Learn more.
This is an advanced use case that uses the dense_vector
field type. Refer to Semantic search for an overview of your options for semantic search with Elasticsearch.
To learn about the differences between semantic search and vector search, go to AI-powered search.
-
Create an index with dense vector field mappings
Each document in our simple data set will have:
- A review: stored in a
review_text
field - An embedding of that review: stored in a
review_vector
field, which is defined as adense_vector
data type.
TipThe
dense_vector
type automatically usesint8_hnsw
quantization by default to reduce the memory footprint required when searching float vectors. Learn more about balancing performance and accuracy in Dense vector quantization.The following API request defines the
review_text
andreview_vector
fields:PUT /amazon-reviews
{ "mappings": { "properties": { "review_vector": { "type": "dense_vector", "dims": 8, "index": true, "similarity": "cosine" }, "review_text": { "type": "text" } } } }
- The
dims
parameter must match the length of the embedding vector. If not specified,dims
will be dynamically calculated based on the first indexed document. - The
index
parameter is set totrue
to enable the use of theknn
query. - The
similarity
parameter defines the similarity function used to compare the query vector to the document vectors.cosine
is the default similarity function fordense_vector
fields in Elasticsearch.
Here we're using an 8-dimensional embedding for readability. The vectors that neural network models work with can have several hundreds or even thousands of dimensions that represent a point in a multi-dimensional space. Each vector dimension represents a feature or a characteristic of the unstructured data.
- A review: stored in a
-
Add documents with embeddings
First, index a single document to understand the document structure:
PUT /amazon-reviews/_doc/1
{ "review_text": "This product is lifechanging! I'm telling all my friends about it.", "review_vector": [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8] }
- The size of the
review_vector
array is 8, matching thedims
count specified in the mapping.
In a production scenario, you'll want to index many documents at once using the
_bulk
endpoint. Here's an example of indexing multiple documents in a single_bulk
request:POST /_bulk
{ "index": { "_index": "amazon-reviews", "_id": "2" } } { "review_text": "This product is amazing! I love it.", "review_vector": [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8] } { "index": { "_index": "amazon-reviews", "_id": "3" } } { "review_text": "This product is terrible. I hate it.", "review_vector": [0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1] } { "index": { "_index": "amazon-reviews", "_id": "4" } } { "review_text": "This product is great. I can do anything with it.", "review_vector": [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8] } { "index": { "_index": "amazon-reviews", "_id": "5" } } { "review_text": "This product has ruined my life and the lives of my family and friends.", "review_vector": [0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1] }
- The size of the
Now you can query these document vectors using a knn
retriever.
knn
is a type of vector search, which finds the k
most similar documents to a query vector.
Here we're using a raw vector for the query text for demonstration purposes:
POST /amazon-reviews/_search
{
"retriever": {
"knn": {
"field": "review_vector",
"query_vector": [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8],
"k": 2,
"num_candidates": 5
}
}
}
- A raw vector serves as the query text in this example. In a real-world scenario, you'll need to generate vectors for queries using an embedding model.
- The
k
parameter specifies the number of results to return. - The
num_candidates
parameter is optional. It limits the number of candidates returned by the search node. This can improve performance and reduce costs.
If you want to try a similar set of steps from an Elasticsearch client, check out the guided index workflow:
- If you're using Elasticsearch Serverless, go to Elasticsearch > Home, select the vector search workflow, and Create a vector optimized index.
- If you're using Elastic Cloud Hosted or a self-managed cluster, go to Elasticsearch > Home and click Create API index. Select the vector search workflow.
When you finish your tests and no longer need the sample data set, delete the index:
DELETE /amazon-reviews
In these simple examples, we're sending a raw vector for the query text.
In a real-world scenario you won't know the query text ahead of time.
You'll need to generate query vectors, on the fly, using the same embedding model that generated the document vectors.
For this you'll need to deploy a text embedding model in Elasticsearch and use the query_vector_builder
parameter.
Alternatively, you can generate vectors client-side and send them directly with the search request.
For an example of using pipelines to generate text embeddings, check out Tutorial: Dense and sparse workflows using ingest pipelines.
To learn about more search options, such as semantic, full-text, and hybrid, go to Search approaches.