Elasticsearch API
http://api.example.com
Elasticsearch provides REST APIs that are used by the UI components and can be called directly to configure and access Elasticsearch features.
Documentation source and versions
This documentation is derived from the 9.0
branch of the elasticsearch-specification repository. It is provided under license Attribution-NonCommercial-NoDerivatives 4.0 International.
This documentation contains work-in-progress information for future Elastic Stack releases.
Last update on Jun 6, 2025.
This API is provided under license Apache 2.0.
Create a behavioral analytics collection event
Deprecated
Technical preview
Path parameters
-
collection_name
string Required The name of the behavioral analytics collection.
-
event_type
string Required The analytics event type.
Values are
page_view
,search
, orsearch_click
.
Query parameters
-
debug
boolean Whether the response type has to include more details
POST _application/analytics/my_analytics_collection/event/search_click
{
"session": {
"id": "1797ca95-91c9-4e2e-b1bd-9c38e6f386a9"
},
"user": {
"id": "5f26f01a-bbee-4202-9298-81261067abbd"
},
"search":{
"query": "search term",
"results": {
"items": [
{
"document": {
"id": "123",
"index": "products"
}
}
],
"total_results": 10
},
"sort": {
"name": "relevance"
},
"search_application": "website"
},
"document":{
"id": "123",
"index": "products"
}
}
curl \
--request POST 'http://api.example.com/_application/analytics/{collection_name}/event/{event_type}' \
--header "Authorization: $API_KEY" \
--header "Content-Type: application/json" \
--data '"{\n \"session\": {\n \"id\": \"1797ca95-91c9-4e2e-b1bd-9c38e6f386a9\"\n },\n \"user\": {\n \"id\": \"5f26f01a-bbee-4202-9298-81261067abbd\"\n },\n \"search\":{\n \"query\": \"search term\",\n \"results\": {\n \"items\": [\n {\n \"document\": {\n \"id\": \"123\",\n \"index\": \"products\"\n }\n }\n ],\n \"total_results\": 10\n },\n \"sort\": {\n \"name\": \"relevance\"\n },\n \"search_application\": \"website\"\n },\n \"document\":{\n \"id\": \"123\",\n \"index\": \"products\"\n }\n}"'
{
"session": {
"id": "1797ca95-91c9-4e2e-b1bd-9c38e6f386a9"
},
"user": {
"id": "5f26f01a-bbee-4202-9298-81261067abbd"
},
"search":{
"query": "search term",
"results": {
"items": [
{
"document": {
"id": "123",
"index": "products"
}
}
],
"total_results": 10
},
"sort": {
"name": "relevance"
},
"search_application": "website"
},
"document":{
"id": "123",
"index": "products"
}
}
Compact and aligned text (CAT)
The compact and aligned text (CAT) APIs aim are intended only for human consumption using the Kibana console or command line. They are not intended for use by applications. For application consumption, it's recommend to use a corresponding JSON API.
All the cat commands accept a query string parameter help
to see all the headers and info they provide, and the /_cat
command alone lists all the available commands.
Get a document count
Get quick access to a document count for a data stream, an index, or an entire cluster. The document count only includes live documents, not deleted documents which have not yet been removed by the merge process.
IMPORTANT: CAT APIs are only intended for human consumption using the command line or Kibana console. They are not intended for use by applications. For application consumption, use the count API.
GET /_cat/count/my-index-000001?v=true&format=json
curl \
--request GET 'http://api.example.com/_cat/count' \
--header "Authorization: $API_KEY"
[
{
"epoch": "1475868259",
"timestamp": "15:24:20",
"count": "120"
}
]
[
{
"epoch": "1475868259",
"timestamp": "15:24:20",
"count": "121"
}
]
Get a document count
Get quick access to a document count for a data stream, an index, or an entire cluster. The document count only includes live documents, not deleted documents which have not yet been removed by the merge process.
IMPORTANT: CAT APIs are only intended for human consumption using the command line or Kibana console. They are not intended for use by applications. For application consumption, use the count API.
Path parameters
-
index
string | array[string] Required A comma-separated list of data streams, indices, and aliases used to limit the request. It supports wildcards (
*
). To target all data streams and indices, omit this parameter or use*
or_all
.
GET /_cat/count/my-index-000001?v=true&format=json
curl \
--request GET 'http://api.example.com/_cat/count/{index}' \
--header "Authorization: $API_KEY"
[
{
"epoch": "1475868259",
"timestamp": "15:24:20",
"count": "120"
}
]
[
{
"epoch": "1475868259",
"timestamp": "15:24:20",
"count": "121"
}
]
Get field data cache information
Get the amount of heap memory currently used by the field data cache on every data node in the cluster.
IMPORTANT: cat APIs are only intended for human consumption using the command line or Kibana console. They are not intended for use by applications. For application consumption, use the nodes stats API.
Path parameters
-
fields
string | array[string] Required Comma-separated list of fields used to limit returned information. To retrieve all fields, omit this parameter.
Query parameters
-
bytes
string The unit used to display byte values.
Values are
b
,kb
,mb
,gb
,tb
, orpb
. -
fields
string | array[string] Comma-separated list of fields used to limit returned information.
-
h
string | array[string] List of columns to appear in the response. Supports simple wildcards.
-
s
string | array[string] List of columns that determine how the table should be sorted. Sorting defaults to ascending and can be changed by setting
:asc
or:desc
as a suffix to the column name.
GET /_cat/fielddata?v=true&fields=body&format=json
curl \
--request GET 'http://api.example.com/_cat/fielddata/{fields}' \
--header "Authorization: $API_KEY"
[
{
"id": "Nqk-6inXQq-OxUfOUI8jNQ",
"host": "127.0.0.1",
"ip": "127.0.0.1",
"node": "Nqk-6in",
"field": "body",
"size": "544b"
}
]
[
{
"id": "Nqk-6inXQq-OxUfOUI8jNQ",
"host": "1127.0.0.1",
"ip": "127.0.0.1",
"node": "Nqk-6in",
"field": "body",
"size": "544b"
},
{
"id": "Nqk-6inXQq-OxUfOUI8jNQ",
"host": "127.0.0.1",
"ip": "127.0.0.1",
"node": "Nqk-6in",
"field": "soul",
"size": "480b"
}
]
Get node information
Get information about the nodes in a cluster. IMPORTANT: cat APIs are only intended for human consumption using the command line or Kibana console. They are not intended for use by applications. For application consumption, use the nodes info API.
Query parameters
-
bytes
string The unit used to display byte values.
Values are
b
,kb
,mb
,gb
,tb
, orpb
. -
full_id
boolean | string If
true
, return the full node ID. Iffalse
, return the shortened node ID. -
include_unloaded_segments
boolean If true, the response includes information from segments that are not loaded into memory.
-
h
string | array[string] A comma-separated list of columns names to display. It supports simple wildcards.
Supported values include:
build
(orb
): The Elasticsearch build hash. For example:5c03844
.completion.size
(orcs
,completionSize
): The size of completion. For example:0b
.cpu
: The percentage of recent system CPU used.disk.avail
(ord
,disk
,diskAvail
): The available disk space. For example:198.4gb
.disk.total
(ordt
,diskTotal
): The total disk space. For example:458.3gb
.disk.used
(ordu
,diskUsed
): The used disk space. For example:259.8gb
.disk.used_percent
(ordup
,diskUsedPercent
): The percentage of disk space used.fielddata.evictions
(orfe
,fielddataEvictions
): The number of fielddata cache evictions.fielddata.memory_size
(orfm
,fielddataMemory
): The fielddata cache memory used. For example:0b
.file_desc.current
(orfdc
,fileDescriptorCurrent
): The number of file descriptors used.file_desc.max
(orfdm
,fileDescriptorMax
): The maximum number of file descriptors.file_desc.percent
(orfdp
,fileDescriptorPercent
): The percentage of file descriptors used.flush.total
(orft
,flushTotal
): The number of flushes.flush.total_time
(orftt
,flushTotalTime
): The amount of time spent in flush.get.current
(orgc
,getCurrent
): The number of current get operations.get.exists_time
(orgeti
,getExistsTime
): The time spent in successful get operations. For example:14ms
.get.exists_total
(orgeto
,getExistsTotal
): The number of successful get operations.get.missing_time
(orgmti
,getMissingTime
): The time spent in failed get operations. For example:0s
.get.missing_total
(orgmto
,getMissingTotal
): The number of failed get operations.get.time
(orgti
,getTime
): The amount of time spent in get operations. For example:14ms
.get.total
(orgto
,getTotal
): The number of get operations.heap.current
(orhc
,heapCurrent
): The used heap size. For example:311.2mb
.heap.max
(orhm
,heapMax
): The total heap size. For example:4gb
.heap.percent
(orhp
,heapPercent
): The used percentage of total allocated Elasticsearch JVM heap. This value reflects only the Elasticsearch process running within the operating system and is the most direct indicator of its JVM, heap, or memory resource performance.http_address
(orhttp
): The bound HTTP address.id
(ornodeId
): The identifier for the node.indexing.delete_current
(oridc
,indexingDeleteCurrent
): The number of current deletion operations.indexing.delete_time
(oridti
,indexingDeleteTime
): The time spent in deletion operations. For example:2ms
.indexing.delete_total
(oridto
,indexingDeleteTotal
): The number of deletion operations.indexing.index_current
(oriic
,indexingIndexCurrent
): The number of current indexing operations.indexing.index_failed
(oriif
,indexingIndexFailed
): The number of failed indexing operations.indexing.index_failed_due_to_version_conflict
(oriifvc
,indexingIndexFailedDueToVersionConflict
): The number of indexing operations that failed due to version conflict.indexing.index_time
(oriiti
,indexingIndexTime
): The time spent in indexing operations. For example:134ms
.indexing.index_total
(oriito
,indexingIndexTotal
): The number of indexing operations.ip
(ori
): The IP address.jdk
(orj
): The Java version. For example:1.8.0
.load_1m
(orl
): The most recent load average. For example:0.22
.load_5m
(orl
): The load average for the last five minutes. For example:0.78
.load_15m
(orl
): The load average for the last fifteen minutes. For example:1.24
.mappings.total_count
(ormtc
,mappingsTotalCount
): The number of mappings, including runtime and object fields.mappings.total_estimated_overhead_in_bytes
(ormteo
,mappingsTotalEstimatedOverheadInBytes
): The estimated heap overhead, in bytes, of mappings on this node, which allows for 1KiB of heap for every mapped field.master
(orm
): Indicates whether the node is the elected master node. Returned values include*
(elected master) and-
(not elected master).merges.current
(ormc
,mergesCurrent
): The number of current merge operations.merges.current_docs
(ormcd
,mergesCurrentDocs
): The number of current merging documents.merges.current_size
(ormcs
,mergesCurrentSize
): The size of current merges. For example:0b
.merges.total
(ormt
,mergesTotal
): The number of completed merge operations.merges.total_docs
(ormtd
,mergesTotalDocs
): The number of merged documents.merges.total_size
(ormts
,mergesTotalSize
): The total size of merges. For example:0b
.merges.total_time
(ormtt
,mergesTotalTime
): The time spent merging documents. For example:0s
.name
(orn
): The node name.node.role
(orr
,role
,nodeRole
): The roles of the node. Returned values includec
(cold node),d
(data node),f
(frozen node),h
(hot node),i
(ingest node),l
(machine learning node),m
(master-eligible node),r
(remote cluster client node),s
(content node),t
(transform node),v
(voting-only node),w
(warm node), and-
(coordinating node only). For example,dim
indicates a master-eligible data and ingest node.pid
(orp
): The process identifier.port
(orpo
): The bound transport port number.query_cache.memory_size
(orqcm
,queryCacheMemory
): The used query cache memory. For example:0b
.query_cache.evictions
(orqce
,queryCacheEvictions
): The number of query cache evictions.query_cache.hit_count
(orqchc
,queryCacheHitCount
): The query cache hit count.query_cache.miss_count
(orqcmc
,queryCacheMissCount
): The query cache miss count.ram.current
(orrc
,ramCurrent
): The used total memory. For example:513.4mb
.ram.max
(orrm
,ramMax
): The total memory. For example:2.9gb
.ram.percent
(orrp
,ramPercent
): The used percentage of the total operating system memory. This reflects all processes running on the operating system instead of only Elasticsearch and is not guaranteed to correlate to its performance.refresh.total
(orrto
,refreshTotal
): The number of refresh operations.refresh.time
(orrti
,refreshTime
): The time spent in refresh operations. For example:91ms
.request_cache.memory_size
(orrcm
,requestCacheMemory
): The used request cache memory. For example:0b
.request_cache.evictions
(orrce
,requestCacheEvictions
): The number of request cache evictions.request_cache.hit_count
(orrchc
,requestCacheHitCount
): The request cache hit count.request_cache.miss_count
(orrcmc
,requestCacheMissCount
): The request cache miss count.script.compilations
(orscrcc
,scriptCompilations
): The number of total script compilations.script.cache_evictions
(orscrce
,scriptCacheEvictions
): The number of total compiled scripts evicted from cache.search.fetch_current
(orsfc
,searchFetchCurrent
): The number of current fetch phase operations.search.fetch_time
(orsfti
,searchFetchTime
): The time spent in fetch phase. For example:37ms
.search.fetch_total
(orsfto
,searchFetchTotal
): The number of fetch operations.search.open_contexts
(orso
,searchOpenContexts
): The number of open search contexts.search.query_current
(orsqc
,searchQueryCurrent
): The number of current query phase operations.search.query_time
(orsqti
,searchQueryTime
): The time spent in query phase. For example:43ms
.search.query_total
(orsqto
,searchQueryTotal
): The number of query operations.search.scroll_current
(orscc
,searchScrollCurrent
): The number of open scroll contexts.search.scroll_time
(orscti
,searchScrollTime
): The amount of time scroll contexts were held open. For example:2m
.search.scroll_total
(orscto
,searchScrollTotal
): The number of completed scroll contexts.segments.count
(orsc
,segmentsCount
): The number of segments.segments.fixed_bitset_memory
(orsfbm
,fixedBitsetMemory
): The memory used by fixed bit sets for nested object field types and type filters for types referred in join fields. For example:1.0kb
.segments.index_writer_memory
(orsiwm
,segmentsIndexWriterMemory
): The memory used by the index writer. For example:18mb
.segments.memory
(orsm
,segmentsMemory
): The memory used by segments. For example:1.4kb
.segments.version_map_memory
(orsvmm
,segmentsVersionMapMemory
): The memory used by the version map. For example:1.0kb
.shard_stats.total_count
(orsstc
,shards
,shardStatsTotalCount
): The number of shards assigned.suggest.current
(orsuc
,suggestCurrent
): The number of current suggest operations.suggest.time
(orsuti
,suggestTime
): The time spent in suggest operations.suggest.total
(orsuto
,suggestTotal
): The number of suggest operations.uptime
(oru
): The amount of node uptime. For example:17.3m
.version
(orv
): The Elasticsearch version. For example:9.0.0
.
-
s
string | array[string] A comma-separated list of column names or aliases that determines the sort order. Sorting defaults to ascending and can be changed by setting
:asc
or:desc
as a suffix to the column name. -
master_timeout
string The period to wait for a connection to the master node.
Values are
-1
or0
. -
time
string The unit used to display time values.
Values are
nanos
,micros
,ms
,s
,m
,h
, ord
.
GET /_cat/nodes?v=true&format=json
curl \
--request GET 'http://api.example.com/_cat/nodes' \
--header "Authorization: $API_KEY"
[
{
"ip": "127.0.0.1",
"heap.percent": "65",
"ram.percent": "99",
"cpu": "42",
"load_1m": "3.07",
"load_5m": null,
"load_15m": null,
"node.role": "cdfhilmrstw",
"master": "*",
"name": "mJw06l1"
}
]
[
{
"id": "veJR",
"ip": "127.0.0.1",
"port": "59938",
"v": "9.0.0",
"m": "*"
}
]
Get thread pool statistics
Get thread pool statistics for each node in a cluster. Returned information includes all built-in thread pools and custom thread pools. IMPORTANT: cat APIs are only intended for human consumption using the command line or Kibana console. They are not intended for use by applications. For application consumption, use the nodes info API.
Path parameters
-
thread_pool_patterns
string | array[string] Required A comma-separated list of thread pool names used to limit the request. Accepts wildcard expressions.
Query parameters
-
h
string | array[string] List of columns to appear in the response. Supports simple wildcards.
-
s
string | array[string] List of columns that determine how the table should be sorted. Sorting defaults to ascending and can be changed by setting
:asc
or:desc
as a suffix to the column name. -
time
string The unit used to display time values.
Values are
nanos
,micros
,ms
,s
,m
,h
, ord
. -
local
boolean If
true
, the request computes the list of selected nodes from the local cluster state. Iffalse
the list of selected nodes are computed from the cluster state of the master node. In both cases the coordinating node will send requests for further information to each selected node. -
master_timeout
string Period to wait for a connection to the master node.
Values are
-1
or0
.
GET /_cat/thread_pool?format=json
curl \
--request GET 'http://api.example.com/_cat/thread_pool/{thread_pool_patterns}' \
--header "Authorization: $API_KEY"
[
{
"node_name": "node-0",
"name": "analyze",
"active": "0",
"queue": "0",
"rejected": "0"
},
{
"node_name": "node-0",
"name": "fetch_shard_started",
"active": "0",
"queue": "0",
"rejected": "0"
},
{
"node_name": "node-0",
"name": "fetch_shard_store",
"active": "0",
"queue": "0",
"rejected": "0"
},
{
"node_name": "node-0",
"name": "flush",
"active": "0",
"queue": "0",
"rejected": "0"
},
{
"node_name": "node-0",
"name": "write",
"active": "0",
"queue": "0",
"rejected": "0"
}
]
[
{
"id": "0EWUhXeBQtaVGlexUeVwMg",
"name": "generic",
"active": "0",
"rejected": "0",
"completed": "70"
}
]
Explain the shard allocations
Added in 5.0.0
Get explanations for shard allocations in the cluster. For unassigned shards, it provides an explanation for why the shard is unassigned. For assigned shards, it provides an explanation for why the shard is remaining on its current node and has not moved or rebalanced to another node. This API can be very useful when attempting to diagnose why a shard is unassigned or why a shard continues to remain on its current node when you might expect otherwise.
Query parameters
-
include_disk_info
boolean If true, returns information about disk usage and shard sizes.
-
include_yes_decisions
boolean If true, returns YES decisions in explanation.
-
master_timeout
string Period to wait for a connection to the master node.
Values are
-1
or0
.
Body
-
current_node
string Specifies the node ID or the name of the node to only explain a shard that is currently located on the specified node.
-
index
string -
primary
boolean If true, returns explanation for the primary shard for the given shard ID.
-
shard
number Specifies the ID of the shard that you would like an explanation for.
GET _cluster/allocation/explain
{
"index": "my-index-000001",
"shard": 0,
"primary": false,
"current_node": "my-node"
}
curl \
--request POST 'http://api.example.com/_cluster/allocation/explain' \
--header "Authorization: $API_KEY" \
--header "Content-Type: application/json" \
--data '"{\n \"index\": \"my-index-000001\",\n \"shard\": 0,\n \"primary\": false,\n \"current_node\": \"my-node\"\n}"'
{
"index": "my-index-000001",
"shard": 0,
"primary": false,
"current_node": "my-node"
}
{
"index" : "my-index-000001",
"shard" : 0,
"primary" : true,
"current_state" : "unassigned",
"unassigned_info" : {
"reason" : "INDEX_CREATED",
"at" : "2017-01-04T18:08:16.600Z",
"last_allocation_status" : "no"
},
"can_allocate" : "no",
"allocate_explanation" : "Elasticsearch isn't allowed to allocate this shard to any of the nodes in the cluster. Choose a node to which you expect this shard to be allocated, find this node in the node-by-node explanation, and address the reasons which prevent Elasticsearch from allocating this shard there.",
"node_allocation_decisions" : [
{
"node_id" : "8qt2rY-pT6KNZB3-hGfLnw",
"node_name" : "node-0",
"transport_address" : "127.0.0.1:9401",
"roles" : ["data", "data_cold", "data_content", "data_frozen", "data_hot", "data_warm", "ingest", "master", "ml", "remote_cluster_client", "transform"],
"node_attributes" : {},
"node_decision" : "no",
"weight_ranking" : 1,
"deciders" : [
{
"decider" : "filter",
"decision" : "NO",
"explanation" : "node does not match index setting [index.routing.allocation.include] filters [_name:\"nonexistent_node\"]"
}
]
}
]
}
{
"index" : "my-index-000001",
"shard" : 0,
"primary" : true,
"current_state" : "unassigned",
"unassigned_info" : {
"at" : "2017-01-04T18:03:28.464Z",
"failed shard on node [mEKjwwzLT1yJVb8UxT6anw]: failed recovery, failure RecoveryFailedException",
"reason": "ALLOCATION_FAILED",
"failed_allocation_attempts": 5,
"last_allocation_status": "no",
},
"can_allocate": "no",
"allocate_explanation": "cannot allocate because allocation is not permitted to any of the nodes",
"node_allocation_decisions" : [
{
"node_id" : "3sULLVJrRneSg0EfBB-2Ew",
"node_name" : "node_t0",
"transport_address" : "127.0.0.1:9400",
"roles" : ["data_content", "data_hot"],
"node_decision" : "no",
"store" : {
"matching_size" : "4.2kb",
"matching_size_in_bytes" : 4325
},
"deciders" : [
{
"decider": "max_retry",
"decision" : "NO",
"explanation": "shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [POST /_cluster/reroute?retry_failed] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2024-07-30T21:04:12.166Z], failed_attempts[5], failed_nodes[[mEKjwwzLT1yJVb8UxT6anw]], delayed=false, details[failed shard on node [mEKjwwzLT1yJVb8UxT6anw]: failed recovery, failure RecoveryFailedException], allocation_status[deciders_no]]]"
}
]
}
]
}
Get the cluster health status
Added in 1.3.0
You can also use the API to get the health status of only specified data streams and indices. For data streams, the API retrieves the health status of the stream’s backing indices.
The cluster health status is: green, yellow or red. On the shard level, a red status indicates that the specific shard is not allocated in the cluster. Yellow means that the primary shard is allocated but replicas are not. Green means that all shards are allocated. The index level status is controlled by the worst shard status.
One of the main benefits of the API is the ability to wait until the cluster reaches a certain high watermark health level. The cluster status is controlled by the worst index status.
Query parameters
-
expand_wildcards
string | array[string] Whether to expand wildcard expression to concrete indices that are open, closed or both.
Supported values include:
all
: Match any data stream or index, including hidden ones.open
: Match open, non-hidden indices. Also matches any non-hidden data stream.closed
: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.hidden
: Match hidden data streams and hidden indices. Must be combined withopen
,closed
, orboth
.none
: Wildcard expressions are not accepted.
Values are
all
,open
,closed
,hidden
, ornone
. -
level
string Can be one of cluster, indices or shards. Controls the details level of the health information returned.
Values are
cluster
,indices
, orshards
. -
local
boolean If true, the request retrieves information from the local node only. Defaults to false, which means information is retrieved from the master node.
-
master_timeout
string Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.
Values are
-1
or0
. -
timeout
string Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.
Values are
-1
or0
. -
wait_for_active_shards
number | string A number controlling to how many active shards to wait for, all to wait for all shards in the cluster to be active, or 0 to not wait.
Values are
all
orindex-setting
. -
wait_for_events
string Can be one of immediate, urgent, high, normal, low, languid. Wait until all currently queued events with the given priority are processed.
Values are
immediate
,urgent
,high
,normal
,low
, orlanguid
. -
wait_for_nodes
string | number The request waits until the specified number N of nodes is available. It also accepts >=N, <=N, >N and <N. Alternatively, it is possible to use ge(N), le(N), gt(N) and lt(N) notation.
-
wait_for_no_initializing_shards
boolean A boolean value which controls whether to wait (until the timeout provided) for the cluster to have no shard initializations. Defaults to false, which means it will not wait for initializing shards.
-
wait_for_no_relocating_shards
boolean A boolean value which controls whether to wait (until the timeout provided) for the cluster to have no shard relocations. Defaults to false, which means it will not wait for relocating shards.
-
wait_for_status
string One of green, yellow or red. Will wait (until the timeout provided) until the status of the cluster changes to the one provided or better, i.e. green > yellow > red. By default, will not wait for any status.
Supported values include:
green
(orGREEN
): All shards are assigned.yellow
(orYELLOW
): All primary shards are assigned, but one or more replica shards are unassigned. If a node in the cluster fails, some data could be unavailable until that node is repaired.red
(orRED
): One or more primary shards are unassigned, so some data is unavailable. This can occur briefly during cluster startup as primary shards are assigned.
Values are
green
,GREEN
,yellow
,YELLOW
,red
, orRED
.
GET _cluster/health
curl \
--request GET 'http://api.example.com/_cluster/health' \
--header "Authorization: $API_KEY"
{
"cluster_name" : "testcluster",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 1,
"active_shards" : 1,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 1,
"delayed_unassigned_shards": 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch": 0,
"task_max_waiting_in_queue_millis": 0,
"active_shards_percent_as_number": 50.0
}
Reload the keystore on nodes in the cluster
Added in 6.5.0
Secure settings are stored in an on-disk keystore. Certain of these settings are reloadable. That is, you can change them on disk and reload them without restarting any nodes in the cluster. When you have updated reloadable secure settings in your keystore, you can use this API to reload those settings on each node.
When the Elasticsearch keystore is password protected and not simply obfuscated, you must provide the password for the keystore when you reload the secure settings. Reloading the settings for the whole cluster assumes that the keystores for all nodes are protected with the same password; this method is allowed only when inter-node communications are encrypted. Alternatively, you can reload the secure settings on each node by locally accessing the API and passing the node-specific Elasticsearch keystore password.
Query parameters
-
timeout
string Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.
Values are
-1
or0
.
Body
-
secure_settings_password
string
POST _nodes/reload_secure_settings
{
"secure_settings_password": "keystore-password"
}
curl \
--request POST 'http://api.example.com/_nodes/reload_secure_settings' \
--header "Authorization: $API_KEY" \
--header "Content-Type: application/json" \
--data '"{\n \"secure_settings_password\": \"keystore-password\"\n}"'
{
"secure_settings_password": "keystore-password"
}
{
"_nodes": {
"total": 1,
"successful": 1,
"failed": 0
},
"cluster_name": "my_cluster",
"nodes": {
"pQHNt5rXTTWNvUgOrdynKg": {
"name": "node-0"
}
}
}
Get node statistics
Get statistics for nodes in a cluster. By default, all stats are returned. You can limit the returned information by using metrics.
Path parameters
-
metric
string | array[string] Required Limit the information returned to the specified metrics
Query parameters
-
completion_fields
string | array[string] Comma-separated list or wildcard expressions of fields to include in fielddata and suggest statistics.
-
fielddata_fields
string | array[string] Comma-separated list or wildcard expressions of fields to include in fielddata statistics.
-
fields
string | array[string] Comma-separated list or wildcard expressions of fields to include in the statistics.
-
groups
boolean Comma-separated list of search groups to include in the search statistics.
-
include_segment_file_sizes
boolean If true, the call reports the aggregated disk usage of each one of the Lucene index files (only applies if segment stats are requested).
-
level
string Indicates whether statistics are aggregated at the cluster, index, or shard level.
Values are
cluster
,indices
, orshards
. -
timeout
string Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.
Values are
-1
or0
. -
types
array[string] A comma-separated list of document types for the indexing index metric.
-
include_unloaded_segments
boolean If
true
, the response includes information from segments that are not loaded into memory.
curl \
--request GET 'http://api.example.com/_nodes/stats/{metric}' \
--header "Authorization: $API_KEY"
Get the cluster health
Added in 8.7.0
Get a report with the health status of an Elasticsearch cluster. The report contains a list of indicators that compose Elasticsearch functionality.
Each indicator has a health status of: green, unknown, yellow or red. The indicator will provide an explanation and metadata describing the reason for its current health status.
The cluster’s status is controlled by the worst indicator status.
In the event that an indicator’s status is non-green, a list of impacts may be present in the indicator result which detail the functionalities that are negatively affected by the health issue. Each impact carries with it a severity level, an area of the system that is affected, and a simple description of the impact on the system.
Some health indicators can determine the root cause of a health problem and prescribe a set of steps that can be performed in order to improve the health of the system. The root cause and remediation steps are encapsulated in a diagnosis. A diagnosis contains a cause detailing a root cause analysis, an action containing a brief description of the steps to take to fix the problem, the list of affected resources (if applicable), and a detailed step-by-step troubleshooting guide to fix the diagnosed problem.
NOTE: The health indicators perform root cause analysis of non-green health statuses. This can be computationally expensive when called frequently. When setting up automated polling of the API for health status, set verbose to false to disable the more expensive analysis logic.
Path parameters
-
feature
string | array[string] Required A feature of the cluster, as returned by the top-level health report API.
curl \
--request GET 'http://api.example.com/_health_report/{feature}' \
--header "Authorization: $API_KEY"
Connector
The connector and sync jobs APIs provide a convenient way to create and manage Elastic connectors and sync jobs in an internal index.
Connectors are Elasticsearch integrations for syncing content from third-party data sources, which can be deployed on Elastic Cloud or hosted on your own infrastructure.
This API provides an alternative to relying solely on Kibana UI for connector and sync job management. The API comes with a set of validations and assertions to ensure that the state representation in the internal index remains valid.
This API requires the manage_connector
privilege or, for read-only endpoints, the monitor_connector
privilege.
Check in a connector
Technical preview
Update the last_seen
field in the connector and set it to the current timestamp.
Path parameters
-
connector_id
string Required The unique identifier of the connector to be checked in
curl \
--request PUT 'http://api.example.com/_connector/{connector_id}/_check_in' \
--header "Authorization: $API_KEY"
{
"result": "updated"
}
Create or update a connector
Beta
Path parameters
-
connector_id
string Required The unique identifier of the connector to be created or updated. ID is auto-generated if not provided.
Body
-
description
string -
index_name
string -
is_native
boolean -
language
string -
name
string -
service_type
string
PUT _connector/my-connector
{
"index_name": "search-google-drive",
"name": "My Connector",
"service_type": "google_drive"
}
curl \
--request PUT 'http://api.example.com/_connector/{connector_id}' \
--header "Authorization: $API_KEY" \
--header "Content-Type: application/json" \
--data '"{\n \"index_name\": \"search-google-drive\",\n \"name\": \"My Connector\",\n \"service_type\": \"google_drive\"\n}"'
{
"index_name": "search-google-drive",
"name": "My Connector",
"service_type": "google_drive"
}
{
"index_name": "search-google-drive",
"name": "My Connector",
"description": "My Connector to sync data to Elastic index from Google Drive",
"service_type": "google_drive",
"language": "english"
}
{
"result": "created",
"id": "my-connector"
}
Body
-
description
string -
index_name
string -
is_native
boolean -
language
string -
name
string -
service_type
string
PUT _connector/my-connector
{
"index_name": "search-google-drive",
"name": "My Connector",
"service_type": "google_drive"
}
curl \
--request PUT 'http://api.example.com/_connector' \
--header "Authorization: $API_KEY" \
--header "Content-Type: application/json" \
--data '"{\n \"index_name\": \"search-google-drive\",\n \"name\": \"My Connector\",\n \"service_type\": \"google_drive\"\n}"'
{
"index_name": "search-google-drive",
"name": "My Connector",
"service_type": "google_drive"
}
{
"index_name": "search-google-drive",
"name": "My Connector",
"description": "My Connector to sync data to Elastic index from Google Drive",
"service_type": "google_drive",
"language": "english"
}
{
"result": "created",
"id": "my-connector"
}
Claim a connector sync job
Technical preview
This action updates the job status to in_progress
and sets the last_seen
and started_at
timestamps to the current time.
Additionally, it can set the sync_cursor
property for the sync job.
This API is not intended for direct connector management by users. It supports the implementation of services that utilize the connector protocol to communicate with Elasticsearch.
To sync data using self-managed connectors, you need to deploy the Elastic connector service on your own infrastructure. This service runs automatically on Elastic Cloud for Elastic managed connectors.
Path parameters
-
connector_sync_job_id
string Required The unique identifier of the connector sync job.
Body
Required
-
sync_cursor
object The cursor object from the last incremental sync job. This should reference the
sync_cursor
field in the connector state for which the job runs. -
worker_hostname
string Required The host name of the current system that will run the job.
curl \
--request PUT 'http://api.example.com/_connector/_sync_job/{connector_sync_job_id}/_claim' \
--header "Authorization: $API_KEY" \
--header "Content-Type: application/json" \
--data '{"sync_cursor":{},"worker_hostname":"string"}'
Create a connector sync job
Beta
Create a connector sync job document in the internal index and initialize its counters and timestamps with default values.
Body
Required
-
id
string Required -
job_type
string Values are
full
,incremental
, oraccess_control
. -
trigger_method
string Values are
on_demand
orscheduled
.
POST _connector/_sync_job
{
"id": "connector-id",
"job_type": "full",
"trigger_method": "on_demand"
}
curl \
--request POST 'http://api.example.com/_connector/_sync_job' \
--header "Authorization: $API_KEY" \
--header "Content-Type: application/json" \
--data '"{\n \"id\": \"connector-id\",\n \"job_type\": \"full\",\n \"trigger_method\": \"on_demand\"\n}"'
{
"id": "connector-id",
"job_type": "full",
"trigger_method": "on_demand"
}
Delete auto-follow patterns
Added in 6.5.0
Delete a collection of cross-cluster replication auto-follow patterns.
Path parameters
-
name
string Required The auto-follow pattern collection to delete.
Query parameters
-
master_timeout
string The period to wait for a connection to the master node. If the master node is not available before the timeout expires, the request fails and returns an error. It can also be set to
-1
to indicate that the request should never timeout.Values are
-1
or0
.
DELETE /_ccr/auto_follow/my_auto_follow_pattern
curl \
--request DELETE 'http://api.example.com/_ccr/auto_follow/{name}' \
--header "Authorization: $API_KEY"
{
"acknowledged" : true
}
Resume an auto-follow pattern
Added in 7.5.0
Resume a cross-cluster replication auto-follow pattern that was paused. The auto-follow pattern will resume configuring following indices for newly created indices that match its patterns on the remote cluster. Remote indices created while the pattern was paused will also be followed unless they have been deleted or closed in the interim.
Path parameters
-
name
string Required The name of the auto-follow pattern to resume.
Query parameters
-
master_timeout
string The period to wait for a connection to the master node. If the master node is not available before the timeout expires, the request fails and returns an error. It can also be set to
-1
to indicate that the request should never timeout.Values are
-1
or0
.
POST /_ccr/auto_follow/my_auto_follow_pattern/resume
curl \
--request POST 'http://api.example.com/_ccr/auto_follow/{name}/resume' \
--header "Authorization: $API_KEY"
{
"acknowledged" : true
}
Get data stream lifecycle stats
Added in 8.12.0
Get statistics about the data streams that are managed by a data stream lifecycle.
GET _lifecycle/stats?human&pretty
curl \
--request GET 'http://api.example.com/_lifecycle/stats' \
--header "Authorization: $API_KEY"
{
"last_run_duration_in_millis": 2,
"last_run_duration": "2ms",
"time_between_starts_in_millis": 9998,
"time_between_starts": "9.99s",
"data_streams_count": 2,
"data_streams": [
{
"name": "my-data-stream",
"backing_indices_in_total": 2,
"backing_indices_in_error": 0
},
{
"name": "my-other-stream",
"backing_indices_in_total": 2,
"backing_indices_in_error": 1
}
]
}
Bulk index or delete documents
Perform multiple index
, create
, delete
, and update
actions in a single request.
This reduces overhead and can greatly increase indexing speed.
If the Elasticsearch security features are enabled, you must have the following index privileges for the target data stream, index, or index alias:
- To use the
create
action, you must have thecreate_doc
,create
,index
, orwrite
index privilege. Data streams support only thecreate
action. - To use the
index
action, you must have thecreate
,index
, orwrite
index privilege. - To use the
delete
action, you must have thedelete
orwrite
index privilege. - To use the
update
action, you must have theindex
orwrite
index privilege. - To automatically create a data stream or index with a bulk API request, you must have the
auto_configure
,create_index
, ormanage
index privilege. - To make the result of a bulk operation visible to search using the
refresh
parameter, you must have themaintenance
ormanage
index privilege.
Automatic data stream creation requires a matching index template with data stream enabled.
The actions are specified in the request body using a newline delimited JSON (NDJSON) structure:
action_and_meta_data\n
optional_source\n
action_and_meta_data\n
optional_source\n
....
action_and_meta_data\n
optional_source\n
The index
and create
actions expect a source on the next line and have the same semantics as the op_type
parameter in the standard index API.
A create
action fails if a document with the same ID already exists in the target
An index
action adds or replaces a document as necessary.
NOTE: Data streams support only the create
action.
To update or delete a document in a data stream, you must target the backing index containing the document.
An update
action expects that the partial doc, upsert, and script and its options are specified on the next line.
A delete
action does not expect a source on the next line and has the same semantics as the standard delete API.
NOTE: The final line of data must end with a newline character (\n
).
Each newline character may be preceded by a carriage return (\r
).
When sending NDJSON data to the _bulk
endpoint, use a Content-Type
header of application/json
or application/x-ndjson
.
Because this format uses literal newline characters (\n
) as delimiters, make sure that the JSON actions and sources are not pretty printed.
If you provide a target in the request path, it is used for any actions that don't explicitly specify an _index
argument.
A note on the format: the idea here is to make processing as fast as possible.
As some of the actions are redirected to other shards on other nodes, only action_meta_data
is parsed on the receiving node side.
Client libraries using this protocol should try and strive to do something similar on the client side, and reduce buffering as much as possible.
There is no "correct" number of actions to perform in a single bulk request. Experiment with different settings to find the optimal size for your particular workload. Note that Elasticsearch limits the maximum size of a HTTP request to 100mb by default so clients must ensure that no request exceeds this size. It is not possible to index a single document that exceeds the size limit, so you must pre-process any such documents into smaller pieces before sending them to Elasticsearch. For instance, split documents into pages or chapters before indexing them, or store raw binary data in a system outside Elasticsearch and replace the raw data with a link to the external system in the documents that you send to Elasticsearch.
Client suppport for bulk requests
Some of the officially supported clients provide helpers to assist with bulk requests and reindexing:
- Go: Check out
esutil.BulkIndexer
- Perl: Check out
Search::Elasticsearch::Client::5_0::Bulk
andSearch::Elasticsearch::Client::5_0::Scroll
- Python: Check out
elasticsearch.helpers.*
- JavaScript: Check out
client.helpers.*
- .NET: Check out
BulkAllObservable
- PHP: Check out bulk indexing.
Submitting bulk requests with cURL
If you're providing text file input to curl
, you must use the --data-binary
flag instead of plain -d
.
The latter doesn't preserve newlines. For example:
$ cat requests
{ "index" : { "_index" : "test", "_id" : "1" } }
{ "field1" : "value1" }
$ curl -s -H "Content-Type: application/x-ndjson" -XPOST localhost:9200/_bulk --data-binary "@requests"; echo
{"took":7, "errors": false, "items":[{"index":{"_index":"test","_id":"1","_version":1,"result":"created","forced_refresh":false}}]}
Optimistic concurrency control
Each index
and delete
action within a bulk API call may include the if_seq_no
and if_primary_term
parameters in their respective action and meta data lines.
The if_seq_no
and if_primary_term
parameters control how operations are run, based on the last modification to existing documents. See Optimistic concurrency control for more details.
Versioning
Each bulk item can include the version value using the version
field.
It automatically follows the behavior of the index or delete operation based on the _version
mapping.
It also support the version_type
.
Routing
Each bulk item can include the routing value using the routing
field.
It automatically follows the behavior of the index or delete operation based on the _routing
mapping.
NOTE: Data streams do not support custom routing unless they were created with the allow_custom_routing
setting enabled in the template.
Wait for active shards
When making bulk calls, you can set the wait_for_active_shards
parameter to require a minimum number of shard copies to be active before starting to process the bulk request.
Refresh
Control when the changes made by this request are visible to search.
NOTE: Only the shards that receive the bulk request will be affected by refresh.
Imagine a _bulk?refresh=wait_for
request with three documents in it that happen to be routed to different shards in an index with five shards.
The request will only wait for those three shards to refresh.
The other two shards that make up the index do not participate in the _bulk
request at all.
Query parameters
-
include_source_on_error
boolean True or false if to include the document source in the error message in case of parsing errors.
-
list_executed_pipelines
boolean If
true
, the response will include the ingest pipelines that were run for each index or create. -
pipeline
string The pipeline identifier to use to preprocess incoming documents. If the index has a default ingest pipeline specified, setting the value to
_none
turns off the default ingest pipeline for this request. If a final pipeline is configured, it will always run regardless of the value of this parameter. -
refresh
string If
true
, Elasticsearch refreshes the affected shards to make this operation visible to search. Ifwait_for
, wait for a refresh to make this operation visible to search. Iffalse
, do nothing with refreshes. Valid values:true
,false
,wait_for
.Values are
true
,false
, orwait_for
. -
routing
string A custom value that is used to route operations to a specific shard.
-
_source
boolean | string | array[string] Indicates whether to return the
_source
field (true
orfalse
) or contains a list of fields to return. -
_source_excludes
string | array[string] A comma-separated list of source fields to exclude from the response. You can also use this parameter to exclude fields from the subset specified in
_source_includes
query parameter. If the_source
parameter isfalse
, this parameter is ignored. -
_source_includes
string | array[string] A comma-separated list of source fields to include in the response. If this parameter is specified, only these source fields are returned. You can exclude fields from this subset using the
_source_excludes
query parameter. If the_source
parameter isfalse
, this parameter is ignored. -
timeout
string The period each action waits for the following operations: automatic index creation, dynamic mapping updates, and waiting for active shards. The default is
1m
(one minute), which guarantees Elasticsearch waits for at least the timeout before failing. The actual wait time could be longer, particularly when multiple waits occur.Values are
-1
or0
. -
wait_for_active_shards
number | string The number of shard copies that must be active before proceeding with the operation. Set to
all
or any positive integer up to the total number of shards in the index (number_of_replicas+1
). The default is1
, which waits for each primary shard to be active.Values are
all
orindex-setting
. -
require_alias
boolean If
true
, the request's actions must target an index alias. -
require_data_stream
boolean If
true
, the request's actions must target a data stream (existing or to be created).
POST _bulk
{ "index" : { "_index" : "test", "_id" : "1" } }
{ "field1" : "value1" }
{ "delete" : { "_index" : "test", "_id" : "2" } }
{ "create" : { "_index" : "test", "_id" : "3" } }
{ "field1" : "value3" }
{ "update" : {"_id" : "1", "_index" : "test"} }
{ "doc" : {"field2" : "value2"} }
curl \
--request POST 'http://api.example.com/_bulk' \
--header "Authorization: $API_KEY" \
--header "Content-Type: application/json" \
--data '"{ \"index\" : { \"_index\" : \"test\", \"_id\" : \"1\" } }\n{ \"field1\" : \"value1\" }\n{ \"delete\" : { \"_index\" : \"test\", \"_id\" : \"2\" } }\n{ \"create\" : { \"_index\" : \"test\", \"_id\" : \"3\" } }\n{ \"field1\" : \"value3\" }\n{ \"update\" : {\"_id\" : \"1\", \"_index\" : \"test\"} }\n{ \"doc\" : {\"field2\" : \"value2\"} }"'
{ "index" : { "_index" : "test", "_id" : "1" } }
{ "field1" : "value1" }
{ "delete" : { "_index" : "test", "_id" : "2" } }
{ "create" : { "_index" : "test", "_id" : "3" } }
{ "field1" : "value3" }
{ "update" : {"_id" : "1", "_index" : "test"} }
{ "doc" : {"field2" : "value2"} }
{ "update" : {"_id" : "1", "_index" : "index1", "retry_on_conflict" : 3} }
{ "doc" : {"field" : "value"} }
{ "update" : { "_id" : "0", "_index" : "index1", "retry_on_conflict" : 3} }
{ "script" : { "source": "ctx._source.counter += params.param1", "lang" : "painless", "params" : {"param1" : 1}}, "upsert" : {"counter" : 1}}
{ "update" : {"_id" : "2", "_index" : "index1", "retry_on_conflict" : 3} }
{ "doc" : {"field" : "value"}, "doc_as_upsert" : true }
{ "update" : {"_id" : "3", "_index" : "index1", "_source" : true} }
{ "doc" : {"field" : "value"} }
{ "update" : {"_id" : "4", "_index" : "index1"} }
{ "doc" : {"field" : "value"}, "_source": true}
{ "update": {"_id": "5", "_index": "index1"} }
{ "doc": {"my_field": "foo"} }
{ "update": {"_id": "6", "_index": "index1"} }
{ "doc": {"my_field": "foo"} }
{ "create": {"_id": "7", "_index": "index1"} }
{ "my_field": "foo" }
{ "index" : { "_index" : "my_index", "_id" : "1", "dynamic_templates": {"work_location": "geo_point"}} }
{ "field" : "value1", "work_location": "41.12,-71.34", "raw_location": "41.12,-71.34"}
{ "create" : { "_index" : "my_index", "_id" : "2", "dynamic_templates": {"home_location": "geo_point"}} }
{ "field" : "value2", "home_location": "41.12,-71.34"}
{
"took": 30,
"errors": false,
"items": [
{
"index": {
"_index": "test",
"_id": "1",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"status": 201,
"_seq_no" : 0,
"_primary_term": 1
}
},
{
"delete": {
"_index": "test",
"_id": "2",
"_version": 1,
"result": "not_found",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"status": 404,
"_seq_no" : 1,
"_primary_term" : 2
}
},
{
"create": {
"_index": "test",
"_id": "3",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"status": 201,
"_seq_no" : 2,
"_primary_term" : 3
}
},
{
"update": {
"_index": "test",
"_id": "1",
"_version": 2,
"result": "updated",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"status": 200,
"_seq_no" : 3,
"_primary_term" : 4
}
}
]
}
{
"took": 486,
"errors": true,
"items": [
{
"update": {
"_index": "index1",
"_id": "5",
"status": 404,
"error": {
"type": "document_missing_exception",
"reason": "[5]: document missing",
"index_uuid": "aAsFqTI0Tc2W0LCWgPNrOA",
"shard": "0",
"index": "index1"
}
}
},
{
"update": {
"_index": "index1",
"_id": "6",
"status": 404,
"error": {
"type": "document_missing_exception",
"reason": "[6]: document missing",
"index_uuid": "aAsFqTI0Tc2W0LCWgPNrOA",
"shard": "0",
"index": "index1"
}
}
},
{
"create": {
"_index": "index1",
"_id": "7",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 0,
"_primary_term": 1,
"status": 201
}
}
]
}
{
"items": [
{
"update": {
"error": {
"type": "document_missing_exception",
"reason": "[5]: document missing",
"index_uuid": "aAsFqTI0Tc2W0LCWgPNrOA",
"shard": "0",
"index": "index1"
}
}
},
{
"update": {
"error": {
"type": "document_missing_exception",
"reason": "[6]: document missing",
"index_uuid": "aAsFqTI0Tc2W0LCWgPNrOA",
"shard": "0",
"index": "index1"
}
}
}
]
}
Get a document by its ID
Get a document and its source or stored fields from an index.
By default, this API is realtime and is not affected by the refresh rate of the index (when data will become visible for search).
In the case where stored fields are requested with the stored_fields
parameter and the document has been updated but is not yet refreshed, the API will have to parse and analyze the source to extract the stored fields.
To turn off realtime behavior, set the realtime
parameter to false.
Source filtering
By default, the API returns the contents of the _source
field unless you have used the stored_fields
parameter or the _source
field is turned off.
You can turn off _source
retrieval by using the _source
parameter:
GET my-index-000001/_doc/0?_source=false
If you only need one or two fields from the _source
, use the _source_includes
or _source_excludes
parameters to include or filter out particular fields.
This can be helpful with large documents where partial retrieval can save on network overhead
Both parameters take a comma separated list of fields or wildcard expressions.
For example:
GET my-index-000001/_doc/0?_source_includes=*.id&_source_excludes=entities
If you only want to specify includes, you can use a shorter notation:
GET my-index-000001/_doc/0?_source=*.id
Routing
If routing is used during indexing, the routing value also needs to be specified to retrieve a document. For example:
GET my-index-000001/_doc/2?routing=user1
This request gets the document with ID 2, but it is routed based on the user. The document is not fetched if the correct routing is not specified.
Distributed
The GET operation is hashed into a specific shard ID. It is then redirected to one of the replicas within that shard ID and returns the result. The replicas are the primary shard and its replicas within that shard ID group. This means that the more replicas you have, the better your GET scaling will be.
Versioning support
You can use the version
parameter to retrieve the document only if its current version is equal to the specified one.
Internally, Elasticsearch has marked the old document as deleted and added an entirely new document. The old version of the document doesn't disappear immediately, although you won't be able to access it. Elasticsearch cleans up deleted documents in the background as you continue to index more data.
Query parameters
-
preference
string The node or shard the operation should be performed on. By default, the operation is randomized between the shard replicas.
If it is set to
_local
, the operation will prefer to be run on a local allocated shard when possible. If it is set to a custom value, the value is used to guarantee that the same shards will be used for the same custom value. This can help with "jumping values" when hitting different shards in different refresh states. A sample value can be something like the web session ID or the user name. -
realtime
boolean If
true
, the request is real-time as opposed to near-real-time. -
refresh
boolean If
true
, the request refreshes the relevant shards before retrieving the document. Setting it totrue
should be done after careful thought and verification that this does not cause a heavy load on the system (and slow down indexing). -
routing
string A custom value used to route operations to a specific shard.
-
_source
boolean | string | array[string] Indicates whether to return the
_source
field (true
orfalse
) or lists the fields to return. -
_source_excludes
string | array[string] A comma-separated list of source fields to exclude from the response. You can also use this parameter to exclude fields from the subset specified in
_source_includes
query parameter. If the_source
parameter isfalse
, this parameter is ignored. -
_source_includes
string | array[string] A comma-separated list of source fields to include in the response. If this parameter is specified, only these source fields are returned. You can exclude fields from this subset using the
_source_excludes
query parameter. If the_source
parameter isfalse
, this parameter is ignored. -
stored_fields
string | array[string] A comma-separated list of stored fields to return as part of a hit. If no fields are specified, no stored fields are included in the response. If this field is specified, the
_source
parameter defaults tofalse
. Only leaf fields can be retrieved with thestored_field
option. Object fields can't be returned;if specified, the request fails. -
version
number The version number for concurrency control. It must match the current version of the document for the request to succeed.
-
version_type
string The version type.
Supported values include:
internal
: Use internal versioning that starts at 1 and increments with each update or delete.external
: Only index the document if the specified version is strictly higher than the version of the stored document or if there is no existing document.external_gte
: Only index the document if the specified version is equal or higher than the version of the stored document or if there is no existing document. NOTE: Theexternal_gte
version type is meant for special use cases and should be used with care. If used incorrectly, it can result in loss of data.force
: This option is deprecated because it can cause primary and replica shards to diverge.
Values are
internal
,external
,external_gte
, orforce
.
GET my-index-000001/_doc/0
curl \
--request GET 'http://api.example.com/{index}/_doc/{id}' \
--header "Authorization: $API_KEY"
{
"_index": "my-index-000001",
"_id": "0",
"_version": 1,
"_seq_no": 0,
"_primary_term": 1,
"found": true,
"_source": {
"@timestamp": "2099-11-15T14:12:12",
"http": {
"request": {
"method": "get"
},
"response": {
"status_code": 200,
"bytes": 1070000
},
"version": "1.1"
},
"source": {
"ip": "127.0.0.1"
},
"message": "GET /search HTTP/1.1 200 1070000",
"user": {
"id": "kimchy"
}
}
}
{
"_index": "my-index-000001",
"_id": "1",
"_version": 1,
"_seq_no" : 22,
"_primary_term" : 1,
"found": true,
"fields": {
"tags": [
"production"
]
}
}
{
"_index": "my-index-000001",
"_id": "2",
"_version": 1,
"_seq_no" : 13,
"_primary_term" : 1,
"_routing": "user1",
"found": true,
"fields": {
"tags": [
"env2"
]
}
}
Create or update a document in an index
Add a JSON document to the specified data stream or index and make it searchable. If the target is an index and the document already exists, the request updates the document and increments its version.
NOTE: You cannot use this API to send update requests for existing documents in a data stream.
If the Elasticsearch security features are enabled, you must have the following index privileges for the target data stream, index, or index alias:
- To add or overwrite a document using the
PUT /<target>/_doc/<_id>
request format, you must have thecreate
,index
, orwrite
index privilege. - To add a document using the
POST /<target>/_doc/
request format, you must have thecreate_doc
,create
,index
, orwrite
index privilege. - To automatically create a data stream or index with this API request, you must have the
auto_configure
,create_index
, ormanage
index privilege.
Automatic data stream creation requires a matching index template with data stream enabled.
NOTE: Replica shards might not all be started when an indexing operation returns successfully.
By default, only the primary is required. Set wait_for_active_shards
to change this default behavior.
Automatically create data streams and indices
If the request's target doesn't exist and matches an index template with a data_stream
definition, the index operation automatically creates the data stream.
If the target doesn't exist and doesn't match a data stream template, the operation automatically creates the index and applies any matching index templates.
NOTE: Elasticsearch includes several built-in index templates. To avoid naming collisions with these templates, refer to index pattern documentation.
If no mapping exists, the index operation creates a dynamic mapping. By default, new fields and objects are automatically added to the mapping if needed.
Automatic index creation is controlled by the action.auto_create_index
setting.
If it is true
, any index can be created automatically.
You can modify this setting to explicitly allow or block automatic creation of indices that match specified patterns or set it to false
to turn off automatic index creation entirely.
Specify a comma-separated list of patterns you want to allow or prefix each pattern with +
or -
to indicate whether it should be allowed or blocked.
When a list is specified, the default behaviour is to disallow.
NOTE: The action.auto_create_index
setting affects the automatic creation of indices only.
It does not affect the creation of data streams.
Optimistic concurrency control
Index operations can be made conditional and only be performed if the last modification to the document was assigned the sequence number and primary term specified by the if_seq_no
and if_primary_term
parameters.
If a mismatch is detected, the operation will result in a VersionConflictException
and a status code of 409
.
Routing
By default, shard placement — or routing — is controlled by using a hash of the document's ID value.
For more explicit control, the value fed into the hash function used by the router can be directly specified on a per-operation basis using the routing
parameter.
When setting up explicit mapping, you can also use the _routing
field to direct the index operation to extract the routing value from the document itself.
This does come at the (very minimal) cost of an additional document parsing pass.
If the _routing
mapping is defined and set to be required, the index operation will fail if no routing value is provided or extracted.
NOTE: Data streams do not support custom routing unless they were created with the allow_custom_routing
setting enabled in the template.
Distributed
The index operation is directed to the primary shard based on its route and performed on the actual node containing this shard. After the primary shard completes the operation, if needed, the update is distributed to applicable replicas.
Active shards
To improve the resiliency of writes to the system, indexing operations can be configured to wait for a certain number of active shard copies before proceeding with the operation.
If the requisite number of active shard copies are not available, then the write operation must wait and retry, until either the requisite shard copies have started or a timeout occurs.
By default, write operations only wait for the primary shards to be active before proceeding (that is to say wait_for_active_shards
is 1
).
This default can be overridden in the index settings dynamically by setting index.write.wait_for_active_shards
.
To alter this behavior per operation, use the wait_for_active_shards request
parameter.
Valid values are all or any positive integer up to the total number of configured copies per shard in the index (which is number_of_replicas
+1).
Specifying a negative value or a number greater than the number of shard copies will throw an error.
For example, suppose you have a cluster of three nodes, A, B, and C and you create an index index with the number of replicas set to 3 (resulting in 4 shard copies, one more copy than there are nodes).
If you attempt an indexing operation, by default the operation will only ensure the primary copy of each shard is available before proceeding.
This means that even if B and C went down and A hosted the primary shard copies, the indexing operation would still proceed with only one copy of the data.
If wait_for_active_shards
is set on the request to 3
(and all three nodes are up), the indexing operation will require 3 active shard copies before proceeding.
This requirement should be met because there are 3 active nodes in the cluster, each one holding a copy of the shard.
However, if you set wait_for_active_shards
to all
(or to 4
, which is the same in this situation), the indexing operation will not proceed as you do not have all 4 copies of each shard active in the index.
The operation will timeout unless a new node is brought up in the cluster to host the fourth copy of the shard.
It is important to note that this setting greatly reduces the chances of the write operation not writing to the requisite number of shard copies, but it does not completely eliminate the possibility, because this check occurs before the write operation starts.
After the write operation is underway, it is still possible for replication to fail on any number of shard copies but still succeed on the primary.
The _shards
section of the API response reveals the number of shard copies on which replication succeeded and failed.
No operation (noop) updates
When updating a document by using this API, a new version of the document is always created even if the document hasn't changed.
If this isn't acceptable use the _update
API with detect_noop
set to true
.
The detect_noop
option isn't available on this API because it doesn’t fetch the old source and isn't able to compare it against the new source.
There isn't a definitive rule for when noop updates aren't acceptable. It's a combination of lots of factors like how frequently your data source sends updates that are actually noops and how many queries per second Elasticsearch runs on the shard receiving the updates.
Versioning
Each indexed document is given a version number.
By default, internal versioning is used that starts at 1 and increments with each update, deletes included.
Optionally, the version number can be set to an external value (for example, if maintained in a database).
To enable this functionality, version_type
should be set to external
.
The value provided must be a numeric, long value greater than or equal to 0, and less than around 9.2e+18
.
NOTE: Versioning is completely real time, and is not affected by the near real time aspects of search operations. If no version is provided, the operation runs without any version checks.
When using the external version type, the system checks to see if the version number passed to the index request is greater than the version of the currently stored document. If true, the document will be indexed and the new version number used. If the value provided is less than or equal to the stored document's version number, a version conflict will occur and the index operation will fail. For example:
PUT my-index-000001/_doc/1?version=2&version_type=external
{
"user": {
"id": "elkbee"
}
}
In this example, the operation will succeed since the supplied version of 2 is higher than the current document version of 1.
If the document was already updated and its version was set to 2 or higher, the indexing command will fail and result in a conflict (409 HTTP status code).
A nice side effect is that there is no need to maintain strict ordering of async indexing operations run as a result of changes to a source database, as long as version numbers from the source database are used.
Even the simple case of updating the Elasticsearch index using data from a database is simplified if external versioning is used, as only the latest version will be used if the index operations arrive out of order.
Path parameters
-
index
string Required The name of the data stream or index to target. If the target doesn't exist and matches the name or wildcard (
*
) pattern of an index template with adata_stream
definition, this request creates the data stream. If the target doesn't exist and doesn't match a data stream template, this request creates the index. You can check for existing targets with the resolve index API. -
id
string Required A unique identifier for the document. To automatically generate a document ID, use the
POST /<target>/_doc/
request format and omit this parameter.
Query parameters
-
if_primary_term
number Only perform the operation if the document has this primary term.
-
if_seq_no
number Only perform the operation if the document has this sequence number.
-
include_source_on_error
boolean True or false if to include the document source in the error message in case of parsing errors.
-
op_type
string Set to
create
to only index the document if it does not already exist (put if absent). If a document with the specified_id
already exists, the indexing operation will fail. The behavior is the same as using the<index>/_create
endpoint. If a document ID is specified, this paramater defaults toindex
. Otherwise, it defaults tocreate
. If the request targets a data stream, anop_type
ofcreate
is required.Supported values include:
index
: Overwrite any documents that already exist.create
: Only index documents that do not already exist.
Values are
index
orcreate
. -
pipeline
string The ID of the pipeline to use to preprocess incoming documents. If the index has a default ingest pipeline specified, then setting the value to
_none
disables the default ingest pipeline for this request. If a final pipeline is configured it will always run, regardless of the value of this parameter. -
refresh
string If
true
, Elasticsearch refreshes the affected shards to make this operation visible to search. Ifwait_for
, it waits for a refresh to make this operation visible to search. Iffalse
, it does nothing with refreshes.Values are
true
,false
, orwait_for
. -
routing
string A custom value that is used to route operations to a specific shard.
-
timeout
string The period the request waits for the following operations: automatic index creation, dynamic mapping updates, waiting for active shards.
This parameter is useful for situations where the primary shard assigned to perform the operation might not be available when the operation runs. Some reasons for this might be that the primary shard is currently recovering from a gateway or undergoing relocation. By default, the operation will wait on the primary shard to become available for at least 1 minute before failing and responding with an error. The actual wait time could be longer, particularly when multiple waits occur.
Values are
-1
or0
. -
version
number An explicit version number for concurrency control. It must be a non-negative long number.
-
version_type
string The version type.
Supported values include:
internal
: Use internal versioning that starts at 1 and increments with each update or delete.external
: Only index the document if the specified version is strictly higher than the version of the stored document or if there is no existing document.external_gte
: Only index the document if the specified version is equal or higher than the version of the stored document or if there is no existing document. NOTE: Theexternal_gte
version type is meant for special use cases and should be used with care. If used incorrectly, it can result in loss of data.force
: This option is deprecated because it can cause primary and replica shards to diverge.
Values are
internal
,external
,external_gte
, orforce
. -
wait_for_active_shards
number | string The number of shard copies that must be active before proceeding with the operation. You can set it to
all
or any positive integer up to the total number of shards in the index (number_of_replicas+1
). The default value of1
means it waits for each primary shard to be active.Values are
all
orindex-setting
. -
require_alias
boolean If
true
, the destination must be an index alias.
POST my-index-000001/_doc/
{
"@timestamp": "2099-11-15T13:12:00",
"message": "GET /search HTTP/1.1 200 1070000",
"user": {
"id": "kimchy"
}
}
curl \
--request POST 'http://api.example.com/{index}/_doc/{id}' \
--header "Authorization: $API_KEY" \
--header "Content-Type: application/json" \
--data '"{\n \"@timestamp\": \"2099-11-15T13:12:00\",\n \"message\": \"GET /search HTTP/1.1 200 1070000\",\n \"user\": {\n \"id\": \"kimchy\"\n }\n}"'
{
"@timestamp": "2099-11-15T13:12:00",
"message": "GET /search HTTP/1.1 200 1070000",
"user": {
"id": "kimchy"
}
}
{
"@timestamp": "2099-11-15T13:12:00",
"message": "GET /search HTTP/1.1 200 1070000",
"user": {
"id": "kimchy"
}
}
{
"_shards": {
"total": 2,
"failed": 0,
"successful": 2
},
"_index": "my-index-000001",
"_id": "W0tpsmIBdwcYyG50zbta",
"_version": 1,
"_seq_no": 0,
"_primary_term": 1,
"result": "created"
}
{
"_shards": {
"total": 2,
"failed": 0,
"successful": 2
},
"_index": "my-index-000001",
"_id": "1",
"_version": 1,
"_seq_no": 0,
"_primary_term": 1,
"result": "created"
}
Delete documents
Added in 5.0.0
Deletes documents that match the specified query.
If the Elasticsearch security features are enabled, you must have the following index privileges for the target data stream, index, or alias:
read
delete
orwrite
You can specify the query criteria in the request URI or the request body using the same syntax as the search API. When you submit a delete by query request, Elasticsearch gets a snapshot of the data stream or index when it begins processing the request and deletes matching documents using internal versioning. If a document changes between the time that the snapshot is taken and the delete operation is processed, it results in a version conflict and the delete operation fails.
NOTE: Documents with a version equal to 0 cannot be deleted using delete by query because internal versioning does not support 0 as a valid version number.
While processing a delete by query request, Elasticsearch performs multiple search requests sequentially to find all of the matching documents to delete. A bulk delete request is performed for each batch of matching documents. If a search or bulk request is rejected, the requests are retried up to 10 times, with exponential back off. If the maximum retry limit is reached, processing halts and all failed requests are returned in the response. Any delete requests that completed successfully still stick, they are not rolled back.
You can opt to count version conflicts instead of halting and returning by setting conflicts
to proceed
.
Note that if you opt to count version conflicts the operation could attempt to delete more documents from the source than max_docs
until it has successfully deleted max_docs documents
, or it has gone through every document in the source query.
Throttling delete requests
To control the rate at which delete by query issues batches of delete operations, you can set requests_per_second
to any positive decimal number.
This pads each batch with a wait time to throttle the rate.
Set requests_per_second
to -1
to disable throttling.
Throttling uses a wait time between batches so that the internal scroll requests can be given a timeout that takes the request padding into account.
The padding time is the difference between the batch size divided by the requests_per_second
and the time spent writing.
By default the batch size is 1000
, so if requests_per_second
is set to 500
:
target_time = 1000 / 500 per second = 2 seconds
wait_time = target_time - write_time = 2 seconds - .5 seconds = 1.5 seconds
Since the batch is issued as a single _bulk
request, large batch sizes cause Elasticsearch to create many requests and wait before starting the next set.
This is "bursty" instead of "smooth".
Slicing
Delete by query supports sliced scroll to parallelize the delete process. This can improve efficiency and provide a convenient way to break the request down into smaller parts.
Setting slices
to auto
lets Elasticsearch choose the number of slices to use.
This setting will use one slice per shard, up to a certain limit.
If there are multiple source data streams or indices, it will choose the number of slices based on the index or backing index with the smallest number of shards.
Adding slices to the delete by query operation creates sub-requests which means it has some quirks:
- You can see these requests in the tasks APIs. These sub-requests are "child" tasks of the task for the request with slices.
- Fetching the status of the task for the request with slices only contains the status of completed slices.
- These sub-requests are individually addressable for things like cancellation and rethrottling.
- Rethrottling the request with
slices
will rethrottle the unfinished sub-request proportionally. - Canceling the request with
slices
will cancel each sub-request. - Due to the nature of
slices
each sub-request won't get a perfectly even portion of the documents. All documents will be addressed, but some slices may be larger than others. Expect larger slices to have a more even distribution. - Parameters like
requests_per_second
andmax_docs
on a request withslices
are distributed proportionally to each sub-request. Combine that with the earlier point about distribution being uneven and you should conclude that usingmax_docs
withslices
might not result in exactlymax_docs
documents being deleted. - Each sub-request gets a slightly different snapshot of the source data stream or index though these are all taken at approximately the same time.
If you're slicing manually or otherwise tuning automatic slicing, keep in mind that:
- Query performance is most efficient when the number of slices is equal to the number of shards in the index or backing index. If that number is large (for example, 500), choose a lower number as too many
slices
hurts performance. Settingslices
higher than the number of shards generally does not improve efficiency and adds overhead. - Delete performance scales linearly across available resources with the number of slices.
Whether query or delete performance dominates the runtime depends on the documents being reindexed and cluster resources.
Cancel a delete by query operation
Any delete by query can be canceled using the task cancel API. For example:
POST _tasks/r1A2WoRbTwKZ516z6NEs5A:36619/_cancel
The task ID can be found by using the get tasks API.
Cancellation should happen quickly but might take a few seconds. The get task status API will continue to list the delete by query task until this task checks that it has been cancelled and terminates itself.
Path parameters
-
index
string | array[string] Required A comma-separated list of data streams, indices, and aliases to search. It supports wildcards (
*
). To search all data streams or indices, omit this parameter or use*
or_all
.
Query parameters
-
allow_no_indices
boolean If
false
, the request returns an error if any wildcard expression, index alias, or_all
value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targetingfoo*,bar*
returns an error if an index starts withfoo
but no index starts withbar
. -
analyzer
string Analyzer to use for the query string. This parameter can be used only when the
q
query string parameter is specified. -
analyze_wildcard
boolean If
true
, wildcard and prefix queries are analyzed. This parameter can be used only when theq
query string parameter is specified. -
conflicts
string What to do if delete by query hits version conflicts:
abort
orproceed
.Supported values include:
abort
: Stop reindexing if there are conflicts.proceed
: Continue reindexing even if there are conflicts.
Values are
abort
orproceed
. -
default_operator
string The default operator for query string query:
AND
orOR
. This parameter can be used only when theq
query string parameter is specified.Values are
and
,AND
,or
, orOR
. -
df
string The field to use as default where no field prefix is given in the query string. This parameter can be used only when the
q
query string parameter is specified. -
expand_wildcards
string | array[string] The type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. It supports comma-separated values, such as
open,hidden
.Supported values include:
all
: Match any data stream or index, including hidden ones.open
: Match open, non-hidden indices. Also matches any non-hidden data stream.closed
: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.hidden
: Match hidden data streams and hidden indices. Must be combined withopen
,closed
, orboth
.none
: Wildcard expressions are not accepted.
Values are
all
,open
,closed
,hidden
, ornone
. -
from
number Skips the specified number of documents.
-
lenient
boolean If
true
, format-based query failures (such as providing text to a numeric field) in the query string will be ignored. This parameter can be used only when theq
query string parameter is specified. -
max_docs
number The maximum number of documents to process. Defaults to all documents. When set to a value less then or equal to
scroll_size
, a scroll will not be used to retrieve the results for the operation. -
preference
string The node or shard the operation should be performed on. It is random by default.
-
refresh
boolean If
true
, Elasticsearch refreshes all shards involved in the delete by query after the request completes. This is different than the delete API'srefresh
parameter, which causes just the shard that received the delete request to be refreshed. Unlike the delete API, it does not supportwait_for
. -
request_cache
boolean If
true
, the request cache is used for this request. Defaults to the index-level setting. -
requests_per_second
number The throttle for this request in sub-requests per second.
-
routing
string A custom value used to route operations to a specific shard.
-
q
string A query in the Lucene query string syntax.
-
scroll
string The period to retain the search context for scrolling.
Values are
-1
or0
. -
scroll_size
number The size of the scroll request that powers the operation.
-
search_timeout
string The explicit timeout for each search request. It defaults to no timeout.
Values are
-1
or0
. -
search_type
string The type of the search operation. Available options include
query_then_fetch
anddfs_query_then_fetch
.Supported values include:
query_then_fetch
: Documents are scored using local term and document frequencies for the shard. This is usually faster but less accurate.dfs_query_then_fetch
: Documents are scored using global term and document frequencies across all shards. This is usually slower but more accurate.
Values are
query_then_fetch
ordfs_query_then_fetch
. -
slices
number | string The number of slices this task should be divided into.
Value is
auto
. -
sort
array[string] A comma-separated list of
<field>:<direction>
pairs. -
stats
array[string] The specific
tag
of the request for logging and statistical purposes. -
terminate_after
number The maximum number of documents to collect for each shard. If a query reaches this limit, Elasticsearch terminates the query early. Elasticsearch collects documents before sorting.
Use with caution. Elasticsearch applies this parameter to each shard handling the request. When possible, let Elasticsearch perform early termination automatically. Avoid specifying this parameter for requests that target data streams with backing indices across multiple data tiers.
-
timeout
string The period each deletion request waits for active shards.
Values are
-1
or0
. -
version
boolean If
true
, returns the document version as part of a hit. -
wait_for_active_shards
number | string The number of shard copies that must be active before proceeding with the operation. Set to
all
or any positive integer up to the total number of shards in the index (number_of_replicas+1
). Thetimeout
value controls how long each write request waits for unavailable shards to become available.Values are
all
orindex-setting
. -
wait_for_completion
boolean If
true
, the request blocks until the operation is complete. Iffalse
, Elasticsearch performs some preflight checks, launches the request, and returns a task you can use to cancel or get the status of the task. Elasticsearch creates a record of this task as a document at.tasks/task/${taskId}
. When you are done with a task, you should delete the task document so Elasticsearch can reclaim the space.
Body
Required
-
max_docs
number The maximum number of documents to delete.
-
query
object An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.
External documentation -
slice
object
POST /my-index-000001,my-index-000002/_delete_by_query
{
"query": {
"match_all": {}
}
}
curl \
--request POST 'http://api.example.com/{index}/_delete_by_query' \
--header "Authorization: $API_KEY" \
--header "Content-Type: application/json" \
--data '"{\n \"query\": {\n \"match_all\": {}\n }\n}"'
{
"query": {
"match_all": {}
}
}
{
"query": {
"term": {
"user.id": "kimchy"
}
},
"max_docs": 1
}
{
"slice": {
"id": 0,
"max": 2
},
"query": {
"range": {
"http.response.bytes": {
"lt": 2000000
}
}
}
}
{
"query": {
"range": {
"http.response.bytes": {
"lt": 2000000
}
}
}
}
{
"took" : 147,
"timed_out": false,
"total": 119,
"deleted": 119,
"batches": 1,
"version_conflicts": 0,
"noops": 0,
"retries": {
"bulk": 0,
"search": 0
},
"throttled_millis": 0,
"requests_per_second": -1.0,
"throttled_until_millis": 0,
"failures" : [ ]
}
Check for a document source
Added in 5.4.0
Check whether a document source exists in an index. For example:
HEAD my-index-000001/_source/1
A document's source is not available if it is disabled in the mapping.
Query parameters
-
preference
string The node or shard the operation should be performed on. By default, the operation is randomized between the shard replicas.
-
realtime
boolean If
true
, the request is real-time as opposed to near-real-time. -
refresh
boolean If
true
, the request refreshes the relevant shards before retrieving the document. Setting it totrue
should be done after careful thought and verification that this does not cause a heavy load on the system (and slow down indexing). -
routing
string A custom value used to route operations to a specific shard.
-
_source
boolean | string | array[string] Indicates whether to return the
_source
field (true
orfalse
) or lists the fields to return. -
_source_excludes
string | array[string] A comma-separated list of source fields to exclude in the response.
-
_source_includes
string | array[string] A comma-separated list of source fields to include in the response.
-
version
number The version number for concurrency control. It must match the current version of the document for the request to succeed.
-
version_type
string The version type.
Supported values include:
internal
: Use internal versioning that starts at 1 and increments with each update or delete.external
: Only index the document if the specified version is strictly higher than the version of the stored document or if there is no existing document.external_gte
: Only index the document if the specified version is equal or higher than the version of the stored document or if there is no existing document. NOTE: Theexternal_gte
version type is meant for special use cases and should be used with care. If used incorrectly, it can result in loss of data.force
: This option is deprecated because it can cause primary and replica shards to diverge.
Values are
internal
,external
,external_gte
, orforce
.
curl \
--request HEAD 'http://api.example.com/{index}/_source/{id}' \
--header "Authorization: $API_KEY"
Get term vector information
Get information and statistics about terms in the fields of a particular document.
You can retrieve term vectors for documents stored in the index or for artificial documents passed in the body of the request.
You can specify the fields you are interested in through the fields
parameter or by adding the fields to the request body.
For example:
GET /my-index-000001/_termvectors/1?fields=message
Fields can be specified using wildcards, similar to the multi match query.
Term vectors are real-time by default, not near real-time.
This can be changed by setting realtime
parameter to false
.
You can request three types of values: term information, term statistics, and field statistics. By default, all term information and field statistics are returned for all fields but term statistics are excluded.
Term information
- term frequency in the field (always returned)
- term positions (
positions: true
) - start and end offsets (
offsets: true
) - term payloads (
payloads: true
), as base64 encoded bytes
If the requested information wasn't stored in the index, it will be computed on the fly if possible. Additionally, term vectors could be computed for documents not even existing in the index, but instead provided by the user.
Start and end offsets assume UTF-16 encoding is being used. If you want to use these offsets in order to get the original text that produced this token, you should make sure that the string you are taking a sub-string of is also encoded using UTF-16.
Behaviour
The term and field statistics are not accurate.
Deleted documents are not taken into account.
The information is only retrieved for the shard the requested document resides in.
The term and field statistics are therefore only useful as relative measures whereas the absolute numbers have no meaning in this context.
By default, when requesting term vectors of artificial documents, a shard to get the statistics from is randomly selected.
Use routing
only to hit a particular shard.
Query parameters
-
fields
string | array[string] A comma-separated list or wildcard expressions of fields to include in the statistics. It is used as the default list unless a specific field list is provided in the
completion_fields
orfielddata_fields
parameters. -
field_statistics
boolean If
true
, the response includes:- The document count (how many documents contain this field).
- The sum of document frequencies (the sum of document frequencies for all terms in this field).
- The sum of total term frequencies (the sum of total term frequencies of each term in this field).
-
offsets
boolean If
true
, the response includes term offsets. -
payloads
boolean If
true
, the response includes term payloads. -
positions
boolean If
true
, the response includes term positions. -
preference
string The node or shard the operation should be performed on. It is random by default.
-
realtime
boolean If true, the request is real-time as opposed to near-real-time.
-
routing
string A custom value that is used to route operations to a specific shard.
-
term_statistics
boolean If
true
, the response includes:- The total term frequency (how often a term occurs in all documents).
- The document frequency (the number of documents containing the current term).
By default these values are not returned since term statistics can have a serious performance impact.
-
version
number If
true
, returns the document version as part of a hit. -
version_type
string The version type.
Supported values include:
internal
: Use internal versioning that starts at 1 and increments with each update or delete.external
: Only index the document if the specified version is strictly higher than the version of the stored document or if there is no existing document.external_gte
: Only index the document if the specified version is equal or higher than the version of the stored document or if there is no existing document. NOTE: Theexternal_gte
version type is meant for special use cases and should be used with care. If used incorrectly, it can result in loss of data.force
: This option is deprecated because it can cause primary and replica shards to diverge.
Values are
internal
,external
,external_gte
, orforce
.
Body
-
doc
object An artificial document (a document not present in the index) for which you want to retrieve term vectors.
-
filter
object -
per_field_analyzer
object Override the default per-field analyzer. This is useful in order to generate term vectors in any fashion, especially when using artificial documents. When providing an analyzer for a field that already stores term vectors, the term vectors will be regenerated.
-
fields
string | array[string] -
field_statistics
boolean If
true
, the response includes:- The document count (how many documents contain this field).
- The sum of document frequencies (the sum of document frequencies for all terms in this field).
- The sum of total term frequencies (the sum of total term frequencies of each term in this field).
-
offsets
boolean If
true
, the response includes term offsets. -
payloads
boolean If
true
, the response includes term payloads. -
positions
boolean If
true
, the response includes term positions. -
term_statistics
boolean If
true
, the response includes:- The total term frequency (how often a term occurs in all documents).
- The document frequency (the number of documents containing the current term).
By default these values are not returned since term statistics can have a serious performance impact.
-
routing
string -
version
number -
version_type
string Values are
internal
,external
,external_gte
, orforce
.
GET /my-index-000001/_termvectors/1
{
"fields" : ["text"],
"offsets" : true,
"payloads" : true,
"positions" : true,
"term_statistics" : true,
"field_statistics" : true
}
curl \
--request POST 'http://api.example.com/{index}/_termvectors/{id}' \
--header "Authorization: $API_KEY" \
--header "Content-Type: application/json" \
--data '"{\n \"fields\" : [\"text\"],\n \"offsets\" : true,\n \"payloads\" : true,\n \"positions\" : true,\n \"term_statistics\" : true,\n \"field_statistics\" : true\n}"'
{
"fields" : ["text"],
"offsets" : true,
"payloads" : true,
"positions" : true,
"term_statistics" : true,
"field_statistics" : true
}
{
"doc" : {
"fullname" : "John Doe",
"text" : "test test test"
},
"fields": ["fullname"],
"per_field_analyzer" : {
"fullname": "keyword"
}
}
{
"doc": {
"plot": "When wealthy industrialist Tony Stark is forced to build an armored suit after a life-threatening incident, he ultimately decides to use its technology to fight against evil."
},
"term_statistics": true,
"field_statistics": true,
"positions": false,
"offsets": false,
"filter": {
"max_num_terms": 3,
"min_term_freq": 1,
"min_doc_freq": 1
}
}
{
"fields" : ["text", "some_field_without_term_vectors"],
"offsets" : true,
"positions" : true,
"term_statistics" : true,
"field_statistics" : true
}
{
"doc" : {
"fullname" : "John Doe",
"text" : "test test test"
}
}
{
"_index": "my-index-000001",
"_id": "1",
"_version": 1,
"found": true,
"took": 6,
"term_vectors": {
"text": {
"field_statistics": {
"sum_doc_freq": 4,
"doc_count": 2,
"sum_ttf": 6
},
"terms": {
"test": {
"doc_freq": 2,
"ttf": 4,
"term_freq": 3,
"tokens": [
{
"position": 0,
"start_offset": 0,
"end_offset": 4,
"payload": "d29yZA=="
},
{
"position": 1,
"start_offset": 5,
"end_offset": 9,
"payload": "d29yZA=="
},
{
"position": 2,
"start_offset": 10,
"end_offset": 14,
"payload": "d29yZA=="
}
]
}
}
}
}
}
{
"_index": "my-index-000001",
"_version": 0,
"found": true,
"took": 6,
"term_vectors": {
"fullname": {
"field_statistics": {
"sum_doc_freq": 2,
"doc_count": 4,
"sum_ttf": 4
},
"terms": {
"John Doe": {
"term_freq": 1,
"tokens": [
{
"position": 0,
"start_offset": 0,
"end_offset": 8
}
]
}
}
}
}
}
{
"_index": "imdb",
"_version": 0,
"found": true,
"term_vectors": {
"plot": {
"field_statistics": {
"sum_doc_freq": 3384269,
"doc_count": 176214,
"sum_ttf": 3753460
},
"terms": {
"armored": {
"doc_freq": 27,
"ttf": 27,
"term_freq": 1,
"score": 9.74725
},
"industrialist": {
"doc_freq": 88,
"ttf": 88,
"term_freq": 1,
"score": 8.590818
},
"stark": {
"doc_freq": 44,
"ttf": 47,
"term_freq": 1,
"score": 9.272792
}
}
}
}
}
Get term vector information
Get information and statistics about terms in the fields of a particular document.
You can retrieve term vectors for documents stored in the index or for artificial documents passed in the body of the request.
You can specify the fields you are interested in through the fields
parameter or by adding the fields to the request body.
For example:
GET /my-index-000001/_termvectors/1?fields=message
Fields can be specified using wildcards, similar to the multi match query.
Term vectors are real-time by default, not near real-time.
This can be changed by setting realtime
parameter to false
.
You can request three types of values: term information, term statistics, and field statistics. By default, all term information and field statistics are returned for all fields but term statistics are excluded.
Term information
- term frequency in the field (always returned)
- term positions (
positions: true
) - start and end offsets (
offsets: true
) - term payloads (
payloads: true
), as base64 encoded bytes
If the requested information wasn't stored in the index, it will be computed on the fly if possible. Additionally, term vectors could be computed for documents not even existing in the index, but instead provided by the user.
Start and end offsets assume UTF-16 encoding is being used. If you want to use these offsets in order to get the original text that produced this token, you should make sure that the string you are taking a sub-string of is also encoded using UTF-16.
Behaviour
The term and field statistics are not accurate.
Deleted documents are not taken into account.
The information is only retrieved for the shard the requested document resides in.
The term and field statistics are therefore only useful as relative measures whereas the absolute numbers have no meaning in this context.
By default, when requesting term vectors of artificial documents, a shard to get the statistics from is randomly selected.
Use routing
only to hit a particular shard.
Path parameters
-
index
string Required The name of the index that contains the document.
Query parameters
-
fields
string | array[string] A comma-separated list or wildcard expressions of fields to include in the statistics. It is used as the default list unless a specific field list is provided in the
completion_fields
orfielddata_fields
parameters. -
field_statistics
boolean If
true
, the response includes:- The document count (how many documents contain this field).
- The sum of document frequencies (the sum of document frequencies for all terms in this field).
- The sum of total term frequencies (the sum of total term frequencies of each term in this field).
-
offsets
boolean If
true
, the response includes term offsets. -
payloads
boolean If
true
, the response includes term payloads. -
positions
boolean If
true
, the response includes term positions. -
preference
string The node or shard the operation should be performed on. It is random by default.
-
realtime
boolean If true, the request is real-time as opposed to near-real-time.
-
routing
string A custom value that is used to route operations to a specific shard.
-
term_statistics
boolean If
true
, the response includes:- The total term frequency (how often a term occurs in all documents).
- The document frequency (the number of documents containing the current term).
By default these values are not returned since term statistics can have a serious performance impact.
-
version
number If
true
, returns the document version as part of a hit. -
version_type
string The version type.
Supported values include:
internal
: Use internal versioning that starts at 1 and increments with each update or delete.external
: Only index the document if the specified version is strictly higher than the version of the stored document or if there is no existing document.external_gte
: Only index the document if the specified version is equal or higher than the version of the stored document or if there is no existing document. NOTE: Theexternal_gte
version type is meant for special use cases and should be used with care. If used incorrectly, it can result in loss of data.force
: This option is deprecated because it can cause primary and replica shards to diverge.
Values are
internal
,external
,external_gte
, orforce
.
Body
-
doc
object An artificial document (a document not present in the index) for which you want to retrieve term vectors.
-
filter
object -
per_field_analyzer
object Override the default per-field analyzer. This is useful in order to generate term vectors in any fashion, especially when using artificial documents. When providing an analyzer for a field that already stores term vectors, the term vectors will be regenerated.
-
fields
string | array[string] -
field_statistics
boolean If
true
, the response includes:- The document count (how many documents contain this field).
- The sum of document frequencies (the sum of document frequencies for all terms in this field).
- The sum of total term frequencies (the sum of total term frequencies of each term in this field).
-
offsets
boolean If
true
, the response includes term offsets. -
payloads
boolean If
true
, the response includes term payloads. -
positions
boolean If
true
, the response includes term positions. -
term_statistics
boolean If
true
, the response includes:- The total term frequency (how often a term occurs in all documents).
- The document frequency (the number of documents containing the current term).
By default these values are not returned since term statistics can have a serious performance impact.
-
routing
string -
version
number -
version_type
string Values are
internal
,external
,external_gte
, orforce
.
GET /my-index-000001/_termvectors/1
{
"fields" : ["text"],
"offsets" : true,
"payloads" : true,
"positions" : true,
"term_statistics" : true,
"field_statistics" : true
}
curl \
--request GET 'http://api.example.com/{index}/_termvectors' \
--header "Authorization: $API_KEY" \
--header "Content-Type: application/json" \
--data '"{\n \"fields\" : [\"text\"],\n \"offsets\" : true,\n \"payloads\" : true,\n \"positions\" : true,\n \"term_statistics\" : true,\n \"field_statistics\" : true\n}"'
{
"fields" : ["text"],
"offsets" : true,
"payloads" : true,
"positions" : true,
"term_statistics" : true,
"field_statistics" : true
}
{
"doc" : {
"fullname" : "John Doe",
"text" : "test test test"
},
"fields": ["fullname"],
"per_field_analyzer" : {
"fullname": "keyword"
}
}
{
"doc": {
"plot": "When wealthy industrialist Tony Stark is forced to build an armored suit after a life-threatening incident, he ultimately decides to use its technology to fight against evil."
},
"term_statistics": true,
"field_statistics": true,
"positions": false,
"offsets": false,
"filter": {
"max_num_terms": 3,
"min_term_freq": 1,
"min_doc_freq": 1
}
}
{
"fields" : ["text", "some_field_without_term_vectors"],
"offsets" : true,
"positions" : true,
"term_statistics" : true,
"field_statistics" : true
}
{
"doc" : {
"fullname" : "John Doe",
"text" : "test test test"
}
}
{
"_index": "my-index-000001",
"_id": "1",
"_version": 1,
"found": true,
"took": 6,
"term_vectors": {
"text": {
"field_statistics": {
"sum_doc_freq": 4,
"doc_count": 2,
"sum_ttf": 6
},
"terms": {
"test": {
"doc_freq": 2,
"ttf": 4,
"term_freq": 3,
"tokens": [
{
"position": 0,
"start_offset": 0,
"end_offset": 4,
"payload": "d29yZA=="
},
{
"position": 1,
"start_offset": 5,
"end_offset": 9,
"payload": "d29yZA=="
},
{
"position": 2,
"start_offset": 10,
"end_offset": 14,
"payload": "d29yZA=="
}
]
}
}
}
}
}
{
"_index": "my-index-000001",
"_version": 0,
"found": true,
"took": 6,
"term_vectors": {
"fullname": {
"field_statistics": {
"sum_doc_freq": 2,
"doc_count": 4,
"sum_ttf": 4
},
"terms": {
"John Doe": {
"term_freq": 1,
"tokens": [
{
"position": 0,
"start_offset": 0,
"end_offset": 8
}
]
}
}
}
}
}
{
"_index": "imdb",
"_version": 0,
"found": true,
"term_vectors": {
"plot": {
"field_statistics": {
"sum_doc_freq": 3384269,
"doc_count": 176214,
"sum_ttf": 3753460
},
"terms": {
"armored": {
"doc_freq": 27,
"ttf": 27,
"term_freq": 1,
"score": 9.74725
},
"industrialist": {
"doc_freq": 88,
"ttf": 88,
"term_freq": 1,
"score": 8.590818
},
"stark": {
"doc_freq": 44,
"ttf": 47,
"term_freq": 1,
"score": 9.272792
}
}
}
}
}
Query parameters
-
master_timeout
string Period to wait for a connection to the master node.
Values are
-1
or0
.
curl \
--request GET 'http://api.example.com/_enrich/policy' \
--header "Authorization: $API_KEY"
Get async EQL search results
Added in 7.9.0
Get the current status and available results for an async EQL search or a stored synchronous EQL search.
Path parameters
-
id
string Required Identifier for the search.
Query parameters
-
keep_alive
string Period for which the search and its results are stored on the cluster. Defaults to the keep_alive value set by the search’s EQL search API request.
Values are
-1
or0
. -
wait_for_completion_timeout
string Timeout duration to wait for the request to finish. Defaults to no timeout, meaning the request waits for complete search results.
Values are
-1
or0
.
curl \
--request GET 'http://api.example.com/_eql/search/{id}' \
--header "Authorization: $API_KEY"
Delete an async EQL search
Added in 7.9.0
Delete an async EQL search or a stored synchronous EQL search. The API also deletes results for the search.
Path parameters
-
id
string Required Identifier for the search to delete. A search ID is provided in the EQL search API's response for an async search. A search ID is also provided if the request’s
keep_on_completion
parameter istrue
.
curl \
--request DELETE 'http://api.example.com/_eql/search/{id}' \
--header "Authorization: $API_KEY"
Delete an async ES|QL query
Added in 8.13.0
If the query is still running, it is cancelled. Otherwise, the stored results are deleted.
If the Elasticsearch security features are enabled, only the following users can use this API to delete a query:
- The authenticated user that submitted the original query request
- Users with the
cancel_task
cluster privilege
Path parameters
-
id
string Required The unique identifier of the query. A query ID is provided in the ES|QL async query API response for a query that does not complete in the designated time. A query ID is also provided when the request was submitted with the
keep_on_completion
parameter set totrue
.
curl \
--request DELETE 'http://api.example.com/_query/async/{id}' \
--header "Authorization: $API_KEY"
Reset the features
Technical preview
Clear all of the state information stored in system indices by Elasticsearch features, including the security and machine learning indices.
WARNING: Intended for development and testing use only. Do not reset features on a production cluster.
Return a cluster to the same state as a new installation by resetting the feature state for all Elasticsearch features. This deletes all state information stored in system indices.
The response code is HTTP 200 if the state is successfully reset for all features. It is HTTP 500 if the reset operation failed for any feature.
Note that select features might provide a way to reset particular system indices. Using this API resets all features, both those that are built-in and implemented as plugins.
To list the features that will be affected, use the get features API.
IMPORTANT: The features installed on the node you submit this request to are the features that will be reset. Run on the master node if you have any doubts about which plugins are installed on individual nodes.
Query parameters
-
master_timeout
string Period to wait for a connection to the master node.
Values are
-1
or0
.
curl \
--request POST 'http://api.example.com/_features/_reset' \
--header "Authorization: $API_KEY"
{
"features" : [
{
"feature_name" : "security",
"status" : "SUCCESS"
},
{
"feature_name" : "tasks",
"status" : "SUCCESS"
}
]
}
Run a Fleet search
Technical preview
The purpose of the Fleet search API is to provide an API where the search will be run only after the provided checkpoint has been processed and is visible for searches inside of Elasticsearch.
Path parameters
-
index
string Required A single target to search. If the target is an index alias, it must resolve to a single index.
Query parameters
-
allow_no_indices
boolean -
analyzer
string -
analyze_wildcard
boolean -
batched_reduce_size
number -
ccs_minimize_roundtrips
boolean -
default_operator
string Values are
and
,AND
,or
, orOR
. -
df
string -
docvalue_fields
string | array[string] -
expand_wildcards
string | array[string] Supported values include:
all
: Match any data stream or index, including hidden ones.open
: Match open, non-hidden indices. Also matches any non-hidden data stream.closed
: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.hidden
: Match hidden data streams and hidden indices. Must be combined withopen
,closed
, orboth
.none
: Wildcard expressions are not accepted.
Values are
all
,open
,closed
,hidden
, ornone
. -
explain
boolean -
ignore_throttled
boolean -
lenient
boolean -
preference
string -
pre_filter_shard_size
number -
request_cache
boolean -
routing
string -
scroll
string A duration. Units can be
nanos
,micros
,ms
(milliseconds),s
(seconds),m
(minutes),h
(hours) andd
(days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.Values are
-1
or0
. -
search_type
string Supported values include:
query_then_fetch
: Documents are scored using local term and document frequencies for the shard. This is usually faster but less accurate.dfs_query_then_fetch
: Documents are scored using global term and document frequencies across all shards. This is usually slower but more accurate.
Values are
query_then_fetch
ordfs_query_then_fetch
. -
stats
array[string] -
stored_fields
string | array[string] -
suggest_field
string Specifies which field to use for suggestions.
-
suggest_mode
string Supported values include:
missing
: Only generate suggestions for terms that are not in the shard.popular
: Only suggest terms that occur in more docs on the shard than the original term.always
: Suggest any matching suggestions based on terms in the suggest text.
Values are
missing
,popular
, oralways
. -
suggest_size
number -
suggest_text
string The source text for which the suggestions should be returned.
-
terminate_after
number -
timeout
string A duration. Units can be
nanos
,micros
,ms
(milliseconds),s
(seconds),m
(minutes),h
(hours) andd
(days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.Values are
-1
or0
. -
track_total_hits
boolean | number Number of hits matching the query to count accurately. If true, the exact number of hits is returned at the cost of some performance. If false, the response does not include the total number of hits matching the query. Defaults to 10,000 hits.
-
track_scores
boolean -
typed_keys
boolean -
rest_total_hits_as_int
boolean -
version
boolean -
_source
boolean | string | array[string] Defines how to fetch a source. Fetching can be disabled entirely, or the source can be filtered. Used as a query parameter along with the
_source_includes
and_source_excludes
parameters. -
_source_excludes
string | array[string] -
_source_includes
string | array[string] -
seq_no_primary_term
boolean -
q
string -
size
number -
from
number -
sort
string | array[string] -
wait_for_checkpoints
array[number] A comma separated list of checkpoints. When configured, the search API will only be executed on a shard after the relevant checkpoint has become visible for search. Defaults to an empty list which will cause Elasticsearch to immediately execute the search.
-
allow_partial_search_results
boolean If true, returns partial results if there are shard request timeouts or shard failures. If false, returns an error with no partial results. Defaults to the configured cluster setting
search.default_allow_partial_results
, which is true by default.
Body
-
aggregations
object -
collapse
object External documentation -
explain
boolean If true, returns detailed information about score computation as part of a hit.
-
ext
object Configuration of search extensions defined by Elasticsearch plugins.
-
from
number Starting document offset. By default, you cannot page through more than 10,000 hits using the from and size parameters. To page through more hits, use the search_after parameter.
-
highlight
object -
track_total_hits
boolean | number Number of hits matching the query to count accurately. If true, the exact number of hits is returned at the cost of some performance. If false, the response does not include the total number of hits matching the query. Defaults to 10,000 hits.
-
indices_boost
array[object] Boosts the _score of documents from specified indices.
-
docvalue_fields
array[object] Array of wildcard (*) patterns. The request returns doc values for field names matching these patterns in the hits.fields property of the response.
-
min_score
number Minimum _score for matching documents. Documents with a lower _score are not included in search results and results collected by aggregations.
-
post_filter
object An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.
External documentation -
profile
boolean -
query
object An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.
External documentation rescore
object | array[object] -
script_fields
object Retrieve a script evaluation (based on different fields) for each hit.
-
search_after
array[number | string | boolean | null] A field value.
-
size
number The number of hits to return. By default, you cannot page through more than 10,000 hits using the from and size parameters. To page through more hits, use the search_after parameter.
-
slice
object _source
boolean | object Defines how to fetch a source. Fetching can be disabled entirely, or the source can be filtered.
-
fields
array[object] Array of wildcard (*) patterns. The request returns values for field names matching these patterns in the hits.fields property of the response.
-
suggest
object -
terminate_after
number Maximum number of documents to collect for each shard. If a query reaches this limit, Elasticsearch terminates the query early. Elasticsearch collects documents before sorting. Defaults to 0, which does not terminate query execution early.
-
timeout
string Specifies the period of time to wait for a response from each shard. If no response is received before the timeout expires, the request fails and returns an error. Defaults to no timeout.
-
track_scores
boolean If true, calculate and return document scores, even if the scores are not used for sorting.
-
version
boolean If true, returns document version as part of a hit.
-
seq_no_primary_term
boolean If true, returns sequence number and primary term of the last modification of each hit. See Optimistic concurrency control.
-
stored_fields
string | array[string] -
pit
object -
runtime_mappings
object -
stats
array[string] Stats groups to associate with the search. Each group maintains a statistics aggregation for its associated searches. You can retrieve these stats using the indices stats API.
curl \
--request POST 'http://api.example.com/{index}/_fleet/_fleet_search' \
--header "Authorization: $API_KEY" \
--header "Content-Type: application/json" \
--data '{"aggregations":{},"collapse":{},"explain":true,"ext":{"additionalProperty1":{},"additionalProperty2":{}},"from":42.0,"highlight":{"":"plain","boundary_chars":"string","boundary_max_scan":42.0,"boundary_scanner":"chars","boundary_scanner_locale":"string","force_source":true,"fragmenter":"simple","fragment_size":42.0,"highlight_filter":true,"highlight_query":{},"max_fragment_length":42.0,"max_analyzed_offset":42.0,"no_match_size":42.0,"number_of_fragments":42.0,"options":{"additionalProperty1":{},"additionalProperty2":{}},"order":"score","phrase_limit":42.0,"post_tags":["string"],"pre_tags":["string"],"require_field_match":true,"tags_schema":"styled","encoder":"default","fields":{}},"track_total_hits":true,"indices_boost":[{"additionalProperty1":42.0,"additionalProperty2":42.0}],"docvalue_fields":[{"field":"string","format":"string","include_unmapped":true}],"min_score":42.0,"post_filter":{},"profile":true,"query":{},"rescore":{"window_size":42.0,"query":{"rescore_query":{},"query_weight":42.0,"rescore_query_weight":42.0,"score_mode":"avg"},"learning_to_rank":{"model_id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}}}},"script_fields":{"additionalProperty1":{"script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"ignore_failure":true},"additionalProperty2":{"script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"ignore_failure":true}},"search_after":[42.0],"size":42.0,"slice":{"field":"string","id":"string","max":42.0},"":true,"fields":[{"field":"string","format":"string","include_unmapped":true}],"suggest":{"text":"string"},"terminate_after":42.0,"timeout":"string","track_scores":true,"version":true,"seq_no_primary_term":true,"stored_fields":"string","pit":{"id":"string","keep_alive":"string"},"runtime_mappings":{"additionalProperty1":{"fields":{"additionalProperty1":{"type":"boolean"},"additionalProperty2":{"type":"boolean"}},"fetch_fields":[{"field":"string","format":"string"}],"format":"string","input_field":"string","target_field":"string","target_index":"string","script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"type":"boolean"},"additionalProperty2":{"fields":{"additionalProperty1":{"type":"boolean"},"additionalProperty2":{"type":"boolean"}},"fetch_fields":[{"field":"string","format":"string"}],"format":"string","input_field":"string","target_field":"string","target_index":"string","script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"type":"boolean"}},"stats":["string"]}'
Get component templates
Added in 7.8.0
Get information about component templates.
Path parameters
-
name
string Required Comma-separated list of component template names used to limit the request. Wildcard (
*
) expressions are supported.
Query parameters
-
flat_settings
boolean If
true
, returns settings in flat format. -
include_defaults
boolean Return all default configurations for the component template (default: false)
-
local
boolean If
true
, the request retrieves information from the local node only. Iffalse
, information is retrieved from the master node. -
master_timeout
string Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.
Values are
-1
or0
.
curl \
--request GET 'http://api.example.com/_component_template/{name}' \
--header "Authorization: $API_KEY"
Get index recovery information
Get information about ongoing and completed shard recoveries for one or more indices. For data streams, the API returns information for the stream's backing indices.
All recoveries, whether ongoing or complete, are kept in the cluster state and may be reported on at any time.
Shard recovery is the process of initializing a shard copy, such as restoring a primary shard from a snapshot or creating a replica shard from a primary shard. When a shard recovery completes, the recovered shard is available for search and indexing.
Recovery automatically occurs during the following processes:
- When creating an index for the first time.
- When a node rejoins the cluster and starts up any missing primary shard copies using the data that it holds in its data path.
- Creation of new replica shard copies from the primary.
- Relocation of a shard copy to a different node in the same cluster.
- A snapshot restore operation.
- A clone, shrink, or split operation.
You can determine the cause of a shard recovery using the recovery or cat recovery APIs.
The index recovery API reports information about completed recoveries only for shard copies that currently exist in the cluster. It only reports the last recovery for each shard copy and does not report historical information about earlier recoveries, nor does it report information about the recoveries of shard copies that no longer exist. This means that if a shard copy completes a recovery and then Elasticsearch relocates it onto a different node then the information about the original recovery will not be shown in the recovery API.
Path parameters
-
index
string | array[string] Required Comma-separated list of data streams, indices, and aliases used to limit the request. Supports wildcards (
*
). To target all data streams and indices, omit this parameter or use*
or_all
.
Query parameters
-
active_only
boolean If
true
, the response only includes ongoing shard recoveries. -
detailed
boolean If
true
, the response includes detailed information about shard recoveries.
GET /_recovery?human
curl \
--request GET 'http://api.example.com/{index}/_recovery' \
--header "Authorization: $API_KEY"
{
"index1" : {
"shards" : [ {
"id" : 0,
"type" : "SNAPSHOT",
"stage" : "INDEX",
"primary" : true,
"start_time" : "2014-02-24T12:15:59.716",
"start_time_in_millis": 1393244159716,
"stop_time" : "0s",
"stop_time_in_millis" : 0,
"total_time" : "2.9m",
"total_time_in_millis" : 175576,
"source" : {
"repository" : "my_repository",
"snapshot" : "my_snapshot",
"index" : "index1",
"version" : "{version}",
"restoreUUID": "PDh1ZAOaRbiGIVtCvZOMww"
},
"target" : {
"id" : "ryqJ5lO5S4-lSFbGntkEkg",
"host" : "my.fqdn",
"transport_address" : "my.fqdn",
"ip" : "10.0.1.7",
"name" : "my_es_node"
},
"index" : {
"size" : {
"total" : "75.4mb",
"total_in_bytes" : 79063092,
"reused" : "0b",
"reused_in_bytes" : 0,
"recovered" : "65.7mb",
"recovered_in_bytes" : 68891939,
"recovered_from_snapshot" : "0b",
"recovered_from_snapshot_in_bytes" : 0,
"percent" : "87.1%"
},
"files" : {
"total" : 73,
"reused" : 0,
"recovered" : 69,
"percent" : "94.5%"
},
"total_time" : "0s",
"total_time_in_millis" : 0,
"source_throttle_time" : "0s",
"source_throttle_time_in_millis" : 0,
"target_throttle_time" : "0s",
"target_throttle_time_in_millis" : 0
},
"translog" : {
"recovered" : 0,
"total" : 0,
"percent" : "100.0%",
"total_on_start" : 0,
"total_time" : "0s",
"total_time_in_millis" : 0
},
"verify_index" : {
"check_index_time" : "0s",
"check_index_time_in_millis" : 0,
"total_time" : "0s",
"total_time_in_millis" : 0
}
} ]
}
}
{
"index1" : {
"shards" : [ {
"id" : 0,
"type" : "EXISTING_STORE",
"stage" : "DONE",
"primary" : true,
"start_time" : "2014-02-24T12:38:06.349",
"start_time_in_millis" : "1393245486349",
"stop_time" : "2014-02-24T12:38:08.464",
"stop_time_in_millis" : "1393245488464",
"total_time" : "2.1s",
"total_time_in_millis" : 2115,
"source" : {
"id" : "RGMdRc-yQWWKIBM4DGvwqQ",
"host" : "my.fqdn",
"transport_address" : "my.fqdn",
"ip" : "10.0.1.7",
"name" : "my_es_node"
},
"target" : {
"id" : "RGMdRc-yQWWKIBM4DGvwqQ",
"host" : "my.fqdn",
"transport_address" : "my.fqdn",
"ip" : "10.0.1.7",
"name" : "my_es_node"
},
"index" : {
"size" : {
"total" : "24.7mb",
"total_in_bytes" : 26001617,
"reused" : "24.7mb",
"reused_in_bytes" : 26001617,
"recovered" : "0b",
"recovered_in_bytes" : 0,
"recovered_from_snapshot" : "0b",
"recovered_from_snapshot_in_bytes" : 0,
"percent" : "100.0%"
},
"files" : {
"total" : 26,
"reused" : 26,
"recovered" : 0,
"percent" : "100.0%",
"details" : [ {
"name" : "segments.gen",
"length" : 20,
"recovered" : 20
}, {
"name" : "_0.cfs",
"length" : 135306,
"recovered" : 135306,
"recovered_from_snapshot": 0
}, {
"name" : "segments_2",
"length" : 251,
"recovered" : 251,
"recovered_from_snapshot": 0
}
]
},
"total_time" : "2ms",
"total_time_in_millis" : 2,
"source_throttle_time" : "0s",
"source_throttle_time_in_millis" : 0,
"target_throttle_time" : "0s",
"target_throttle_time_in_millis" : 0
},
"translog" : {
"recovered" : 71,
"total" : 0,
"percent" : "100.0%",
"total_on_start" : 0,
"total_time" : "2.0s",
"total_time_in_millis" : 2025
},
"verify_index" : {
"check_index_time" : 0,
"check_index_time_in_millis" : 0,
"total_time" : "88ms",
"total_time_in_millis" : 88
}
} ]
}
}
Refresh an index
A refresh makes recent operations performed on one or more indices available for search. For data streams, the API runs the refresh operation on the stream’s backing indices.
By default, Elasticsearch periodically refreshes indices every second, but only on indices that have received one search request or more in the last 30 seconds.
You can change this default interval with the index.refresh_interval
setting.
Refresh requests are synchronous and do not return a response until the refresh operation completes.
Refreshes are resource-intensive. To ensure good cluster performance, it's recommended to wait for Elasticsearch's periodic refresh rather than performing an explicit refresh when possible.
If your application workflow indexes documents and then runs a search to retrieve the indexed document, it's recommended to use the index API's refresh=wait_for
query parameter option.
This option ensures the indexing operation waits for a periodic refresh before running the search.
Query parameters
-
allow_no_indices
boolean If
false
, the request returns an error if any wildcard expression, index alias, or_all
value targets only missing or closed indices. This behavior applies even if the request targets other open indices. -
expand_wildcards
string | array[string] Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as
open,hidden
. Valid values are:all
,open
,closed
,hidden
,none
.Supported values include:
all
: Match any data stream or index, including hidden ones.open
: Match open, non-hidden indices. Also matches any non-hidden data stream.closed
: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.hidden
: Match hidden data streams and hidden indices. Must be combined withopen
,closed
, orboth
.none
: Wildcard expressions are not accepted.
Values are
all
,open
,closed
,hidden
, ornone
.
curl \
--request GET 'http://api.example.com/_refresh' \
--header "Authorization: $API_KEY"
curl \
--request GET 'http://api.example.com/_inference' \
--header "Authorization: $API_KEY"
Create a Google Vertex AI inference endpoint
Added in 8.15.0
Create an inference endpoint to perform an inference task with the googlevertexai
service.
Path parameters
-
task_type
string Required The type of the inference task that the model will perform.
Values are
rerank
ortext_embedding
. -
googlevertexai_inference_id
string Required The unique identifier of the inference endpoint.
Body
-
chunking_settings
object -
service
string Required Value is
googlevertexai
. -
service_settings
object Required -
task_settings
object
PUT _inference/text_embedding/google_vertex_ai_embeddingss
{
"service": "googlevertexai",
"service_settings": {
"service_account_json": "service-account-json",
"model_id": "model-id",
"location": "location",
"project_id": "project-id"
}
}
curl \
--request PUT 'http://api.example.com/_inference/{task_type}/{googlevertexai_inference_id}' \
--header "Authorization: $API_KEY" \
--header "Content-Type: application/json" \
--data '"{\n \"service\": \"googlevertexai\",\n \"service_settings\": {\n \"service_account_json\": \"service-account-json\",\n \"model_id\": \"model-id\",\n \"location\": \"location\",\n \"project_id\": \"project-id\"\n }\n}"'
{
"service": "googlevertexai",
"service_settings": {
"service_account_json": "service-account-json",
"model_id": "model-id",
"location": "location",
"project_id": "project-id"
}
}
{
"service": "googlevertexai",
"service_settings": {
"service_account_json": "service-account-json",
"project_id": "project-id"
}
}
Licensing
Licensing APIs enable you to manage your licenses.
Script
Use the script support APIs to get a list of supported script contexts and languages. Use the stored script APIs to manage stored scripts and search templates.
Cancel a task
Technical preview
WARNING: The task management API is new and should still be considered a beta feature. The API may change in ways that are not backwards compatible.
A task may continue to run for some time after it has been cancelled because it may not be able to safely stop its current activity straight away. It is also possible that Elasticsearch must complete its work on other tasks before it can process the cancellation. The get task information API will continue to list these cancelled tasks until they complete. The cancelled flag in the response indicates that the cancellation command has been processed and the task will stop as soon as possible.
To troubleshoot why a cancelled task does not complete promptly, use the get task information API with the ?detailed
parameter to identify the other tasks the system is running.
You can also use the node hot threads API to obtain detailed information about the work the system is doing instead of completing the cancelled task.
Query parameters
-
actions
string | array[string] A comma-separated list or wildcard expression of actions that is used to limit the request.
-
nodes
array[string] A comma-separated list of node IDs or names that is used to limit the request.
-
parent_task_id
string A parent task ID that is used to limit the tasks.
-
wait_for_completion
boolean If true, the request blocks until all found tasks are complete.
curl \
--request POST 'http://api.example.com/_tasks/_cancel' \
--header "Authorization: $API_KEY"
Get task information
Technical preview
Get information about a task currently running in the cluster.
WARNING: The task management API is new and should still be considered a beta feature. The API may change in ways that are not backwards compatible.
If the task identifier is not found, a 404 response code indicates that there are no resources that match the request.
Path parameters
-
task_id
string Required The task identifier.
Query parameters
-
timeout
string The period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.
Values are
-1
or0
. -
wait_for_completion
boolean If
true
, the request blocks until the task has completed.
GET _tasks?actions=cluster:*
curl \
--request GET 'http://api.example.com/_tasks/{task_id}' \
--header "Authorization: $API_KEY"
{
"nodes" : {
"oTUltX4IQMOUUVeiohTt8A" : {
"name" : "H5dfFeA",
"transport_address" : "127.0.0.1:9300",
"host" : "127.0.0.1",
"ip" : "127.0.0.1:9300",
"tasks" : {
"oTUltX4IQMOUUVeiohTt8A:124" : {
"node" : "oTUltX4IQMOUUVeiohTt8A",
"id" : 124,
"type" : "direct",
"action" : "cluster:monitor/tasks/lists[n]",
"start_time_in_millis" : 1458585884904,
"running_time_in_nanos" : 47402,
"cancellable" : false,
"parent_task_id" : "oTUltX4IQMOUUVeiohTt8A:123"
},
"oTUltX4IQMOUUVeiohTt8A:123" : {
"node" : "oTUltX4IQMOUUVeiohTt8A",
"id" : 123,
"type" : "transport",
"action" : "cluster:monitor/tasks/lists",
"start_time_in_millis" : 1458585884904,
"running_time_in_nanos" : 236042,
"cancellable" : false
}
}
}
}
}
{
"nodes" : {
"r1A2WoRbTwKZ516z6NEs5A" : {
"name" : "r1A2WoR",
"transport_address" : "127.0.0.1:9300",
"host" : "127.0.0.1",
"ip" : "127.0.0.1:9300",
"attributes" : {
"testattr" : "test",
"portsfile" : "true"
},
"tasks" : {
"r1A2WoRbTwKZ516z6NEs5A:36619" : {
"node" : "r1A2WoRbTwKZ516z6NEs5A",
"id" : 36619,
"type" : "transport",
"action" : "indices:data/write/delete/byquery",
"status" : {
"total" : 6154,
"updated" : 0,
"created" : 0,
"deleted" : 3500,
"batches" : 36,
"version_conflicts" : 0,
"noops" : 0,
"retries": 0,
"throttled_millis": 0
},
"description" : ""
}
}
}
}
}