Authentication

Api key auth (http_api_key)

Elasticsearch APIs use key-based authentication. You must create an API key and use the encoded value in the request header. For example:

curl -X GET "${ES_URL}/_cat/indices?v=true" \
  -H "Authorization: ApiKey ${API_KEY}"

For more information about where to find API keys for the Elasticsearch endpoint (${ES_URL}) for a project, go to Get started with Elasticsearch Serverless.

Behavioral analytics

Get behavioral analytics collections Deprecated Technical preview

GET /_application/analytics/{name}

All methods and paths for this operation:

GET /_application/analytics

GET /_application/analytics/{name}

Path parameters

  • name array[string] Required

    A list of analytics collections to limit the returned information

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • * object Additional properties
      Hide * attribute Show * attribute object
      • event_data_stream object Required
        Hide event_data_stream attribute Show event_data_stream attribute object
        • name string Required
GET /_application/analytics/{name}
GET _application/analytics/my*
resp = client.search_application.get_behavioral_analytics(
    name="my*",
)
const response = await client.searchApplication.getBehavioralAnalytics({
  name: "my*",
});
response = client.search_application.get_behavioral_analytics(
  name: "my*"
)
$resp = $client->searchApplication()->getBehavioralAnalytics([
    "name" => "my*",
]);
curl -X GET -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/_application/analytics/my*"
client.searchApplication().getBehavioralAnalytics(g -> g
    .name("my*")
);
Response examples (200)
A successful response from `GET _application/analytics/my*`
{
  "my_analytics_collection": {
      "event_data_stream": {
          "name": "behavioral_analytics-events-my_analytics_collection"
      }
  },
  "my_analytics_collection2": {
      "event_data_stream": {
          "name": "behavioral_analytics-events-my_analytics_collection2"
      }
  }
}

Create a behavioral analytics collection Deprecated Technical preview

PUT /_application/analytics/{name}

Path parameters

  • name string Required

    The name of the analytics collection to be created or updated.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

    • name string Required
PUT /_application/analytics/{name}
PUT _application/analytics/my_analytics_collection
resp = client.search_application.put_behavioral_analytics(
    name="my_analytics_collection",
)
const response = await client.searchApplication.putBehavioralAnalytics({
  name: "my_analytics_collection",
});
response = client.search_application.put_behavioral_analytics(
  name: "my_analytics_collection"
)
$resp = $client->searchApplication()->putBehavioralAnalytics([
    "name" => "my_analytics_collection",
]);
curl -X PUT -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/_application/analytics/my_analytics_collection"
client.searchApplication().putBehavioralAnalytics(p -> p
    .name("my_analytics_collection")
);





Get aliases Generally available

GET /_cat/aliases/{name}

All methods and paths for this operation:

GET /_cat/aliases

GET /_cat/aliases/{name}

Get the cluster's index aliases, including filter and routing information. This API does not return data stream aliases.

IMPORTANT: CAT APIs are only intended for human consumption using the command line or the Kibana console. They are not intended for use by applications. For application consumption, use the aliases API.

Required authorization

  • Index privileges: view_index_metadata

Path parameters

  • name string | array[string]

    A comma-separated list of aliases to retrieve. Supports wildcards (*). To retrieve all aliases, omit this parameter or use * or _all.

Query parameters

  • h string | array[string]

    List of columns to appear in the response. Supports simple wildcards.

  • s string | array[string]

    List of columns that determine how the table should be sorted. Sorting defaults to ascending and can be changed by setting :asc or :desc as a suffix to the column name.

  • expand_wildcards string | array[string]

    The type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. It supports comma-separated values, such as open,hidden.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.

    Values are all, open, closed, hidden, or none.

  • master_timeout string

    The period to wait for a connection to the master node. If the master node is not available before the timeout expires, the request fails and returns an error. To indicated that the request should never timeout, you can set it to -1.

    Values are -1 or 0.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • alias string

      alias name

    • index string
    • filter string

      filter

    • routing.index string

      index routing

    • is_write_index string

      write index

GET _cat/aliases?format=json&v=true
resp = client.cat.aliases(
    format="json",
    v=True,
)
const response = await client.cat.aliases({
  format: "json",
  v: "true",
});
response = client.cat.aliases(
  format: "json",
  v: "true"
)
$resp = $client->cat()->aliases([
    "format" => "json",
    "v" => "true",
]);
curl -X GET -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/_cat/aliases?format=json&v=true"
client.cat().aliases();
Response examples (200)
A successful response from `GET _cat/aliases?format=json&v=true`. This response shows that `alias2` has configured a filter and `alias3` and `alias4` have routing configurations.
[
  {
    "alias": "alias1",
    "index": "test1",
    "filter": "-",
    "routing.index": "-",
    "routing.search": "-",
    "is_write_index": "true"
  },
  {
    "alias": "alias1",
    "index": "test1",
    "filter": "*",
    "routing.index": "-",
    "routing.search": "-",
    "is_write_index": "true"
  },
  {
    "alias": "alias3",
    "index": "test1",
    "filter": "-",
    "routing.index": "1",
    "routing.search": "1",
    "is_write_index": "true"
  },
  {
    "alias": "alias4",
    "index": "test1",
    "filter": "-",
    "routing.index": "2",
    "routing.search": "1,2",
    "is_write_index": "true"
  }
]




Get a document count Generally available

GET /_cat/count/{index}

All methods and paths for this operation:

GET /_cat/count

GET /_cat/count/{index}

Get quick access to a document count for a data stream, an index, or an entire cluster. The document count only includes live documents, not deleted documents which have not yet been removed by the merge process.

IMPORTANT: CAT APIs are only intended for human consumption using the command line or Kibana console. They are not intended for use by applications. For application consumption, use the count API.

Required authorization

  • Index privileges: read

Path parameters

  • index string | array[string] Required

    A comma-separated list of data streams, indices, and aliases used to limit the request. It supports wildcards (*). To target all data streams and indices, omit this parameter or use * or _all.

Query parameters

  • h string | array[string]

    List of columns to appear in the response. Supports simple wildcards.

  • s string | array[string]

    List of columns that determine how the table should be sorted. Sorting defaults to ascending and can be changed by setting :asc or :desc as a suffix to the column name.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • epoch number | string

      Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

      Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      One of:

      Time unit for seconds

    • timestamp string

      Time of day, expressed as HH:MM:SS

    • count string

      the document count

GET /_cat/count/my-index-000001?v=true&format=json
resp = client.cat.count(
    index="my-index-000001",
    v=True,
    format="json",
)
const response = await client.cat.count({
  index: "my-index-000001",
  v: "true",
  format: "json",
});
response = client.cat.count(
  index: "my-index-000001",
  v: "true",
  format: "json"
)
$resp = $client->cat()->count([
    "index" => "my-index-000001",
    "v" => "true",
    "format" => "json",
]);
curl -X GET -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/_cat/count/my-index-000001?v=true&format=json"
client.cat().count();
Response examples (200)
A successful response from `GET /_cat/count/my-index-000001?v=true&format=json`. It retrieves the document count for the `my-index-000001` data stream or index.
[
  {
    "epoch": "1475868259",
    "timestamp": "15:24:20",
    "count": "120"
  }
]
A successful response from `GET /_cat/count?v=true&format=json`. It retrieves the document count for all data streams and indices in the cluster.
[
  {
    "epoch": "1475868259",
    "timestamp": "15:24:20",
    "count": "121"
  }
]

Get CAT help Generally available

GET /_cat

Get help for the CAT APIs.

Responses

  • 200 application/json
GET /_cat
curl \
 --request GET 'http://api.example.com/_cat' \
 --header "Authorization: $API_KEY"

Get index information Generally available

GET /_cat/indices/{index}

All methods and paths for this operation:

GET /_cat/indices

GET /_cat/indices/{index}

Get high-level information about indices in a cluster, including backing indices for data streams.

Use this request to get the following information for each index in a cluster:

  • shard count
  • document count
  • deleted document count
  • primary store size
  • total store size of all shards, including shard replicas

These metrics are retrieved directly from Lucene, which Elasticsearch uses internally to power indexing and search. As a result, all document counts include hidden nested documents. To get an accurate count of Elasticsearch documents, use the cat count or count APIs.

CAT APIs are only intended for human consumption using the command line or Kibana console. They are not intended for use by applications. For application consumption, use an index endpoint.

Required authorization

  • Index privileges: monitor
  • Cluster privileges: monitor

Path parameters

  • index string | array[string] Required

    Comma-separated list of data streams, indices, and aliases used to limit the request. Supports wildcards (*). To target all data streams and indices, omit this parameter or use * or _all.

Query parameters

  • bytes string

    The unit used to display byte values.

    Values are b, kb, mb, gb, tb, or pb.

  • expand_wildcards string | array[string]

    The type of index that wildcard patterns can match.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.

    Values are all, open, closed, hidden, or none.

  • health string

    The health status used to limit returned indices. By default, the response includes indices of any health status.

    Supported values include:

    • green (or GREEN): All shards are assigned.
    • yellow (or YELLOW): All primary shards are assigned, but one or more replica shards are unassigned. If a node in the cluster fails, some data could be unavailable until that node is repaired.
    • red (or RED): One or more primary shards are unassigned, so some data is unavailable. This can occur briefly during cluster startup as primary shards are assigned.
    • unknown
    • unavailable

    Values are green, GREEN, yellow, YELLOW, red, RED, unknown, or unavailable.

  • include_unloaded_segments boolean

    If true, the response includes information from segments that are not loaded into memory.

  • pri boolean

    If true, the response only includes information from primary shards.

  • time string

    The unit used to display time values.

    Values are nanos, micros, ms, s, m, h, or d.

  • master_timeout string

    Period to wait for a connection to the master node.

    Values are -1 or 0.

  • h string | array[string]

    List of columns to appear in the response. Supports simple wildcards.

  • s string | array[string]

    List of columns that determine how the table should be sorted. Sorting defaults to ascending and can be changed by setting :asc or :desc as a suffix to the column name.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • health string

      current health status

    • status string

      open/close status

    • index string

      index name

    • uuid string

      index uuid

    • pri string

      number of primary shards

    • rep string

      number of replica shards

    • docs.count string | null

      available docs

    • docs.deleted string | null

      deleted docs

    • creation.date string

      index creation date (millisecond value)

    • creation.date.string string

      index creation date (as string)

    • store.size string | null

      store size of primaries & replicas

    • pri.store.size string | null

      store size of primaries

    • dataset.size string | null

      total size of dataset (including the cache for partially mounted indices)

    • completion.size string

      size of completion

    • pri.completion.size string

      size of completion

    • fielddata.memory_size string

      used fielddata cache

    • pri.fielddata.memory_size string

      used fielddata cache

    • fielddata.evictions string

      fielddata evictions

    • pri.fielddata.evictions string

      fielddata evictions

    • query_cache.memory_size string

      used query cache

    • pri.query_cache.memory_size string

      used query cache

    • query_cache.evictions string

      query cache evictions

    • pri.query_cache.evictions string

      query cache evictions

    • request_cache.memory_size string

      used request cache

    • pri.request_cache.memory_size string

      used request cache

    • request_cache.evictions string

      request cache evictions

    • pri.request_cache.evictions string

      request cache evictions

    • request_cache.hit_count string

      request cache hit count

    • pri.request_cache.hit_count string

      request cache hit count

    • request_cache.miss_count string

      request cache miss count

    • pri.request_cache.miss_count string

      request cache miss count

    • flush.total string

      number of flushes

    • pri.flush.total string

      number of flushes

    • flush.total_time string

      time spent in flush

    • pri.flush.total_time string

      time spent in flush

    • get.current string

      number of current get ops

    • pri.get.current string

      number of current get ops

    • get.time string

      time spent in get

    • pri.get.time string

      time spent in get

    • get.total string

      number of get ops

    • pri.get.total string

      number of get ops

    • get.exists_time string

      time spent in successful gets

    • pri.get.exists_time string

      time spent in successful gets

    • get.exists_total string

      number of successful gets

    • pri.get.exists_total string

      number of successful gets

    • get.missing_time string

      time spent in failed gets

    • pri.get.missing_time string

      time spent in failed gets

    • get.missing_total string

      number of failed gets

    • pri.get.missing_total string

      number of failed gets

    • indexing.delete_current string

      number of current deletions

    • pri.indexing.delete_current string

      number of current deletions

    • indexing.delete_time string

      time spent in deletions

    • pri.indexing.delete_time string

      time spent in deletions

    • indexing.delete_total string

      number of delete ops

    • pri.indexing.delete_total string

      number of delete ops

    • indexing.index_current string

      number of current indexing ops

    • pri.indexing.index_current string

      number of current indexing ops

    • indexing.index_time string

      time spent in indexing

    • pri.indexing.index_time string

      time spent in indexing

    • indexing.index_total string

      number of indexing ops

    • pri.indexing.index_total string

      number of indexing ops

    • indexing.index_failed string

      number of failed indexing ops

    • pri.indexing.index_failed string

      number of failed indexing ops

    • merges.current string

      number of current merges

    • pri.merges.current string

      number of current merges

    • merges.current_docs string

      number of current merging docs

    • pri.merges.current_docs string

      number of current merging docs

    • merges.current_size string

      size of current merges

    • pri.merges.current_size string

      size of current merges

    • merges.total string

      number of completed merge ops

    • pri.merges.total string

      number of completed merge ops

    • merges.total_docs string

      docs merged

    • pri.merges.total_docs string

      docs merged

    • merges.total_size string

      size merged

    • pri.merges.total_size string

      size merged

    • merges.total_time string

      time spent in merges

    • pri.merges.total_time string

      time spent in merges

    • refresh.total string

      total refreshes

    • pri.refresh.total string

      total refreshes

    • refresh.time string

      time spent in refreshes

    • pri.refresh.time string

      time spent in refreshes

    • refresh.external_total string

      total external refreshes

    • pri.refresh.external_total string

      total external refreshes

    • refresh.external_time string

      time spent in external refreshes

    • pri.refresh.external_time string

      time spent in external refreshes

    • refresh.listeners string

      number of pending refresh listeners

    • pri.refresh.listeners string

      number of pending refresh listeners

    • search.fetch_current string

      current fetch phase ops

    • pri.search.fetch_current string

      current fetch phase ops

    • search.fetch_time string

      time spent in fetch phase

    • pri.search.fetch_time string

      time spent in fetch phase

    • search.fetch_total string

      total fetch ops

    • pri.search.fetch_total string

      total fetch ops

    • search.open_contexts string

      open search contexts

    • pri.search.open_contexts string

      open search contexts

    • search.query_current string

      current query phase ops

    • pri.search.query_current string

      current query phase ops

    • search.query_time string

      time spent in query phase

    • pri.search.query_time string

      time spent in query phase

    • search.query_total string

      total query phase ops

    • pri.search.query_total string

      total query phase ops

    • search.scroll_current string

      open scroll contexts

    • pri.search.scroll_current string

      open scroll contexts

    • search.scroll_time string

      time scroll contexts held open

    • pri.search.scroll_time string

      time scroll contexts held open

    • search.scroll_total string

      completed scroll contexts

    • pri.search.scroll_total string

      completed scroll contexts

    • segments.count string

      number of segments

    • pri.segments.count string

      number of segments

    • segments.memory string

      memory used by segments

    • pri.segments.memory string

      memory used by segments

    • segments.index_writer_memory string

      memory used by index writer

    • pri.segments.index_writer_memory string

      memory used by index writer

    • segments.version_map_memory string

      memory used by version map

    • pri.segments.version_map_memory string

      memory used by version map

    • segments.fixed_bitset_memory string

      memory used by fixed bit sets for nested object field types and export type filters for types referred in _parent fields

    • pri.segments.fixed_bitset_memory string

      memory used by fixed bit sets for nested object field types and export type filters for types referred in _parent fields

    • warmer.current string

      current warmer ops

    • pri.warmer.current string

      current warmer ops

    • warmer.total string

      total warmer ops

    • pri.warmer.total string

      total warmer ops

    • warmer.total_time string

      time spent in warmers

    • pri.warmer.total_time string

      time spent in warmers

    • suggest.current string

      number of current suggest ops

    • pri.suggest.current string

      number of current suggest ops

    • suggest.time string

      time spend in suggest

    • pri.suggest.time string

      time spend in suggest

    • suggest.total string

      number of suggest ops

    • pri.suggest.total string

      number of suggest ops

    • memory.total string

      total used memory

    • pri.memory.total string

      total user memory

    • search.throttled string

      indicates if the index is search throttled

    • bulk.total_operations string

      number of bulk shard ops

    • pri.bulk.total_operations string

      number of bulk shard ops

    • bulk.total_time string

      time spend in shard bulk

    • pri.bulk.total_time string

      time spend in shard bulk

    • bulk.total_size_in_bytes string

      total size in bytes of shard bulk

    • pri.bulk.total_size_in_bytes string

      total size in bytes of shard bulk

    • bulk.avg_time string

      average time spend in shard bulk

    • pri.bulk.avg_time string

      average time spend in shard bulk

    • bulk.avg_size_in_bytes string

      average size in bytes of shard bulk

    • pri.bulk.avg_size_in_bytes string

      average size in bytes of shard bulk

GET /_cat/indices/my-index-*?v=true&s=index&format=json
resp = client.cat.indices(
    index="my-index-*",
    v=True,
    s="index",
    format="json",
)
const response = await client.cat.indices({
  index: "my-index-*",
  v: "true",
  s: "index",
  format: "json",
});
response = client.cat.indices(
  index: "my-index-*",
  v: "true",
  s: "index",
  format: "json"
)
$resp = $client->cat()->indices([
    "index" => "my-index-*",
    "v" => "true",
    "s" => "index",
    "format" => "json",
]);
curl -X GET -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/_cat/indices/my-index-*?v=true&s=index&format=json"
client.cat().indices();
Response examples (200)
A successful response from `GET /_cat/indices/my-index-*?v=true&s=index&format=json`.
[
  {
    "health": "yellow",
    "status": "open",
    "index": "my-index-000001",
    "uuid": "u8FNjxh8Rfy_awN11oDKYQ",
    "pri": "1",
    "rep": "1",
    "docs.count": "1200",
    "docs.deleted": "0",
    "store.size": "88.1kb",
    "pri.store.size": "88.1kb",
    "dataset.size": "88.1kb"
  },
  {
    "health": "green",
    "status": "open",
    "index": "my-index-000002",
    "uuid": "nYFWZEO7TUiOjLQXBaYJpA ",
    "pri": "1",
    "rep": "0",
    "docs.count": "0",
    "docs.deleted": "0",
    "store.size": "260b",
    "pri.store.size": "260b",
    "dataset.size": "260b"
  }
]

Get data frame analytics jobs Generally available

GET /_cat/ml/data_frame/analytics/{id}

All methods and paths for this operation:

GET /_cat/ml/data_frame/analytics

GET /_cat/ml/data_frame/analytics/{id}

Get configuration and usage information about data frame analytics jobs.

IMPORTANT: CAT APIs are only intended for human consumption using the Kibana console or command line. They are not intended for use by applications. For application consumption, use the get data frame analytics jobs statistics API.

Required authorization

  • Cluster privileges: monitor_ml

Path parameters

  • id string Required

    The ID of the data frame analytics to fetch

Query parameters

  • allow_no_match boolean

    Whether to ignore if a wildcard expression matches no configs. (This includes _all string or when no configs have been specified)

  • bytes string

    The unit in which to display byte values

    Values are b, kb, mb, gb, tb, or pb.

  • h string | array[string]

    Comma-separated list of column names to display.

    Supported values include:

    • assignment_explanation (or ae): Contains messages relating to the selection of a node.
    • create_time (or ct, createTime): The time when the data frame analytics job was created.
    • description (or d): A description of a job.
    • dest_index (or di, destIndex): Name of the destination index.
    • failure_reason (or fr, failureReason): Contains messages about the reason why a data frame analytics job failed.
    • id: Identifier for the data frame analytics job.
    • model_memory_limit (or mml, modelMemoryLimit): The approximate maximum amount of memory resources that are permitted for the data frame analytics job.
    • node.address (or na, nodeAddress): The network address of the node that the data frame analytics job is assigned to.
    • node.ephemeral_id (or ne, nodeEphemeralId): The ephemeral ID of the node that the data frame analytics job is assigned to.
    • node.id (or ni, nodeId): The unique identifier of the node that the data frame analytics job is assigned to.
    • node.name (or nn, nodeName): The name of the node that the data frame analytics job is assigned to.
    • progress (or p): The progress report of the data frame analytics job by phase.
    • source_index (or si, sourceIndex): Name of the source index.
    • state (or s): Current state of the data frame analytics job.
    • type (or t): The type of analysis that the data frame analytics job performs.
    • version (or v): The Elasticsearch version number in which the data frame analytics job was created.
  • s string | array[string]

    Comma-separated list of column names or column aliases used to sort the response.

    Supported values include:

    • assignment_explanation (or ae): Contains messages relating to the selection of a node.
    • create_time (or ct, createTime): The time when the data frame analytics job was created.
    • description (or d): A description of a job.
    • dest_index (or di, destIndex): Name of the destination index.
    • failure_reason (or fr, failureReason): Contains messages about the reason why a data frame analytics job failed.
    • id: Identifier for the data frame analytics job.
    • model_memory_limit (or mml, modelMemoryLimit): The approximate maximum amount of memory resources that are permitted for the data frame analytics job.
    • node.address (or na, nodeAddress): The network address of the node that the data frame analytics job is assigned to.
    • node.ephemeral_id (or ne, nodeEphemeralId): The ephemeral ID of the node that the data frame analytics job is assigned to.
    • node.id (or ni, nodeId): The unique identifier of the node that the data frame analytics job is assigned to.
    • node.name (or nn, nodeName): The name of the node that the data frame analytics job is assigned to.
    • progress (or p): The progress report of the data frame analytics job by phase.
    • source_index (or si, sourceIndex): Name of the source index.
    • state (or s): Current state of the data frame analytics job.
    • type (or t): The type of analysis that the data frame analytics job performs.
    • version (or v): The Elasticsearch version number in which the data frame analytics job was created.
  • time string

    Unit used to display time values.

    Values are nanos, micros, ms, s, m, h, or d.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • id string
    • type string

      The type of analysis that the job performs.

    • create_time string

      The time when the job was created.

    • version string
    • source_index string
    • dest_index string
    • description string

      A description of the job.

    • model_memory_limit string

      The approximate maximum amount of memory resources that are permitted for the job.

    • state string

      The current status of the job.

    • failure_reason string

      Messages about the reason why the job failed.

    • progress string

      The progress report for the job by phase.

    • assignment_explanation string

      Messages related to the selection of a node.

    • node.id string
    • node.name string
    • node.ephemeral_id string
    • node.address string

      The network address of the assigned node.

GET /_cat/ml/data_frame/analytics/{id}
GET _cat/ml/data_frame/analytics?v=true&format=json
resp = client.cat.ml_data_frame_analytics(
    v=True,
    format="json",
)
const response = await client.cat.mlDataFrameAnalytics({
  v: "true",
  format: "json",
});
response = client.cat.ml_data_frame_analytics(
  v: "true",
  format: "json"
)
$resp = $client->cat()->mlDataFrameAnalytics([
    "v" => "true",
    "format" => "json",
]);
curl -X GET -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/_cat/ml/data_frame/analytics?v=true&format=json"
client.cat().mlDataFrameAnalytics();
Response examples (200)
A successful response from `GET _cat/ml/data_frame/analytics?v=true&format=json`.
[
  {
    "id": "classifier_job_1",
    "type": "classification",
    "create_time": "2020-02-12T11:49:09.594Z",
    "state": "stopped"
  },
    {
    "id": "classifier_job_2",
    "type": "classification",
    "create_time": "2020-02-12T11:49:14.479Z",
    "state": "stopped"
  },
  {
    "id": "classifier_job_3",
    "type": "classification",
    "create_time": "2020-02-12T11:49:16.928Z",
    "state": "stopped"
  },
  {
    "id": "classifier_job_4",
    "type": "classification",
    "create_time": "2020-02-12T11:49:19.127Z",
    "state": "stopped"
  },
  {
    "id": "classifier_job_5",
    "type": "classification",
    "create_time": "2020-02-12T11:49:21.349Z",
    "state": "stopped"
  }
]

Get datafeeds Generally available

GET /_cat/ml/datafeeds/{datafeed_id}

All methods and paths for this operation:

GET /_cat/ml/datafeeds

GET /_cat/ml/datafeeds/{datafeed_id}

Get configuration and usage information about datafeeds. This API returns a maximum of 10,000 datafeeds. If the Elasticsearch security features are enabled, you must have monitor_ml, monitor, manage_ml, or manage cluster privileges to use this API.

IMPORTANT: CAT APIs are only intended for human consumption using the Kibana console or command line. They are not intended for use by applications. For application consumption, use the get datafeed statistics API.

Required authorization

  • Cluster privileges: monitor_ml

Path parameters

  • datafeed_id string Required

    A numerical character string that uniquely identifies the datafeed.

Query parameters

  • allow_no_match boolean

    Specifies what to do when the request:

    • Contains wildcard expressions and there are no datafeeds that match.
    • Contains the _all string or no identifiers and there are no matches.
    • Contains wildcard expressions and there are only partial matches.

    If true, the API returns an empty datafeeds array when there are no matches and the subset of results when there are partial matches. If false, the API returns a 404 status code when there are no matches or only partial matches.

  • h string | array[string]

    Comma-separated list of column names to display.

    Supported values include:

    • ae (or assignment_explanation): For started datafeeds only, contains messages relating to the selection of a node.
    • bc (or buckets.count, bucketsCount): The number of buckets processed.
    • id: A numerical character string that uniquely identifies the datafeed.
    • na (or node.address, nodeAddress): For started datafeeds only, the network address of the node where the datafeed is started.
    • ne (or node.ephemeral_id, nodeEphemeralId): For started datafeeds only, the ephemeral ID of the node where the datafeed is started.
    • ni (or node.id, nodeId): For started datafeeds only, the unique identifier of the node where the datafeed is started.
    • nn (or node.name, nodeName): For started datafeeds only, the name of the node where the datafeed is started.
    • sba (or search.bucket_avg, searchBucketAvg): The average search time per bucket, in milliseconds.
    • sc (or search.count, searchCount): The number of searches run by the datafeed.
    • seah (or search.exp_avg_hour, searchExpAvgHour): The exponential average search time per hour, in milliseconds.
    • st (or search.time, searchTime): The total time the datafeed spent searching, in milliseconds.
    • s (or state): The status of the datafeed: starting, started, stopping, or stopped. If starting, the datafeed has been requested to start but has not yet started. If started, the datafeed is actively receiving data. If stopping, the datafeed has been requested to stop gracefully and is completing its final action. If stopped, the datafeed is stopped and will not receive data until it is re-started.
  • s string | array[string]

    Comma-separated list of column names or column aliases used to sort the response.

    Supported values include:

    • ae (or assignment_explanation): For started datafeeds only, contains messages relating to the selection of a node.
    • bc (or buckets.count, bucketsCount): The number of buckets processed.
    • id: A numerical character string that uniquely identifies the datafeed.
    • na (or node.address, nodeAddress): For started datafeeds only, the network address of the node where the datafeed is started.
    • ne (or node.ephemeral_id, nodeEphemeralId): For started datafeeds only, the ephemeral ID of the node where the datafeed is started.
    • ni (or node.id, nodeId): For started datafeeds only, the unique identifier of the node where the datafeed is started.
    • nn (or node.name, nodeName): For started datafeeds only, the name of the node where the datafeed is started.
    • sba (or search.bucket_avg, searchBucketAvg): The average search time per bucket, in milliseconds.
    • sc (or search.count, searchCount): The number of searches run by the datafeed.
    • seah (or search.exp_avg_hour, searchExpAvgHour): The exponential average search time per hour, in milliseconds.
    • st (or search.time, searchTime): The total time the datafeed spent searching, in milliseconds.
    • s (or state): The status of the datafeed: starting, started, stopping, or stopped. If starting, the datafeed has been requested to start but has not yet started. If started, the datafeed is actively receiving data. If stopping, the datafeed has been requested to stop gracefully and is completing its final action. If stopped, the datafeed is stopped and will not receive data until it is re-started.
  • time string

    The unit used to display time values.

    Values are nanos, micros, ms, s, m, h, or d.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • id string

      The datafeed identifier.

    • state string

      Values are started, stopped, starting, or stopping.

    • assignment_explanation string

      For started datafeeds only, contains messages relating to the selection of a node.

    • buckets.count string

      The number of buckets processed.

    • search.count string

      The number of searches run by the datafeed.

    • search.time string

      The total time the datafeed spent searching, in milliseconds.

    • search.bucket_avg string

      The average search time per bucket, in milliseconds.

    • search.exp_avg_hour string

      The exponential average search time per hour, in milliseconds.

    • node.id string

      The unique identifier of the assigned node. For started datafeeds only, this information pertains to the node upon which the datafeed is started.

    • node.name string

      The name of the assigned node. For started datafeeds only, this information pertains to the node upon which the datafeed is started.

    • node.ephemeral_id string

      The ephemeral identifier of the assigned node. For started datafeeds only, this information pertains to the node upon which the datafeed is started.

    • node.address string

      The network address of the assigned node. For started datafeeds only, this information pertains to the node upon which the datafeed is started.

GET /_cat/ml/datafeeds/{datafeed_id}
GET _cat/ml/datafeeds?v=true&format=json
resp = client.cat.ml_datafeeds(
    v=True,
    format="json",
)
const response = await client.cat.mlDatafeeds({
  v: "true",
  format: "json",
});
response = client.cat.ml_datafeeds(
  v: "true",
  format: "json"
)
$resp = $client->cat()->mlDatafeeds([
    "v" => "true",
    "format" => "json",
]);
curl -X GET -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/_cat/ml/datafeeds?v=true&format=json"
client.cat().mlDatafeeds();
Response examples (200)
A successful response from `GET _cat/ml/datafeeds?v=true&format=json`.
[
  {
    "id": "datafeed-high_sum_total_sales",
    "state": "stopped",
    "buckets.count": "743",
    "search.count": "7"
  },
  {
    "id": "datafeed-low_request_rate",
    "state": "stopped",
    "buckets.count": "1457",
    "search.count": "3"
  },
  {
    "id": "datafeed-response_code_rates",
    "state": "stopped",
    "buckets.count": "1460",
    "search.count": "18"
  },
  {
    "id": "datafeed-url_scanning",
    "state": "stopped",
    "buckets.count": "1460",
    "search.count": "18"
  }
]

Get anomaly detection jobs Generally available

GET /_cat/ml/anomaly_detectors/{job_id}

All methods and paths for this operation:

GET /_cat/ml/anomaly_detectors

GET /_cat/ml/anomaly_detectors/{job_id}

Get configuration and usage information for anomaly detection jobs. This API returns a maximum of 10,000 jobs. If the Elasticsearch security features are enabled, you must have monitor_ml, monitor, manage_ml, or manage cluster privileges to use this API.

IMPORTANT: CAT APIs are only intended for human consumption using the Kibana console or command line. They are not intended for use by applications. For application consumption, use the get anomaly detection job statistics API.

Required authorization

  • Cluster privileges: monitor_ml

Path parameters

  • job_id string Required

    Identifier for the anomaly detection job.

Query parameters

  • allow_no_match boolean

    Specifies what to do when the request:

    • Contains wildcard expressions and there are no jobs that match.
    • Contains the _all string or no identifiers and there are no matches.
    • Contains wildcard expressions and there are only partial matches.

    If true, the API returns an empty jobs array when there are no matches and the subset of results when there are partial matches. If false, the API returns a 404 status code when there are no matches or only partial matches.

  • bytes string

    The unit used to display byte values.

    Values are b, kb, mb, gb, tb, or pb.

  • h string | array[string]

    Comma-separated list of column names to display.

    Supported values include:

    • assignment_explanation (or ae): For open anomaly detection jobs only, contains messages relating to the selection of a node to run the job.
    • buckets.count (or bc, bucketsCount): The number of bucket results produced by the job.
    • buckets.time.exp_avg (or btea, bucketsTimeExpAvg): Exponential moving average of all bucket processing times, in milliseconds.
    • buckets.time.exp_avg_hour (or bteah, bucketsTimeExpAvgHour): Exponentially-weighted moving average of bucket processing times calculated in a 1 hour time window, in milliseconds.
    • buckets.time.max (or btmax, bucketsTimeMax): Maximum among all bucket processing times, in milliseconds.
    • buckets.time.min (or btmin, bucketsTimeMin): Minimum among all bucket processing times, in milliseconds.
    • buckets.time.total (or btt, bucketsTimeTotal): Sum of all bucket processing times, in milliseconds.
    • data.buckets (or db, dataBuckets): The number of buckets processed.
    • data.earliest_record (or der, dataEarliestRecord): The timestamp of the earliest chronologically input document.
    • data.empty_buckets (or deb, dataEmptyBuckets): The number of buckets which did not contain any data.
    • data.input_bytes (or dib, dataInputBytes): The number of bytes of input data posted to the anomaly detection job.
    • data.input_fields (or dif, dataInputFields): The total number of fields in input documents posted to the anomaly detection job. This count includes fields that are not used in the analysis. However, be aware that if you are using a datafeed, it extracts only the required fields from the documents it retrieves before posting them to the job.
    • data.input_records (or dir, dataInputRecords): The number of input documents posted to the anomaly detection job.
    • data.invalid_dates (or did, dataInvalidDates): The number of input documents with either a missing date field or a date that could not be parsed.
    • data.last (or dl, dataLast): The timestamp at which data was last analyzed, according to server time.
    • data.last_empty_bucket (or dleb, dataLastEmptyBucket): The timestamp of the last bucket that did not contain any data.
    • data.last_sparse_bucket (or dlsb, dataLastSparseBucket): The timestamp of the last bucket that was considered sparse.
    • data.latest_record (or dlr, dataLatestRecord): The timestamp of the latest chronologically input document.
    • data.missing_fields (or dmf, dataMissingFields): The number of input documents that are missing a field that the anomaly detection job is configured to analyze. Input documents with missing fields are still processed because it is possible that not all fields are missing.
    • data.out_of_order_timestamps (or doot, dataOutOfOrderTimestamps): The number of input documents that have a timestamp chronologically preceding the start of the current anomaly detection bucket offset by the latency window. This information is applicable only when you provide data to the anomaly detection job by using the post data API. These out of order documents are discarded, since jobs require time series data to be in ascending chronological order.
    • data.processed_fields (or dpf, dataProcessedFields): The total number of fields in all the documents that have been processed by the anomaly detection job. Only fields that are specified in the detector configuration object contribute to this count. The timestamp is not included in this count.
    • data.processed_records (or dpr, dataProcessedRecords): The number of input documents that have been processed by the anomaly detection job. This value includes documents with missing fields, since they are nonetheless analyzed. If you use datafeeds and have aggregations in your search query, the processed record count is the number of aggregation results processed, not the number of Elasticsearch documents.
    • data.sparse_buckets (or dsb, dataSparseBuckets): The number of buckets that contained few data points compared to the expected number of data points.
    • forecasts.memory.avg (or fmavg, forecastsMemoryAvg): The average memory usage in bytes for forecasts related to the anomaly detection job.
    • forecasts.memory.max (or fmmax, forecastsMemoryMax): The maximum memory usage in bytes for forecasts related to the anomaly detection job.
    • forecasts.memory.min (or fmmin, forecastsMemoryMin): The minimum memory usage in bytes for forecasts related to the anomaly detection job.
    • forecasts.memory.total (or fmt, forecastsMemoryTotal): The total memory usage in bytes for forecasts related to the anomaly detection job.
    • forecasts.records.avg (or fravg, forecastsRecordsAvg): The average number of model_forecast` documents written for forecasts related to the anomaly detection job.
    • forecasts.records.max (or frmax, forecastsRecordsMax): The maximum number of model_forecast documents written for forecasts related to the anomaly detection job.
    • forecasts.records.min (or frmin, forecastsRecordsMin): The minimum number of model_forecast documents written for forecasts related to the anomaly detection job.
    • forecasts.records.total (or frt, forecastsRecordsTotal): The total number of model_forecast documents written for forecasts related to the anomaly detection job.
    • forecasts.time.avg (or ftavg, forecastsTimeAvg): The average runtime in milliseconds for forecasts related to the anomaly detection job.
    • forecasts.time.max (or ftmax, forecastsTimeMax): The maximum runtime in milliseconds for forecasts related to the anomaly detection job.
    • forecasts.time.min (or ftmin, forecastsTimeMin): The minimum runtime in milliseconds for forecasts related to the anomaly detection job.
    • forecasts.time.total (or ftt, forecastsTimeTotal): The total runtime in milliseconds for forecasts related to the anomaly detection job.
    • forecasts.total (or ft, forecastsTotal): The number of individual forecasts currently available for the job.
    • id: Identifier for the anomaly detection job.
    • model.bucket_allocation_failures (or mbaf, modelBucketAllocationFailures): The number of buckets for which new entities in incoming data were not processed due to insufficient model memory.
    • model.by_fields (or mbf, modelByFields): The number of by field values that were analyzed by the models. This value is cumulative for all detectors in the job.
    • model.bytes (or mb, modelBytes): The number of bytes of memory used by the models. This is the maximum value since the last time the model was persisted. If the job is closed, this value indicates the latest size.
    • model.bytes_exceeded (or mbe, modelBytesExceeded): The number of bytes over the high limit for memory usage at the last allocation failure.
    • model.categorization_status (or mcs, modelCategorizationStatus): The status of categorization for the job: ok or warn. If ok, categorization is performing acceptably well (or not being used at all). If warn, categorization is detecting a distribution of categories that suggests the input data is inappropriate for categorization. Problems could be that there is only one category, more than 90% of categories are rare, the number of categories is greater than 50% of the number of categorized documents, there are no frequently matched categories, or more than 50% of categories are dead.
    • model.categorized_doc_count (or mcdc, modelCategorizedDocCount): The number of documents that have had a field categorized.
    • model.dead_category_count (or mdcc, modelDeadCategoryCount): The number of categories created by categorization that will never be assigned again because another category’s definition makes it a superset of the dead category. Dead categories are a side effect of the way categorization has no prior training.
    • model.failed_category_count (or mdcc, modelFailedCategoryCount): The number of times that categorization wanted to create a new category but couldn’t because the job had hit its model memory limit. This count does not track which specific categories failed to be created. Therefore, you cannot use this value to determine the number of unique categories that were missed.
    • model.frequent_category_count (or mfcc, modelFrequentCategoryCount): The number of categories that match more than 1% of categorized documents.
    • model.log_time (or mlt, modelLogTime): The timestamp when the model stats were gathered, according to server time.
    • model.memory_limit (or mml, modelMemoryLimit): The timestamp when the model stats were gathered, according to server time.
    • model.memory_status (or mms, modelMemoryStatus): The status of the mathematical models: ok, soft_limit, or hard_limit. If ok, the models stayed below the configured value. If soft_limit, the models used more than 60% of the configured memory limit and older unused models will be pruned to free up space. Additionally, in categorization jobs no further category examples will be stored. If hard_limit, the models used more space than the configured memory limit. As a result, not all incoming data was processed.
    • model.over_fields (or mof, modelOverFields): The number of over field values that were analyzed by the models. This value is cumulative for all detectors in the job.
    • model.partition_fields (or mpf, modelPartitionFields): The number of partition field values that were analyzed by the models. This value is cumulative for all detectors in the job.
    • model.rare_category_count (or mrcc, modelRareCategoryCount): The number of categories that match just one categorized document.
    • model.timestamp (or mt, modelTimestamp): The timestamp of the last record when the model stats were gathered.
    • model.total_category_count (or mtcc, modelTotalCategoryCount): The number of categories created by categorization.
    • node.address (or na, nodeAddress): The network address of the node that runs the job. This information is available only for open jobs.
    • node.ephemeral_id (or ne, nodeEphemeralId): The ephemeral ID of the node that runs the job. This information is available only for open jobs.
    • node.id (or ni, nodeId): The unique identifier of the node that runs the job. This information is available only for open jobs.
    • node.name (or nn, nodeName): The name of the node that runs the job. This information is available only for open jobs.
    • opened_time (or ot): For open jobs only, the elapsed time for which the job has been open.
    • state (or s): The status of the anomaly detection job: closed, closing, failed, opened, or opening. If closed, the job finished successfully with its model state persisted. The job must be opened before it can accept further data. If closing, the job close action is in progress and has not yet completed. A closing job cannot accept further data. If failed, the job did not finish successfully due to an error. This situation can occur due to invalid input data, a fatal error occurring during the analysis, or an external interaction such as the process being killed by the Linux out of memory (OOM) killer. If the job had irrevocably failed, it must be force closed and then deleted. If the datafeed can be corrected, the job can be closed and then re-opened. If opened, the job is available to receive and process data. If opening, the job open action is in progress and has not yet completed.
  • s string | array[string]

    Comma-separated list of column names or column aliases used to sort the response.

    Supported values include:

    • assignment_explanation (or ae): For open anomaly detection jobs only, contains messages relating to the selection of a node to run the job.
    • buckets.count (or bc, bucketsCount): The number of bucket results produced by the job.
    • buckets.time.exp_avg (or btea, bucketsTimeExpAvg): Exponential moving average of all bucket processing times, in milliseconds.
    • buckets.time.exp_avg_hour (or bteah, bucketsTimeExpAvgHour): Exponentially-weighted moving average of bucket processing times calculated in a 1 hour time window, in milliseconds.
    • buckets.time.max (or btmax, bucketsTimeMax): Maximum among all bucket processing times, in milliseconds.
    • buckets.time.min (or btmin, bucketsTimeMin): Minimum among all bucket processing times, in milliseconds.
    • buckets.time.total (or btt, bucketsTimeTotal): Sum of all bucket processing times, in milliseconds.
    • data.buckets (or db, dataBuckets): The number of buckets processed.
    • data.earliest_record (or der, dataEarliestRecord): The timestamp of the earliest chronologically input document.
    • data.empty_buckets (or deb, dataEmptyBuckets): The number of buckets which did not contain any data.
    • data.input_bytes (or dib, dataInputBytes): The number of bytes of input data posted to the anomaly detection job.
    • data.input_fields (or dif, dataInputFields): The total number of fields in input documents posted to the anomaly detection job. This count includes fields that are not used in the analysis. However, be aware that if you are using a datafeed, it extracts only the required fields from the documents it retrieves before posting them to the job.
    • data.input_records (or dir, dataInputRecords): The number of input documents posted to the anomaly detection job.
    • data.invalid_dates (or did, dataInvalidDates): The number of input documents with either a missing date field or a date that could not be parsed.
    • data.last (or dl, dataLast): The timestamp at which data was last analyzed, according to server time.
    • data.last_empty_bucket (or dleb, dataLastEmptyBucket): The timestamp of the last bucket that did not contain any data.
    • data.last_sparse_bucket (or dlsb, dataLastSparseBucket): The timestamp of the last bucket that was considered sparse.
    • data.latest_record (or dlr, dataLatestRecord): The timestamp of the latest chronologically input document.
    • data.missing_fields (or dmf, dataMissingFields): The number of input documents that are missing a field that the anomaly detection job is configured to analyze. Input documents with missing fields are still processed because it is possible that not all fields are missing.
    • data.out_of_order_timestamps (or doot, dataOutOfOrderTimestamps): The number of input documents that have a timestamp chronologically preceding the start of the current anomaly detection bucket offset by the latency window. This information is applicable only when you provide data to the anomaly detection job by using the post data API. These out of order documents are discarded, since jobs require time series data to be in ascending chronological order.
    • data.processed_fields (or dpf, dataProcessedFields): The total number of fields in all the documents that have been processed by the anomaly detection job. Only fields that are specified in the detector configuration object contribute to this count. The timestamp is not included in this count.
    • data.processed_records (or dpr, dataProcessedRecords): The number of input documents that have been processed by the anomaly detection job. This value includes documents with missing fields, since they are nonetheless analyzed. If you use datafeeds and have aggregations in your search query, the processed record count is the number of aggregation results processed, not the number of Elasticsearch documents.
    • data.sparse_buckets (or dsb, dataSparseBuckets): The number of buckets that contained few data points compared to the expected number of data points.
    • forecasts.memory.avg (or fmavg, forecastsMemoryAvg): The average memory usage in bytes for forecasts related to the anomaly detection job.
    • forecasts.memory.max (or fmmax, forecastsMemoryMax): The maximum memory usage in bytes for forecasts related to the anomaly detection job.
    • forecasts.memory.min (or fmmin, forecastsMemoryMin): The minimum memory usage in bytes for forecasts related to the anomaly detection job.
    • forecasts.memory.total (or fmt, forecastsMemoryTotal): The total memory usage in bytes for forecasts related to the anomaly detection job.
    • forecasts.records.avg (or fravg, forecastsRecordsAvg): The average number of model_forecast` documents written for forecasts related to the anomaly detection job.
    • forecasts.records.max (or frmax, forecastsRecordsMax): The maximum number of model_forecast documents written for forecasts related to the anomaly detection job.
    • forecasts.records.min (or frmin, forecastsRecordsMin): The minimum number of model_forecast documents written for forecasts related to the anomaly detection job.
    • forecasts.records.total (or frt, forecastsRecordsTotal): The total number of model_forecast documents written for forecasts related to the anomaly detection job.
    • forecasts.time.avg (or ftavg, forecastsTimeAvg): The average runtime in milliseconds for forecasts related to the anomaly detection job.
    • forecasts.time.max (or ftmax, forecastsTimeMax): The maximum runtime in milliseconds for forecasts related to the anomaly detection job.
    • forecasts.time.min (or ftmin, forecastsTimeMin): The minimum runtime in milliseconds for forecasts related to the anomaly detection job.
    • forecasts.time.total (or ftt, forecastsTimeTotal): The total runtime in milliseconds for forecasts related to the anomaly detection job.
    • forecasts.total (or ft, forecastsTotal): The number of individual forecasts currently available for the job.
    • id: Identifier for the anomaly detection job.
    • model.bucket_allocation_failures (or mbaf, modelBucketAllocationFailures): The number of buckets for which new entities in incoming data were not processed due to insufficient model memory.
    • model.by_fields (or mbf, modelByFields): The number of by field values that were analyzed by the models. This value is cumulative for all detectors in the job.
    • model.bytes (or mb, modelBytes): The number of bytes of memory used by the models. This is the maximum value since the last time the model was persisted. If the job is closed, this value indicates the latest size.
    • model.bytes_exceeded (or mbe, modelBytesExceeded): The number of bytes over the high limit for memory usage at the last allocation failure.
    • model.categorization_status (or mcs, modelCategorizationStatus): The status of categorization for the job: ok or warn. If ok, categorization is performing acceptably well (or not being used at all). If warn, categorization is detecting a distribution of categories that suggests the input data is inappropriate for categorization. Problems could be that there is only one category, more than 90% of categories are rare, the number of categories is greater than 50% of the number of categorized documents, there are no frequently matched categories, or more than 50% of categories are dead.
    • model.categorized_doc_count (or mcdc, modelCategorizedDocCount): The number of documents that have had a field categorized.
    • model.dead_category_count (or mdcc, modelDeadCategoryCount): The number of categories created by categorization that will never be assigned again because another category’s definition makes it a superset of the dead category. Dead categories are a side effect of the way categorization has no prior training.
    • model.failed_category_count (or mdcc, modelFailedCategoryCount): The number of times that categorization wanted to create a new category but couldn’t because the job had hit its model memory limit. This count does not track which specific categories failed to be created. Therefore, you cannot use this value to determine the number of unique categories that were missed.
    • model.frequent_category_count (or mfcc, modelFrequentCategoryCount): The number of categories that match more than 1% of categorized documents.
    • model.log_time (or mlt, modelLogTime): The timestamp when the model stats were gathered, according to server time.
    • model.memory_limit (or mml, modelMemoryLimit): The timestamp when the model stats were gathered, according to server time.
    • model.memory_status (or mms, modelMemoryStatus): The status of the mathematical models: ok, soft_limit, or hard_limit. If ok, the models stayed below the configured value. If soft_limit, the models used more than 60% of the configured memory limit and older unused models will be pruned to free up space. Additionally, in categorization jobs no further category examples will be stored. If hard_limit, the models used more space than the configured memory limit. As a result, not all incoming data was processed.
    • model.over_fields (or mof, modelOverFields): The number of over field values that were analyzed by the models. This value is cumulative for all detectors in the job.
    • model.partition_fields (or mpf, modelPartitionFields): The number of partition field values that were analyzed by the models. This value is cumulative for all detectors in the job.
    • model.rare_category_count (or mrcc, modelRareCategoryCount): The number of categories that match just one categorized document.
    • model.timestamp (or mt, modelTimestamp): The timestamp of the last record when the model stats were gathered.
    • model.total_category_count (or mtcc, modelTotalCategoryCount): The number of categories created by categorization.
    • node.address (or na, nodeAddress): The network address of the node that runs the job. This information is available only for open jobs.
    • node.ephemeral_id (or ne, nodeEphemeralId): The ephemeral ID of the node that runs the job. This information is available only for open jobs.
    • node.id (or ni, nodeId): The unique identifier of the node that runs the job. This information is available only for open jobs.
    • node.name (or nn, nodeName): The name of the node that runs the job. This information is available only for open jobs.
    • opened_time (or ot): For open jobs only, the elapsed time for which the job has been open.
    • state (or s): The status of the anomaly detection job: closed, closing, failed, opened, or opening. If closed, the job finished successfully with its model state persisted. The job must be opened before it can accept further data. If closing, the job close action is in progress and has not yet completed. A closing job cannot accept further data. If failed, the job did not finish successfully due to an error. This situation can occur due to invalid input data, a fatal error occurring during the analysis, or an external interaction such as the process being killed by the Linux out of memory (OOM) killer. If the job had irrevocably failed, it must be force closed and then deleted. If the datafeed can be corrected, the job can be closed and then re-opened. If opened, the job is available to receive and process data. If opening, the job open action is in progress and has not yet completed.
  • time string

    The unit used to display time values.

    Values are nanos, micros, ms, s, m, h, or d.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • id string
    • state string

      Values are closing, closed, opened, failed, or opening.

    • opened_time string

      For open jobs only, the amount of time the job has been opened.

    • assignment_explanation string

      For open anomaly detection jobs only, contains messages relating to the selection of a node to run the job.

    • data.processed_records string

      The number of input documents that have been processed by the anomaly detection job. This value includes documents with missing fields, since they are nonetheless analyzed. If you use datafeeds and have aggregations in your search query, the processed_record_count is the number of aggregation results processed, not the number of Elasticsearch documents.

    • data.processed_fields string

      The total number of fields in all the documents that have been processed by the anomaly detection job. Only fields that are specified in the detector configuration object contribute to this count. The timestamp is not included in this count.

    • data.input_bytes number | string

    • data.input_records string

      The number of input documents posted to the anomaly detection job.

    • data.input_fields string

      The total number of fields in input documents posted to the anomaly detection job. This count includes fields that are not used in the analysis. However, be aware that if you are using a datafeed, it extracts only the required fields from the documents it retrieves before posting them to the job.

    • data.invalid_dates string

      The number of input documents with either a missing date field or a date that could not be parsed.

    • data.missing_fields string

      The number of input documents that are missing a field that the anomaly detection job is configured to analyze. Input documents with missing fields are still processed because it is possible that not all fields are missing. If you are using datafeeds or posting data to the job in JSON format, a high missing_field_count is often not an indication of data issues. It is not necessarily a cause for concern.

    • data.out_of_order_timestamps string

      The number of input documents that have a timestamp chronologically preceding the start of the current anomaly detection bucket offset by the latency window. This information is applicable only when you provide data to the anomaly detection job by using the post data API. These out of order documents are discarded, since jobs require time series data to be in ascending chronological order.

    • data.empty_buckets string

      The number of buckets which did not contain any data. If your data contains many empty buckets, consider increasing your bucket_span or using functions that are tolerant to gaps in data such as mean, non_null_sum or non_zero_count.

    • data.sparse_buckets string

      The number of buckets that contained few data points compared to the expected number of data points. If your data contains many sparse buckets, consider using a longer bucket_span.

    • data.buckets string

      The total number of buckets processed.

    • data.earliest_record string

      The timestamp of the earliest chronologically input document.

    • data.latest_record string

      The timestamp of the latest chronologically input document.

    • data.last string

      The timestamp at which data was last analyzed, according to server time.

    • data.last_empty_bucket string

      The timestamp of the last bucket that did not contain any data.

    • data.last_sparse_bucket string

      The timestamp of the last bucket that was considered sparse.

    • model.bytes number | string

    • model.memory_status string

      Values are ok, soft_limit, or hard_limit.

    • model.bytes_exceeded number | string

    • model.memory_limit string

      The upper limit for model memory usage, checked on increasing values.

    • model.by_fields string

      The number of by field values that were analyzed by the models. This value is cumulative for all detectors in the job.

    • model.over_fields string

      The number of over field values that were analyzed by the models. This value is cumulative for all detectors in the job.

    • model.partition_fields string

      The number of partition field values that were analyzed by the models. This value is cumulative for all detectors in the job.

    • model.bucket_allocation_failures string

      The number of buckets for which new entities in incoming data were not processed due to insufficient model memory. This situation is also signified by a hard_limit: memory_status property value.

    • model.categorization_status string

      Values are ok or warn.

    • model.categorized_doc_count string

      The number of documents that have had a field categorized.

    • model.total_category_count string

      The number of categories created by categorization.

    • model.frequent_category_count string

      The number of categories that match more than 1% of categorized documents.

    • model.rare_category_count string

      The number of categories that match just one categorized document.

    • model.dead_category_count string

      The number of categories created by categorization that will never be assigned again because another category’s definition makes it a superset of the dead category. Dead categories are a side effect of the way categorization has no prior training.

    • model.failed_category_count string

      The number of times that categorization wanted to create a new category but couldn’t because the job had hit its model_memory_limit. This count does not track which specific categories failed to be created. Therefore you cannot use this value to determine the number of unique categories that were missed.

    • model.log_time string

      The timestamp when the model stats were gathered, according to server time.

    • model.timestamp string

      The timestamp of the last record when the model stats were gathered.

    • forecasts.total string

      The number of individual forecasts currently available for the job. A value of one or more indicates that forecasts exist.

    • forecasts.memory.min string

      The minimum memory usage in bytes for forecasts related to the anomaly detection job.

    • forecasts.memory.max string

      The maximum memory usage in bytes for forecasts related to the anomaly detection job.

    • forecasts.memory.avg string

      The average memory usage in bytes for forecasts related to the anomaly detection job.

    • forecasts.memory.total string

      The total memory usage in bytes for forecasts related to the anomaly detection job.

    • forecasts.records.min string

      The minimum number of model_forecast documents written for forecasts related to the anomaly detection job.

    • forecasts.records.max string

      The maximum number of model_forecast documents written for forecasts related to the anomaly detection job.

    • forecasts.records.avg string

      The average number of model_forecast documents written for forecasts related to the anomaly detection job.

    • forecasts.records.total string

      The total number of model_forecast documents written for forecasts related to the anomaly detection job.

    • forecasts.time.min string

      The minimum runtime in milliseconds for forecasts related to the anomaly detection job.

    • forecasts.time.max string

      The maximum runtime in milliseconds for forecasts related to the anomaly detection job.

    • forecasts.time.avg string

      The average runtime in milliseconds for forecasts related to the anomaly detection job.

    • forecasts.time.total string

      The total runtime in milliseconds for forecasts related to the anomaly detection job.

    • node.id string
    • node.name string

      The name of the assigned node.

    • node.ephemeral_id string
    • node.address string

      The network address of the assigned node.

    • buckets.count string

      The number of bucket results produced by the job.

    • buckets.time.total string

      The sum of all bucket processing times, in milliseconds.

    • buckets.time.min string

      The minimum of all bucket processing times, in milliseconds.

    • buckets.time.max string

      The maximum of all bucket processing times, in milliseconds.

    • buckets.time.exp_avg string

      The exponential moving average of all bucket processing times, in milliseconds.

    • buckets.time.exp_avg_hour string

      The exponential moving average of bucket processing times calculated in a one hour time window, in milliseconds.

GET /_cat/ml/anomaly_detectors/{job_id}
GET _cat/ml/anomaly_detectors?h=id,s,dpr,mb&v=true&format=json
resp = client.cat.ml_jobs(
    h="id,s,dpr,mb",
    v=True,
    format="json",
)
const response = await client.cat.mlJobs({
  h: "id,s,dpr,mb",
  v: "true",
  format: "json",
});
response = client.cat.ml_jobs(
  h: "id,s,dpr,mb",
  v: "true",
  format: "json"
)
$resp = $client->cat()->mlJobs([
    "h" => "id,s,dpr,mb",
    "v" => "true",
    "format" => "json",
]);
curl -X GET -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/_cat/ml/anomaly_detectors?h=id,s,dpr,mb&v=true&format=json"
client.cat().mlJobs();
Response examples (200)
A successful response from `GET _cat/ml/anomaly_detectors?h=id,s,dpr,mb&v=true&format=json`.
[
  {
    "id": "high_sum_total_sales",
    "s": "closed",
    "dpr": "14022",
    "mb": "1.5mb"
  },
  {
    "id": "low_request_rate",
    "s": "closed",
    "dpr": "1216",
    "mb": "40.5kb"
  },
  {
    "id": "response_code_rates",
    "s": "closed",
    "dpr": "28146",
    "mb": "132.7kb"
  },
  {
    "id": "url_scanning",
    "s": "closed",
    "dpr": "28146",
    "mb": "501.6kb"
  }
]













Ping the cluster Generally available

HEAD /

Get information about whether the cluster is running.

Responses

  • 200 application/json
HEAD /
curl \
 --request HEAD 'http://api.example.com/' \
 --header "Authorization: $API_KEY"

Connector

The connector and sync jobs APIs provide a convenient way to create and manage Elastic connectors and sync jobs in an internal index. Connectors are Elasticsearch integrations for syncing content from third-party data sources, which can be deployed on Elastic Cloud or hosted on your own infrastructure. This API provides an alternative to relying solely on Kibana UI for connector and sync job management. The API comes with a set of validations and assertions to ensure that the state representation in the internal index remains valid. This API requires the manage_connector privilege or, for read-only endpoints, the monitor_connector privilege.

Check out the connector API tutorial

Check in a connector Technical preview

PUT /_connector/{connector_id}/_check_in

Update the last_seen field in the connector and set it to the current timestamp.

Path parameters

  • connector_id string Required

    The unique identifier of the connector to be checked in

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • result string Required

      Values are created, updated, deleted, not_found, or noop.

PUT /_connector/{connector_id}/_check_in
PUT _connector/my-connector/_check_in
resp = client.connector.check_in(
    connector_id="my-connector",
)
const response = await client.connector.checkIn({
  connector_id: "my-connector",
});
response = client.connector.check_in(
  connector_id: "my-connector"
)
$resp = $client->connector()->checkIn([
    "connector_id" => "my-connector",
]);
curl -X PUT -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/_connector/my-connector/_check_in"
client.connector().checkIn(c -> c
    .connectorId("my-connector")
);
Response examples (200)
{
    "result": "updated"
}








Delete a connector Beta

DELETE /_connector/{connector_id}

Removes a connector and associated sync jobs. This is a destructive action that is not recoverable. NOTE: This action doesn’t delete any API keys, ingest pipelines, or data indices associated with the connector. These need to be removed manually.

Path parameters

  • connector_id string Required

    The unique identifier of the connector to be deleted

Query parameters

  • delete_sync_jobs boolean

    A flag indicating if associated sync jobs should be also removed. Defaults to false.

  • hard boolean

    A flag indicating if the connector should be hard deleted.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_connector/{connector_id}
DELETE _connector/my-connector-id&delete_sync_jobs=true
resp = client.connector.delete(
    connector_id="my-connector-id&delete_sync_jobs=true",
)
const response = await client.connector.delete({
  connector_id: "my-connector-id&delete_sync_jobs=true",
});
response = client.connector.delete(
  connector_id: "my-connector-id&delete_sync_jobs=true"
)
$resp = $client->connector()->delete([
    "connector_id" => "my-connector-id&delete_sync_jobs=true",
]);
curl -X DELETE -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/_connector/my-connector-id&delete_sync_jobs=true"
client.connector().delete(d -> d
    .connectorId("my-connector-id&delete_sync_jobs=true")
);
Response examples (200)
{
    "acknowledged": true
}

Get all connectors Beta

GET /_connector

Get information about all connectors.

Query parameters

  • from number

    Starting offset (default: 0)

  • size number

    Specifies a max number of results to get

  • index_name string | array[string]

    A comma-separated list of connector index names to fetch connector documents for

  • connector_name string | array[string]

    A comma-separated list of connector names to fetch connector documents for

  • service_type string | array[string]

    A comma-separated list of connector service types to fetch connector documents for

  • include_deleted boolean

    A flag to indicate if the desired connector should be fetched, even if it was soft-deleted.

  • query string

    A wildcard query string that filters connectors with matching name, description or index name

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • count number Required
    • results array[object] Required
      Hide results attributes Show results attributes object
      • api_key_id string
      • api_key_secret_id string
      • configuration object Required
        Hide configuration attribute Show configuration attribute object
      • custom_scheduling object Required
        Hide custom_scheduling attribute Show custom_scheduling attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • configuration_overrides object Required
            Hide configuration_overrides attributes Show configuration_overrides attributes object
            • max_crawl_depth number
            • sitemap_discovery_disabled boolean
            • domain_allowlist array[string]
            • sitemap_urls array[string]
            • seed_urls array[string]
          • enabled boolean Required
          • interval string Required
          • last_synced string
          • name string Required
      • deleted boolean Required
      • description string
      • features object
        Hide features attributes Show features attributes object
        • document_level_security object
          Hide document_level_security attribute Show document_level_security attribute object
          • enabled boolean Required
        • incremental_sync object
          Hide incremental_sync attribute Show incremental_sync attribute object
          • enabled boolean Required
        • native_connector_api_keys object
          Hide native_connector_api_keys attribute Show native_connector_api_keys attribute object
          • enabled boolean Required
        • sync_rules object
          Hide sync_rules attributes Show sync_rules attributes object
          • advanced object
            Hide advanced attribute Show advanced attribute object
            • enabled boolean Required
          • basic object
            Hide basic attribute Show basic attribute object
            • enabled boolean Required
      • filtering array[object] Required
        Hide filtering attributes Show filtering attributes object
        • active object Required
          Hide active attributes Show active attributes object
          • advanced_snippet object Required
          • rules array[object] Required
          • validation object Required
        • domain string
        • draft object Required
          Hide draft attributes Show draft attributes object
          • advanced_snippet object Required
          • rules array[object] Required
          • validation object Required
      • id string
      • index_name string | null

      • is_native boolean Required
      • language string
      • last_access_control_sync_error string
      • last_access_control_sync_scheduled_at string | number

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        One of:
      • last_access_control_sync_status string

        Values are canceling, canceled, completed, error, in_progress, pending, or suspended.

      • last_deleted_document_count number
      • last_incremental_sync_scheduled_at string | number

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        One of:
      • last_indexed_document_count number
      • last_seen string | number

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        One of:
      • last_sync_error string
      • last_sync_scheduled_at string | number

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        One of:
      • last_sync_status string

        Values are canceling, canceled, completed, error, in_progress, pending, or suspended.

      • last_synced string | number

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        One of:
      • name string
      • pipeline object
        Hide pipeline attributes Show pipeline attributes object
        • extract_binary_content boolean Required
        • name string Required
        • reduce_whitespace boolean Required
        • run_ml_inference boolean Required
      • scheduling object Required
        Hide scheduling attributes Show scheduling attributes object
        • access_control object
          Hide access_control attributes Show access_control attributes object
          • enabled boolean Required
          • interval string Required

            The interval is expressed using the crontab syntax

        • full object
          Hide full attributes Show full attributes object
          • enabled boolean Required
          • interval string Required

            The interval is expressed using the crontab syntax

        • incremental object
          Hide incremental attributes Show incremental attributes object
          • enabled boolean Required
          • interval string Required

            The interval is expressed using the crontab syntax

      • service_type string
      • status string Required

        Values are created, needs_configuration, configured, connected, or error.

      • sync_cursor object
      • sync_now boolean Required
GET _connector
resp = client.connector.list()
const response = await client.connector.list();
response = client.connector.list
$resp = $client->connector()->list();
curl -X GET -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/_connector"
client.connector().list(l -> l);

Create a connector Beta

POST /_connector

Connectors are Elasticsearch integrations that bring content from third-party data sources, which can be deployed on Elastic Cloud or hosted on your own infrastructure. Elastic managed connectors (Native connectors) are a managed service on Elastic Cloud. Self-managed connectors (Connector clients) are self-managed on your infrastructure.

application/json

Body

  • description string
  • index_name string
  • is_native boolean
  • language string
  • name string
  • service_type string

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • result string Required

      Values are created, updated, deleted, not_found, or noop.

    • id string Required
POST /_connector
curl \
 --request POST 'http://api.example.com/_connector' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"description":"string","index_name":"string","is_native":true,"language":"string","name":"string","service_type":"string"}'

Cancel a connector sync job Beta

PUT /_connector/_sync_job/{connector_sync_job_id}/_cancel

Cancel a connector sync job, which sets the status to cancelling and updates cancellation_requested_at to the current time. The connector service is then responsible for setting the status of connector sync jobs to cancelled.

Path parameters

  • connector_sync_job_id string Required

    The unique identifier of the connector sync job

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • result string Required

      Values are created, updated, deleted, not_found, or noop.

PUT /_connector/_sync_job/{connector_sync_job_id}/_cancel
PUT _connector/_sync_job/my-connector-sync-job-id/_cancel
resp = client.connector.sync_job_cancel(
    connector_sync_job_id="my-connector-sync-job-id",
)
const response = await client.connector.syncJobCancel({
  connector_sync_job_id: "my-connector-sync-job-id",
});
response = client.connector.sync_job_cancel(
  connector_sync_job_id: "my-connector-sync-job-id"
)
$resp = $client->connector()->syncJobCancel([
    "connector_sync_job_id" => "my-connector-sync-job-id",
]);
curl -X PUT -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/_connector/_sync_job/my-connector-sync-job-id/_cancel"
client.connector().syncJobCancel(s -> s
    .connectorSyncJobId("my-connector-sync-job-id")
);

Get a connector sync job Beta

GET /_connector/_sync_job/{connector_sync_job_id}

Path parameters

  • connector_sync_job_id string Required

    The unique identifier of the connector sync job

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • cancelation_requested_at string | number

      A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      One of:
    • canceled_at string | number

      A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      One of:
    • completed_at string | number

      A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      One of:
    • connector object Required
      Hide connector attributes Show connector attributes object
      • configuration object Required
        Hide configuration attribute Show configuration attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • category string
          • default_value number | string | boolean | null Required

            A scalar value.

          • depends_on array[object] Required
            Hide depends_on attributes Show depends_on attributes object
            • field string Required
            • value
          • display string Required

            Values are textbox, textarea, numeric, toggle, or dropdown.

          • label string Required
          • options array[object] Required
            Hide options attributes Show options attributes object
            • label string Required
            • value
          • order number
          • placeholder string
          • required boolean Required
          • sensitive boolean Required
          • type string

            Values are str, int, list, or bool.

          • ui_restrictions array[string]
          • validations array[object]
          • value object Required
      • filtering object Required
        Hide filtering attributes Show filtering attributes object
        • advanced_snippet object Required
          Hide advanced_snippet attributes Show advanced_snippet attributes object
          • created_at string | number

            A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

            One of:
          • updated_at string | number

            A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

            One of:
          • value object Required
        • rules array[object] Required
          Hide rules attributes Show rules attributes object
          • created_at string
          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • id string Required
          • order number Required
          • policy string Required

            Values are exclude or include.

          • rule string Required

            Values are contains, ends_with, equals, regex, starts_with, >, or <.

          • updated_at string
          • value string Required
        • validation object Required
          Hide validation attributes Show validation attributes object
          • errors array[object] Required
            Hide errors attributes Show errors attributes object
            • ids array[string] Required
            • messages array[string] Required
          • state string Required

            Values are edited, invalid, or valid.

      • id string Required
      • index_name string Required
      • language string
      • pipeline object
        Hide pipeline attributes Show pipeline attributes object
        • extract_binary_content boolean Required
        • name string Required
        • reduce_whitespace boolean Required
        • run_ml_inference boolean Required
      • service_type string Required
      • sync_cursor object
    • created_at string | number Required

      A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      One of:
    • deleted_document_count number Required
    • error string
    • id string Required
    • indexed_document_count number Required
    • indexed_document_volume number Required
    • job_type string Required

      Values are full, incremental, or access_control.

    • last_seen string | number

      A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      One of:
    • metadata object Required
      Hide metadata attribute Show metadata attribute object
      • * object Additional properties
    • started_at string | number

      A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      One of:
    • status string Required

      Values are canceling, canceled, completed, error, in_progress, pending, or suspended.

    • total_document_count number Required
    • trigger_method string Required

      Values are on_demand or scheduled.

    • worker_hostname string
GET /_connector/_sync_job/{connector_sync_job_id}
GET _connector/_sync_job/my-connector-sync-job
resp = client.connector.sync_job_get(
    connector_sync_job_id="my-connector-sync-job",
)
const response = await client.connector.syncJobGet({
  connector_sync_job_id: "my-connector-sync-job",
});
response = client.connector.sync_job_get(
  connector_sync_job_id: "my-connector-sync-job"
)
$resp = $client->connector()->syncJobGet([
    "connector_sync_job_id" => "my-connector-sync-job",
]);
curl -X GET -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/_connector/_sync_job/my-connector-sync-job"
client.connector().syncJobGet(s -> s
    .connectorSyncJobId("my-connector-sync-job")
);




Get all connector sync jobs Beta

GET /_connector/_sync_job

Get information about all stored connector sync jobs listed by their creation date in ascending order.

Query parameters

  • from number

    Starting offset (default: 0)

  • size number

    Specifies a max number of results to get

  • status string

    A sync job status to fetch connector sync jobs for

    Values are canceling, canceled, completed, error, in_progress, pending, or suspended.

  • connector_id string

    A connector id to fetch connector sync jobs for

  • job_type string | array[string]

    A comma-separated list of job types to fetch the sync jobs for

    Supported values include: full, incremental, access_control

    Values are full, incremental, or access_control.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • count number Required
    • results array[object] Required
      Hide results attributes Show results attributes object
      • cancelation_requested_at string | number

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        One of:
      • canceled_at string | number

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        One of:
      • completed_at string | number

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        One of:
      • connector object Required
        Hide connector attributes Show connector attributes object
        • configuration object Required
          Hide configuration attribute Show configuration attribute object
          • * object Additional properties
            Hide * attributes Show * attributes object
            • category string
            • default_value
            • depends_on array[object] Required
            • display string Required

              Values are textbox, textarea, numeric, toggle, or dropdown.

            • label string Required
            • options array[object] Required
            • order number
            • placeholder string
            • required boolean Required
            • sensitive boolean Required
            • tooltip
            • type string

              Values are str, int, list, or bool.

            • ui_restrictions array[string]
            • validations array[object]
            • value object Required
        • filtering object Required
          Hide filtering attributes Show filtering attributes object
          • advanced_snippet object Required
            Hide advanced_snippet attributes Show advanced_snippet attributes object
            • created_at
            • updated_at
            • value object Required
          • rules array[object] Required
          • validation object Required
            Hide validation attributes Show validation attributes object
            • errors array[object] Required
            • state string Required

              Values are edited, invalid, or valid.

        • id string Required
        • index_name string Required
        • language string
        • pipeline object
          Hide pipeline attributes Show pipeline attributes object
          • extract_binary_content boolean Required
          • name string Required
          • reduce_whitespace boolean Required
          • run_ml_inference boolean Required
        • service_type string Required
        • sync_cursor object
      • created_at string | number Required

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        One of:
      • deleted_document_count number Required
      • error string
      • id string Required
      • indexed_document_count number Required
      • indexed_document_volume number Required
      • job_type string Required

        Values are full, incremental, or access_control.

      • last_seen string | number

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        One of:
      • metadata object Required
        Hide metadata attribute Show metadata attribute object
        • * object Additional properties
      • started_at string | number

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        One of:
      • status string Required

        Values are canceling, canceled, completed, error, in_progress, pending, or suspended.

      • total_document_count number Required
      • trigger_method string Required

        Values are on_demand or scheduled.

      • worker_hostname string
GET _connector/_sync_job?connector_id=my-connector-id&size=1
resp = client.connector.sync_job_list(
    connector_id="my-connector-id",
    size="1",
)
const response = await client.connector.syncJobList({
  connector_id: "my-connector-id",
  size: 1,
});
response = client.connector.sync_job_list(
  connector_id: "my-connector-id",
  size: "1"
)
$resp = $client->connector()->syncJobList([
    "connector_id" => "my-connector-id",
    "size" => "1",
]);
curl -X GET -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/_connector/_sync_job?connector_id=my-connector-id&size=1"
client.connector().syncJobList(s -> s
    .connectorId("my-connector-id")
    .size(1)
);




Activate the connector draft filter Technical preview

PUT /_connector/{connector_id}/_filtering/_activate

Activates the valid draft filtering for a connector.

Path parameters

  • connector_id string Required

    The unique identifier of the connector to be updated

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • result string Required

      Values are created, updated, deleted, not_found, or noop.

PUT /_connector/{connector_id}/_filtering/_activate
curl \
 --request PUT 'http://api.example.com/_connector/{connector_id}/_filtering/_activate' \
 --header "Authorization: $API_KEY"




Update the connector configuration Beta

PUT /_connector/{connector_id}/_configuration

Update the configuration field in the connector document.

Path parameters

  • connector_id string Required

    The unique identifier of the connector to be updated

application/json

Body Required

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • result string Required

      Values are created, updated, deleted, not_found, or noop.

PUT /_connector/{connector_id}/_configuration
PUT _connector/my-spo-connector/_configuration
{
    "values": {
        "tenant_id": "my-tenant-id",
        "tenant_name": "my-sharepoint-site",
        "client_id": "foo",
        "secret_value": "bar",
        "site_collections": "*"
    }
}
resp = client.connector.update_configuration(
    connector_id="my-spo-connector",
    values={
        "tenant_id": "my-tenant-id",
        "tenant_name": "my-sharepoint-site",
        "client_id": "foo",
        "secret_value": "bar",
        "site_collections": "*"
    },
)
const response = await client.connector.updateConfiguration({
  connector_id: "my-spo-connector",
  values: {
    tenant_id: "my-tenant-id",
    tenant_name: "my-sharepoint-site",
    client_id: "foo",
    secret_value: "bar",
    site_collections: "*",
  },
});
response = client.connector.update_configuration(
  connector_id: "my-spo-connector",
  body: {
    "values": {
      "tenant_id": "my-tenant-id",
      "tenant_name": "my-sharepoint-site",
      "client_id": "foo",
      "secret_value": "bar",
      "site_collections": "*"
    }
  }
)
$resp = $client->connector()->updateConfiguration([
    "connector_id" => "my-spo-connector",
    "body" => [
        "values" => [
            "tenant_id" => "my-tenant-id",
            "tenant_name" => "my-sharepoint-site",
            "client_id" => "foo",
            "secret_value" => "bar",
            "site_collections" => "*",
        ],
    ],
]);
curl -X PUT -H "Authorization: ApiKey $ELASTIC_API_KEY" -H "Content-Type: application/json" -d '{"values":{"tenant_id":"my-tenant-id","tenant_name":"my-sharepoint-site","client_id":"foo","secret_value":"bar","site_collections":"*"}}' "$ELASTICSEARCH_URL/_connector/my-spo-connector/_configuration"
client.connector().updateConfiguration(u -> u
    .connectorId("my-spo-connector")
    .values(Map.of("tenant_id", JsonData.fromJson("\"my-tenant-id\""),"tenant_name", JsonData.fromJson("\"my-sharepoint-site\""),"secret_value", JsonData.fromJson("\"bar\""),"client_id", JsonData.fromJson("\"foo\""),"site_collections", JsonData.fromJson("\"*\"")))
);
{
    "values": {
        "tenant_id": "my-tenant-id",
        "tenant_name": "my-sharepoint-site",
        "client_id": "foo",
        "secret_value": "bar",
        "site_collections": "*"
    }
}
{
    "values": {
        "secret_value": "foo-bar"
    }
}
Response examples (200)
{
  "result": "updated"
}

Update the connector error field Technical preview

PUT /_connector/{connector_id}/_error

Set the error field for the connector. If the error provided in the request body is non-null, the connector’s status is updated to error. Otherwise, if the error is reset to null, the connector status is updated to connected.

Path parameters

  • connector_id string Required

    The unique identifier of the connector to be updated

application/json

Body Required

  • error string | null Required

    One of:

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • result string Required

      Values are created, updated, deleted, not_found, or noop.

PUT /_connector/{connector_id}/_error
PUT _connector/my-connector/_error
{
    "error": "Houston, we have a problem!"
}
resp = client.connector.update_error(
    connector_id="my-connector",
    error="Houston, we have a problem!",
)
const response = await client.connector.updateError({
  connector_id: "my-connector",
  error: "Houston, we have a problem!",
});
response = client.connector.update_error(
  connector_id: "my-connector",
  body: {
    "error": "Houston, we have a problem!"
  }
)
$resp = $client->connector()->updateError([
    "connector_id" => "my-connector",
    "body" => [
        "error" => "Houston, we have a problem!",
    ],
]);
curl -X PUT -H "Authorization: ApiKey $ELASTIC_API_KEY" -H "Content-Type: application/json" -d '{"error":"Houston, we have a problem!"}' "$ELASTICSEARCH_URL/_connector/my-connector/_error"
client.connector().updateError(u -> u
    .connectorId("my-connector")
    .error("Houston, we have a problem!")
);
Request example
{
    "error": "Houston, we have a problem!"
}
Response examples (200)
{
  "result": "updated"
}

Update the connector filtering Beta

PUT /_connector/{connector_id}/_filtering

Update the draft filtering configuration of a connector and marks the draft validation state as edited. The filtering draft is activated once validated by the running Elastic connector service. The filtering property is used to configure sync rules (both basic and advanced) for a connector.

Path parameters

  • connector_id string Required

    The unique identifier of the connector to be updated

application/json

Body Required

  • filtering array[object]
    Hide filtering attributes Show filtering attributes object
    • active object Required
      Hide active attributes Show active attributes object
      • advanced_snippet object Required
        Hide advanced_snippet attributes Show advanced_snippet attributes object
        • created_at string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

          One of:
        • updated_at string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

          One of:
        • value object Required
      • rules array[object] Required
        Hide rules attributes Show rules attributes object
        • created_at string
        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • id string Required
        • order number Required
        • policy string Required

          Values are exclude or include.

        • rule string Required

          Values are contains, ends_with, equals, regex, starts_with, >, or <.

        • updated_at string
        • value string Required
      • validation object Required
        Hide validation attributes Show validation attributes object
        • errors array[object] Required
          Hide errors attributes Show errors attributes object
          • ids array[string] Required
          • messages array[string] Required
        • state string Required

          Values are edited, invalid, or valid.

    • domain string
    • draft object Required
      Hide draft attributes Show draft attributes object
      • advanced_snippet object Required
        Hide advanced_snippet attributes Show advanced_snippet attributes object
        • created_at string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

          One of:
        • updated_at string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

          One of:
        • value object Required
      • rules array[object] Required
        Hide rules attributes Show rules attributes object
        • created_at string
        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • id string Required
        • order number Required
        • policy string Required

          Values are exclude or include.

        • rule string Required

          Values are contains, ends_with, equals, regex, starts_with, >, or <.

        • updated_at string
        • value string Required
      • validation object Required
        Hide validation attributes Show validation attributes object
        • errors array[object] Required
          Hide errors attributes Show errors attributes object
          • ids array[string] Required
          • messages array[string] Required
        • state string Required

          Values are edited, invalid, or valid.

  • rules array[object]
    Hide rules attributes Show rules attributes object
    • created_at string | number

      A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      One of:
    • field string Required

      Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • id string Required
    • order number Required
    • policy string Required

      Values are exclude or include.

    • rule string Required

      Values are contains, ends_with, equals, regex, starts_with, >, or <.

    • updated_at string | number

      A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      One of:
    • value string Required
  • advanced_snippet object
    Hide advanced_snippet attributes Show advanced_snippet attributes object
    • created_at string | number

      A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      One of:
    • updated_at string | number

      A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      One of:
    • value object Required

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • result string Required

      Values are created, updated, deleted, not_found, or noop.

PUT /_connector/{connector_id}/_filtering
PUT _connector/my-g-drive-connector/_filtering
{
    "rules": [
         {
            "field": "file_extension",
            "id": "exclude-txt-files",
            "order": 0,
            "policy": "exclude",
            "rule": "equals",
            "value": "txt"
        },
        {
            "field": "_",
            "id": "DEFAULT",
            "order": 1,
            "policy": "include",
            "rule": "regex",
            "value": ".*"
        }
    ]
}
resp = client.connector.update_filtering(
    connector_id="my-g-drive-connector",
    rules=[
        {
            "field": "file_extension",
            "id": "exclude-txt-files",
            "order": 0,
            "policy": "exclude",
            "rule": "equals",
            "value": "txt"
        },
        {
            "field": "_",
            "id": "DEFAULT",
            "order": 1,
            "policy": "include",
            "rule": "regex",
            "value": ".*"
        }
    ],
)
const response = await client.connector.updateFiltering({
  connector_id: "my-g-drive-connector",
  rules: [
    {
      field: "file_extension",
      id: "exclude-txt-files",
      order: 0,
      policy: "exclude",
      rule: "equals",
      value: "txt",
    },
    {
      field: "_",
      id: "DEFAULT",
      order: 1,
      policy: "include",
      rule: "regex",
      value: ".*",
    },
  ],
});
response = client.connector.update_filtering(
  connector_id: "my-g-drive-connector",
  body: {
    "rules": [
      {
        "field": "file_extension",
        "id": "exclude-txt-files",
        "order": 0,
        "policy": "exclude",
        "rule": "equals",
        "value": "txt"
      },
      {
        "field": "_",
        "id": "DEFAULT",
        "order": 1,
        "policy": "include",
        "rule": "regex",
        "value": ".*"
      }
    ]
  }
)
$resp = $client->connector()->updateFiltering([
    "connector_id" => "my-g-drive-connector",
    "body" => [
        "rules" => array(
            [
                "field" => "file_extension",
                "id" => "exclude-txt-files",
                "order" => 0,
                "policy" => "exclude",
                "rule" => "equals",
                "value" => "txt",
            ],
            [
                "field" => "_",
                "id" => "DEFAULT",
                "order" => 1,
                "policy" => "include",
                "rule" => "regex",
                "value" => ".*",
            ],
        ),
    ],
]);
curl -X PUT -H "Authorization: ApiKey $ELASTIC_API_KEY" -H "Content-Type: application/json" -d '{"rules":[{"field":"file_extension","id":"exclude-txt-files","order":0,"policy":"exclude","rule":"equals","value":"txt"},{"field":"_","id":"DEFAULT","order":1,"policy":"include","rule":"regex","value":".*"}]}' "$ELASTICSEARCH_URL/_connector/my-g-drive-connector/_filtering"
client.connector().updateFiltering(u -> u
    .connectorId("my-g-drive-connector")
    .rules(List.of(FilteringRule.of(f -> f
            .field("file_extension")
            .id("exclude-txt-files")
            .order(0)
            .policy(FilteringPolicy.Exclude)
            .rule(FilteringRuleRule.Equals)
            .value("txt")),FilteringRule.of(f -> f
            .field("_")
            .id("DEFAULT")
            .order(1)
            .policy(FilteringPolicy.Include)
            .rule(FilteringRuleRule.Regex)
            .value(".*"))))
);
Request examples
{
    "rules": [
         {
            "field": "file_extension",
            "id": "exclude-txt-files",
            "order": 0,
            "policy": "exclude",
            "rule": "equals",
            "value": "txt"
        },
        {
            "field": "_",
            "id": "DEFAULT",
            "order": 1,
            "policy": "include",
            "rule": "regex",
            "value": ".*"
        }
    ]
}
{
    "advanced_snippet": {
        "value": [{
            "tables": [
                "users",
                "orders"
            ],
            "query": "SELECT users.id AS id, orders.order_id AS order_id FROM users JOIN orders ON users.id = orders.user_id"
        }]
    }
}
Response examples (200)
{
  "result": "updated"
}








Update the connector name and description Beta

PUT /_connector/{connector_id}/_name

Path parameters

  • connector_id string Required

    The unique identifier of the connector to be updated

application/json

Body Required

  • name string
  • description string

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • result string Required

      Values are created, updated, deleted, not_found, or noop.

PUT /_connector/{connector_id}/_name
PUT _connector/my-connector/_name
{
    "name": "Custom connector",
    "description": "This is my customized connector"
}
resp = client.connector.update_name(
    connector_id="my-connector",
    name="Custom connector",
    description="This is my customized connector",
)
const response = await client.connector.updateName({
  connector_id: "my-connector",
  name: "Custom connector",
  description: "This is my customized connector",
});
response = client.connector.update_name(
  connector_id: "my-connector",
  body: {
    "name": "Custom connector",
    "description": "This is my customized connector"
  }
)
$resp = $client->connector()->updateName([
    "connector_id" => "my-connector",
    "body" => [
        "name" => "Custom connector",
        "description" => "This is my customized connector",
    ],
]);
curl -X PUT -H "Authorization: ApiKey $ELASTIC_API_KEY" -H "Content-Type: application/json" -d '{"name":"Custom connector","description":"This is my customized connector"}' "$ELASTICSEARCH_URL/_connector/my-connector/_name"
client.connector().updateName(u -> u
    .connectorId("my-connector")
    .description("This is my customized connector")
    .name("Custom connector")
);
Request example
{
    "name": "Custom connector",
    "description": "This is my customized connector"
}
Response examples (200)
{
  "result": "updated"
}




















Get data streams Generally available

GET /_data_stream/{name}

All methods and paths for this operation:

GET /_data_stream

GET /_data_stream/{name}

Get information about one or more data streams.

Required authorization

  • Index privileges: view_index_metadata

Path parameters

  • name string | array[string]

    Comma-separated list of data stream names used to limit the request. Wildcard (*) expressions are supported. If omitted, all data streams are returned.

Query parameters

  • expand_wildcards string | array[string]

    Type of data stream that wildcard patterns can match. Supports comma-separated values, such as open,hidden.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.

    Values are all, open, closed, hidden, or none.

  • include_defaults boolean Generally available

    If true, returns all relevant default configurations for the index template.

  • master_timeout string

    Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

    Values are -1 or 0.

  • verbose boolean

    Whether the maximum timestamp for each data stream should be calculated and returned.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • data_streams array[object] Required
      Hide data_streams attributes Show data_streams attributes object
      • _meta object
        Hide _meta attribute Show _meta attribute object
        • * object Additional properties
      • allow_custom_routing boolean

        If true, the data stream allows custom routing on write request.

      • failure_store object
        Hide failure_store attributes Show failure_store attributes object
        • enabled boolean Required
        • indices array[object] Required
          Hide indices attributes Show indices attributes object
          • index_name string Required
          • index_uuid string Required
          • ilm_policy string
          • managed_by string

            Values are Index Lifecycle Management, Data stream lifecycle, or Unmanaged.

          • prefer_ilm boolean

            Indicates if ILM should take precedence over DSL in case both are configured to manage this index.

          • index_mode string

            Values are standard, time_series, logsdb, or lookup.

        • rollover_on_write boolean Required
      • generation number Required

        Current generation for the data stream. This number acts as a cumulative count of the stream’s rollovers, starting at 1.

      • hidden boolean Required

        If true, the data stream is hidden.

      • ilm_policy string
      • next_generation_managed_by string Required

        Values are Index Lifecycle Management, Data stream lifecycle, or Unmanaged.

      • prefer_ilm boolean Required

        Indicates if ILM should take precedence over DSL in case both are configured to managed this data stream.

      • indices array[object] Required

        Array of objects containing information about the data stream’s backing indices. The last item in this array contains information about the stream’s current write index.

        Hide indices attributes Show indices attributes object
        • index_name string Required
        • index_uuid string Required
        • ilm_policy string
        • managed_by string

          Values are Index Lifecycle Management, Data stream lifecycle, or Unmanaged.

        • prefer_ilm boolean

          Indicates if ILM should take precedence over DSL in case both are configured to manage this index.

        • index_mode string

          Values are standard, time_series, logsdb, or lookup.

      • lifecycle object

        Data stream lifecycle with rollover can be used to display the configuration including the default rollover conditions, if asked.

        Hide lifecycle attributes Show lifecycle attributes object
        • data_retention string

          A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • downsampling object
          Hide downsampling attribute Show downsampling attribute object
          • rounds array[object] Required

            The list of downsampling rounds to execute as part of this downsampling configuration

        • enabled boolean

          If defined, it turns data stream lifecycle on/off (true/false) for this data stream. A data stream lifecycle that's disabled (enabled: false) will have no effect on the data stream.

          Default value is true.

        • rollover object
          Hide rollover attributes Show rollover attributes object
          • min_age string

            A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

          • max_age string
          • min_docs number
          • max_docs number
          • min_size
          • max_size
          • min_primary_shard_size
          • max_primary_shard_size
          • min_primary_shard_docs number
          • max_primary_shard_docs number
      • name string Required
      • replicated boolean

        If true, the data stream is created and managed by cross-cluster replication and the local cluster can not write into this data stream or change its mappings.

      • rollover_on_write boolean Required

        If true, the next write to this data stream will trigger a rollover first and the document will be indexed in the new backing index. If the rollover fails the indexing request will fail too.

      • settings object Required Additional properties
        Index settings
      • mappings object
        Hide mappings attributes Show mappings attributes object
        • all_field object
          Hide all_field attributes Show all_field attributes object
          • analyzer string Required
          • enabled boolean Required
          • omit_norms boolean Required
          • search_analyzer string Required
          • similarity string Required
          • store boolean Required
          • store_term_vector_offsets boolean Required
          • store_term_vector_payloads boolean Required
          • store_term_vector_positions boolean Required
          • store_term_vectors boolean Required
        • date_detection boolean
        • dynamic string

          Values are strict, runtime, true, or false.

        • dynamic_date_formats array[string]
        • dynamic_templates array[object]
        • _field_names object
          Hide _field_names attribute Show _field_names attribute object
          • enabled boolean Required
        • index_field object
          Hide index_field attribute Show index_field attribute object
          • enabled boolean Required
        • _meta object
          Hide _meta attribute Show _meta attribute object
          • * object Additional properties
        • numeric_detection boolean
        • properties object
        • _routing object
          Hide _routing attribute Show _routing attribute object
          • required boolean Required
        • _size object
          Hide _size attribute Show _size attribute object
          • enabled boolean Required
        • _source object
          Hide _source attributes Show _source attributes object
          • compress boolean
          • compress_threshold string
          • enabled boolean
          • excludes array[string]
          • includes array[string]
          • mode string

            Values are disabled, stored, or synthetic.

        • runtime object
          Hide runtime attribute Show runtime attribute object
          • * object Additional properties
            Hide * attributes Show * attributes object
            • fields object

              For type composite

            • fetch_fields array[object]

              For type lookup

            • format string

              A custom format for date type runtime fields.

            • input_field string

              Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

            • target_field string

              Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

            • target_index string
            • script object
            • type string Required

              Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

        • enabled boolean
        • subobjects string

          Values are true or false.

        • _data_stream_timestamp object
          Hide _data_stream_timestamp attribute Show _data_stream_timestamp attribute object
          • enabled boolean Required
      • status string Required

        Values are green, GREEN, yellow, YELLOW, red, RED, unknown, or unavailable.

      • system boolean Generally available

        If true, the data stream is created and managed by an Elastic stack component and cannot be modified through normal user interaction.

      • template string Required
      • timestamp_field object Required
        Hide timestamp_field attribute Show timestamp_field attribute object
        • name string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • index_mode string

        Values are standard, time_series, logsdb, or lookup.

GET _data_stream/my-data-stream
resp = client.indices.get_data_stream(
    name="my-data-stream",
)
const response = await client.indices.getDataStream({
  name: "my-data-stream",
});
response = client.indices.get_data_stream(
  name: "my-data-stream"
)
$resp = $client->indices()->getDataStream([
    "name" => "my-data-stream",
]);
curl -X GET -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/_data_stream/my-data-stream"
client.indices().getDataStream(g -> g
    .name("my-data-stream")
);
Response examples (200)
A successful response for retrieving information about a data stream.
{
  "data_streams": [
    {
      "name": "my-data-stream",
      "timestamp_field": {
        "name": "@timestamp"
      },
      "indices": [
        {
          "index_name": ".ds-my-data-stream-2099.03.07-000001",
          "index_uuid": "xCEhwsp8Tey0-FLNFYVwSg",
          "prefer_ilm": true,
          "ilm_policy": "my-lifecycle-policy",
          "managed_by": "Index Lifecycle Management"
        },
        {
          "index_name": ".ds-my-data-stream-2099.03.08-000002",
          "index_uuid": "PA_JquKGSiKcAKBA8DJ5gw",
          "prefer_ilm": true,
          "ilm_policy": "my-lifecycle-policy",
          "managed_by": "Index Lifecycle Management"
        }
      ],
      "generation": 2,
      "_meta": {
        "my-meta-field": "foo"
      },
      "status": "GREEN",
      "next_generation_managed_by": "Index Lifecycle Management",
      "prefer_ilm": true,
      "template": "my-index-template",
      "ilm_policy": "my-lifecycle-policy",
      "hidden": false,
      "system": false,
      "allow_custom_routing": false,
      "replicated": false,
      "rollover_on_write": false
    },
    {
      "name": "my-data-stream-two",
      "timestamp_field": {
        "name": "@timestamp"
      },
      "indices": [
        {
          "index_name": ".ds-my-data-stream-two-2099.03.08-000001",
          "index_uuid": "3liBu2SYS5axasRt6fUIpA",
          "prefer_ilm": true,
          "ilm_policy": "my-lifecycle-policy",
          "managed_by": "Index Lifecycle Management"
        }
      ],
      "generation": 1,
      "_meta": {
        "my-meta-field": "foo"
      },
      "status": "YELLOW",
      "next_generation_managed_by": "Index Lifecycle Management",
      "prefer_ilm": true,
      "template": "my-index-template",
      "ilm_policy": "my-lifecycle-policy",
      "hidden": false,
      "system": false,
      "allow_custom_routing": false,
      "replicated": false,
      "rollover_on_write": false
    }
  ]
}
















































Update data streams Generally available

POST /_data_stream/_modify

Performs one or more data stream modification actions in a single atomic operation.

application/json

Body Required

  • actions array[object] Required

    Actions to perform.

    Hide actions attributes Show actions attributes object
    • add_backing_index object
      Hide add_backing_index attributes Show add_backing_index attributes object
      • data_stream string Required
      • index string Required
    • remove_backing_index object
      Hide remove_backing_index attributes Show remove_backing_index attributes object
      • data_stream string Required
      • index string Required

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

POST _data_stream/_modify
{
  "actions": [
    {
      "remove_backing_index": {
        "data_stream": "my-data-stream",
        "index": ".ds-my-data-stream-2023.07.26-000001"
      }
    },
    {
      "add_backing_index": {
        "data_stream": "my-data-stream",
        "index": ".ds-my-data-stream-2023.07.26-000001-downsample"
      }
    }
  ]
}
resp = client.indices.modify_data_stream(
    actions=[
        {
            "remove_backing_index": {
                "data_stream": "my-data-stream",
                "index": ".ds-my-data-stream-2023.07.26-000001"
            }
        },
        {
            "add_backing_index": {
                "data_stream": "my-data-stream",
                "index": ".ds-my-data-stream-2023.07.26-000001-downsample"
            }
        }
    ],
)
const response = await client.indices.modifyDataStream({
  actions: [
    {
      remove_backing_index: {
        data_stream: "my-data-stream",
        index: ".ds-my-data-stream-2023.07.26-000001",
      },
    },
    {
      add_backing_index: {
        data_stream: "my-data-stream",
        index: ".ds-my-data-stream-2023.07.26-000001-downsample",
      },
    },
  ],
});
response = client.indices.modify_data_stream(
  body: {
    "actions": [
      {
        "remove_backing_index": {
          "data_stream": "my-data-stream",
          "index": ".ds-my-data-stream-2023.07.26-000001"
        }
      },
      {
        "add_backing_index": {
          "data_stream": "my-data-stream",
          "index": ".ds-my-data-stream-2023.07.26-000001-downsample"
        }
      }
    ]
  }
)
$resp = $client->indices()->modifyDataStream([
    "body" => [
        "actions" => array(
            [
                "remove_backing_index" => [
                    "data_stream" => "my-data-stream",
                    "index" => ".ds-my-data-stream-2023.07.26-000001",
                ],
            ],
            [
                "add_backing_index" => [
                    "data_stream" => "my-data-stream",
                    "index" => ".ds-my-data-stream-2023.07.26-000001-downsample",
                ],
            ],
        ),
    ],
]);
curl -X POST -H "Authorization: ApiKey $ELASTIC_API_KEY" -H "Content-Type: application/json" -d '{"actions":[{"remove_backing_index":{"data_stream":"my-data-stream","index":".ds-my-data-stream-2023.07.26-000001"}},{"add_backing_index":{"data_stream":"my-data-stream","index":".ds-my-data-stream-2023.07.26-000001-downsample"}}]}' "$ELASTICSEARCH_URL/_data_stream/_modify"
client.indices().modifyDataStream(m -> m
    .actions(List.of(Action.of(a -> a
            .removeBackingIndex(r -> r
                .dataStream("my-data-stream")
                .index(".ds-my-data-stream-2023.07.26-000001")
        )),Action.of(ac -> ac
            .addBackingIndex(ad -> ad
                .dataStream("my-data-stream")
                .index(".ds-my-data-stream-2023.07.26-000001-downsample")
        ))))
);
Request example
An example body for a `POST _data_stream/_modify` request.
{
  "actions": [
    {
      "remove_backing_index": {
        "data_stream": "my-data-stream",
        "index": ".ds-my-data-stream-2023.07.26-000001"
      }
    },
    {
      "add_backing_index": {
        "data_stream": "my-data-stream",
        "index": ".ds-my-data-stream-2023.07.26-000001-downsample"
      }
    }
  ]
}

Bulk index or delete documents Generally available

PUT /{index}/_bulk

All methods and paths for this operation:

POST /_bulk

PUT /_bulk
POST /{index}/_bulk
PUT /{index}/_bulk

Perform multiple index, create, delete, and update actions in a single request. This reduces overhead and can greatly increase indexing speed.

If the Elasticsearch security features are enabled, you must have the following index privileges for the target data stream, index, or index alias:

  • To use the create action, you must have the create_doc, create, index, or write index privilege. Data streams support only the create action.
  • To use the index action, you must have the create, index, or write index privilege.
  • To use the delete action, you must have the delete or write index privilege.
  • To use the update action, you must have the index or write index privilege.
  • To automatically create a data stream or index with a bulk API request, you must have the auto_configure, create_index, or manage index privilege.
  • To make the result of a bulk operation visible to search using the refresh parameter, you must have the maintenance or manage index privilege.

Automatic data stream creation requires a matching index template with data stream enabled.

The actions are specified in the request body using a newline delimited JSON (NDJSON) structure:

action_and_meta_data\n
optional_source\n
action_and_meta_data\n
optional_source\n
....
action_and_meta_data\n
optional_source\n

The index and create actions expect a source on the next line and have the same semantics as the op_type parameter in the standard index API. A create action fails if a document with the same ID already exists in the target An index action adds or replaces a document as necessary.

NOTE: Data streams support only the create action. To update or delete a document in a data stream, you must target the backing index containing the document.

An update action expects that the partial doc, upsert, and script and its options are specified on the next line.

A delete action does not expect a source on the next line and has the same semantics as the standard delete API.

NOTE: The final line of data must end with a newline character (\n). Each newline character may be preceded by a carriage return (\r). When sending NDJSON data to the _bulk endpoint, use a Content-Type header of application/json or application/x-ndjson. Because this format uses literal newline characters (\n) as delimiters, make sure that the JSON actions and sources are not pretty printed.

If you provide a target in the request path, it is used for any actions that don't explicitly specify an _index argument.

A note on the format: the idea here is to make processing as fast as possible. As some of the actions are redirected to other shards on other nodes, only action_meta_data is parsed on the receiving node side.

Client libraries using this protocol should try and strive to do something similar on the client side, and reduce buffering as much as possible.

There is no "correct" number of actions to perform in a single bulk request. Experiment with different settings to find the optimal size for your particular workload. Note that Elasticsearch limits the maximum size of a HTTP request to 100mb by default so clients must ensure that no request exceeds this size. It is not possible to index a single document that exceeds the size limit, so you must pre-process any such documents into smaller pieces before sending them to Elasticsearch. For instance, split documents into pages or chapters before indexing them, or store raw binary data in a system outside Elasticsearch and replace the raw data with a link to the external system in the documents that you send to Elasticsearch.

Client suppport for bulk requests

Some of the officially supported clients provide helpers to assist with bulk requests and reindexing:

  • Go: Check out esutil.BulkIndexer
  • Perl: Check out Search::Elasticsearch::Client::5_0::Bulk and Search::Elasticsearch::Client::5_0::Scroll
  • Python: Check out elasticsearch.helpers.*
  • JavaScript: Check out client.helpers.*
  • .NET: Check out BulkAllObservable
  • PHP: Check out bulk indexing.

Submitting bulk requests with cURL

If you're providing text file input to curl, you must use the --data-binary flag instead of plain -d. The latter doesn't preserve newlines. For example:

$ cat requests
{ "index" : { "_index" : "test", "_id" : "1" } }
{ "field1" : "value1" }
$ curl -s -H "Content-Type: application/x-ndjson" -XPOST localhost:9200/_bulk --data-binary "@requests"; echo
{"took":7, "errors": false, "items":[{"index":{"_index":"test","_id":"1","_version":1,"result":"created","forced_refresh":false}}]}

Optimistic concurrency control

Each index and delete action within a bulk API call may include the if_seq_no and if_primary_term parameters in their respective action and meta data lines. The if_seq_no and if_primary_term parameters control how operations are run, based on the last modification to existing documents. See Optimistic concurrency control for more details.

Versioning

Each bulk item can include the version value using the version field. It automatically follows the behavior of the index or delete operation based on the _version mapping. It also support the version_type.

Routing

Each bulk item can include the routing value using the routing field. It automatically follows the behavior of the index or delete operation based on the _routing mapping.

NOTE: Data streams do not support custom routing unless they were created with the allow_custom_routing setting enabled in the template.

Wait for active shards

When making bulk calls, you can set the wait_for_active_shards parameter to require a minimum number of shard copies to be active before starting to process the bulk request.

Refresh

Control when the changes made by this request are visible to search.

NOTE: Only the shards that receive the bulk request will be affected by refresh. Imagine a _bulk?refresh=wait_for request with three documents in it that happen to be routed to different shards in an index with five shards. The request will only wait for those three shards to refresh. The other two shards that make up the index do not participate in the _bulk request at all.

You might want to disable the refresh interval temporarily to improve indexing throughput for large bulk requests. Refer to the linked documentation for step-by-step instructions using the index settings API.

External documentation

Path parameters

  • index string Required

    The name of the data stream, index, or index alias to perform bulk actions on.

Query parameters

  • include_source_on_error boolean

    True or false if to include the document source in the error message in case of parsing errors.

  • list_executed_pipelines boolean

    If true, the response will include the ingest pipelines that were run for each index or create.

  • pipeline string

    The pipeline identifier to use to preprocess incoming documents. If the index has a default ingest pipeline specified, setting the value to _none turns off the default ingest pipeline for this request. If a final pipeline is configured, it will always run regardless of the value of this parameter.

  • refresh string

    If true, Elasticsearch refreshes the affected shards to make this operation visible to search. If wait_for, wait for a refresh to make this operation visible to search. If false, do nothing with refreshes. Valid values: true, false, wait_for.

    Values are true, false, or wait_for.

  • routing string

    A custom value that is used to route operations to a specific shard.

  • _source boolean | string | array[string]

    Indicates whether to return the _source field (true or false) or contains a list of fields to return.

  • _source_excludes string | array[string]

    A comma-separated list of source fields to exclude from the response. You can also use this parameter to exclude fields from the subset specified in _source_includes query parameter. If the _source parameter is false, this parameter is ignored.

  • _source_includes string | array[string]

    A comma-separated list of source fields to include in the response. If this parameter is specified, only these source fields are returned. You can exclude fields from this subset using the _source_excludes query parameter. If the _source parameter is false, this parameter is ignored.

  • timeout string

    The period each action waits for the following operations: automatic index creation, dynamic mapping updates, and waiting for active shards. The default is 1m (one minute), which guarantees Elasticsearch waits for at least the timeout before failing. The actual wait time could be longer, particularly when multiple waits occur.

    Values are -1 or 0.

  • wait_for_active_shards number | string

    The number of shard copies that must be active before proceeding with the operation. Set to all or any positive integer up to the total number of shards in the index (number_of_replicas+1). The default is 1, which waits for each primary shard to be active.

    Values are all or index-setting.

  • require_alias boolean

    If true, the request's actions must target an index alias.

  • require_data_stream boolean

    If true, the request's actions must target a data stream (existing or to be created).

application/json

Body object Required

One of:
  • index object
    Hide index attributes Show index attributes object
    • _id string
    • _index string
    • routing string
    • if_primary_term number
    • if_seq_no number
    • version number
    • version_type string

      Values are internal, external, external_gte, or force.

    • dynamic_templates object

      A map from the full name of fields to the name of dynamic templates. It defaults to an empty map. If a name matches a dynamic template, that template will be applied regardless of other match predicates defined in the template. If a field is already defined in the mapping, then this parameter won't be used.

      Hide dynamic_templates attribute Show dynamic_templates attribute object
      • * string Additional properties
    • pipeline string

      The ID of the pipeline to use to preprocess incoming documents. If the index has a default ingest pipeline specified, setting the value to _none turns off the default ingest pipeline for this request. If a final pipeline is configured, it will always run regardless of the value of this parameter.

    • require_alias boolean

      If true, the request's actions must target an index alias.

      Default value is false.

  • create object
    Hide create attributes Show create attributes object
    • _id string
    • _index string
    • routing string
    • if_primary_term number
    • if_seq_no number
    • version number
    • version_type string

      Values are internal, external, external_gte, or force.

    • dynamic_templates object

      A map from the full name of fields to the name of dynamic templates. It defaults to an empty map. If a name matches a dynamic template, that template will be applied regardless of other match predicates defined in the template. If a field is already defined in the mapping, then this parameter won't be used.

      Hide dynamic_templates attribute Show dynamic_templates attribute object
      • * string Additional properties
    • pipeline string

      The ID of the pipeline to use to preprocess incoming documents. If the index has a default ingest pipeline specified, setting the value to _none turns off the default ingest pipeline for this request. If a final pipeline is configured, it will always run regardless of the value of this parameter.

    • require_alias boolean

      If true, the request's actions must target an index alias.

      Default value is false.

  • update object
    Hide update attributes Show update attributes object
    • _id string
    • _index string
    • routing string
    • if_primary_term number
    • if_seq_no number
    • version number
    • version_type string

      Values are internal, external, external_gte, or force.

    • require_alias boolean

      If true, the request's actions must target an index alias.

      Default value is false.

    • retry_on_conflict number

      The number of times an update should be retried in the case of a version conflict.

  • delete object
    Hide delete attributes Show delete attributes object
    • _id string
    • _index string
    • routing string
    • if_primary_term number
    • if_seq_no number
    • version number
    • version_type string

      Values are internal, external, external_gte, or force.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • errors boolean Required

      If true, one or more of the operations in the bulk request did not complete successfully.

    • items array[object] Required

      The result of each operation in the bulk request, in the order they were submitted.

      Hide items attribute Show items attribute object
      • * object Additional properties
        Hide * attributes Show * attributes object
        • _id string | null

          The document ID associated with the operation.

        • _index string Required

          The name of the index associated with the operation. If the operation targeted a data stream, this is the backing index into which the document was written.

        • status number Required

          The HTTP status code returned for the operation.

        • failure_store string

          Values are not_applicable_or_unknown, used, not_enabled, or failed.

        • error object

          Cause and details about a request failure. This class defines the properties common to all error types. Additional details are also provided, that depend on the error type.

          Hide error attributes Show error attributes object
          • type string Required

            The type of error

          • reason string | null

            A human-readable explanation of the error, in English.

          • stack_trace string

            The server stack trace. Present only if the error_trace=true parameter was sent with the request.

          • caused_by object

            Cause and details about a request failure. This class defines the properties common to all error types. Additional details are also provided, that depend on the error type.

          • root_cause array[object]

            Cause and details about a request failure. This class defines the properties common to all error types. Additional details are also provided, that depend on the error type.

            Cause and details about a request failure. This class defines the properties common to all error types. Additional details are also provided, that depend on the error type.

          • suppressed array[object]

            Cause and details about a request failure. This class defines the properties common to all error types. Additional details are also provided, that depend on the error type.

            Cause and details about a request failure. This class defines the properties common to all error types. Additional details are also provided, that depend on the error type.

        • _primary_term number

          The primary term assigned to the document for the operation. This property is returned only for successful operations.

        • result string

          The result of the operation. Successful values are created, deleted, and updated.

        • _seq_no number
        • _shards object
          Hide _shards attributes Show _shards attributes object
          • failed number Required
          • successful number Required
          • total number Required
          • failures array[object]
          • skipped number
        • _version number
        • forced_refresh boolean
        • get object
          Hide get attributes Show get attributes object
          • fields object
            Hide fields attribute Show fields attribute object
            • * object Additional properties
          • found boolean Required
          • _seq_no number
          • _primary_term number
          • _routing string
          • _source object
            Hide _source attribute Show _source attribute object
            • * object Additional properties
    • took number Required

      The length of time, in milliseconds, it took to process the bulk request.

    • ingest_took number
POST _bulk
{ "index" : { "_index" : "test", "_id" : "1" } }
{ "field1" : "value1" }
{ "delete" : { "_index" : "test", "_id" : "2" } }
{ "create" : { "_index" : "test", "_id" : "3" } }
{ "field1" : "value3" }
{ "update" : {"_id" : "1", "_index" : "test"} }
{ "doc" : {"field2" : "value2"} }
resp = client.bulk(
    operations=[
        {
            "index": {
                "_index": "test",
                "_id": "1"
            }
        },
        {
            "field1": "value1"
        },
        {
            "delete": {
                "_index": "test",
                "_id": "2"
            }
        },
        {
            "create": {
                "_index": "test",
                "_id": "3"
            }
        },
        {
            "field1": "value3"
        },
        {
            "update": {
                "_id": "1",
                "_index": "test"
            }
        },
        {
            "doc": {
                "field2": "value2"
            }
        }
    ],
)
const response = await client.bulk({
  operations: [
    {
      index: {
        _index: "test",
        _id: "1",
      },
    },
    {
      field1: "value1",
    },
    {
      delete: {
        _index: "test",
        _id: "2",
      },
    },
    {
      create: {
        _index: "test",
        _id: "3",
      },
    },
    {
      field1: "value3",
    },
    {
      update: {
        _id: "1",
        _index: "test",
      },
    },
    {
      doc: {
        field2: "value2",
      },
    },
  ],
});
response = client.bulk(
  body: [
    {
      "index": {
        "_index": "test",
        "_id": "1"
      }
    },
    {
      "field1": "value1"
    },
    {
      "delete": {
        "_index": "test",
        "_id": "2"
      }
    },
    {
      "create": {
        "_index": "test",
        "_id": "3"
      }
    },
    {
      "field1": "value3"
    },
    {
      "update": {
        "_id": "1",
        "_index": "test"
      }
    },
    {
      "doc": {
        "field2": "value2"
      }
    }
  ]
)
$resp = $client->bulk([
    "body" => array(
        [
            "index" => [
                "_index" => "test",
                "_id" => "1",
            ],
        ],
        [
            "field1" => "value1",
        ],
        [
            "delete" => [
                "_index" => "test",
                "_id" => "2",
            ],
        ],
        [
            "create" => [
                "_index" => "test",
                "_id" => "3",
            ],
        ],
        [
            "field1" => "value3",
        ],
        [
            "update" => [
                "_id" => "1",
                "_index" => "test",
            ],
        ],
        [
            "doc" => [
                "field2" => "value2",
            ],
        ],
    ),
]);
curl -X POST -H "Authorization: ApiKey $ELASTIC_API_KEY" -H "Content-Type: application/json" -d '[{"index":{"_index":"test","_id":"1"}},{"field1":"value1"},{"delete":{"_index":"test","_id":"2"}},{"create":{"_index":"test","_id":"3"}},{"field1":"value3"},{"update":{"_id":"1","_index":"test"}},{"doc":{"field2":"value2"}}]' "$ELASTICSEARCH_URL/_bulk"
Run `POST _bulk` to perform multiple operations.
{ "index" : { "_index" : "test", "_id" : "1" } }
{ "field1" : "value1" }
{ "delete" : { "_index" : "test", "_id" : "2" } }
{ "create" : { "_index" : "test", "_id" : "3" } }
{ "field1" : "value3" }
{ "update" : {"_id" : "1", "_index" : "test"} }
{ "doc" : {"field2" : "value2"} }
When you run `POST _bulk` and use the `update` action, you can use `retry_on_conflict` as a field in the action itself (not in the extra payload line) to specify how many times an update should be retried in the case of a version conflict.
{ "update" : {"_id" : "1", "_index" : "index1", "retry_on_conflict" : 3} }
{ "doc" : {"field" : "value"} }
{ "update" : { "_id" : "0", "_index" : "index1", "retry_on_conflict" : 3} }
{ "script" : { "source": "ctx._source.counter += params.param1", "lang" : "painless", "params" : {"param1" : 1}}, "upsert" : {"counter" : 1}}
{ "update" : {"_id" : "2", "_index" : "index1", "retry_on_conflict" : 3} }
{ "doc" : {"field" : "value"}, "doc_as_upsert" : true }
{ "update" : {"_id" : "3", "_index" : "index1", "_source" : true} }
{ "doc" : {"field" : "value"} }
{ "update" : {"_id" : "4", "_index" : "index1"} }
{ "doc" : {"field" : "value"}, "_source": true}
To return only information about failed operations, run `POST /_bulk?filter_path=items.*.error`.
{ "update": {"_id": "5", "_index": "index1"} }
{ "doc": {"my_field": "foo"} }
{ "update": {"_id": "6", "_index": "index1"} }
{ "doc": {"my_field": "foo"} }
{ "create": {"_id": "7", "_index": "index1"} }
{ "my_field": "foo" }
Run `POST /_bulk` to perform a bulk request that consists of index and create actions with the `dynamic_templates` parameter. The bulk request creates two new fields `work_location` and `home_location` with type `geo_point` according to the `dynamic_templates` parameter. However, the `raw_location` field is created using default dynamic mapping rules, as a text field in that case since it is supplied as a string in the JSON document.
{ "index" : { "_index" : "my_index", "_id" : "1", "dynamic_templates": {"work_location": "geo_point"}} }
{ "field" : "value1", "work_location": "41.12,-71.34", "raw_location": "41.12,-71.34"}
{ "create" : { "_index" : "my_index", "_id" : "2", "dynamic_templates": {"home_location": "geo_point"}} }
{ "field" : "value2", "home_location": "41.12,-71.34"}
Response examples (200)
{
   "took": 30,
   "errors": false,
   "items": [
      {
         "index": {
            "_index": "test",
            "_id": "1",
            "_version": 1,
            "result": "created",
            "_shards": {
               "total": 2,
               "successful": 1,
               "failed": 0
            },
            "status": 201,
            "_seq_no" : 0,
            "_primary_term": 1
         }
      },
      {
         "delete": {
            "_index": "test",
            "_id": "2",
            "_version": 1,
            "result": "not_found",
            "_shards": {
               "total": 2,
               "successful": 1,
               "failed": 0
            },
            "status": 404,
            "_seq_no" : 1,
            "_primary_term" : 2
         }
      },
      {
         "create": {
            "_index": "test",
            "_id": "3",
            "_version": 1,
            "result": "created",
            "_shards": {
               "total": 2,
               "successful": 1,
               "failed": 0
            },
            "status": 201,
            "_seq_no" : 2,
            "_primary_term" : 3
         }
      },
      {
         "update": {
            "_index": "test",
            "_id": "1",
            "_version": 2,
            "result": "updated",
            "_shards": {
                "total": 2,
                "successful": 1,
                "failed": 0
            },
            "status": 200,
            "_seq_no" : 3,
            "_primary_term" : 4
         }
      }
   ]
}
If you run `POST /_bulk` with operations that update non-existent documents, the operations cannot complete successfully. The API returns a response with an `errors` property value `true`. The response also includes an error object for any failed operations. The error object contains additional information about the failure, such as the error type and reason.
{
  "took": 486,
  "errors": true,
  "items": [
    {
      "update": {
        "_index": "index1",
        "_id": "5",
        "status": 404,
        "error": {
          "type": "document_missing_exception",
          "reason": "[5]: document missing",
          "index_uuid": "aAsFqTI0Tc2W0LCWgPNrOA",
          "shard": "0",
          "index": "index1"
        }
      }
    },
    {
      "update": {
        "_index": "index1",
        "_id": "6",
        "status": 404,
        "error": {
          "type": "document_missing_exception",
          "reason": "[6]: document missing",
          "index_uuid": "aAsFqTI0Tc2W0LCWgPNrOA",
          "shard": "0",
          "index": "index1"
        }
      }
    },
    {
      "create": {
        "_index": "index1",
        "_id": "7",
        "_version": 1,
        "result": "created",
        "_shards": {
          "total": 2,
          "successful": 1,
          "failed": 0
        },
        "_seq_no": 0,
        "_primary_term": 1,
        "status": 201
      }
    }
  ]
}
An example response from `POST /_bulk?filter_path=items.*.error`, which returns only information about failed operations.
{
  "items": [
    {
      "update": {
        "error": {
          "type": "document_missing_exception",
          "reason": "[5]: document missing",
          "index_uuid": "aAsFqTI0Tc2W0LCWgPNrOA",
          "shard": "0",
          "index": "index1"
        }
      }
    },
    {
      "update": {
        "error": {
          "type": "document_missing_exception",
          "reason": "[6]: document missing",
          "index_uuid": "aAsFqTI0Tc2W0LCWgPNrOA",
          "shard": "0",
          "index": "index1"
        }
      }
    }
  ]
}




























Check for a document source Generally available

HEAD /{index}/_source/{id}

Check whether a document source exists in an index. For example:

HEAD my-index-000001/_source/1

A document's source is not available if it is disabled in the mapping.

Required authorization

  • Index privileges: read
External documentation

Path parameters

  • index string Required

    A comma-separated list of data streams, indices, and aliases. It supports wildcards (*).

  • id string Required

    A unique identifier for the document.

Query parameters

  • preference string

    The node or shard the operation should be performed on. By default, the operation is randomized between the shard replicas.

  • realtime boolean

    If true, the request is real-time as opposed to near-real-time.

  • refresh boolean

    If true, the request refreshes the relevant shards before retrieving the document. Setting it to true should be done after careful thought and verification that this does not cause a heavy load on the system (and slow down indexing).

  • routing string

    A custom value used to route operations to a specific shard.

  • _source boolean | string | array[string]

    Indicates whether to return the _source field (true or false) or lists the fields to return.

  • _source_excludes string | array[string]

    A comma-separated list of source fields to exclude in the response.

  • _source_includes string | array[string]

    A comma-separated list of source fields to include in the response.

  • version number

    The version number for concurrency control. It must match the current version of the document for the request to succeed.

  • version_type string

    The version type.

    Supported values include:

    • internal: Use internal versioning that starts at 1 and increments with each update or delete.
    • external: Only index the document if the specified version is strictly higher than the version of the stored document or if there is no existing document.
    • external_gte: Only index the document if the specified version is equal or higher than the version of the stored document or if there is no existing document. NOTE: The external_gte version type is meant for special use cases and should be used with care. If used incorrectly, it can result in loss of data.
    • force: This option is deprecated because it can cause primary and replica shards to diverge.

    Values are internal, external, external_gte, or force.

Responses

  • 200 application/json
HEAD my-index-000001/_source/1
resp = client.exists_source(
    index="my-index-000001",
    id="1",
)
const response = await client.existsSource({
  index: "my-index-000001",
  id: 1,
});
response = client.exists_source(
  index: "my-index-000001",
  id: "1"
)
$resp = $client->existsSource([
    "index" => "my-index-000001",
    "id" => "1",
]);
curl --head -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/my-index-000001/_source/1"
client.existsSource(e -> e
    .id("1")
    .index("my-index-000001")
);

































Delete an enrich policy Generally available

DELETE /_enrich/policy/{name}

Deletes an existing enrich policy and its enrich index.

Path parameters

  • name string Required

    Enrich policy to delete.

Query parameters

  • master_timeout string

    Period to wait for a connection to the master node.

    Values are -1 or 0.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_enrich/policy/my-policy
resp = client.enrich.delete_policy(
    name="my-policy",
)
const response = await client.enrich.deletePolicy({
  name: "my-policy",
});
response = client.enrich.delete_policy(
  name: "my-policy"
)
$resp = $client->enrich()->deletePolicy([
    "name" => "my-policy",
]);
curl -X DELETE -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/_enrich/policy/my-policy"
client.enrich().deletePolicy(d -> d
    .name("my-policy")
);




EQL

Event Query Language (EQL) is a query language for event-based time series data, such as logs, metrics, and traces.

Learn more about EQL search








Get the async EQL status Generally available

GET /_eql/search/status/{id}

Get the current status for an async EQL search or a stored synchronous EQL search without returning results.

Path parameters

  • id string Required

    Identifier for the search.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • id string Required
    • is_partial boolean Required

      If true, the search request is still executing. If false, the search is completed.

    • is_running boolean Required

      If true, the response does not contain complete search results. This could be because either the search is still running (is_running status is false), or because it is already completed (is_running status is true) and results are partial due to failures or timeouts.

    • start_time_in_millis number

      Time unit for milliseconds

    • expiration_time_in_millis number

      Time unit for milliseconds

    • completion_status number

      For a completed search shows the http status code of the completed search.

GET /_eql/search/status/FmNJRUZ1YWZCU3dHY1BIOUhaenVSRkEaaXFlZ3h4c1RTWFNocDdnY2FSaERnUTozNDE=
resp = client.eql.get_status(
    id="FmNJRUZ1YWZCU3dHY1BIOUhaenVSRkEaaXFlZ3h4c1RTWFNocDdnY2FSaERnUTozNDE=",
)
const response = await client.eql.getStatus({
  id: "FmNJRUZ1YWZCU3dHY1BIOUhaenVSRkEaaXFlZ3h4c1RTWFNocDdnY2FSaERnUTozNDE=",
});
response = client.eql.get_status(
  id: "FmNJRUZ1YWZCU3dHY1BIOUhaenVSRkEaaXFlZ3h4c1RTWFNocDdnY2FSaERnUTozNDE="
)
$resp = $client->eql()->getStatus([
    "id" => "FmNJRUZ1YWZCU3dHY1BIOUhaenVSRkEaaXFlZ3h4c1RTWFNocDdnY2FSaERnUTozNDE=",
]);
curl -X GET -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/_eql/search/status/FmNJRUZ1YWZCU3dHY1BIOUhaenVSRkEaaXFlZ3h4c1RTWFNocDdnY2FSaERnUTozNDE="
client.eql().getStatus(g -> g
    .id("FmNJRUZ1YWZCU3dHY1BIOUhaenVSRkEaaXFlZ3h4c1RTWFNocDdnY2FSaERnUTozNDE=")
);
Response examples (200)
A successful response for getting status information for an async EQL search.
{
  "id": "FmNJRUZ1YWZCU3dHY1BIOUhaenVSRkEaaXFlZ3h4c1RTWFNocDdnY2FSaERnUTozNDE=",
  "is_running" : true,
  "is_partial" : true,
  "start_time_in_millis" : 1611690235000,
  "expiration_time_in_millis" : 1611690295000
}

Get EQL search results Generally available

POST /{index}/_eql/search

All methods and paths for this operation:

GET /{index}/_eql/search

POST /{index}/_eql/search

Returns search results for an Event Query Language (EQL) query. EQL assumes each document in a data stream or index corresponds to an event.

External documentation

Path parameters

  • index string | array[string] Required

    The name of the index to scope the operation

Query parameters

  • allow_no_indices boolean

    Whether to ignore if a wildcard indices expression resolves into no concrete indices. (This includes _all string or when no indices have been specified)

  • allow_partial_search_results boolean

    If true, returns partial results if there are shard failures. If false, returns an error with no partial results.

  • allow_partial_sequence_results boolean

    If true, sequence queries will return partial results in case of shard failures. If false, they will return no results at all. This flag has effect only if allow_partial_search_results is true.

  • expand_wildcards string | array[string]

    Whether to expand wildcard expression to concrete indices that are open, closed or both.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.

    Values are all, open, closed, hidden, or none.

  • ccs_minimize_roundtrips boolean

    Indicates whether network round-trips should be minimized as part of cross-cluster search requests execution

  • ignore_unavailable boolean

    If true, missing or closed indices are not included in the response.

  • keep_alive string

    Period for which the search and its results are stored on the cluster.

    Values are -1 or 0.

  • keep_on_completion boolean

    If true, the search and its results are stored on the cluster.

  • wait_for_completion_timeout string

    Timeout duration to wait for the request to finish. Defaults to no timeout, meaning the request waits for complete search results.

    Values are -1 or 0.

application/json

Body Required

  • query string Required

    EQL query you wish to run.

  • case_sensitive boolean
  • event_category_field string

    Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

  • tiebreaker_field string

    Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

  • timestamp_field string

    Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

  • fetch_size number
  • filter object | array[object]

    Query, written in Query DSL, used to filter the events on which the EQL query runs.

    One of:

    An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

    External documentation
  • keep_alive string

    A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

  • keep_on_completion boolean
  • wait_for_completion_timeout string

    A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

  • allow_partial_search_results boolean

    Allow query execution also in case of shard failures. If true, the query will keep running and will return results based on the available shards. For sequences, the behavior can be further refined using allow_partial_sequence_results

    Default value is true.

  • allow_partial_sequence_results boolean

    This flag applies only to sequences and has effect only if allow_partial_search_results=true. If true, the sequence query will return results based on the available shards, ignoring the others. If false, the sequence query will return successfully, but will always have empty results.

    Default value is false.

  • size number
  • fields object | array[object]

    Array of wildcard (*) patterns. The response returns values for field names matching these patterns in the fields property of each hit.

    One of:

    A reference to a field with formatting instructions on how to return the value

    Hide attributes Show attributes
    • field string Required

      Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • format string

      The format in which the values are returned.

    • include_unmapped boolean
  • result_position string

    Values are tail or head.

  • runtime_mappings object
    Hide runtime_mappings attribute Show runtime_mappings attribute object
    • * object Additional properties
      Hide * attributes Show * attributes object
      • fields object

        For type composite

        Hide fields attribute Show fields attribute object
        • * object Additional properties
          Hide * attribute Show * attribute object
          • type string Required

            Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

      • fetch_fields array[object]

        For type lookup

        Hide fetch_fields attributes Show fetch_fields attributes object
        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • format string
      • format string

        A custom format for date type runtime fields.

      • input_field string

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • target_field string

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • target_index string
      • script object
        Hide script attributes Show script attributes object
        • source string | object

          One of:
        • id string
        • params object

          Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          Hide params attribute Show params attribute object
          • * object Additional properties
        • lang string

          Any of:

          Values are painless, expression, mustache, or java.

        • options object
          Hide options attribute Show options attribute object
          • * string Additional properties
      • type string Required

        Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

  • max_samples_per_key number

    By default, the response of a sample query contains up to 10 samples, with one sample per unique set of join keys. Use the size parameter to get a smaller or larger set of samples. To retrieve more than one sample per set of join keys, use the max_samples_per_key parameter. Pipes are not supported for sample queries.

    Default value is 1.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • id string
    • is_partial boolean

      If true, the response does not contain complete search results.

    • is_running boolean

      If true, the search request is still executing.

    • took number

      Time unit for milliseconds

    • timed_out boolean

      If true, the request timed out before completion.

    • hits object Required
      Hide hits attributes Show hits attributes object
      • total object
        Hide total attributes Show total attributes object
        • relation string Required

          Values are eq or gte.

        • value number Required
      • events array[object]

        Contains events matching the query. Each object represents a matching event.

        Hide events attributes Show events attributes object
        • _index string Required
        • _id string Required
        • _source object Required

          Original JSON body passed for the event at index time.

        • missing boolean

          Set to true for events in a timespan-constrained sequence that do not meet a given condition.

        • fields object
          Hide fields attribute Show fields attribute object
          • * array[object] Additional properties
      • sequences array[object]

        Contains event sequences matching the query. Each object represents a matching sequence. This parameter is only returned for EQL queries containing a sequence.

        Hide sequences attributes Show sequences attributes object
        • events array[object] Required

          Contains events matching the query. Each object represents a matching event.

          Hide events attributes Show events attributes object
          • _index string Required
          • _id string Required
          • _source object Required

            Original JSON body passed for the event at index time.

          • missing boolean

            Set to true for events in a timespan-constrained sequence that do not meet a given condition.

          • fields object
        • join_keys array[object]

          Shared field values used to constrain matches in the sequence. These are defined using the by keyword in the EQL query syntax.

    • shard_failures array[object]

      Contains information about shard failures (if any), in case allow_partial_search_results=true

      Hide shard_failures attributes Show shard_failures attributes object
      • index string
      • node string
      • reason object Required

        Cause and details about a request failure. This class defines the properties common to all error types. Additional details are also provided, that depend on the error type.

        Hide reason attributes Show reason attributes object
        • type string Required

          The type of error

        • reason string | null

          A human-readable explanation of the error, in English.

        • stack_trace string

          The server stack trace. Present only if the error_trace=true parameter was sent with the request.

        • caused_by object

          Cause and details about a request failure. This class defines the properties common to all error types. Additional details are also provided, that depend on the error type.

        • root_cause array[object]

          Cause and details about a request failure. This class defines the properties common to all error types. Additional details are also provided, that depend on the error type.

          Cause and details about a request failure. This class defines the properties common to all error types. Additional details are also provided, that depend on the error type.

        • suppressed array[object]

          Cause and details about a request failure. This class defines the properties common to all error types. Additional details are also provided, that depend on the error type.

          Cause and details about a request failure. This class defines the properties common to all error types. Additional details are also provided, that depend on the error type.

      • shard number Required
      • status string
GET /my-data-stream/_eql/search
{
  "query": """
    process where (process.name == "cmd.exe" and process.pid != 2013)
  """
}
resp = client.eql.search(
    index="my-data-stream",
    query="\n    process where (process.name == \"cmd.exe\" and process.pid != 2013)\n  ",
)
const response = await client.eql.search({
  index: "my-data-stream",
  query:
    '\n    process where (process.name == "cmd.exe" and process.pid != 2013)\n  ',
});
response = client.eql.search(
  index: "my-data-stream",
  body: {
    "query": "\n    process where (process.name == \"cmd.exe\" and process.pid != 2013)\n  "
  }
)
$resp = $client->eql()->search([
    "index" => "my-data-stream",
    "body" => [
        "query" => "\n    process where (process.name == \"cmd.exe\" and process.pid != 2013)\n  ",
    ],
]);
curl -X GET -H "Authorization: ApiKey $ELASTIC_API_KEY" -H "Content-Type: application/json" -d '{"query":"\n    process where (process.name == \"cmd.exe\" and process.pid != 2013)\n  "}' "$ELASTICSEARCH_URL/my-data-stream/_eql/search"
client.eql().search(s -> s
    .index("my-data-stream")
    .query(" process where (process.name == \"cmd.exe\" and process.pid != 2013) ")
);
Request examples
Run `GET /my-data-stream/_eql/search` to search for events that have a `process.name` of `cmd.exe` and a `process.pid` other than `2013`.
{
  "query": """
    process where (process.name == "cmd.exe" and process.pid != 2013)
  """
}
Run `GET /my-data-stream/_eql/search` to search for a sequence of events. The sequence starts with an event with an `event.category` of `file`, a `file.name` of `cmd.exe`, and a `process.pid` other than `2013`. It is followed by an event with an `event.category` of `process` and a `process.executable` that contains the substring `regsvr32`. These events must also share the same `process.pid` value.
{
  "query": """
    sequence by process.pid
      [ file where file.name == "cmd.exe" and process.pid != 2013 ]
      [ process where stringContains(process.executable, "regsvr32") ]
  """
}
Response examples (200)
{
  "is_partial": false,
  "is_running": false,
  "took": 6,
  "timed_out": false,
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "sequences": [
      {
        "join_keys": [
          2012
        ],
        "events": [
          {
            "_index": ".ds-my-data-stream-2099.12.07-000001",
            "_id": "AtOJ4UjUBAAx3XR5kcCM",
            "_source": {
              "@timestamp": "2099-12-06T11:04:07.000Z",
              "event": {
                "category": "file",
                "id": "dGCHwoeS",
                "sequence": 2
              },
              "file": {
                "accessed": "2099-12-07T11:07:08.000Z",
                "name": "cmd.exe",
                "path": "C:\\Windows\\System32\\cmd.exe",
                "type": "file",
                "size": 16384
              },
              "process": {
                "pid": 2012,
                "name": "cmd.exe",
                "executable": "C:\\Windows\\System32\\cmd.exe"
              }
            }
          },
          {
            "_index": ".ds-my-data-stream-2099.12.07-000001",
            "_id": "OQmfCaduce8zoHT93o4H",
            "_source": {
              "@timestamp": "2099-12-07T11:07:09.000Z",
              "event": {
                "category": "process",
                "id": "aR3NWVOs",
                "sequence": 4
              },
              "process": {
                "pid": 2012,
                "name": "regsvr32.exe",
                "command_line": "regsvr32.exe  /s /u /i:https://...RegSvr32.sct scrobj.dll",
                "executable": "C:\\Windows\\System32\\regsvr32.exe"
              }
            }
          }
        ]
      }
    ]
  }
}

ES|QL

The Elasticsearch Query Language (ES|QL) provides a powerful way to filter, transform, and analyze data stored in Elasticsearch, and in the future in other runtimes.

Learn more about ES|QL

Get a specific running ES|QL query information Technical preview

GET /_query/queries/{id}

Returns an object extended information about a running ES|QL query.

Required authorization

  • Cluster privileges: monitor_esql

Path parameters

  • id string Required

    The query ID

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • id number Required
    • node string Required
    • start_time_millis number Required
    • running_time_nanos number Required
    • query string Required
    • coordinating_node string Required
    • data_nodes array[string] Required
GET /_query/queries/{id}
curl \
 --request GET 'http://api.example.com/_query/queries/{id}' \
 --header "Authorization: $API_KEY"




Run an ES|QL query Generally available

POST /_query

Get search results for an ES|QL (Elasticsearch query language) query.

External documentation

Query parameters

  • format string

    A short version of the Accept header, e.g. json, yaml.

    Values are csv, json, tsv, txt, yaml, cbor, smile, or arrow.

  • delimiter string

    The character to use between values within a CSV row. Only valid for the CSV format.

  • drop_null_columns boolean

    Should columns that are entirely null be removed from the columns and values portion of the results? Defaults to false. If true then the response will include an extra section under the name all_columns which has the name of all columns.

  • allow_partial_results boolean

    If true, partial results will be returned if there are shard failures, but the query can continue to execute on other clusters and shards. If false, the query will fail if there are any failures.

    To override the default behavior, you can set the esql.query.allow_partial_results cluster setting to false.

application/json

Body Required

  • columnar boolean

    By default, ES|QL returns results as rows. For example, FROM returns each individual document as one row. For the JSON, YAML, CBOR and smile formats, ES|QL can return the results in a columnar fashion where one row represents all the values of a certain column in the results.

  • filter object

    An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

    External documentation
  • locale string
  • params array[number | string | boolean | null]

    To avoid any attempts of hacking or code injection, extract the values in a separate list of parameters. Use question mark placeholders (?) in the query string for each of the parameters.

  • profile boolean

    If provided and true the response will include an extra profile object with information on how the query was executed. This information is for human debugging and its format can change at any time but it can give some insight into the performance of each part of the query.

  • query string Required

    The ES|QL query API accepts an ES|QL query string in the query parameter, runs it, and returns the results.

  • tables object

    Tables to use with the LOOKUP operation. The top level key is the table name and the next level key is the column name.

    Hide tables attribute Show tables attribute object
  • include_ccs_metadata boolean

    When set to true and performing a cross-cluster query, the response will include an extra _clusters object with information about the clusters that participated in the search along with info such as shards count.

    Default value is false.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • took number

      Time unit for milliseconds

    • is_partial boolean
    • all_columns array[object]
      Hide all_columns attributes Show all_columns attributes object
      • name string Required
      • type string Required
    • columns array[object] Required
      Hide columns attributes Show columns attributes object
      • name string Required
      • type string Required
    • values array[array] Required

      A field value.

      A field value.

    • _clusters object
      Hide _clusters attributes Show _clusters attributes object
      • total number Required
      • successful number Required
      • running number Required
      • skipped number Required
      • partial number Required
      • failed number Required
      • details object Required
        Hide details attribute Show details attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • status string Required

            Values are running, successful, partial, skipped, or failed.

          • indices string Required
          • took number

            Time unit for milliseconds

          • _shards object
            Hide _shards attributes Show _shards attributes object
            • total number Required
            • successful number
            • skipped number
            • failed number
          • failures array[object]
            Hide failures attributes Show failures attributes object
            • shard number Required
            • index
            • node string
            • reason object Required

              Cause and details about a request failure. This class defines the properties common to all error types. Additional details are also provided, that depend on the error type.

    • profile object

      Profiling information. Present if profile was true in the request. The contents of this field are currently unstable.

POST /_query
{
  "query": """
    FROM library,remote-*:library
    | EVAL year = DATE_TRUNC(1 YEARS, release_date)
    | STATS MAX(page_count) BY year
    | SORT year
    | LIMIT 5
  """,
  "include_ccs_metadata": true
}
resp = client.esql.query(
    query="\n    FROM library,remote-*:library\n    | EVAL year = DATE_TRUNC(1 YEARS, release_date)\n    | STATS MAX(page_count) BY year\n    | SORT year\n    | LIMIT 5\n  ",
    include_ccs_metadata=True,
)
const response = await client.esql.query({
  query:
    "\n    FROM library,remote-*:library\n    | EVAL year = DATE_TRUNC(1 YEARS, release_date)\n    | STATS MAX(page_count) BY year\n    | SORT year\n    | LIMIT 5\n  ",
  include_ccs_metadata: true,
});
response = client.esql.query(
  body: {
    "query": "\n    FROM library,remote-*:library\n    | EVAL year = DATE_TRUNC(1 YEARS, release_date)\n    | STATS MAX(page_count) BY year\n    | SORT year\n    | LIMIT 5\n  ",
    "include_ccs_metadata": true
  }
)
$resp = $client->esql()->query([
    "body" => [
        "query" => "\n    FROM library,remote-*:library\n    | EVAL year = DATE_TRUNC(1 YEARS, release_date)\n    | STATS MAX(page_count) BY year\n    | SORT year\n    | LIMIT 5\n  ",
        "include_ccs_metadata" => true,
    ],
]);
curl -X POST -H "Authorization: ApiKey $ELASTIC_API_KEY" -H "Content-Type: application/json" -d '{"query":"\n    FROM library,remote-*:library\n    | EVAL year = DATE_TRUNC(1 YEARS, release_date)\n    | STATS MAX(page_count) BY year\n    | SORT year\n    | LIMIT 5\n  ","include_ccs_metadata":true}' "$ELASTICSEARCH_URL/_query"
client.esql().query(q -> q
    .includeCcsMetadata(true)
    .query(" FROM library,remote-*:library | EVAL year = DATE_TRUNC(1 YEARS, release_date) | STATS MAX(page_count) BY year | SORT year | LIMIT 5 ")
);
Request example
Run `POST /_query` to get results for an ES|QL query.
{
  "query": """
    FROM library,remote-*:library
    | EVAL year = DATE_TRUNC(1 YEARS, release_date)
    | STATS MAX(page_count) BY year
    | SORT year
    | LIMIT 5
  """,
  "include_ccs_metadata": true
}

Explore graph analytics Generally available

POST /{index}/_graph/explore

All methods and paths for this operation:

GET /{index}/_graph/explore

POST /{index}/_graph/explore

Extract and summarize information about the documents and terms in an Elasticsearch data stream or index. The easiest way to understand the behavior of this API is to use the Graph UI to explore connections. An initial request to the _explore API contains a seed query that identifies the documents of interest and specifies the fields that define the vertices and connections you want to include in the graph. Subsequent requests enable you to spider out from one more vertices of interest. You can exclude vertices that have already been returned.

External documentation

Path parameters

  • index string | array[string] Required

    Name of the index.

Query parameters

  • routing string

    Custom value used to route operations to a specific shard.

  • timeout string

    Specifies the period of time to wait for a response from each shard. If no response is received before the timeout expires, the request fails and returns an error. Defaults to no timeout.

    Values are -1 or 0.

application/json

Body

  • connections object
    Hide connections attributes Show connections attributes object
    • connections object
    • query object

      An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

      External documentation
    • vertices array[object] Required

      Contains the fields you are interested in.

      Hide vertices attributes Show vertices attributes object
      • exclude array[string]

        Prevents the specified terms from being included in the results.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • include array[object]

        Identifies the terms of interest that form the starting points from which you want to spider out.

        Hide include attributes Show include attributes object
        • boost number
        • term string Required
      • min_doc_count number

        Specifies how many documents must contain a pair of terms before it is considered to be a useful connection. This setting acts as a certainty threshold.

        Default value is 3.

      • shard_min_doc_count number

        Controls how many documents on a particular shard have to contain a pair of terms before the connection is returned for global consideration.

        Default value is 2.

      • size number

        Specifies the maximum number of vertex terms returned for each field.

        Default value is 5.

  • controls object
    Hide controls attributes Show controls attributes object
    • sample_diversity object
      Hide sample_diversity attributes Show sample_diversity attributes object
      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • max_docs_per_value number Required
    • sample_size number

      Each hop considers a sample of the best-matching documents on each shard. Using samples improves the speed of execution and keeps exploration focused on meaningfully-connected terms. Very small values (less than 50) might not provide sufficient weight-of-evidence to identify significant connections between terms. Very large sample sizes can dilute the quality of the results and increase execution times.

      Default value is 100.

    • timeout string

      A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • use_significance boolean Required

      Filters associated terms so only those that are significantly associated with your query are included.

  • query object

    An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

    External documentation
  • vertices array[object]

    Specifies one or more fields that contain the terms you want to include in the graph as vertices.

    Hide vertices attributes Show vertices attributes object
    • exclude array[string]

      Prevents the specified terms from being included in the results.

    • field string Required

      Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • include array[object]

      Identifies the terms of interest that form the starting points from which you want to spider out.

      Hide include attributes Show include attributes object
      • boost number
      • term string Required
    • min_doc_count number

      Specifies how many documents must contain a pair of terms before it is considered to be a useful connection. This setting acts as a certainty threshold.

      Default value is 3.

    • shard_min_doc_count number

      Controls how many documents on a particular shard have to contain a pair of terms before the connection is returned for global consideration.

      Default value is 2.

    • size number

      Specifies the maximum number of vertex terms returned for each field.

      Default value is 5.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • connections array[object] Required
      Hide connections attributes Show connections attributes object
      • doc_count number Required
      • source number Required
      • target number Required
      • weight number Required
    • failures array[object] Required
      Hide failures attributes Show failures attributes object
      • index string
      • node string
      • reason object Required

        Cause and details about a request failure. This class defines the properties common to all error types. Additional details are also provided, that depend on the error type.

        Hide reason attributes Show reason attributes object
        • type string Required

          The type of error

        • reason string | null

          A human-readable explanation of the error, in English.

        • stack_trace string

          The server stack trace. Present only if the error_trace=true parameter was sent with the request.

        • caused_by object

          Cause and details about a request failure. This class defines the properties common to all error types. Additional details are also provided, that depend on the error type.

        • root_cause array[object]

          Cause and details about a request failure. This class defines the properties common to all error types. Additional details are also provided, that depend on the error type.

          Cause and details about a request failure. This class defines the properties common to all error types. Additional details are also provided, that depend on the error type.

        • suppressed array[object]

          Cause and details about a request failure. This class defines the properties common to all error types. Additional details are also provided, that depend on the error type.

          Cause and details about a request failure. This class defines the properties common to all error types. Additional details are also provided, that depend on the error type.

      • shard number Required
      • status string
    • timed_out boolean Required
    • took number Required
    • vertices array[object] Required
      Hide vertices attributes Show vertices attributes object
      • depth number Required
      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • term string Required
      • weight number Required
POST clicklogs/_graph/explore
{
  "query": {
    "match": {
      "query.raw": "midi"
    }
  },
  "vertices": [
    {
      "field": "product"
    }
  ],
  "connections": {
    "vertices": [
      {
        "field": "query.raw"
      }
    ]
  }
}
resp = client.graph.explore(
    index="clicklogs",
    query={
        "match": {
            "query.raw": "midi"
        }
    },
    vertices=[
        {
            "field": "product"
        }
    ],
    connections={
        "vertices": [
            {
                "field": "query.raw"
            }
        ]
    },
)
const response = await client.graph.explore({
  index: "clicklogs",
  query: {
    match: {
      "query.raw": "midi",
    },
  },
  vertices: [
    {
      field: "product",
    },
  ],
  connections: {
    vertices: [
      {
        field: "query.raw",
      },
    ],
  },
});
response = client.graph.explore(
  index: "clicklogs",
  body: {
    "query": {
      "match": {
        "query.raw": "midi"
      }
    },
    "vertices": [
      {
        "field": "product"
      }
    ],
    "connections": {
      "vertices": [
        {
          "field": "query.raw"
        }
      ]
    }
  }
)
$resp = $client->graph()->explore([
    "index" => "clicklogs",
    "body" => [
        "query" => [
            "match" => [
                "query.raw" => "midi",
            ],
        ],
        "vertices" => array(
            [
                "field" => "product",
            ],
        ),
        "connections" => [
            "vertices" => array(
                [
                    "field" => "query.raw",
                ],
            ),
        ],
    ],
]);
curl -X POST -H "Authorization: ApiKey $ELASTIC_API_KEY" -H "Content-Type: application/json" -d '{"query":{"match":{"query.raw":"midi"}},"vertices":[{"field":"product"}],"connections":{"vertices":[{"field":"query.raw"}]}}' "$ELASTICSEARCH_URL/clicklogs/_graph/explore"
client.graph().explore(e -> e
    .connections(c -> c
        .vertices(v -> v
            .field("query.raw")
        )
    )
    .index("clicklogs")
    .query(q -> q
        .match(m -> m
            .field("query.raw")
            .query(FieldValue.of("midi"))
        )
    )
    .vertices(v -> v
        .field("product")
    )
);
Request example
Run `POST clicklogs/_graph/explore` for a basic exploration An initial graph explore query typically begins with a query to identify strongly related terms. Seed the exploration with a query. This example is searching `clicklogs` for people who searched for the term `midi`.Identify the vertices to include in the graph. This example is looking for product codes that are significantly associated with searches for `midi`. Find the connections. This example is looking for other search terms that led people to click on the products that are associated with searches for `midi`.
{
  "query": {
    "match": {
      "query.raw": "midi"
    }
  },
  "vertices": [
    {
      "field": "product"
    }
  ],
  "connections": {
    "vertices": [
      {
        "field": "query.raw"
      }
    ]
  }
}









Delete component templates Generally available

DELETE /_component_template/{name}

Component templates are building blocks for constructing index templates that specify index mappings, settings, and aliases.

Required authorization

  • Cluster privileges: manage_index_templates

Path parameters

  • name string | array[string] Required

    Comma-separated list or wildcard expression of component template names used to limit the request.

Query parameters

  • master_timeout string

    Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

    Values are -1 or 0.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

    Values are -1 or 0.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_component_template/{name}
DELETE _component_template/template_1
resp = client.cluster.delete_component_template(
    name="template_1",
)
const response = await client.cluster.deleteComponentTemplate({
  name: "template_1",
});
response = client.cluster.delete_component_template(
  name: "template_1"
)
$resp = $client->cluster()->deleteComponentTemplate([
    "name" => "template_1",
]);
curl -X DELETE -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/_component_template/template_1"
client.cluster().deleteComponentTemplate(d -> d
    .name("template_1")
);

Check component templates Generally available

HEAD /_component_template/{name}

Returns information about whether a particular component template exists.

Path parameters

  • name string | array[string] Required

    Comma-separated list of component template names used to limit the request. Wildcard (*) expressions are supported.

Query parameters

  • master_timeout string

    Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

    Values are -1 or 0.

  • local boolean

    If true, the request retrieves information from the local node only. Defaults to false, which means information is retrieved from the master node.

Responses

  • 200 application/json
HEAD /_component_template/{name}
curl \
 --request HEAD 'http://api.example.com/_component_template/{name}' \
 --header "Authorization: $API_KEY"

Add an index block Generally available

PUT /{index}/_block/{block}

Add an index block to an index. Index blocks limit the operations allowed on an index by blocking specific operation types.

Path parameters

  • index string Required

    A comma-separated list or wildcard expression of index names used to limit the request. By default, you must explicitly name the indices you are adding blocks to. To allow the adding of blocks to indices with _all, *, or other wildcard expressions, change the action.destructive_requires_name setting to false. You can update this setting in the elasticsearch.yml file or by using the cluster update settings API.

  • block string

    The block type to add to the index.

    Supported values include:

    • metadata: Disable metadata changes, such as closing the index.
    • read: Disable read operations.
    • read_only: Disable write operations and metadata changes.
    • write: Disable write operations. However, metadata changes are still allowed.

    Values are metadata, read, read_only, or write.

Query parameters

  • allow_no_indices boolean

    If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

  • expand_wildcards string | array[string]

    The type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. It supports comma-separated values, such as open,hidden.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.

    Values are all, open, closed, hidden, or none.

  • ignore_unavailable boolean

    If false, the request returns an error if it targets a missing or closed index.

  • master_timeout string

    The period to wait for the master node. If the master node is not available before the timeout expires, the request fails and returns an error. It can also be set to -1 to indicate that the request should never timeout.

    Values are -1 or 0.

  • timeout string

    The period to wait for a response from all relevant nodes in the cluster after updating the cluster metadata. If no response is received before the timeout expires, the cluster metadata update still applies but the response will indicate that it was not completely acknowledged. It can also be set to -1 to indicate that the request should never timeout.

    Values are -1 or 0.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • acknowledged boolean Required
    • shards_acknowledged boolean Required
    • indices array[object] Required
      Hide indices attributes Show indices attributes object
      • name string Required
      • blocked boolean Required
PUT /my-index-000001/_block/write
resp = client.indices.add_block(
    index="my-index-000001",
    block="write",
)
const response = await client.indices.addBlock({
  index: "my-index-000001",
  block: "write",
});
response = client.indices.add_block(
  index: "my-index-000001",
  block: "write"
)
$resp = $client->indices()->addBlock([
    "index" => "my-index-000001",
    "block" => "write",
]);
curl -X PUT -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/my-index-000001/_block/write"
client.indices().addBlock(a -> a
    .block(IndicesBlockOptions.Write)
    .index("my-index-000001")
);
Response examples (200)
A successful response from `PUT /my-index-000001/_block/write`, which adds an index block to an index.'
{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "indices" : [ {
    "name" : "my-index-000001",
    "blocked" : true
  } ]
}

Remove an index block Generally available

DELETE /{index}/_block/{block}

Remove an index block from an index. Index blocks limit the operations allowed on an index by blocking specific operation types.

Required authorization

  • Index privileges: manage

Path parameters

  • index string Required

    A comma-separated list or wildcard expression of index names used to limit the request. By default, you must explicitly name the indices you are removing blocks from. To allow the removal of blocks from indices with _all, *, or other wildcard expressions, change the action.destructive_requires_name setting to false. You can update this setting in the elasticsearch.yml file or by using the cluster update settings API.

  • block string Required

    The block type to remove from the index.

    Supported values include:

    • metadata: Disable metadata changes, such as closing the index.
    • read: Disable read operations.
    • read_only: Disable write operations and metadata changes.
    • write: Disable write operations. However, metadata changes are still allowed.

    Values are metadata, read, read_only, or write.

Query parameters

  • allow_no_indices boolean

    If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

  • expand_wildcards string | array[string]

    The type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. It supports comma-separated values, such as open,hidden.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.

    Values are all, open, closed, hidden, or none.

  • ignore_unavailable boolean

    If false, the request returns an error if it targets a missing or closed index.

  • master_timeout string

    The period to wait for the master node. If the master node is not available before the timeout expires, the request fails and returns an error. It can also be set to -1 to indicate that the request should never timeout.

    Values are -1 or 0.

  • timeout string

    The period to wait for a response from all relevant nodes in the cluster after updating the cluster metadata. If no response is received before the timeout expires, the cluster metadata update still applies but the response will indicate that it was not completely acknowledged. It can also be set to -1 to indicate that the request should never timeout.

    Values are -1 or 0.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • acknowledged boolean Required
    • indices array[object] Required
      Hide indices attributes Show indices attributes object
      • name string Required
      • unblocked boolean
      • exception object

        Cause and details about a request failure. This class defines the properties common to all error types. Additional details are also provided, that depend on the error type.

        Hide exception attributes Show exception attributes object
        • type string Required

          The type of error

        • reason string | null

          A human-readable explanation of the error, in English.

        • stack_trace string

          The server stack trace. Present only if the error_trace=true parameter was sent with the request.

        • caused_by object

          Cause and details about a request failure. This class defines the properties common to all error types. Additional details are also provided, that depend on the error type.

        • root_cause array[object]

          Cause and details about a request failure. This class defines the properties common to all error types. Additional details are also provided, that depend on the error type.

          Cause and details about a request failure. This class defines the properties common to all error types. Additional details are also provided, that depend on the error type.

        • suppressed array[object]

          Cause and details about a request failure. This class defines the properties common to all error types. Additional details are also provided, that depend on the error type.

          Cause and details about a request failure. This class defines the properties common to all error types. Additional details are also provided, that depend on the error type.

DELETE /my-index-000001/_block/write
resp = client.indices.remove_block(
    index="my-index-000001",
    block="write",
)
const response = await client.indices.removeBlock({
  index: "my-index-000001",
  block: "write",
});
response = client.indices.remove_block(
  index: "my-index-000001",
  block: "write"
)
$resp = $client->indices()->removeBlock([
    "index" => "my-index-000001",
    "block" => "write",
]);
curl -X DELETE -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/my-index-000001/_block/write"
client.indices().removeBlock(a -> a
    .block(IndicesBlockOptions.Write)
    .index("my-index-000001")
);
Response examples (200)
A successful response from `DELETE /my-index-000001/_block/write`, which removes an index block from an index.'
{
  "acknowledged" : true,
  "indices" : [ {
    "name" : "my-index-000001",
    "unblocked" : true
  } ]
}

Get tokens from text analysis Generally available

POST /{index}/_analyze

All methods and paths for this operation:

GET /_analyze

POST /_analyze
GET /{index}/_analyze
POST /{index}/_analyze

The analyze API performs analysis on a text string and returns the resulting tokens.

Generating excessive amount of tokens may cause a node to run out of memory. The index.analyze.max_token_count setting enables you to limit the number of tokens that can be produced. If more than this limit of tokens gets generated, an error occurs. The _analyze endpoint without a specified index will always use 10000 as its limit.

Required authorization

  • Index privileges: index
External documentation

Path parameters

  • index string Required

    Index used to derive the analyzer. If specified, the analyzer or field parameter overrides this value. If no index is specified or the index does not have a default analyzer, the analyze API uses the standard analyzer.

Query parameters

  • index string

    Index used to derive the analyzer. If specified, the analyzer or field parameter overrides this value. If no index is specified or the index does not have a default analyzer, the analyze API uses the standard analyzer.

application/json

Body

  • analyzer string

    The name of the analyzer that should be applied to the provided text. This could be a built-in analyzer, or an analyzer that’s been configured in the index.

  • attributes array[string]

    Array of token attributes used to filter the output of the explain parameter.

  • char_filter array

    Array of character filters used to preprocess characters before the tokenizer.

    External documentation
  • explain boolean

    If true, the response includes token attributes and additional details.

    Default value is false.

  • field string

    Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

  • filter array

    Array of token filters used to apply after the tokenizer.

    External documentation
  • normalizer string

    Normalizer to use to convert text into a single token.

  • text string | array[string]

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • detail object
      Hide detail attributes Show detail attributes object
      • analyzer object
        Hide analyzer attributes Show analyzer attributes object
        • name string Required
        • tokens array[object] Required
          Hide tokens attributes Show tokens attributes object
          • bytes string Required
          • end_offset number Required
          • keyword boolean
          • position number Required
          • positionLength number Required
          • start_offset number Required
          • termFrequency number Required
          • token string Required
          • type string Required
      • charfilters array[object]
        Hide charfilters attributes Show charfilters attributes object
        • filtered_text array[string] Required
        • name string Required
      • custom_analyzer boolean Required
      • tokenfilters array[object]
        Hide tokenfilters attributes Show tokenfilters attributes object
        • name string Required
        • tokens array[object] Required
          Hide tokens attributes Show tokens attributes object
          • bytes string Required
          • end_offset number Required
          • keyword boolean
          • position number Required
          • positionLength number Required
          • start_offset number Required
          • termFrequency number Required
          • token string Required
          • type string Required
      • tokenizer object
        Hide tokenizer attributes Show tokenizer attributes object
        • name string Required
        • tokens array[object] Required
          Hide tokens attributes Show tokens attributes object
          • bytes string Required
          • end_offset number Required
          • keyword boolean
          • position number Required
          • positionLength number Required
          • start_offset number Required
          • termFrequency number Required
          • token string Required
          • type string Required
    • tokens array[object]
      Hide tokens attributes Show tokens attributes object
      • end_offset number Required
      • position number Required
      • positionLength number
      • start_offset number Required
      • token string Required
      • type string Required
GET /_analyze
{
  "analyzer": "standard",
  "text": "this is a test"
}
resp = client.indices.analyze(
    analyzer="standard",
    text="this is a test",
)
const response = await client.indices.analyze({
  analyzer: "standard",
  text: "this is a test",
});
response = client.indices.analyze(
  body: {
    "analyzer": "standard",
    "text": "this is a test"
  }
)
$resp = $client->indices()->analyze([
    "body" => [
        "analyzer" => "standard",
        "text" => "this is a test",
    ],
]);
curl -X GET -H "Authorization: ApiKey $ELASTIC_API_KEY" -H "Content-Type: application/json" -d '{"analyzer":"standard","text":"this is a test"}' "$ELASTICSEARCH_URL/_analyze"
client.indices().analyze(a -> a
    .analyzer("standard")
    .text("this is a test")
);
You can apply any of the built-in analyzers to the text string without specifying an index.
{
  "analyzer": "standard",
  "text": "this is a test"
}
If the text parameter is provided as array of strings, it is analyzed as a multi-value field.
{
  "analyzer": "standard",
  "text": [
    "this is a test",
    "the second text"
  ]
}
You can test a custom transient analyzer built from tokenizers, token filters, and char filters. Token filters use the filter parameter.
{
  "tokenizer": "keyword",
  "filter": [
    "lowercase"
  ],
  "char_filter": [
    "html_strip"
  ],
  "text": "this is a <b>test</b>"
}
Custom tokenizers, token filters, and character filters can be specified in the request body.
{
  "tokenizer": "whitespace",
  "filter": [
    "lowercase",
    {
      "type": "stop",
      "stopwords": [
        "a",
        "is",
        "this"
      ]
    }
  ],
  "text": "this is a test"
}
Run `GET /analyze_sample/_analyze` to run an analysis on the text using the default index analyzer associated with the `analyze_sample` index. Alternatively, the analyzer can be derived based on a field mapping.
{
  "field": "obj1.field1",
  "text": "this is a test"
}
Run `GET /analyze_sample/_analyze` and supply a normalizer for a keyword field if there is a normalizer associated with the specified index.
{
  "normalizer": "my_normalizer",
  "text": "BaR"
}
If you want to get more advanced details, set `explain` to `true`. It will output all token attributes for each token. You can filter token attributes you want to output by setting the `attributes` option. NOTE: The format of the additional detail information is labelled as experimental in Lucene and it may change in the future.
{
  "tokenizer": "standard",
  "filter": [
    "snowball"
  ],
  "text": "detailed output",
  "explain": true,
  "attributes": [
    "keyword"
  ]
}
Response examples (200)
A successful response for an analysis with `explain` set to `true`.
{
  "detail": {
    "custom_analyzer": true,
    "charfilters": [],
    "tokenizer": {
      "name": "standard",
      "tokens": [
        {
          "token": "detailed",
          "start_offset": 0,
          "end_offset": 8,
          "type": "<ALPHANUM>",
          "position": 0
        },
        {
          "token": "output",
          "start_offset": 9,
          "end_offset": 15,
          "type": "<ALPHANUM>",
          "position": 1
        }
      ]
    },
    "tokenfilters": [
      {
        "name": "snowball",
        "tokens": [
          {
            "token": "detail",
            "start_offset": 0,
            "end_offset": 8,
            "type": "<ALPHANUM>",
            "position": 0,
            "keyword": false
          },
          {
            "token": "output",
            "start_offset": 9,
            "end_offset": 15,
            "type": "<ALPHANUM>",
            "position": 1,
            "keyword": false
          }
        ]
      }
    ]
  }
}

Get index information Generally available

GET /{index}

Get information about one or more indices. For data streams, the API returns information about the stream’s backing indices.

Required authorization

  • Index privileges: view_index_metadata,manage

Path parameters

  • index string | array[string] Required

    Comma-separated list of data streams, indices, and index aliases used to limit the request. Wildcard expressions (*) are supported.

Query parameters

  • allow_no_indices boolean

    If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

  • expand_wildcards string | array[string]

    Type of index that wildcard expressions can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.

    Values are all, open, closed, hidden, or none.

  • flat_settings boolean

    If true, returns settings in flat format.

  • ignore_unavailable boolean

    If false, requests that target a missing index return an error.

  • include_defaults boolean

    If true, return all default settings in the response.

  • local boolean

    If true, the request retrieves information from the local node only. Defaults to false, which means information is retrieved from the master node.

  • master_timeout string

    Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

    Values are -1 or 0.

  • features string | array[string] Generally available

    Return only information on specified index features

    Supported values include: aliases, mappings, settings

    Values are aliases, mappings, or settings.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • * object
      Hide * attributes Show * attributes object
      • aliases object
        Hide aliases attribute Show aliases attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • filter object

            An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

            External documentation
          • index_routing string
          • is_hidden boolean

            If true, the alias is hidden. All indices for the alias must have the same is_hidden value.

            Default value is false.

          • is_write_index boolean

            If true, the index is the write index for the alias.

            Default value is false.

          • routing string
          • search_routing string
      • mappings object
        Hide mappings attributes Show mappings attributes object
        • all_field object
          Hide all_field attributes Show all_field attributes object
          • analyzer string Required
          • enabled boolean Required
          • omit_norms boolean Required
          • search_analyzer string Required
          • similarity string Required
          • store boolean Required
          • store_term_vector_offsets boolean Required
          • store_term_vector_payloads boolean Required
          • store_term_vector_positions boolean Required
          • store_term_vectors boolean Required
        • date_detection boolean
        • dynamic string

          Values are strict, runtime, true, or false.

        • dynamic_date_formats array[string]
        • dynamic_templates array[object]
        • _field_names object
          Hide _field_names attribute Show _field_names attribute object
          • enabled boolean Required
        • index_field object
          Hide index_field attribute Show index_field attribute object
          • enabled boolean Required
        • _meta object
          Hide _meta attribute Show _meta attribute object
          • * object Additional properties
        • numeric_detection boolean
        • properties object
        • _routing object
          Hide _routing attribute Show _routing attribute object
          • required boolean Required
        • _size object
          Hide _size attribute Show _size attribute object
          • enabled boolean Required
        • _source object
          Hide _source attributes Show _source attributes object
          • compress boolean
          • compress_threshold string
          • enabled boolean
          • excludes array[string]
          • includes array[string]
          • mode string

            Values are disabled, stored, or synthetic.

        • runtime object
          Hide runtime attribute Show runtime attribute object
          • * object Additional properties
            Hide * attributes Show * attributes object
            • fields object

              For type composite

              Hide fields attribute Show fields attribute object
              • * object Additional properties
            • fetch_fields array[object]

              For type lookup

            • format string

              A custom format for date type runtime fields.

            • input_field string

              Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

            • target_field string

              Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

            • target_index string
            • script object
              Hide script attributes Show script attributes object
              • source
              • id string
              • params object

                Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

              • lang
              • options object
            • type string Required

              Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

        • enabled boolean
        • subobjects string

          Values are true or false.

        • _data_stream_timestamp object
          Hide _data_stream_timestamp attribute Show _data_stream_timestamp attribute object
          • enabled boolean Required
      • settings object Additional properties
        Index settings
      • defaults object Additional properties
        Index settings
      • data_stream string
      • lifecycle object

        Data stream lifecycle denotes that a data stream is managed by the data stream lifecycle and contains the configuration.

        Hide lifecycle attributes Show lifecycle attributes object
        • data_retention string

          A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • downsampling object
          Hide downsampling attribute Show downsampling attribute object
          • rounds array[object] Required

            The list of downsampling rounds to execute as part of this downsampling configuration

            Hide rounds attributes Show rounds attributes object
            • after string Required

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

            • config object Required
        • enabled boolean

          If defined, it turns data stream lifecycle on/off (true/false) for this data stream. A data stream lifecycle that's disabled (enabled: false) will have no effect on the data stream.

          Default value is true.

GET /my-index-000001
resp = client.indices.get(
    index="my-index-000001",
)
const response = await client.indices.get({
  index: "my-index-000001",
});
response = client.indices.get(
  index: "my-index-000001"
)
$resp = $client->indices()->get([
    "index" => "my-index-000001",
]);
curl -X GET -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/my-index-000001"
client.indices().get(g -> g
    .index("my-index-000001")
);

Create an index Generally available

PUT /{index}

You can use the create index API to add a new index to an Elasticsearch cluster. When creating an index, you can specify the following:

  • Settings for the index.
  • Mappings for fields in the index.
  • Index aliases

Wait for active shards

By default, index creation will only return a response to the client when the primary copies of each shard have been started, or the request times out. The index creation response will indicate what happened. For example, acknowledged indicates whether the index was successfully created in the cluster, while shards_acknowledged indicates whether the requisite number of shard copies were started for each shard in the index before timing out. Note that it is still possible for either acknowledged or shards_acknowledged to be false, but for the index creation to be successful. These values simply indicate whether the operation completed before the timeout. If acknowledged is false, the request timed out before the cluster state was updated with the newly created index, but it probably will be created sometime soon. If shards_acknowledged is false, then the request timed out before the requisite number of shards were started (by default just the primaries), even if the cluster state was successfully updated to reflect the newly created index (that is to say, acknowledged is true).

You can change the default of only waiting for the primary shards to start through the index setting index.write.wait_for_active_shards. Note that changing this setting will also affect the wait_for_active_shards value on all subsequent write operations.

Required authorization

  • Index privileges: create_index,manage

Path parameters

  • index string Required

    Name of the index you wish to create. Index names must meet the following criteria:

    • Lowercase only
    • Cannot include \, /, *, ?, ", <, >, |, (space character), ,, or #
    • Indices prior to 7.0 could contain a colon (:), but that has been deprecated and will not be supported in later versions
    • Cannot start with -, _, or +
    • Cannot be . or ..
    • Cannot be longer than 255 bytes (note thtat it is bytes, so multi-byte characters will reach the limit faster)
    • Names starting with . are deprecated, except for hidden indices and internal indices managed by plugins

Query parameters

  • master_timeout string

    Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

    Values are -1 or 0.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

    Values are -1 or 0.

  • wait_for_active_shards number | string

    The number of shard copies that must be active before proceeding with the operation. Set to all or any positive integer up to the total number of shards in the index (number_of_replicas+1).

    Values are all or index-setting.

application/json

Body

  • aliases object

    Aliases for the index.

    Hide aliases attribute Show aliases attribute object
    • * object Additional properties
      Hide * attributes Show * attributes object
      • filter object

        An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

        External documentation
      • index_routing string
      • is_hidden boolean

        If true, the alias is hidden. All indices for the alias must have the same is_hidden value.

        Default value is false.

      • is_write_index boolean

        If true, the index is the write index for the alias.

        Default value is false.

      • routing string
      • search_routing string
  • mappings object
    Hide mappings attributes Show mappings attributes object
    • all_field object
      Hide all_field attributes Show all_field attributes object
      • analyzer string Required
      • enabled boolean Required
      • omit_norms boolean Required
      • search_analyzer string Required
      • similarity string Required
      • store boolean Required
      • store_term_vector_offsets boolean Required
      • store_term_vector_payloads boolean Required
      • store_term_vector_positions boolean Required
      • store_term_vectors boolean Required
    • date_detection boolean
    • dynamic string

      Values are strict, runtime, true, or false.

    • dynamic_date_formats array[string]
    • dynamic_templates array[object]
    • _field_names object
      Hide _field_names attribute Show _field_names attribute object
      • enabled boolean Required
    • index_field object
      Hide index_field attribute Show index_field attribute object
      • enabled boolean Required
    • _meta object
      Hide _meta attribute Show _meta attribute object
      • * object Additional properties
    • numeric_detection boolean
    • properties object
    • _routing object
      Hide _routing attribute Show _routing attribute object
      • required boolean Required
    • _size object
      Hide _size attribute Show _size attribute object
      • enabled boolean Required
    • _source object
      Hide _source attributes Show _source attributes object
      • compress boolean
      • compress_threshold string
      • enabled boolean
      • excludes array[string]
      • includes array[string]
      • mode string

        Values are disabled, stored, or synthetic.

    • runtime object
      Hide runtime attribute Show runtime attribute object
      • * object Additional properties
        Hide * attributes Show * attributes object
        • fields object

          For type composite

          Hide fields attribute Show fields attribute object
          • * object Additional properties
            Hide * attribute Show * attribute object
            • type string Required

              Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

        • fetch_fields array[object]

          For type lookup

          Hide fetch_fields attributes Show fetch_fields attributes object
          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • format string
        • format string

          A custom format for date type runtime fields.

        • input_field string

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • target_field string

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • target_index string
        • script object
          Hide script attributes Show script attributes object
          • source string | object

            One of:
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

            Hide params attribute Show params attribute object
            • * object Additional properties
          • lang string

            Any of:

            Values are painless, expression, mustache, or java.

          • options object
            Hide options attribute Show options attribute object
            • * string Additional properties
        • type string Required

          Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

    • enabled boolean
    • subobjects string

      Values are true or false.

    • _data_stream_timestamp object
      Hide _data_stream_timestamp attribute Show _data_stream_timestamp attribute object
      • enabled boolean Required
  • settings object Additional properties
    Index settings

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • index string Required
    • shards_acknowledged boolean Required
    • acknowledged boolean Required
PUT /my-index-000001
{
  "settings": {
    "number_of_shards": 3,
    "number_of_replicas": 2
  }
}
resp = client.indices.create(
    index="my-index-000001",
    settings={
        "number_of_shards": 3,
        "number_of_replicas": 2
    },
)
const response = await client.indices.create({
  index: "my-index-000001",
  settings: {
    number_of_shards: 3,
    number_of_replicas: 2,
  },
});
response = client.indices.create(
  index: "my-index-000001",
  body: {
    "settings": {
      "number_of_shards": 3,
      "number_of_replicas": 2
    }
  }
)
$resp = $client->indices()->create([
    "index" => "my-index-000001",
    "body" => [
        "settings" => [
            "number_of_shards" => 3,
            "number_of_replicas" => 2,
        ],
    ],
]);
curl -X PUT -H "Authorization: ApiKey $ELASTIC_API_KEY" -H "Content-Type: application/json" -d '{"settings":{"number_of_shards":3,"number_of_replicas":2}}' "$ELASTICSEARCH_URL/my-index-000001"
client.indices().create(c -> c
    .index("my-index-000001")
    .settings(s -> s
        .numberOfShards("3")
        .numberOfReplicas("2")
    )
);
This request specifies the `number_of_shards` and `number_of_replicas`.
{
  "settings": {
    "number_of_shards": 3,
    "number_of_replicas": 2
  }
}
You can provide mapping definitions in the create index API requests.
{
  "settings": {
    "number_of_shards": 1
  },
  "mappings": {
    "properties": {
      "field1": { "type": "text" }
    }
  }
}
You can provide mapping definitions in the create index API requests. Index alias names also support date math.
{
  "aliases": {
    "alias_1": {},
    "alias_2": {
      "filter": {
        "term": {
          "user.id": "kimchy"
        }
      },
      "routing": "shard-1"
    }
  }
}
















Get index templates Generally available

GET /_index_template/{name}

All methods and paths for this operation:

GET /_index_template

GET /_index_template/{name}

Get information about one or more index templates.

Required authorization

  • Cluster privileges: manage_index_templates

Path parameters

  • name string Required

    Comma-separated list of index template names used to limit the request. Wildcard (*) expressions are supported.

Query parameters

  • local boolean

    If true, the request retrieves information from the local node only. Defaults to false, which means information is retrieved from the master node.

  • flat_settings boolean

    If true, returns settings in flat format.

  • master_timeout string

    Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

    Values are -1 or 0.

  • include_defaults boolean Generally available

    If true, returns all relevant default configurations for the index template.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • index_templates array[object] Required
      Hide index_templates attributes Show index_templates attributes object
      • name string Required
      • index_template object Required
        Hide index_template attributes Show index_template attributes object
        • index_patterns string | array[string] Required
        • composed_of array[string] Required

          An ordered list of component template names. Component templates are merged in the order specified, meaning that the last component template specified has the highest precedence.

        • template object
          Hide template attributes Show template attributes object
          • aliases object

            Aliases to add. If the index template includes a data_stream object, these are data stream aliases. Otherwise, these are index aliases. Data stream aliases ignore the index_routing, routing, and search_routing options.

            Hide aliases attribute Show aliases attribute object
            • * object Additional properties
          • mappings object
            Hide mappings attributes Show mappings attributes object
            • all_field object
            • date_detection boolean
            • dynamic string

              Values are strict, runtime, true, or false.

            • dynamic_date_formats array[string]
            • dynamic_templates array[object]
            • _field_names object
            • index_field object
            • _meta object
            • numeric_detection boolean
            • properties object
            • _routing object
            • _size object
            • _source object
            • runtime object
            • enabled boolean
            • subobjects string

              Values are true or false.

            • _data_stream_timestamp object
          • settings object Additional properties
            Index settings
          • lifecycle object
          • data_stream_options object | string | null

            One of:

            Data stream options template contains the same information as DataStreamOptions but allows them to be set explicitly to null.

        • version number
        • priority number

          Priority to determine index template precedence when a new data stream or index is created. The index template with the highest priority is chosen. If no priority is specified the template is treated as though it is of priority 0 (lowest priority). This number is not automatically generated by Elasticsearch.

        • _meta object
          Hide _meta attribute Show _meta attribute object
          • * object Additional properties
        • allow_auto_create boolean
        • data_stream object
          Hide data_stream attributes Show data_stream attributes object
          • hidden boolean

            If true, the data stream is hidden.

            Default value is false.

          • allow_custom_routing boolean

            If true, the data stream supports custom routing.

            Default value is false.

        • deprecated boolean Generally available

          Marks this index template as deprecated. When creating or updating a non-deprecated index template that uses deprecated components, Elasticsearch will emit a deprecation warning.

        • ignore_missing_component_templates string | array[string]
GET _index_template/*?filter_path=index_templates.name,index_templates.index_template.index_patterns,index_templates.index_template.data_stream
resp = client.indices.get_index_template(
    name="*",
    filter_path="index_templates.name,index_templates.index_template.index_patterns,index_templates.index_template.data_stream",
)
const response = await client.indices.getIndexTemplate({
  name: "*",
  filter_path:
    "index_templates.name,index_templates.index_template.index_patterns,index_templates.index_template.data_stream",
});
response = client.indices.get_index_template(
  name: "*",
  filter_path: "index_templates.name,index_templates.index_template.index_patterns,index_templates.index_template.data_stream"
)
$resp = $client->indices()->getIndexTemplate([
    "name" => "*",
    "filter_path" => "index_templates.name,index_templates.index_template.index_patterns,index_templates.index_template.data_stream",
]);
curl -X GET -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/_index_template/*?filter_path=index_templates.name,index_templates.index_template.index_patterns,index_templates.index_template.data_stream"












Get aliases Generally available

GET /{index}/_alias/{name}

All methods and paths for this operation:

GET /_alias

GET /_alias/{name}
GET /{index}/_alias
GET /{index}/_alias/{name}

Retrieves information for one or more data stream or index aliases.

Required authorization

  • Index privileges: view_index_metadata

Path parameters

  • index string | array[string] Required

    Comma-separated list of data streams or indices used to limit the request. Supports wildcards (*). To target all data streams and indices, omit this parameter or use * or _all.

  • name string | array[string] Required

    Comma-separated list of aliases to retrieve. Supports wildcards (*). To retrieve all aliases, omit this parameter or use * or _all.

Query parameters

  • allow_no_indices boolean

    If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.

    Values are all, open, closed, hidden, or none.

  • ignore_unavailable boolean

    If false, the request returns an error if it targets a missing or closed index.

  • master_timeout string

    Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

    Values are -1 or 0.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • * object Additional properties
      Hide * attribute Show * attribute object
      • aliases object Required
        Hide aliases attribute Show aliases attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • filter object

            An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

            External documentation
          • index_routing string

            Value used to route indexing operations to a specific shard. If specified, this overwrites the routing value for indexing operations.

          • is_write_index boolean

            If true, the index is the write index for the alias.

            Default value is false.

          • routing string

            Value used to route indexing and search operations to a specific shard.

          • search_routing string

            Value used to route search operations to a specific shard. If specified, this overwrites the routing value for search operations.

          • is_hidden boolean Generally available

            If true, the alias is hidden. All indices for the alias must have the same is_hidden value.

            Default value is false.

GET _alias
resp = client.indices.get_alias()
const response = await client.indices.getAlias();
response = client.indices.get_alias
$resp = $client->indices()->getAlias();
curl -X GET -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/_alias"
client.indices().getAlias(g -> g);
































Simulate an index Generally available

POST /_index_template/_simulate_index/{name}

Get the index configuration that would be applied to the specified index from an existing index template.

Required authorization

  • Cluster privileges: manage_index_templates

Path parameters

  • name string Required

    Name of the index to simulate

Query parameters

  • create boolean

    Whether the index template we optionally defined in the body should only be dry-run added if new or can also replace an existing one

  • cause string

    User defined reason for dry-run creating the new template for simulation purposes

  • master_timeout string

    Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

    Values are -1 or 0.

  • include_defaults boolean Generally available

    If true, returns all relevant default configurations for the index template.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • overlapping array[object]
      Hide overlapping attributes Show overlapping attributes object
      • name string Required
      • index_patterns array[string] Required
    • template object Required
      Hide template attributes Show template attributes object
      • aliases object Required
        Hide aliases attribute Show aliases attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • filter object

            An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

            External documentation
          • index_routing string
          • is_hidden boolean

            If true, the alias is hidden. All indices for the alias must have the same is_hidden value.

            Default value is false.

          • is_write_index boolean

            If true, the index is the write index for the alias.

            Default value is false.

          • routing string
          • search_routing string
      • mappings object Required
        Hide mappings attributes Show mappings attributes object
        • all_field object
          Hide all_field attributes Show all_field attributes object
          • analyzer string Required
          • enabled boolean Required
          • omit_norms boolean Required
          • search_analyzer string Required
          • similarity string Required
          • store boolean Required
          • store_term_vector_offsets boolean Required
          • store_term_vector_payloads boolean Required
          • store_term_vector_positions boolean Required
          • store_term_vectors boolean Required
        • date_detection boolean
        • dynamic string

          Values are strict, runtime, true, or false.

        • dynamic_date_formats array[string]
        • dynamic_templates array[object]
        • _field_names object
          Hide _field_names attribute Show _field_names attribute object
          • enabled boolean Required
        • index_field object
          Hide index_field attribute Show index_field attribute object
          • enabled boolean Required
        • _meta object
          Hide _meta attribute Show _meta attribute object
          • * object Additional properties
        • numeric_detection boolean
        • properties object
        • _routing object
          Hide _routing attribute Show _routing attribute object
          • required boolean Required
        • _size object
          Hide _size attribute Show _size attribute object
          • enabled boolean Required
        • _source object
          Hide _source attributes Show _source attributes object
          • compress boolean
          • compress_threshold string
          • enabled boolean
          • excludes array[string]
          • includes array[string]
          • mode string

            Values are disabled, stored, or synthetic.

        • runtime object
          Hide runtime attribute Show runtime attribute object
          • * object Additional properties
            Hide * attributes Show * attributes object
            • fields object

              For type composite

              Hide fields attribute Show fields attribute object
              • * object Additional properties
            • fetch_fields array[object]

              For type lookup

            • format string

              A custom format for date type runtime fields.

            • input_field string

              Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

            • target_field string

              Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

            • target_index string
            • script object
              Hide script attributes Show script attributes object
              • source
              • id string
              • params object

                Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

              • lang
              • options object
            • type string Required

              Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

        • enabled boolean
        • subobjects string

          Values are true or false.

        • _data_stream_timestamp object
          Hide _data_stream_timestamp attribute Show _data_stream_timestamp attribute object
          • enabled boolean Required
      • settings object Required Additional properties
        Index settings
POST /_index_template/_simulate_index/{name}
POST /_index_template/_simulate_index/my-index-000001
resp = client.indices.simulate_index_template(
    name="my-index-000001",
)
const response = await client.indices.simulateIndexTemplate({
  name: "my-index-000001",
});
response = client.indices.simulate_index_template(
  name: "my-index-000001"
)
$resp = $client->indices()->simulateIndexTemplate([
    "name" => "my-index-000001",
]);
curl -X POST -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/_index_template/_simulate_index/my-index-000001"
client.indices().simulateIndexTemplate(s -> s
    .name("my-index-000001")
);
Response examples (200)
A successful response from `POST /_index_template/_simulate_index/my-index-000001`.
{
  "template" : {
    "settings" : {
      "index" : {
        "number_of_shards" : "2",
        "number_of_replicas" : "0",
        "routing" : {
          "allocation" : {
            "include" : {
              "_tier_preference" : "data_content"
            }
          }
        }
      }
    },
    "mappings" : {
      "properties" : {
        "@timestamp" : {
          "type" : "date"
        }
      }
    },
    "aliases" : { }
  },
  "overlapping" : [
    {
      "name" : "template_1",
      "index_patterns" : [
        "my-index-*"
      ]
    }
  ]
}












Inference

Inference APIs enable you to use certain services, such as built-in machine learning models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio or Hugging Face. For built-in models and models uploaded through Eland, the inference APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the inference APIs to use these models or if you want to use non-NLP models, use the machine learning trained model APIs.

















Perform inference on the service Generally available

POST /_inference/{task_type}/{inference_id}

All methods and paths for this operation:

POST /_inference/{inference_id}

POST /_inference/{task_type}/{inference_id}

This API enables you to use machine learning models to perform specific tasks on data that you provide as an input. It returns a response with the results of the tasks. The inference endpoint you use can perform one specific task that has been defined when the endpoint was created with the create inference API.

For details about using this API with a service, such as Amazon Bedrock, Anthropic, or HuggingFace, refer to the service-specific documentation.


The inference APIs enable you to use certain services, such as built-in machine learning models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face. For built-in models and models uploaded through Eland, the inference APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the inference APIs to use these models or if you want to use non-NLP models, use the machine learning trained model APIs.

Required authorization

  • Cluster privileges: monitor_inference

Path parameters

  • task_type string Required

    The type of inference task that the model performs.

    Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

  • inference_id string Required

    The unique identifier for the inference endpoint.

Query parameters

  • timeout string

    The amount of time to wait for the inference request to complete.

    Values are -1 or 0.

application/json

Body

  • query string

    The query input, which is required only for the rerank task. It is not required for other tasks.

  • input string | array[string] Required

    The text on which you want to perform the inference task. It can be a single string or an array.


    Inference endpoints for the completion task type currently only support a single string as input.

  • input_type string

    Specifies the input data type for the text embedding model. The input_type parameter only applies to Inference Endpoints with the text_embedding task type. Possible values include:

    • SEARCH
    • INGEST
    • CLASSIFICATION
    • CLUSTERING Not all services support all values. Unsupported values will trigger a validation exception. Accepted values depend on the configured inference service, refer to the relevant service-specific documentation for more info.


    The input_type parameter specified on the root level of the request body will take precedence over the input_type parameter specified in task_settings.

  • task_settings object

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • text_embedding_bytes array[object]

      The text embedding result object for byte representation

      Hide text_embedding_bytes attribute Show text_embedding_bytes attribute object
      • embedding array[number] Required

        Text Embedding results containing bytes are represented as Dense Vectors of bytes.

    • text_embedding_bits array[object]

      The text embedding result object for byte representation

      Hide text_embedding_bits attribute Show text_embedding_bits attribute object
      • embedding array[number] Required

        Text Embedding results containing bytes are represented as Dense Vectors of bytes.

    • text_embedding array[object]

      The text embedding result object

      Hide text_embedding attribute Show text_embedding attribute object
      • embedding array[number] Required

        Text Embedding results are represented as Dense Vectors of floats.

    • sparse_embedding array[object]
      Hide sparse_embedding attribute Show sparse_embedding attribute object
      • embedding object Required

        Sparse Embedding tokens are represented as a dictionary of string to double.

        Hide embedding attribute Show embedding attribute object
        • * number Additional properties
    • completion array[object]

      The completion result object

      Hide completion attribute Show completion attribute object
      • result string Required
    • rerank array[object]

      The rerank result object representing a single ranked document id: the original index of the document in the request relevance_score: the relevance_score of the document relative to the query text: Optional, the text of the document, if requested

      Hide rerank attributes Show rerank attributes object
      • index number Required
      • relevance_score number Required
      • text string
POST /_inference/{task_type}/{inference_id}
curl \
 --request POST 'http://api.example.com/_inference/{task_type}/{inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"query":"string","input":"string","input_type":"string","task_settings":{}}'

Delete an inference endpoint Generally available

DELETE /_inference/{task_type}/{inference_id}

All methods and paths for this operation:

DELETE /_inference/{inference_id}

DELETE /_inference/{task_type}/{inference_id}

Path parameters

  • task_type string Required

    The task type

    Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

  • inference_id string Required

    The inference identifier.

Query parameters

  • dry_run boolean

    When true, the endpoint is not deleted and a list of ingest processors which reference this endpoint is returned.

  • force boolean

    When true, the inference endpoint is forcefully deleted even if it is still being used by ingest processors or semantic text fields.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object

    Acknowledged response. For dry_run, contains the list of pipelines which reference the inference endpoint

    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

    • pipelines array[string] Required
DELETE /_inference/{task_type}/{inference_id}
DELETE /_inference/sparse_embedding/my-elser-model
resp = client.inference.delete(
    task_type="sparse_embedding",
    inference_id="my-elser-model",
)
const response = await client.inference.delete({
  task_type: "sparse_embedding",
  inference_id: "my-elser-model",
});
response = client.inference.delete(
  task_type: "sparse_embedding",
  inference_id: "my-elser-model"
)
$resp = $client->inference()->delete([
    "task_type" => "sparse_embedding",
    "inference_id" => "my-elser-model",
]);
curl -X DELETE -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/_inference/sparse_embedding/my-elser-model"
client.inference().delete(d -> d
    .inferenceId("my-elser-model")
    .taskType(TaskType.SparseEmbedding)
);








Create an Anthropic inference endpoint Generally available

PUT /_inference/{task_type}/{anthropic_inference_id}

Create an inference endpoint to perform an inference task with the anthropic service.

Required authorization

  • Cluster privileges: manage_inference

Path parameters

  • task_type string

    The task type. The only valid task type for the model to perform is completion.

    Value is completion.

  • anthropic_inference_id string Required

    The unique identifier of the inference endpoint.

Query parameters

  • timeout string

    Specifies the amount of time to wait for the inference endpoint to be created.

    Values are -1 or 0.

application/json

Body

  • chunking_settings object

    Chunking configuration object

    Hide chunking_settings attributes Show chunking_settings attributes object
    • max_chunk_size number

      The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

      Default value is 250.

    • overlap number

      The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      Default value is 100.

    • sentence_overlap number

      The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      Default value is 1.

    • separator_group string Required

      This parameter is only applicable when using the recursive chunking strategy.

      Sets a predefined list of separators in the saved chunking settings based on the selected text type. Values can be markdown or plaintext.

      Using this parameter is an alternative to manually specifying a custom separators list.

    • separators array[string] Required

      A list of strings used as possible split points when chunking text with the recursive strategy.

      Each string can be a plain string or a regular expression (regex) pattern. The system tries each separator in order to split the text, starting from the first item in the list.

      After splitting, it attempts to recombine smaller pieces into larger chunks that stay within the max_chunk_size limit, to reduce the total number of chunks generated.

    • strategy string

      The chunking strategy: sentence, word, none or recursive.

      • If strategy is set to recursive, you must also specify:

        • max_chunk_size
        • either separators orseparator_group

      Learn more about different chunking strategies in the linked documentation.

      Default value is sentence.

      External documentation
  • service string Required

    Value is anthropic.

  • service_settings object Required
    Hide service_settings attributes Show service_settings attributes object
    • api_key string Required

      A valid API key for the Anthropic API.

    • model_id string Required

      The name of the model to use for the inference task. Refer to the Anthropic documentation for the list of supported models.

    • rate_limit object

      This setting helps to minimize the number of rate limit errors returned from the service.

      Hide rate_limit attribute Show rate_limit attribute object
      • requests_per_minute number

        The number of requests allowed per minute. By default, the number of requests allowed per minute is set by each service as follows:

        • alibabacloud-ai-search service: 1000
        • anthropic service: 50
        • azureaistudio service: 240
        • azureopenai service and task type text_embedding: 1440
        • azureopenai service and task type completion: 120
        • cohere service: 10000
        • elastic service and task type chat_completion: 240
        • googleaistudio service: 360
        • googlevertexai service: 30000
        • hugging_face service: 3000
        • jinaai service: 2000
        • mistral service: 240
        • openai service and task type text_embedding: 3000
        • openai service and task type completion: 500
        • voyageai service: 2000
        • watsonxai service: 120
  • task_settings object
    Hide task_settings attributes Show task_settings attributes object
    • max_tokens number Required

      For a completion task, it is the maximum number of tokens to generate before stopping.

    • temperature number

      For a completion task, it is the amount of randomness injected into the response. For more details about the supported range, refer to Anthropic documentation.

      External documentation
    • top_k number

      For a completion task, it specifies to only sample from the top K options for each subsequent token. It is recommended for advanced use cases only. You usually only need to use temperature.

    • top_p number

      For a completion task, it specifies to use Anthropic's nucleus sampling. In nucleus sampling, Anthropic computes the cumulative distribution over all the options for each subsequent token in decreasing probability order and cuts it off once it reaches the specified probability. You should either alter temperature or top_p, but not both. It is recommended for advanced use cases only. You usually only need to use temperature.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • chunking_settings object

      Chunking configuration object

      Hide chunking_settings attributes Show chunking_settings attributes object
      • max_chunk_size number

        The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

        Default value is 250.

      • overlap number

        The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

        Default value is 100.

      • sentence_overlap number

        The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

        Default value is 1.

      • separator_group string Required

        This parameter is only applicable when using the recursive chunking strategy.

        Sets a predefined list of separators in the saved chunking settings based on the selected text type. Values can be markdown or plaintext.

        Using this parameter is an alternative to manually specifying a custom separators list.

      • separators array[string] Required

        A list of strings used as possible split points when chunking text with the recursive strategy.

        Each string can be a plain string or a regular expression (regex) pattern. The system tries each separator in order to split the text, starting from the first item in the list.

        After splitting, it attempts to recombine smaller pieces into larger chunks that stay within the max_chunk_size limit, to reduce the total number of chunks generated.

      • strategy string

        The chunking strategy: sentence, word, none or recursive.

        • If strategy is set to recursive, you must also specify:

          • max_chunk_size
          • either separators orseparator_group

        Learn more about different chunking strategies in the linked documentation.

        Default value is sentence.

        External documentation
    • service string Required

      The service type

    • service_settings object Required
    • task_settings object
    • inference_id string Required

      The inference Id

    • task_type string Required

      Value is completion.

PUT /_inference/{task_type}/{anthropic_inference_id}
PUT _inference/completion/anthropic_completion
{
    "service": "anthropic",
    "service_settings": {
        "api_key": "Anthropic-Api-Key",
        "model_id": "Model-ID"
    },
    "task_settings": {
        "max_tokens": 1024
    }
}
resp = client.inference.put(
    task_type="completion",
    inference_id="anthropic_completion",
    inference_config={
        "service": "anthropic",
        "service_settings": {
            "api_key": "Anthropic-Api-Key",
            "model_id": "Model-ID"
        },
        "task_settings": {
            "max_tokens": 1024
        }
    },
)
const response = await client.inference.put({
  task_type: "completion",
  inference_id: "anthropic_completion",
  inference_config: {
    service: "anthropic",
    service_settings: {
      api_key: "Anthropic-Api-Key",
      model_id: "Model-ID",
    },
    task_settings: {
      max_tokens: 1024,
    },
  },
});
response = client.inference.put(
  task_type: "completion",
  inference_id: "anthropic_completion",
  body: {
    "service": "anthropic",
    "service_settings": {
      "api_key": "Anthropic-Api-Key",
      "model_id": "Model-ID"
    },
    "task_settings": {
      "max_tokens": 1024
    }
  }
)
$resp = $client->inference()->put([
    "task_type" => "completion",
    "inference_id" => "anthropic_completion",
    "body" => [
        "service" => "anthropic",
        "service_settings" => [
            "api_key" => "Anthropic-Api-Key",
            "model_id" => "Model-ID",
        ],
        "task_settings" => [
            "max_tokens" => 1024,
        ],
    ],
]);
curl -X PUT -H "Authorization: ApiKey $ELASTIC_API_KEY" -H "Content-Type: application/json" -d '{"service":"anthropic","service_settings":{"api_key":"Anthropic-Api-Key","model_id":"Model-ID"},"task_settings":{"max_tokens":1024}}' "$ELASTICSEARCH_URL/_inference/completion/anthropic_completion"
client.inference().put(p -> p
    .inferenceId("anthropic_completion")
    .taskType(TaskType.Completion)
    .inferenceConfig(i -> i
        .service("anthropic")
        .serviceSettings(JsonData.fromJson("{\"api_key\":\"Anthropic-Api-Key\",\"model_id\":\"Model-ID\"}"))
        .taskSettings(JsonData.fromJson("{\"max_tokens\":1024}"))
    )
);
Request example
Run `PUT _inference/completion/anthropic_completion` to create an inference endpoint that performs a completion task.
{
    "service": "anthropic",
    "service_settings": {
        "api_key": "Anthropic-Api-Key",
        "model_id": "Model-ID"
    },
    "task_settings": {
        "max_tokens": 1024
    }
}




Create an Azure OpenAI inference endpoint Generally available

PUT /_inference/{task_type}/{azureopenai_inference_id}

Create an inference endpoint to perform an inference task with the azureopenai service.

The list of chat completion models that you can choose from in your Azure OpenAI deployment include:

The list of embeddings models that you can choose from in your deployment can be found in the Azure models documentation.

Required authorization

  • Cluster privileges: manage_inference

Path parameters

  • task_type string

    The type of the inference task that the model will perform. NOTE: The chat_completion task type only supports streaming and only through the _stream API.

    Values are completion or text_embedding.

  • azureopenai_inference_id string Required

    The unique identifier of the inference endpoint.

Query parameters

  • timeout string

    Specifies the amount of time to wait for the inference endpoint to be created.

    Values are -1 or 0.

application/json

Body

  • chunking_settings object

    Chunking configuration object

    Hide chunking_settings attributes Show chunking_settings attributes object
    • max_chunk_size number

      The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

      Default value is 250.

    • overlap number

      The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      Default value is 100.

    • sentence_overlap number

      The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      Default value is 1.

    • separator_group string Required

      This parameter is only applicable when using the recursive chunking strategy.

      Sets a predefined list of separators in the saved chunking settings based on the selected text type. Values can be markdown or plaintext.

      Using this parameter is an alternative to manually specifying a custom separators list.

    • separators array[string] Required

      A list of strings used as possible split points when chunking text with the recursive strategy.

      Each string can be a plain string or a regular expression (regex) pattern. The system tries each separator in order to split the text, starting from the first item in the list.

      After splitting, it attempts to recombine smaller pieces into larger chunks that stay within the max_chunk_size limit, to reduce the total number of chunks generated.

    • strategy string

      The chunking strategy: sentence, word, none or recursive.

      • If strategy is set to recursive, you must also specify:

        • max_chunk_size
        • either separators orseparator_group

      Learn more about different chunking strategies in the linked documentation.

      Default value is sentence.

      External documentation
  • service string Required

    Value is azureopenai.

  • service_settings object Required
    Hide service_settings attributes Show service_settings attributes object
    • api_key string

      A valid API key for your Azure OpenAI account. You must specify either api_key or entra_id. If you do not provide either or you provide both, you will receive an error when you try to create your model.

      IMPORTANT: You need to provide the API key only once, during the inference model creation. The get inference endpoint API does not retrieve your API key. After creating the inference model, you cannot change the associated API key. If you want to use a different API key, delete the inference model and recreate it with the same name and the updated API key.

      External documentation
    • api_version string Required

      The Azure API version ID to use. It is recommended to use the latest supported non-preview version.

    • deployment_id string Required

      The deployment name of your deployed models. Your Azure OpenAI deployments can be found though the Azure OpenAI Studio portal that is linked to your subscription.

      External documentation
    • entra_id string

      A valid Microsoft Entra token. You must specify either api_key or entra_id. If you do not provide either or you provide both, you will receive an error when you try to create your model.

      External documentation
    • rate_limit object

      This setting helps to minimize the number of rate limit errors returned from the service.

      Hide rate_limit attribute Show rate_limit attribute object
      • requests_per_minute number

        The number of requests allowed per minute. By default, the number of requests allowed per minute is set by each service as follows:

        • alibabacloud-ai-search service: 1000
        • anthropic service: 50
        • azureaistudio service: 240
        • azureopenai service and task type text_embedding: 1440
        • azureopenai service and task type completion: 120
        • cohere service: 10000
        • elastic service and task type chat_completion: 240
        • googleaistudio service: 360
        • googlevertexai service: 30000
        • hugging_face service: 3000
        • jinaai service: 2000
        • mistral service: 240
        • openai service and task type text_embedding: 3000
        • openai service and task type completion: 500
        • voyageai service: 2000
        • watsonxai service: 120
    • resource_name string Required

      The name of your Azure OpenAI resource. You can find this from the list of resources in the Azure Portal for your subscription.

      External documentation
  • task_settings object
    Hide task_settings attribute Show task_settings attribute object
    • user string

      For a completion or text_embedding task, specify the user issuing the request. This information can be used for abuse detection.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • chunking_settings object

      Chunking configuration object

      Hide chunking_settings attributes Show chunking_settings attributes object
      • max_chunk_size number

        The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

        Default value is 250.

      • overlap number

        The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

        Default value is 100.

      • sentence_overlap number

        The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

        Default value is 1.

      • separator_group string Required

        This parameter is only applicable when using the recursive chunking strategy.

        Sets a predefined list of separators in the saved chunking settings based on the selected text type. Values can be markdown or plaintext.

        Using this parameter is an alternative to manually specifying a custom separators list.

      • separators array[string] Required

        A list of strings used as possible split points when chunking text with the recursive strategy.

        Each string can be a plain string or a regular expression (regex) pattern. The system tries each separator in order to split the text, starting from the first item in the list.

        After splitting, it attempts to recombine smaller pieces into larger chunks that stay within the max_chunk_size limit, to reduce the total number of chunks generated.

      • strategy string

        The chunking strategy: sentence, word, none or recursive.

        • If strategy is set to recursive, you must also specify:

          • max_chunk_size
          • either separators orseparator_group

        Learn more about different chunking strategies in the linked documentation.

        Default value is sentence.

        External documentation
    • service string Required

      The service type

    • service_settings object Required
    • task_settings object
    • inference_id string Required

      The inference Id

    • task_type string Required

      Values are text_embedding or completion.

PUT /_inference/{task_type}/{azureopenai_inference_id}
PUT _inference/text_embedding/azure_openai_embeddings
{
    "service": "azureopenai",
    "service_settings": {
        "api_key": "Api-Key",
        "resource_name": "Resource-name",
        "deployment_id": "Deployment-id",
        "api_version": "2024-02-01"
    }
}
resp = client.inference.put(
    task_type="text_embedding",
    inference_id="azure_openai_embeddings",
    inference_config={
        "service": "azureopenai",
        "service_settings": {
            "api_key": "Api-Key",
            "resource_name": "Resource-name",
            "deployment_id": "Deployment-id",
            "api_version": "2024-02-01"
        }
    },
)
const response = await client.inference.put({
  task_type: "text_embedding",
  inference_id: "azure_openai_embeddings",
  inference_config: {
    service: "azureopenai",
    service_settings: {
      api_key: "Api-Key",
      resource_name: "Resource-name",
      deployment_id: "Deployment-id",
      api_version: "2024-02-01",
    },
  },
});
response = client.inference.put(
  task_type: "text_embedding",
  inference_id: "azure_openai_embeddings",
  body: {
    "service": "azureopenai",
    "service_settings": {
      "api_key": "Api-Key",
      "resource_name": "Resource-name",
      "deployment_id": "Deployment-id",
      "api_version": "2024-02-01"
    }
  }
)
$resp = $client->inference()->put([
    "task_type" => "text_embedding",
    "inference_id" => "azure_openai_embeddings",
    "body" => [
        "service" => "azureopenai",
        "service_settings" => [
            "api_key" => "Api-Key",
            "resource_name" => "Resource-name",
            "deployment_id" => "Deployment-id",
            "api_version" => "2024-02-01",
        ],
    ],
]);
curl -X PUT -H "Authorization: ApiKey $ELASTIC_API_KEY" -H "Content-Type: application/json" -d '{"service":"azureopenai","service_settings":{"api_key":"Api-Key","resource_name":"Resource-name","deployment_id":"Deployment-id","api_version":"2024-02-01"}}' "$ELASTICSEARCH_URL/_inference/text_embedding/azure_openai_embeddings"
client.inference().put(p -> p
    .inferenceId("azure_openai_embeddings")
    .taskType(TaskType.TextEmbedding)
    .inferenceConfig(i -> i
        .service("azureopenai")
        .serviceSettings(JsonData.fromJson("{\"api_key\":\"Api-Key\",\"resource_name\":\"Resource-name\",\"deployment_id\":\"Deployment-id\",\"api_version\":\"2024-02-01\"}"))
    )
);
Request examples
Run `PUT _inference/text_embedding/azure_openai_embeddings` to create an inference endpoint that performs a `text_embedding` task. You do not specify a model, as it is defined already in the Azure OpenAI deployment.
{
    "service": "azureopenai",
    "service_settings": {
        "api_key": "Api-Key",
        "resource_name": "Resource-name",
        "deployment_id": "Deployment-id",
        "api_version": "2024-02-01"
    }
}
Run `PUT _inference/completion/azure_openai_completion` to create an inference endpoint that performs a `completion` task.
{
    "service": "azureopenai",
    "service_settings": {
        "api_key": "Api-Key",
        "resource_name": "Resource-name",
        "deployment_id": "Deployment-id",
        "api_version": "2024-02-01"
    }
}




Create a custom inference endpoint Generally available

PUT /_inference/{task_type}/{custom_inference_id}

The custom service gives more control over how to interact with external inference services that aren't explicitly supported through dedicated integrations. The custom service gives you the ability to define the headers, url, query parameters, request body, and secrets. The custom service supports the template replacement functionality, which enables you to define a template that can be replaced with the value associated with that key. Templates are portions of a string that start with ${ and end with }. The parameters secret_parameters and task_settings are checked for keys for template replacement. Template replacement is supported in the request, headers, url, and query_parameters. If the definition (key) is not found for a template, an error message is returned. In case of an endpoint definition like the following:

PUT _inference/text_embedding/test-text-embedding
{
  "service": "custom",
  "service_settings": {
     "secret_parameters": {
          "api_key": "<some api key>"
     },
     "url": "...endpoints.huggingface.cloud/v1/embeddings",
     "headers": {
         "Authorization": "Bearer ${api_key}",
         "Content-Type": "application/json"
     },
     "request": "{\"input\": ${input}}",
     "response": {
         "json_parser": {
             "text_embeddings":"$.data[*].embedding[*]"
         }
     }
  }
}

To replace ${api_key} the secret_parameters and task_settings are checked for a key named api_key.


Templates should not be surrounded by quotes.

Pre-defined templates:

  • ${input} refers to the array of input strings that comes from the input field of the subsequent inference requests.
  • ${input_type} refers to the input type translation values.
  • ${query} refers to the query field used specifically for reranking tasks.
  • ${top_n} refers to the top_n field available when performing rerank requests.
  • ${return_documents} refers to the return_documents field available when performing rerank requests.

Required authorization

  • Cluster privileges: manage_inference

Path parameters

  • task_type string

    The type of the inference task that the model will perform.

    Values are text_embedding, sparse_embedding, rerank, or completion.

  • custom_inference_id string Required

    The unique identifier of the inference endpoint.

application/json

Body

  • chunking_settings object

    Chunking configuration object

    Hide chunking_settings attributes Show chunking_settings attributes object
    • max_chunk_size number

      The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

      Default value is 250.

    • overlap number

      The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      Default value is 100.

    • sentence_overlap number

      The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      Default value is 1.

    • separator_group string Required

      This parameter is only applicable when using the recursive chunking strategy.

      Sets a predefined list of separators in the saved chunking settings based on the selected text type. Values can be markdown or plaintext.

      Using this parameter is an alternative to manually specifying a custom separators list.

    • separators array[string] Required

      A list of strings used as possible split points when chunking text with the recursive strategy.

      Each string can be a plain string or a regular expression (regex) pattern. The system tries each separator in order to split the text, starting from the first item in the list.

      After splitting, it attempts to recombine smaller pieces into larger chunks that stay within the max_chunk_size limit, to reduce the total number of chunks generated.

    • strategy string

      The chunking strategy: sentence, word, none or recursive.

      • If strategy is set to recursive, you must also specify:

        • max_chunk_size
        • either separators orseparator_group

      Learn more about different chunking strategies in the linked documentation.

      Default value is sentence.

      External documentation
  • service string Required

    Value is custom.

  • service_settings object Required
    Hide service_settings attributes Show service_settings attributes object
    • headers object

      Specifies the HTTPS header parameters – such as Authentication or Contet-Type – that are required to access the custom service. For example:

      "headers":{
        "Authorization": "Bearer ${api_key}",
        "Content-Type": "application/json;charset=utf-8"
      }
      
    • input_type object

      Specifies the input type translation values that are used to replace the ${input_type} template in the request body. For example:

      "input_type": {
        "translation": {
          "ingest": "do_ingest",
          "search": "do_search"
        },
        "default": "a_default"
      },
      

      If the subsequent inference requests come from a search context, the search key will be used and the template will be replaced with do_search. If it comes from the ingest context do_ingest is used. If it's a different context that is not specified, the default value will be used. If no default is specified an empty string is used. translation can be:

      • classification
      • clustering
      • ingest
      • search
    • query_parameters object

      Specifies the query parameters as a list of tuples. The arrays inside the query_parameters must have two items, a key and a value. For example:

      "query_parameters":[
        ["param_key", "some_value"],
        ["param_key", "another_value"],
        ["other_key", "other_value"]
      ]
      

      If the base url is https://www.elastic.co it results in: https://www.elastic.co?param_key=some_value&param_key=another_value&other_key=other_value.

    • request object Required
      Hide request attribute Show request attribute object
      • content string Required

        The body structure of the request. It requires passing in the string-escaped result of the JSON format HTTP request body. For example:

        "request": "{\"input\":${input}}"
        


        The content string needs to be a single line except when using the Kibana console.

    • response object Required
      Hide response attribute Show response attribute object
      • json_parser object Required

        Specifies the JSON parser that is used to parse the response from the custom service. Different task types require different json_parser parameters. For example:

        # text_embedding
        # For a response like this:
        
        {
         "object": "list",
         "data": [
             {
               "object": "embedding",
               "index": 0,
               "embedding": [
                   0.014539449,
                   -0.015288644
               ]
             }
         ],
         "model": "text-embedding-ada-002-v2",
         "usage": {
             "prompt_tokens": 8,
             "total_tokens": 8
         }
        }
        
        # the json_parser definition should look like this:
        
        "response":{
          "json_parser":{
            "text_embeddings":"$.data[*].embedding[*]"
          }
        }
        
        # sparse_embedding
        # For a response like this:
        
        {
          "request_id": "75C50B5B-E79E-4930-****-F48DBB392231",
          "latency": 22,
          "usage": {
             "token_count": 11
          },
          "result": {
             "sparse_embeddings": [
                {
                  "index": 0,
                  "embedding": [
                    {
                      "token_id": 6,
                      "weight": 0.101
                    },
                    {
                      "token_id": 163040,
                      "weight": 0.28417
                    }
                  ]
                }
             ]
          }
        }
        
        # the json_parser definition should look like this:
        
        "response":{
          "json_parser":{
            "token_path":"$.result.sparse_embeddings[*].embedding[*].token_id",
            "weight_path":"$.result.sparse_embeddings[*].embedding[*].weight"
          }
        }
        
        # rerank
        # For a response like this:
        
        {
          "results": [
            {
              "index": 3,
              "relevance_score": 0.999071,
              "document": "abc"
            },
            {
              "index": 4,
              "relevance_score": 0.7867867,
              "document": "123"
            },
            {
              "index": 0,
              "relevance_score": 0.32713068,
              "document": "super"
            }
          ],
        }
        
        # the json_parser definition should look like this:
        
        "response":{
          "json_parser":{
            "reranked_index":"$.result.scores[*].index",    // optional
            "relevance_score":"$.result.scores[*].score",
            "document_text":"xxx"    // optional
          }
        }
        
        # completion
        # For a response like this:
        
        {
         "id": "chatcmpl-B9MBs8CjcvOU2jLn4n570S5qMJKcT",
         "object": "chat.completion",
         "created": 1741569952,
         "model": "gpt-4.1-2025-04-14",
         "choices": [
           {
            "index": 0,
            "message": {
              "role": "assistant",
              "content": "Hello! How can I assist you today?",
              "refusal": null,
              "annotations": []
            },
            "logprobs": null,
            "finish_reason": "stop"
          }
         ]
        }
        
        # the json_parser definition should look like this:
        
        "response":{
          "json_parser":{
            "completion_result":"$.choices[*].message.content"
          }
        }
        
    • secret_parameters object Required

      Specifies secret parameters, like api_key or api_token, that are required to access the custom service. For example:

      "secret_parameters":{
        "api_key":"<api_key>"
      }
      
    • url string

      The URL endpoint to use for the requests.

  • task_settings object
    Hide task_settings attribute Show task_settings attribute object
    • parameters object

      Specifies parameters that are required to run the custom service. The parameters depend on the model your custom service uses. For example:

      "task_settings":{
        "parameters":{
          "input_type":"query",
          "return_token":true
        }
      }
      

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • chunking_settings object

      Chunking configuration object

      Hide chunking_settings attributes Show chunking_settings attributes object
      • max_chunk_size number

        The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

        Default value is 250.

      • overlap number

        The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

        Default value is 100.

      • sentence_overlap number

        The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

        Default value is 1.

      • separator_group string Required

        This parameter is only applicable when using the recursive chunking strategy.

        Sets a predefined list of separators in the saved chunking settings based on the selected text type. Values can be markdown or plaintext.

        Using this parameter is an alternative to manually specifying a custom separators list.

      • separators array[string] Required

        A list of strings used as possible split points when chunking text with the recursive strategy.

        Each string can be a plain string or a regular expression (regex) pattern. The system tries each separator in order to split the text, starting from the first item in the list.

        After splitting, it attempts to recombine smaller pieces into larger chunks that stay within the max_chunk_size limit, to reduce the total number of chunks generated.

      • strategy string

        The chunking strategy: sentence, word, none or recursive.

        • If strategy is set to recursive, you must also specify:

          • max_chunk_size
          • either separators orseparator_group

        Learn more about different chunking strategies in the linked documentation.

        Default value is sentence.

        External documentation
    • service string Required

      The service type

    • service_settings object Required
    • task_settings object
    • inference_id string Required

      The inference Id

    • task_type string Required

      Values are text_embedding, sparse_embedding, rerank, or completion.

PUT /_inference/{task_type}/{custom_inference_id}
curl \
 --request PUT 'http://api.example.com/_inference/{task_type}/{custom_inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"chunking_settings":{"max_chunk_size":250,"overlap":100,"sentence_overlap":1,"separator_group":"string","separators":["string"],"strategy":"sentence"},"service":"custom","service_settings":{"headers":{},"input_type":{},"query_parameters":{},"request":{"content":"string"},"response":{"json_parser":{}},"secret_parameters":{},"url":"string"},"task_settings":{"parameters":{}}}'




Create an Elasticsearch inference endpoint Generally available

PUT /_inference/{task_type}/{elasticsearch_inference_id}

Create an inference endpoint to perform an inference task with the elasticsearch service.


Your Elasticsearch deployment contains preconfigured ELSER and E5 inference endpoints, you only need to create the enpoints using the API if you want to customize the settings.

If you use the ELSER or the E5 model through the elasticsearch service, the API request will automatically download and deploy the model if it isn't downloaded yet.


You might see a 502 bad gateway error in the response when using the Kibana Console. This error usually just reflects a timeout, while the model downloads in the background. You can check the download progress in the Machine Learning UI. If using the Python client, you can set the timeout parameter to a higher value.

After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for "state": "fully_allocated" in the response and ensure that the "allocation_count" matches the "target_allocation_count". Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.

Required authorization

  • Cluster privileges: manage_inference

Path parameters

  • task_type string

    The type of the inference task that the model will perform.

    Values are rerank, sparse_embedding, or text_embedding.

  • elasticsearch_inference_id string Required

    The unique identifier of the inference endpoint. The must not match the model_id.

Query parameters

  • timeout string

    Specifies the amount of time to wait for the inference endpoint to be created.

    Values are -1 or 0.

application/json

Body

  • chunking_settings object

    Chunking configuration object

    Hide chunking_settings attributes Show chunking_settings attributes object
    • max_chunk_size number

      The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

      Default value is 250.

    • overlap number

      The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      Default value is 100.

    • sentence_overlap number

      The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      Default value is 1.

    • separator_group string Required

      This parameter is only applicable when using the recursive chunking strategy.

      Sets a predefined list of separators in the saved chunking settings based on the selected text type. Values can be markdown or plaintext.

      Using this parameter is an alternative to manually specifying a custom separators list.

    • separators array[string] Required

      A list of strings used as possible split points when chunking text with the recursive strategy.

      Each string can be a plain string or a regular expression (regex) pattern. The system tries each separator in order to split the text, starting from the first item in the list.

      After splitting, it attempts to recombine smaller pieces into larger chunks that stay within the max_chunk_size limit, to reduce the total number of chunks generated.

    • strategy string

      The chunking strategy: sentence, word, none or recursive.

      • If strategy is set to recursive, you must also specify:

        • max_chunk_size
        • either separators orseparator_group

      Learn more about different chunking strategies in the linked documentation.

      Default value is sentence.

      External documentation
  • service string Required

    Value is elasticsearch.

  • service_settings object Required
    Hide service_settings attributes Show service_settings attributes object
    • adaptive_allocations object
      Hide adaptive_allocations attributes Show adaptive_allocations attributes object
      • enabled boolean

        Turn on adaptive_allocations.

        Default value is false.

      • max_number_of_allocations number

        The maximum number of allocations to scale to. If set, it must be greater than or equal to min_number_of_allocations.

      • min_number_of_allocations number

        The minimum number of allocations to scale to. If set, it must be greater than or equal to 0. If not defined, the deployment scales to 0.

    • deployment_id string

      The deployment identifier for a trained model deployment. When deployment_id is used the model_id is optional.

    • model_id string Required

      The name of the model to use for the inference task. It can be the ID of a built-in model (for example, .multilingual-e5-small for E5) or a text embedding model that was uploaded by using the Eland client.

      External documentation
    • num_allocations number

      The total number of allocations that are assigned to the model across machine learning nodes. Increasing this value generally increases the throughput. If adaptive allocations are enabled, do not set this value because it's automatically set.

    • num_threads number Required

      The number of threads used by each model allocation during inference. This setting generally increases the speed per inference request. The inference process is a compute-bound process; threads_per_allocations must not exceed the number of available allocated processors per node. The value must be a power of 2. The maximum value is 32.

  • task_settings object
    Hide task_settings attribute Show task_settings attribute object
    • return_documents boolean

      For a rerank task, return the document instead of only the index.

      Default value is true.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • chunking_settings object

      Chunking configuration object

      Hide chunking_settings attributes Show chunking_settings attributes object
      • max_chunk_size number

        The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

        Default value is 250.

      • overlap number

        The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

        Default value is 100.

      • sentence_overlap number

        The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

        Default value is 1.

      • separator_group string Required

        This parameter is only applicable when using the recursive chunking strategy.

        Sets a predefined list of separators in the saved chunking settings based on the selected text type. Values can be markdown or plaintext.

        Using this parameter is an alternative to manually specifying a custom separators list.

      • separators array[string] Required

        A list of strings used as possible split points when chunking text with the recursive strategy.

        Each string can be a plain string or a regular expression (regex) pattern. The system tries each separator in order to split the text, starting from the first item in the list.

        After splitting, it attempts to recombine smaller pieces into larger chunks that stay within the max_chunk_size limit, to reduce the total number of chunks generated.

      • strategy string

        The chunking strategy: sentence, word, none or recursive.

        • If strategy is set to recursive, you must also specify:

          • max_chunk_size
          • either separators orseparator_group

        Learn more about different chunking strategies in the linked documentation.

        Default value is sentence.

        External documentation
    • service string Required

      The service type

    • service_settings object Required
    • task_settings object
    • inference_id string Required

      The inference Id

    • task_type string Required

      Values are sparse_embedding, text_embedding, or rerank.

PUT /_inference/{task_type}/{elasticsearch_inference_id}
PUT _inference/sparse_embedding/my-elser-model
{
    "service": "elasticsearch",
    "service_settings": {
        "adaptive_allocations": { 
        "enabled": true,
        "min_number_of_allocations": 1,
        "max_number_of_allocations": 4
        },
        "num_threads": 1,
        "model_id": ".elser_model_2" 
    }
}
resp = client.inference.put(
    task_type="sparse_embedding",
    inference_id="my-elser-model",
    inference_config={
        "service": "elasticsearch",
        "service_settings": {
            "adaptive_allocations": {
                "enabled": True,
                "min_number_of_allocations": 1,
                "max_number_of_allocations": 4
            },
            "num_threads": 1,
            "model_id": ".elser_model_2"
        }
    },
)
const response = await client.inference.put({
  task_type: "sparse_embedding",
  inference_id: "my-elser-model",
  inference_config: {
    service: "elasticsearch",
    service_settings: {
      adaptive_allocations: {
        enabled: true,
        min_number_of_allocations: 1,
        max_number_of_allocations: 4,
      },
      num_threads: 1,
      model_id: ".elser_model_2",
    },
  },
});
response = client.inference.put(
  task_type: "sparse_embedding",
  inference_id: "my-elser-model",
  body: {
    "service": "elasticsearch",
    "service_settings": {
      "adaptive_allocations": {
        "enabled": true,
        "min_number_of_allocations": 1,
        "max_number_of_allocations": 4
      },
      "num_threads": 1,
      "model_id": ".elser_model_2"
    }
  }
)
$resp = $client->inference()->put([
    "task_type" => "sparse_embedding",
    "inference_id" => "my-elser-model",
    "body" => [
        "service" => "elasticsearch",
        "service_settings" => [
            "adaptive_allocations" => [
                "enabled" => true,
                "min_number_of_allocations" => 1,
                "max_number_of_allocations" => 4,
            ],
            "num_threads" => 1,
            "model_id" => ".elser_model_2",
        ],
    ],
]);
curl -X PUT -H "Authorization: ApiKey $ELASTIC_API_KEY" -H "Content-Type: application/json" -d '{"service":"elasticsearch","service_settings":{"adaptive_allocations":{"enabled":true,"min_number_of_allocations":1,"max_number_of_allocations":4},"num_threads":1,"model_id":".elser_model_2"}}' "$ELASTICSEARCH_URL/_inference/sparse_embedding/my-elser-model"
client.inference().put(p -> p
    .inferenceId("my-elser-model")
    .taskType(TaskType.SparseEmbedding)
    .inferenceConfig(i -> i
        .service("elasticsearch")
        .serviceSettings(JsonData.fromJson("{\"adaptive_allocations\":{\"enabled\":true,\"min_number_of_allocations\":1,\"max_number_of_allocations\":4},\"num_threads\":1,\"model_id\":\".elser_model_2\"}"))
    )
);
Run `PUT _inference/sparse_embedding/my-elser-model` to create an inference endpoint that performs a `sparse_embedding` task. The `model_id` must be the ID of one of the built-in ELSER models. The API will automatically download the ELSER model if it isn't already downloaded and then deploy the model.
{
    "service": "elasticsearch",
    "service_settings": {
        "adaptive_allocations": { 
        "enabled": true,
        "min_number_of_allocations": 1,
        "max_number_of_allocations": 4
        },
        "num_threads": 1,
        "model_id": ".elser_model_2" 
    }
}
Run `PUT _inference/rerank/my-elastic-rerank` to create an inference endpoint that performs a rerank task using the built-in Elastic Rerank cross-encoder model. The `model_id` must be `.rerank-v1`, which is the ID of the built-in Elastic Rerank model. The API will automatically download the Elastic Rerank model if it isn't already downloaded and then deploy the model. Once deployed, the model can be used for semantic re-ranking with a `text_similarity_reranker` retriever.
{
    "service": "elasticsearch",
    "service_settings": {
        "model_id": ".rerank-v1", 
        "num_threads": 1,
        "adaptive_allocations": { 
        "enabled": true,
        "min_number_of_allocations": 1,
        "max_number_of_allocations": 4
        }
    }
}
Run `PUT _inference/text_embedding/my-e5-model` to create an inference endpoint that performs a `text_embedding` task. The `model_id` must be the ID of one of the built-in E5 models. The API will automatically download the E5 model if it isn't already downloaded and then deploy the model.
{
    "service": "elasticsearch",
    "service_settings": {
        "num_allocations": 1,
        "num_threads": 1,
        "model_id": ".multilingual-e5-small" 
    }
}
Run `PUT _inference/text_embedding/my-msmarco-minilm-model` to create an inference endpoint that performs a `text_embedding` task with a model that was uploaded by Eland.
{
    "service": "elasticsearch",
    "service_settings": {
        "num_allocations": 1,
        "num_threads": 1,
        "model_id": "msmarco-MiniLM-L12-cos-v5" 
    }
}
Run `PUT _inference/text_embedding/my-e5-model` to create an inference endpoint that performs a `text_embedding` task and to configure adaptive allocations. The API request will automatically download the E5 model if it isn't already downloaded and then deploy the model.
{
    "service": "elasticsearch",
    "service_settings": {
        "adaptive_allocations": {
        "enabled": true,
        "min_number_of_allocations": 3,
        "max_number_of_allocations": 10
        },
        "num_threads": 1,
        "model_id": ".multilingual-e5-small"
    }
}
Run `PUT _inference/sparse_embedding/use_existing_deployment` to use an already existing model deployment when creating an inference endpoint.
{
    "service": "elasticsearch",
    "service_settings": {
        "deployment_id": ".elser_model_2"
    }
}
Response examples (200)
A successful response from `PUT _inference/sparse_embedding/use_existing_deployment`. It contains the model ID and the threads and allocations settings from the model deployment.
{
  "inference_id": "use_existing_deployment",
  "task_type": "sparse_embedding",
  "service": "elasticsearch",
  "service_settings": {
    "num_allocations": 2,
    "num_threads": 1,
    "model_id": ".elser_model_2",
    "deployment_id": ".elser_model_2"
  },
  "chunking_settings": {
    "strategy": "sentence",
    "max_chunk_size": 250,
    "sentence_overlap": 1
  }
}
















Create an JinaAI inference endpoint Generally available

PUT /_inference/{task_type}/{jinaai_inference_id}

Create an inference endpoint to perform an inference task with the jinaai service.

To review the available rerank models, refer to https://jina.ai/reranker. To review the available text_embedding models, refer to the https://jina.ai/embeddings/.

Required authorization

  • Cluster privileges: manage_inference

Path parameters

  • task_type string

    The type of the inference task that the model will perform.

    Values are rerank or text_embedding.

  • jinaai_inference_id string Required

    The unique identifier of the inference endpoint.

Query parameters

  • timeout string

    Specifies the amount of time to wait for the inference endpoint to be created.

    Values are -1 or 0.

application/json

Body

  • chunking_settings object

    Chunking configuration object

    Hide chunking_settings attributes Show chunking_settings attributes object
    • max_chunk_size number

      The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

      Default value is 250.

    • overlap number

      The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      Default value is 100.

    • sentence_overlap number

      The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      Default value is 1.

    • separator_group string Required

      This parameter is only applicable when using the recursive chunking strategy.

      Sets a predefined list of separators in the saved chunking settings based on the selected text type. Values can be markdown or plaintext.

      Using this parameter is an alternative to manually specifying a custom separators list.

    • separators array[string] Required

      A list of strings used as possible split points when chunking text with the recursive strategy.

      Each string can be a plain string or a regular expression (regex) pattern. The system tries each separator in order to split the text, starting from the first item in the list.

      After splitting, it attempts to recombine smaller pieces into larger chunks that stay within the max_chunk_size limit, to reduce the total number of chunks generated.

    • strategy string

      The chunking strategy: sentence, word, none or recursive.

      • If strategy is set to recursive, you must also specify:

        • max_chunk_size
        • either separators orseparator_group

      Learn more about different chunking strategies in the linked documentation.

      Default value is sentence.

      External documentation
  • service string Required

    Value is jinaai.

  • service_settings object Required
    Hide service_settings attributes Show service_settings attributes object
    • api_key string Required

      A valid API key of your JinaAI account.

      IMPORTANT: You need to provide the API key only once, during the inference model creation. The get inference endpoint API does not retrieve your API key. After creating the inference model, you cannot change the associated API key. If you want to use a different API key, delete the inference model and recreate it with the same name and the updated API key.

      External documentation
    • model_id string

      The name of the model to use for the inference task. For a rerank task, it is required. For a text_embedding task, it is optional.

    • rate_limit object

      This setting helps to minimize the number of rate limit errors returned from the service.

      Hide rate_limit attribute Show rate_limit attribute object
      • requests_per_minute number

        The number of requests allowed per minute. By default, the number of requests allowed per minute is set by each service as follows:

        • alibabacloud-ai-search service: 1000
        • anthropic service: 50
        • azureaistudio service: 240
        • azureopenai service and task type text_embedding: 1440
        • azureopenai service and task type completion: 120
        • cohere service: 10000
        • elastic service and task type chat_completion: 240
        • googleaistudio service: 360
        • googlevertexai service: 30000
        • hugging_face service: 3000
        • jinaai service: 2000
        • mistral service: 240
        • openai service and task type text_embedding: 3000
        • openai service and task type completion: 500
        • voyageai service: 2000
        • watsonxai service: 120
    • similarity string

      Values are cosine, dot_product, or l2_norm.

  • task_settings object
    Hide task_settings attributes Show task_settings attributes object
    • return_documents boolean

      For a rerank task, return the doc text within the results.

    • task string

      Values are classification, clustering, ingest, or search.

    • top_n number

      For a rerank task, the number of most relevant documents to return. It defaults to the number of the documents. If this inference endpoint is used in a text_similarity_reranker retriever query and top_n is set, it must be greater than or equal to rank_window_size in the query.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • chunking_settings object

      Chunking configuration object

      Hide chunking_settings attributes Show chunking_settings attributes object
      • max_chunk_size number

        The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

        Default value is 250.

      • overlap number

        The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

        Default value is 100.

      • sentence_overlap number

        The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

        Default value is 1.

      • separator_group string Required

        This parameter is only applicable when using the recursive chunking strategy.

        Sets a predefined list of separators in the saved chunking settings based on the selected text type. Values can be markdown or plaintext.

        Using this parameter is an alternative to manually specifying a custom separators list.

      • separators array[string] Required

        A list of strings used as possible split points when chunking text with the recursive strategy.

        Each string can be a plain string or a regular expression (regex) pattern. The system tries each separator in order to split the text, starting from the first item in the list.

        After splitting, it attempts to recombine smaller pieces into larger chunks that stay within the max_chunk_size limit, to reduce the total number of chunks generated.

      • strategy string

        The chunking strategy: sentence, word, none or recursive.

        • If strategy is set to recursive, you must also specify:

          • max_chunk_size
          • either separators orseparator_group

        Learn more about different chunking strategies in the linked documentation.

        Default value is sentence.

        External documentation
    • service string Required

      The service type

    • service_settings object Required
    • task_settings object
    • inference_id string Required

      The inference Id

    • task_type string Required

      Values are text_embedding or rerank.

PUT /_inference/{task_type}/{jinaai_inference_id}
PUT _inference/text_embedding/jinaai-embeddings
{
    "service": "jinaai",
    "service_settings": {
        "model_id": "jina-embeddings-v3",
        "api_key": "JinaAi-Api-key"
    }
}
resp = client.inference.put(
    task_type="text_embedding",
    inference_id="jinaai-embeddings",
    inference_config={
        "service": "jinaai",
        "service_settings": {
            "model_id": "jina-embeddings-v3",
            "api_key": "JinaAi-Api-key"
        }
    },
)
const response = await client.inference.put({
  task_type: "text_embedding",
  inference_id: "jinaai-embeddings",
  inference_config: {
    service: "jinaai",
    service_settings: {
      model_id: "jina-embeddings-v3",
      api_key: "JinaAi-Api-key",
    },
  },
});
response = client.inference.put(
  task_type: "text_embedding",
  inference_id: "jinaai-embeddings",
  body: {
    "service": "jinaai",
    "service_settings": {
      "model_id": "jina-embeddings-v3",
      "api_key": "JinaAi-Api-key"
    }
  }
)
$resp = $client->inference()->put([
    "task_type" => "text_embedding",
    "inference_id" => "jinaai-embeddings",
    "body" => [
        "service" => "jinaai",
        "service_settings" => [
            "model_id" => "jina-embeddings-v3",
            "api_key" => "JinaAi-Api-key",
        ],
    ],
]);
curl -X PUT -H "Authorization: ApiKey $ELASTIC_API_KEY" -H "Content-Type: application/json" -d '{"service":"jinaai","service_settings":{"model_id":"jina-embeddings-v3","api_key":"JinaAi-Api-key"}}' "$ELASTICSEARCH_URL/_inference/text_embedding/jinaai-embeddings"
client.inference().put(p -> p
    .inferenceId("jinaai-embeddings")
    .taskType(TaskType.TextEmbedding)
    .inferenceConfig(i -> i
        .service("jinaai")
        .serviceSettings(JsonData.fromJson("{\"model_id\":\"jina-embeddings-v3\",\"api_key\":\"JinaAi-Api-key\"}"))
    )
);
Request examples
Run `PUT _inference/text_embedding/jinaai-embeddings` to create an inference endpoint for text embedding tasks using the JinaAI service.
{
    "service": "jinaai",
    "service_settings": {
        "model_id": "jina-embeddings-v3",
        "api_key": "JinaAi-Api-key"
    }
}
Run `PUT _inference/rerank/jinaai-rerank` to create an inference endpoint for rerank tasks using the JinaAI service.
{
    "service": "jinaai",
    "service_settings": {
        "api_key": "JinaAI-Api-key",
        "model_id": "jina-reranker-v2-base-multilingual"
    },
    "task_settings": {
        "top_n": 10,
        "return_documents": true
    }
}








Create a VoyageAI inference endpoint Generally available

PUT /_inference/{task_type}/{voyageai_inference_id}

Create an inference endpoint to perform an inference task with the voyageai service.

Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.

Required authorization

  • Cluster privileges: manage_inference

Path parameters

  • task_type string

    The type of the inference task that the model will perform.

    Values are text_embedding or rerank.

  • voyageai_inference_id string Required

    The unique identifier of the inference endpoint.

Query parameters

  • timeout string

    Specifies the amount of time to wait for the inference endpoint to be created.

    Values are -1 or 0.

application/json

Body

  • chunking_settings object

    Chunking configuration object

    Hide chunking_settings attributes Show chunking_settings attributes object
    • max_chunk_size number

      The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

      Default value is 250.

    • overlap number

      The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      Default value is 100.

    • sentence_overlap number

      The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      Default value is 1.

    • separator_group string Required

      This parameter is only applicable when using the recursive chunking strategy.

      Sets a predefined list of separators in the saved chunking settings based on the selected text type. Values can be markdown or plaintext.

      Using this parameter is an alternative to manually specifying a custom separators list.

    • separators array[string] Required

      A list of strings used as possible split points when chunking text with the recursive strategy.

      Each string can be a plain string or a regular expression (regex) pattern. The system tries each separator in order to split the text, starting from the first item in the list.

      After splitting, it attempts to recombine smaller pieces into larger chunks that stay within the max_chunk_size limit, to reduce the total number of chunks generated.

    • strategy string

      The chunking strategy: sentence, word, none or recursive.

      • If strategy is set to recursive, you must also specify:

        • max_chunk_size
        • either separators orseparator_group

      Learn more about different chunking strategies in the linked documentation.

      Default value is sentence.

      External documentation
  • service string Required

    Value is voyageai.

  • service_settings object Required
    Hide service_settings attributes Show service_settings attributes object
    • dimensions number

      The number of dimensions for resulting output embeddings. This setting maps to output_dimension in the VoyageAI documentation. Only for the text_embedding task type.

      External documentation
    • model_id string Required

      The name of the model to use for the inference task. Refer to the VoyageAI documentation for the list of available text embedding and rerank models.

      External documentation
    • rate_limit object

      This setting helps to minimize the number of rate limit errors returned from the service.

      Hide rate_limit attribute Show rate_limit attribute object
      • requests_per_minute number

        The number of requests allowed per minute. By default, the number of requests allowed per minute is set by each service as follows:

        • alibabacloud-ai-search service: 1000
        • anthropic service: 50
        • azureaistudio service: 240
        • azureopenai service and task type text_embedding: 1440
        • azureopenai service and task type completion: 120
        • cohere service: 10000
        • elastic service and task type chat_completion: 240
        • googleaistudio service: 360
        • googlevertexai service: 30000
        • hugging_face service: 3000
        • jinaai service: 2000
        • mistral service: 240
        • openai service and task type text_embedding: 3000
        • openai service and task type completion: 500
        • voyageai service: 2000
        • watsonxai service: 120
    • embedding_type number

      The data type for the embeddings to be returned. This setting maps to output_dtype in the VoyageAI documentation. Permitted values: float, int8, bit. int8 is a synonym of byte in the VoyageAI documentation. bit is a synonym of binary in the VoyageAI documentation. Only for the text_embedding task type.

      External documentation
  • task_settings object
    Hide task_settings attributes Show task_settings attributes object
    • input_type string

      Type of the input text. Permitted values: ingest (maps to document in the VoyageAI documentation), search (maps to query in the VoyageAI documentation). Only for the text_embedding task type.

    • return_documents boolean

      Whether to return the source documents in the response. Only for the rerank task type.

      Default value is false.

    • top_k number

      The number of most relevant documents to return. If not specified, the reranking results of all documents will be returned. Only for the rerank task type.

    • truncation boolean

      Whether to truncate the input texts to fit within the context length.

      Default value is true.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • chunking_settings object

      Chunking configuration object

      Hide chunking_settings attributes Show chunking_settings attributes object
      • max_chunk_size number

        The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

        Default value is 250.

      • overlap number

        The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

        Default value is 100.

      • sentence_overlap number

        The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

        Default value is 1.

      • separator_group string Required

        This parameter is only applicable when using the recursive chunking strategy.

        Sets a predefined list of separators in the saved chunking settings based on the selected text type. Values can be markdown or plaintext.

        Using this parameter is an alternative to manually specifying a custom separators list.

      • separators array[string] Required

        A list of strings used as possible split points when chunking text with the recursive strategy.

        Each string can be a plain string or a regular expression (regex) pattern. The system tries each separator in order to split the text, starting from the first item in the list.

        After splitting, it attempts to recombine smaller pieces into larger chunks that stay within the max_chunk_size limit, to reduce the total number of chunks generated.

      • strategy string

        The chunking strategy: sentence, word, none or recursive.

        • If strategy is set to recursive, you must also specify:

          • max_chunk_size
          • either separators orseparator_group

        Learn more about different chunking strategies in the linked documentation.

        Default value is sentence.

        External documentation
    • service string Required

      The service type

    • service_settings object Required
    • task_settings object
    • inference_id string Required

      The inference Id

    • task_type string Required

      Values are text_embedding or rerank.

PUT /_inference/{task_type}/{voyageai_inference_id}
PUT _inference/text_embedding/openai-embeddings
{
    "service": "voyageai",
    "service_settings": {
        "model_id": "voyage-3-large",
        "dimensions": 512
    }
}
resp = client.inference.put(
    task_type="text_embedding",
    inference_id="openai-embeddings",
    inference_config={
        "service": "voyageai",
        "service_settings": {
            "model_id": "voyage-3-large",
            "dimensions": 512
        }
    },
)
const response = await client.inference.put({
  task_type: "text_embedding",
  inference_id: "openai-embeddings",
  inference_config: {
    service: "voyageai",
    service_settings: {
      model_id: "voyage-3-large",
      dimensions: 512,
    },
  },
});
response = client.inference.put(
  task_type: "text_embedding",
  inference_id: "openai-embeddings",
  body: {
    "service": "voyageai",
    "service_settings": {
      "model_id": "voyage-3-large",
      "dimensions": 512
    }
  }
)
$resp = $client->inference()->put([
    "task_type" => "text_embedding",
    "inference_id" => "openai-embeddings",
    "body" => [
        "service" => "voyageai",
        "service_settings" => [
            "model_id" => "voyage-3-large",
            "dimensions" => 512,
        ],
    ],
]);
curl -X PUT -H "Authorization: ApiKey $ELASTIC_API_KEY" -H "Content-Type: application/json" -d '{"service":"voyageai","service_settings":{"model_id":"voyage-3-large","dimensions":512}}' "$ELASTICSEARCH_URL/_inference/text_embedding/openai-embeddings"
client.inference().put(p -> p
    .inferenceId("openai-embeddings")
    .taskType(TaskType.TextEmbedding)
    .inferenceConfig(i -> i
        .service("voyageai")
        .serviceSettings(JsonData.fromJson("{\"model_id\":\"voyage-3-large\",\"dimensions\":512}"))
    )
);
Request examples
Run `PUT _inference/text_embedding/voyageai-embeddings` to create an inference endpoint that performs a `text_embedding` task. The embeddings created by requests to this endpoint will have 512 dimensions.
{
    "service": "voyageai",
    "service_settings": {
        "model_id": "voyage-3-large",
        "dimensions": 512
    }
}
Run `PUT _inference/rerank/voyageai-rerank` to create an inference endpoint that performs a `rerank` task.
{
    "service": "voyageai",
    "service_settings": {
        "model_id": "rerank-2"
    }
}

















Get cluster info Generally available

GET /

Get basic build, version, and cluster information.

Required authorization

  • Cluster privileges: monitor

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • cluster_name string Required
    • cluster_uuid string Required
    • name string Required
    • tagline string Required
    • version object Required
      Hide version attributes Show version attributes object
      • build_date string | number Required

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        One of:
      • build_flavor string Required

        The build flavor. For example, default.

      • build_hash string Required

        The Elasticsearch Git commit's SHA hash.

      • build_snapshot boolean Required

        Indicates whether the Elasticsearch build was a snapshot.

      • build_type string Required

        The build type that corresponds to how Elasticsearch was installed. For example, docker, rpm, or tar.

      • lucene_version string Required
      • minimum_index_compatibility_version string Required
      • minimum_wire_compatibility_version string Required
      • number string Required

        The Elasticsearch version number.

GET /
resp = client.info()
const response = await client.info();
response = client.info
$resp = $client->info();
curl -X GET -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/"
client.info();
Response examples (200)
A successful response from `GET /`s.
{
  "name": "instance-0000000000",
  "cluster_name": "my_test_cluster",
  "cluster_uuid": "5QaxoN0pRZuOmWSxstBBwQ",
  "version": {
    "build_date": "2024-02-01T13:07:13.727175297Z",
    "minimum_wire_compatibility_version": "7.17.0",
    "build_hash": "6185ba65d27469afabc9bc951cded6c17c21e3f3",
    "number": "8.12.1",
    "lucene_version": "9.9.2",
    "minimum_index_compatibility_version": "7.0.0",
    "build_flavor": "default",
    "build_snapshot": false,
    "build_type": "docker"
  },
  "tagline": "You Know, for Search"
}

Ingest

Ingest APIs enable you to manage tasks and resources related to ingest pipelines and processors.























































Delete events from a calendar Generally available

DELETE /_ml/calendars/{calendar_id}/events/{event_id}

Path parameters

  • calendar_id string Required

    A string that uniquely identifies a calendar.

  • event_id string Required

    Identifier for the scheduled event. You can obtain this identifier by using the get calendar events API.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_ml/calendars/{calendar_id}/events/{event_id}
DELETE _ml/calendars/planned-outages/events/LS8LJGEBMTCMA-qz49st
resp = client.ml.delete_calendar_event(
    calendar_id="planned-outages",
    event_id="LS8LJGEBMTCMA-qz49st",
)
const response = await client.ml.deleteCalendarEvent({
  calendar_id: "planned-outages",
  event_id: "LS8LJGEBMTCMA-qz49st",
});
response = client.ml.delete_calendar_event(
  calendar_id: "planned-outages",
  event_id: "LS8LJGEBMTCMA-qz49st"
)
$resp = $client->ml()->deleteCalendarEvent([
    "calendar_id" => "planned-outages",
    "event_id" => "LS8LJGEBMTCMA-qz49st",
]);
curl -X DELETE -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/_ml/calendars/planned-outages/events/LS8LJGEBMTCMA-qz49st"
client.ml().deleteCalendarEvent(d -> d
    .calendarId("planned-outages")
    .eventId("LS8LJGEBMTCMA-qz49st")
);
Response examples (200)
A successful response when deleting a calendar event.
{
  "acknowledged": true
}








































Delete an anomaly detection job Generally available

DELETE /_ml/anomaly_detectors/{job_id}

All job configuration, model state and results are deleted. It is not currently possible to delete multiple jobs using wildcards or a comma separated list. If you delete a job that has a datafeed, the request first tries to delete the datafeed. This behavior is equivalent to calling the delete datafeed API with the same timeout and force parameters as the delete job request.

Required authorization

  • Cluster privileges: manage_ml

Path parameters

  • job_id string Required

    Identifier for the anomaly detection job.

Query parameters

  • force boolean

    Use to forcefully delete an opened job; this method is quicker than closing and deleting the job.

  • delete_user_annotations boolean

    Specifies whether annotations that have been added by the user should be deleted along with any auto-generated annotations when the job is reset.

  • wait_for_completion boolean

    Specifies whether the request should return immediately or wait until the job deletion completes.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_ml/anomaly_detectors/{job_id}
DELETE _ml/anomaly_detectors/total-requests
resp = client.ml.delete_job(
    job_id="total-requests",
)
const response = await client.ml.deleteJob({
  job_id: "total-requests",
});
response = client.ml.delete_job(
  job_id: "total-requests"
)
$resp = $client->ml()->deleteJob([
    "job_id" => "total-requests",
]);
curl -X DELETE -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/_ml/anomaly_detectors/total-requests"
client.ml().deleteJob(d -> d
    .jobId("total-requests")
);
Response examples (200)
A successful response when deleting an anomaly detection job.
{
  "acknowledged": true
}
A successful response when deleting an anomaly detection job asynchronously. When the `wait_for_completion` query parameter is set to `false`, the response contains an identifier for the job deletion task.
{
  "task": "oTUltX4IQMOUUVeiohTt8A:39"
}




Force buffered data to be processed Deprecated Generally available

POST /_ml/anomaly_detectors/{job_id}/_flush

The flush jobs API is only applicable when sending data for analysis using the post data API. Depending on the content of the buffer, then it might additionally calculate new results. Both flush and close operations are similar, however the flush is more efficient if you are expecting to send more data for analysis. When flushing, the job remains open and is available to continue analyzing data. A close operation additionally prunes and persists the model state to disk and the job must be opened again before analyzing further data.

Required authorization

  • Cluster privileges: manage_ml

Path parameters

  • job_id string Required

    Identifier for the anomaly detection job.

Query parameters

  • advance_time string | number

    Specifies to advance to a particular time value. Results are generated and the model is updated for data from the specified time interval.

  • calc_interim boolean

    If true, calculates the interim results for the most recent bucket or all buckets within the latency period.

  • end string | number

    When used in conjunction with calc_interim and start, specifies the range of buckets on which to calculate interim results.

  • skip_time string | number

    Specifies to skip to a particular time value. Results are not generated and the model is not updated for data from the specified time interval.

  • start string | number

    When used in conjunction with calc_interim, specifies the range of buckets on which to calculate interim results.

application/json

Body

  • advance_time string | number

    A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

    One of:
  • calc_interim boolean

    Refer to the description for the calc_interim query parameter.

  • end string | number

    A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

    One of:
  • skip_time string | number

    A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

    One of:
  • start string | number

    A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

    One of:

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • flushed boolean Required
    • last_finalized_bucket_end number

      Provides the timestamp (in milliseconds since the epoch) of the end of the last bucket that was processed.

POST /_ml/anomaly_detectors/{job_id}/_flush
POST _ml/anomaly_detectors/low_request_rate/_flush
{
  "calc_interim": true
}
resp = client.ml.flush_job(
    job_id="low_request_rate",
    calc_interim=True,
)
const response = await client.ml.flushJob({
  job_id: "low_request_rate",
  calc_interim: true,
});
response = client.ml.flush_job(
  job_id: "low_request_rate",
  body: {
    "calc_interim": true
  }
)
$resp = $client->ml()->flushJob([
    "job_id" => "low_request_rate",
    "body" => [
        "calc_interim" => true,
    ],
]);
curl -X POST -H "Authorization: ApiKey $ELASTIC_API_KEY" -H "Content-Type: application/json" -d '{"calc_interim":true}' "$ELASTICSEARCH_URL/_ml/anomaly_detectors/low_request_rate/_flush"
client.ml().flushJob(f -> f
    .calcInterim(true)
    .jobId("low_request_rate")
);
Request example
An example body for a `POST _ml/anomaly_detectors/low_request_rate/_flush` request.
{
  "calc_interim": true
}












Get anomaly detection job stats Generally available

GET /_ml/anomaly_detectors/{job_id}/_stats

All methods and paths for this operation:

GET /_ml/anomaly_detectors/_stats

GET /_ml/anomaly_detectors/{job_id}/_stats

Required authorization

  • Cluster privileges: monitor_ml

Path parameters

  • job_id string Required

    Identifier for the anomaly detection job. It can be a job identifier, a group name, a comma-separated list of jobs, or a wildcard expression. If you do not specify one of these options, the API returns information for all anomaly detection jobs.

Query parameters

  • allow_no_match boolean

    Specifies what to do when the request:

    1. Contains wildcard expressions and there are no jobs that match.
    2. Contains the _all string or no identifiers and there are no matches.
    3. Contains wildcard expressions and there are only partial matches.

    If true, the API returns an empty jobs array when there are no matches and the subset of results when there are partial matches. If false, the API returns a 404 status code when there are no matches or only partial matches.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • count number Required
    • jobs array[object] Required
      Hide jobs attributes Show jobs attributes object
      • assignment_explanation string

        For open anomaly detection jobs only, contains messages relating to the selection of a node to run the job.

      • data_counts object Required
        Hide data_counts attributes Show data_counts attributes object
        • bucket_count number Required
        • earliest_record_timestamp number
        • empty_bucket_count number Required
        • input_bytes number Required
        • input_field_count number Required
        • input_record_count number Required
        • invalid_date_count number Required
        • job_id string Required
        • last_data_time number
        • latest_empty_bucket_timestamp number
        • latest_record_timestamp number
        • latest_sparse_bucket_timestamp number
        • latest_bucket_timestamp number
        • log_time number
        • missing_field_count number Required
        • out_of_order_timestamp_count number Required
        • processed_field_count number Required
        • processed_record_count number Required
        • sparse_bucket_count number Required
      • forecasts_stats object Required
        Hide forecasts_stats attributes Show forecasts_stats attributes object
        • memory_bytes object
          Hide memory_bytes attributes Show memory_bytes attributes object
          • avg number Required
          • max number Required
          • min number Required
          • total number Required
        • processing_time_ms object
          Hide processing_time_ms attributes Show processing_time_ms attributes object
          • avg number Required
          • max number Required
          • min number Required
          • total number Required
        • records object
          Hide records attributes Show records attributes object
          • avg number Required
          • max number Required
          • min number Required
          • total number Required
        • status object
          Hide status attribute Show status attribute object
          • * number Additional properties
        • total number Required
        • forecasted_jobs number Required
      • job_id string Required

        Identifier for the anomaly detection job.

      • model_size_stats object Required
        Hide model_size_stats attributes Show model_size_stats attributes object
        • bucket_allocation_failures_count number Required
        • job_id string Required
        • log_time string | number Required

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

          One of:
        • memory_status string Required

          Values are ok, soft_limit, or hard_limit.

        • model_bytes number | string Required

        • model_bytes_exceeded number | string

        • model_bytes_memory_limit number | string

        • output_memory_allocator_bytes number | string

        • peak_model_bytes number | string

        • assignment_memory_basis string
        • result_type string Required
        • total_by_field_count number Required
        • total_over_field_count number Required
        • total_partition_field_count number Required
        • categorization_status string Required

          Values are ok or warn.

        • categorized_doc_count number Required
        • dead_category_count number Required
        • failed_category_count number Required
        • frequent_category_count number Required
        • rare_category_count number Required
        • total_category_count number Required
        • timestamp number
      • open_time string | number

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        One of:
      • state string Required

        Values are closing, closed, opened, failed, or opening.

      • timing_stats object Required
        Hide timing_stats attributes Show timing_stats attributes object
        • average_bucket_processing_time_ms number

          Time unit for fractional milliseconds

        • bucket_count number Required
        • exponential_average_bucket_processing_time_ms number

          Time unit for fractional milliseconds

        • exponential_average_bucket_processing_time_per_hour_ms number

          Time unit for fractional milliseconds

        • job_id string Required
        • total_bucket_processing_time_ms number

          Time unit for fractional milliseconds

        • maximum_bucket_processing_time_ms number

          Time unit for fractional milliseconds

        • minimum_bucket_processing_time_ms number

          Time unit for fractional milliseconds

      • deleting boolean

        Indicates that the process of deleting the job is in progress but not yet completed. It is only reported when true.

GET /_ml/anomaly_detectors/{job_id}/_stats
GET _ml/anomaly_detectors/low_request_rate/_stats
resp = client.ml.get_job_stats(
    job_id="low_request_rate",
)
const response = await client.ml.getJobStats({
  job_id: "low_request_rate",
});
response = client.ml.get_job_stats(
  job_id: "low_request_rate"
)
$resp = $client->ml()->getJobStats([
    "job_id" => "low_request_rate",
]);
curl -X GET -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/_ml/anomaly_detectors/low_request_rate/_stats"
client.ml().getJobStats(g -> g
    .jobId("low_request_rate")
);












Reset an anomaly detection job Generally available

POST /_ml/anomaly_detectors/{job_id}/_reset

All model state and results are deleted. The job is ready to start over as if it had just been created. It is not currently possible to reset multiple jobs using wildcards or a comma separated list.

Required authorization

  • Cluster privileges: manage_ml

Path parameters

  • job_id string Required

    The ID of the job to reset.

Query parameters

  • wait_for_completion boolean

    Should this request wait until the operation has completed before returning.

  • delete_user_annotations boolean

    Specifies whether annotations that have been added by the user should be deleted along with any auto-generated annotations when the job is reset.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

POST /_ml/anomaly_detectors/{job_id}/_reset
POST _ml/anomaly_detectors/total-requests/_reset
resp = client.ml.reset_job(
    job_id="total-requests",
)
const response = await client.ml.resetJob({
  job_id: "total-requests",
});
response = client.ml.reset_job(
  job_id: "total-requests"
)
$resp = $client->ml()->resetJob([
    "job_id" => "total-requests",
]);
curl -X POST -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/_ml/anomaly_detectors/total-requests/_reset"
client.ml().resetJob(r -> r
    .jobId("total-requests")
);

Start datafeeds Generally available

POST /_ml/datafeeds/{datafeed_id}/_start

A datafeed must be started in order to retrieve data from Elasticsearch. A datafeed can be started and stopped multiple times throughout its lifecycle.

Before you can start a datafeed, the anomaly detection job must be open. Otherwise, an error occurs.

If you restart a stopped datafeed, it continues processing input data from the next millisecond after it was stopped. If new data was indexed for that exact millisecond between stopping and starting, it will be ignored.

When Elasticsearch security features are enabled, your datafeed remembers which roles the last user to create or update it had at the time of creation or update and runs the query using those same roles. If you provided secondary authorization headers when you created or updated the datafeed, those credentials are used instead.

Required authorization

  • Cluster privileges: manage_ml

Path parameters

  • datafeed_id string Required

    A numerical character string that uniquely identifies the datafeed. This identifier can contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and underscores. It must start and end with alphanumeric characters.

Query parameters

  • end string | number

    The time that the datafeed should end, which can be specified by using one of the following formats:

    • ISO 8601 format with milliseconds, for example 2017-01-22T06:00:00.000Z
    • ISO 8601 format without milliseconds, for example 2017-01-22T06:00:00+00:00
    • Milliseconds since the epoch, for example 1485061200000

    Date-time arguments using either of the ISO 8601 formats must have a time zone designator, where Z is accepted as an abbreviation for UTC time. When a URL is expected (for example, in browsers), the + used in time zone designators must be encoded as %2B. The end time value is exclusive. If you do not specify an end time, the datafeed runs continuously.

  • start string | number

    The time that the datafeed should begin, which can be specified by using the same formats as the end parameter. This value is inclusive. If you do not specify a start time and the datafeed is associated with a new anomaly detection job, the analysis starts from the earliest time for which data is available. If you restart a stopped datafeed and specify a start value that is earlier than the timestamp of the latest processed record, the datafeed continues from 1 millisecond after the timestamp of the latest processed record.

  • timeout string

    Specifies the amount of time to wait until a datafeed starts.

    Values are -1 or 0.

application/json

Body

  • end string | number

    A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

    One of:
  • start string | number

    A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

    One of:
  • timeout string

    A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • node string | array[string] Required

    • started boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

POST /_ml/datafeeds/{datafeed_id}/_start
POST _ml/datafeeds/datafeed-low_request_rate/_start
{
  "start": "2019-04-07T18:22:16Z"
}
resp = client.ml.start_datafeed(
    datafeed_id="datafeed-low_request_rate",
    start="2019-04-07T18:22:16Z",
)
const response = await client.ml.startDatafeed({
  datafeed_id: "datafeed-low_request_rate",
  start: "2019-04-07T18:22:16Z",
});
response = client.ml.start_datafeed(
  datafeed_id: "datafeed-low_request_rate",
  body: {
    "start": "2019-04-07T18:22:16Z"
  }
)
$resp = $client->ml()->startDatafeed([
    "datafeed_id" => "datafeed-low_request_rate",
    "body" => [
        "start" => "2019-04-07T18:22:16Z",
    ],
]);
curl -X POST -H "Authorization: ApiKey $ELASTIC_API_KEY" -H "Content-Type: application/json" -d '{"start":"2019-04-07T18:22:16Z"}' "$ELASTICSEARCH_URL/_ml/datafeeds/datafeed-low_request_rate/_start"
client.ml().startDatafeed(s -> s
    .datafeedId("datafeed-low_request_rate")
    .start(DateTime.of("2019-04-07T18:22:16Z"))
);
Request example
An example body for a `POST _ml/datafeeds/datafeed-low_request_rate/_start` request.
{
  "start": "2019-04-07T18:22:16Z"
}













































Stop data frame analytics jobs Generally available

POST /_ml/data_frame/analytics/{id}/_stop

A data frame analytics job can be started and stopped multiple times throughout its lifecycle.

Required authorization

  • Cluster privileges: manage_ml

Path parameters

  • id string Required

    Identifier for the data frame analytics job. This identifier can contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and underscores. It must start and end with alphanumeric characters.

Query parameters

  • allow_no_match boolean

    Specifies what to do when the request:

    1. Contains wildcard expressions and there are no data frame analytics jobs that match.
    2. Contains the _all string or no identifiers and there are no matches.
    3. Contains wildcard expressions and there are only partial matches.

    The default value is true, which returns an empty data_frame_analytics array when there are no matches and the subset of results when there are partial matches. If this parameter is false, the request returns a 404 status code when there are no matches or only partial matches.

  • force boolean

    If true, the data frame analytics job is stopped forcefully.

  • timeout string

    Controls the amount of time to wait until the data frame analytics job stops. Defaults to 20 seconds.

    Values are -1 or 0.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • stopped boolean Required
POST /_ml/data_frame/analytics/{id}/_stop
POST _ml/data_frame/analytics/loganalytics/_stop
resp = client.ml.stop_data_frame_analytics(
    id="loganalytics",
)
const response = await client.ml.stopDataFrameAnalytics({
  id: "loganalytics",
});
response = client.ml.stop_data_frame_analytics(
  id: "loganalytics"
)
$resp = $client->ml()->stopDataFrameAnalytics([
    "id" => "loganalytics",
]);
curl -X POST -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/_ml/data_frame/analytics/loganalytics/_stop"
client.ml().stopDataFrameAnalytics(s -> s
    .id("loganalytics")
);

Update a data frame analytics job Generally available

POST /_ml/data_frame/analytics/{id}/_update

Required authorization

  • Index privileges: read,create_index,manage,index,view_index_metadata
  • Cluster privileges: manage_ml

Path parameters

  • id string Required

    Identifier for the data frame analytics job. This identifier can contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and underscores. It must start and end with alphanumeric characters.

application/json

Body Required

  • description string

    A description of the job.

  • model_memory_limit string

    The approximate maximum amount of memory resources that are permitted for analytical processing. If your elasticsearch.yml file contains an xpack.ml.max_model_memory_limit setting, an error occurs when you try to create data frame analytics jobs that have model_memory_limit values greater than that setting.

    Default value is 1gb.

  • max_num_threads number

    The maximum number of threads to be used by the analysis. Using more threads may decrease the time necessary to complete the analysis at the cost of using more CPU. Note that the process may use additional threads for operational functionality other than the analysis itself.

    Default value is 1.

  • allow_lazy_start boolean

    Specifies whether this job can start when there is insufficient machine learning node capacity for it to be immediately assigned to a node.

    Default value is false.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • authorization object
      Hide authorization attributes Show authorization attributes object
      • api_key object
        Hide api_key attributes Show api_key attributes object
        • id string Required

          The identifier for the API key.

        • name string Required

          The name of the API key.

      • roles array[string]

        If a user ID was used for the most recent update to the job, its roles at the time of the update are listed in the response.

      • service_account string

        If a service account was used for the most recent update to the job, the account name is listed in the response.

    • allow_lazy_start boolean Required
    • analysis object Required
      Hide analysis attributes Show analysis attributes object
      • classification object
        Hide classification attributes Show classification attributes object
        • alpha number

          Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This parameter affects loss calculations by acting as a multiplier of the tree depth. Higher alpha values result in shallower trees and faster training times. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to zero.

        • dependent_variable string Required

          Defines which field of the document is to be predicted. It must match one of the fields in the index being used to train. If this field is missing from a document, then that document will not be used for training, but a prediction with the trained model will be generated for it. It is also known as continuous target variable. For classification analysis, the data type of the field must be numeric (integer, short, long, byte), categorical (ip or keyword), or boolean. There must be no more than 30 different values in this field. For regression analysis, the data type of the field must be numeric.

        • downsample_factor number

          Advanced configuration option. Controls the fraction of data that is used to compute the derivatives of the loss function for tree training. A small value results in the use of a small fraction of the data. If this value is set to be less than 1, accuracy typically improves. However, too small a value may result in poor convergence for the ensemble and so require more trees. By default, this value is calculated during hyperparameter optimization. It must be greater than zero and less than or equal to 1.

        • early_stopping_enabled boolean

          Advanced configuration option. Specifies whether the training process should finish if it is not finding any better performing models. If disabled, the training process can take significantly longer and the chance of finding a better performing model is unremarkable.

          Default value is true.

        • eta number

          Advanced configuration option. The shrinkage applied to the weights. Smaller values result in larger forests which have a better generalization error. However, larger forests cause slower training. By default, this value is calculated during hyperparameter optimization. It must be a value between 0.001 and 1.

        • eta_growth_rate_per_tree number

          Advanced configuration option. Specifies the rate at which eta increases for each new tree that is added to the forest. For example, a rate of 1.05 increases eta by 5% for each extra tree. By default, this value is calculated during hyperparameter optimization. It must be between 0.5 and 2.

        • feature_bag_fraction number

          Advanced configuration option. Defines the fraction of features that will be used when selecting a random bag for each candidate split. By default, this value is calculated during hyperparameter optimization.

        • feature_processors array[object]

          Advanced configuration option. A collection of feature preprocessors that modify one or more included fields. The analysis uses the resulting one or more features instead of the original document field. However, these features are ephemeral; they are not stored in the destination index. Multiple feature_processors entries can refer to the same document fields. Automatic categorical feature encoding still occurs for the fields that are unprocessed by a custom processor or that have categorical values. Use this property only if you want to override the automatic feature encoding of the specified fields.

          Hide feature_processors attributes Show feature_processors attributes object
          • frequency_encoding object
          • multi_encoding object
          • n_gram_encoding object
          • one_hot_encoding object
          • target_mean_encoding object
        • gamma number

          Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies a linear penalty associated with the size of individual trees in the forest. A high gamma value causes training to prefer small trees. A small gamma value results in larger individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

        • lambda number

          Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies an L2 regularization term which applies to leaf weights of the individual trees in the forest. A high lambda value causes training to favor small leaf weights. This behavior makes the prediction function smoother at the expense of potentially not being able to capture relevant relationships between the features and the dependent variable. A small lambda value results in large individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

        • max_optimization_rounds_per_hyperparameter number

          Advanced configuration option. A multiplier responsible for determining the maximum number of hyperparameter optimization steps in the Bayesian optimization procedure. The maximum number of steps is determined based on the number of undefined hyperparameters times the maximum optimization rounds per hyperparameter. By default, this value is calculated during hyperparameter optimization.

        • max_trees number

          Advanced configuration option. Defines the maximum number of decision trees in the forest. The maximum value is 2000. By default, this value is calculated during hyperparameter optimization.

        • num_top_feature_importance_values number

          Advanced configuration option. Specifies the maximum number of feature importance values per document to return. By default, no feature importance calculation occurs.

          Default value is 0.

        • prediction_field_name string

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • randomize_seed number

          Defines the seed for the random generator that is used to pick training data. By default, it is randomly generated. Set it to a specific value to use the same training data each time you start a job (assuming other related parameters such as source and analyzed_fields are the same).

        • soft_tree_depth_limit number

          Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This soft limit combines with the soft_tree_depth_tolerance to penalize trees that exceed the specified depth; the regularized loss increases quickly beyond this depth. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.

        • soft_tree_depth_tolerance number

          Advanced configuration option. This option controls how quickly the regularized loss increases when the tree depth exceeds soft_tree_depth_limit. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.01.

        • training_percent string | number

        • class_assignment_objective string
        • num_top_classes number

          Defines the number of categories for which the predicted probabilities are reported. It must be non-negative or -1. If it is -1 or greater than the total number of categories, probabilities are reported for all categories; if you have a large number of categories, there could be a significant effect on the size of your destination index. NOTE: To use the AUC ROC evaluation method, num_top_classes must be set to -1 or a value greater than or equal to the total number of categories.

          Default value is 2.

      • outlier_detection object
        Hide outlier_detection attributes Show outlier_detection attributes object
        • compute_feature_influence boolean

          Specifies whether the feature influence calculation is enabled.

          Default value is true.

        • feature_influence_threshold number

          The minimum outlier score that a document needs to have in order to calculate its feature influence score. Value range: 0-1.

          Default value is 0.1.

        • method string

          The method that outlier detection uses. Available methods are lof, ldof, distance_kth_nn, distance_knn, and ensemble. The default value is ensemble, which means that outlier detection uses an ensemble of different methods and normalises and combines their individual outlier scores to obtain the overall outlier score.

          Default value is ensemble.

        • n_neighbors number

          Defines the value for how many nearest neighbors each method of outlier detection uses to calculate its outlier score. When the value is not set, different values are used for different ensemble members. This default behavior helps improve the diversity in the ensemble; only override it if you are confident that the value you choose is appropriate for the data set.

        • outlier_fraction number

          The proportion of the data set that is assumed to be outlying prior to outlier detection. For example, 0.05 means it is assumed that 5% of values are real outliers and 95% are inliers.

        • standardization_enabled boolean

          If true, the following operation is performed on the columns before computing outlier scores: (x_i - mean(x_i)) / sd(x_i).

          Default value is true.

      • regression object
        Hide regression attributes Show regression attributes object
        • alpha number

          Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This parameter affects loss calculations by acting as a multiplier of the tree depth. Higher alpha values result in shallower trees and faster training times. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to zero.

        • dependent_variable string Required

          Defines which field of the document is to be predicted. It must match one of the fields in the index being used to train. If this field is missing from a document, then that document will not be used for training, but a prediction with the trained model will be generated for it. It is also known as continuous target variable. For classification analysis, the data type of the field must be numeric (integer, short, long, byte), categorical (ip or keyword), or boolean. There must be no more than 30 different values in this field. For regression analysis, the data type of the field must be numeric.

        • downsample_factor number

          Advanced configuration option. Controls the fraction of data that is used to compute the derivatives of the loss function for tree training. A small value results in the use of a small fraction of the data. If this value is set to be less than 1, accuracy typically improves. However, too small a value may result in poor convergence for the ensemble and so require more trees. By default, this value is calculated during hyperparameter optimization. It must be greater than zero and less than or equal to 1.

        • early_stopping_enabled boolean

          Advanced configuration option. Specifies whether the training process should finish if it is not finding any better performing models. If disabled, the training process can take significantly longer and the chance of finding a better performing model is unremarkable.

          Default value is true.

        • eta number

          Advanced configuration option. The shrinkage applied to the weights. Smaller values result in larger forests which have a better generalization error. However, larger forests cause slower training. By default, this value is calculated during hyperparameter optimization. It must be a value between 0.001 and 1.

        • eta_growth_rate_per_tree number

          Advanced configuration option. Specifies the rate at which eta increases for each new tree that is added to the forest. For example, a rate of 1.05 increases eta by 5% for each extra tree. By default, this value is calculated during hyperparameter optimization. It must be between 0.5 and 2.

        • feature_bag_fraction number

          Advanced configuration option. Defines the fraction of features that will be used when selecting a random bag for each candidate split. By default, this value is calculated during hyperparameter optimization.

        • feature_processors array[object]

          Advanced configuration option. A collection of feature preprocessors that modify one or more included fields. The analysis uses the resulting one or more features instead of the original document field. However, these features are ephemeral; they are not stored in the destination index. Multiple feature_processors entries can refer to the same document fields. Automatic categorical feature encoding still occurs for the fields that are unprocessed by a custom processor or that have categorical values. Use this property only if you want to override the automatic feature encoding of the specified fields.

          Hide feature_processors attributes Show feature_processors attributes object
          • frequency_encoding object
          • multi_encoding object
          • n_gram_encoding object
          • one_hot_encoding object
          • target_mean_encoding object
        • gamma number

          Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies a linear penalty associated with the size of individual trees in the forest. A high gamma value causes training to prefer small trees. A small gamma value results in larger individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

        • lambda number

          Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies an L2 regularization term which applies to leaf weights of the individual trees in the forest. A high lambda value causes training to favor small leaf weights. This behavior makes the prediction function smoother at the expense of potentially not being able to capture relevant relationships between the features and the dependent variable. A small lambda value results in large individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

        • max_optimization_rounds_per_hyperparameter number

          Advanced configuration option. A multiplier responsible for determining the maximum number of hyperparameter optimization steps in the Bayesian optimization procedure. The maximum number of steps is determined based on the number of undefined hyperparameters times the maximum optimization rounds per hyperparameter. By default, this value is calculated during hyperparameter optimization.

        • max_trees number

          Advanced configuration option. Defines the maximum number of decision trees in the forest. The maximum value is 2000. By default, this value is calculated during hyperparameter optimization.

        • num_top_feature_importance_values number

          Advanced configuration option. Specifies the maximum number of feature importance values per document to return. By default, no feature importance calculation occurs.

          Default value is 0.

        • prediction_field_name string

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • randomize_seed number

          Defines the seed for the random generator that is used to pick training data. By default, it is randomly generated. Set it to a specific value to use the same training data each time you start a job (assuming other related parameters such as source and analyzed_fields are the same).

        • soft_tree_depth_limit number

          Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This soft limit combines with the soft_tree_depth_tolerance to penalize trees that exceed the specified depth; the regularized loss increases quickly beyond this depth. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.

        • soft_tree_depth_tolerance number

          Advanced configuration option. This option controls how quickly the regularized loss increases when the tree depth exceeds soft_tree_depth_limit. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.01.

        • training_percent string | number

        • loss_function string

          The loss function used during regression. Available options are mse (mean squared error), msle (mean squared logarithmic error), huber (Pseudo-Huber loss).

          Default value is mse.

        • loss_function_parameter number

          A positive number that is used as a parameter to the loss_function.

    • analyzed_fields object
      Hide analyzed_fields attributes Show analyzed_fields attributes object
      • includes array[string]

        An array of strings that defines the fields that will be excluded from the analysis. You do not need to add fields with unsupported data types to excludes, these fields are excluded from the analysis automatically.

      • excludes array[string]

        An array of strings that defines the fields that will be included in the analysis.

    • create_time number Required
    • description string
    • dest object Required
      Hide dest attributes Show dest attributes object
      • index string Required
      • results_field string

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • id string Required
    • max_num_threads number Required
    • model_memory_limit string Required
    • source object Required
      Hide source attributes Show source attributes object
      • index string | array[string] Required
      • runtime_mappings object
        Hide runtime_mappings attribute Show runtime_mappings attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • fields object

            For type composite

            Hide fields attribute Show fields attribute object
            • * object Additional properties
              Hide * attribute Show * attribute object
              • type string Required

                Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

          • fetch_fields array[object]

            For type lookup

            Hide fetch_fields attributes Show fetch_fields attributes object
            • field string Required

              Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

            • format string
          • format string

            A custom format for date type runtime fields.

          • input_field string

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • target_field string

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • target_index string
          • script object
            Hide script attributes Show script attributes object
            • id string
            • params object

              Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

              Hide params attribute Show params attribute object
              • * object Additional properties
            • lang string

              Any of:

              Values are painless, expression, mustache, or java.

            • options object
              Hide options attribute Show options attribute object
              • * string Additional properties
          • type string Required

            Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

      • _source object
        Hide _source attributes Show _source attributes object
        • includes array[string]

          An array of strings that defines the fields that will be excluded from the analysis. You do not need to add fields with unsupported data types to excludes, these fields are excluded from the analysis automatically.

        • excludes array[string]

          An array of strings that defines the fields that will be included in the analysis.

      • query object

        The Elasticsearch query domain-specific language (DSL). This value corresponds to the query object in an Elasticsearch search POST body. All the options that are supported by Elasticsearch can be used, as this object is passed verbatim to Elasticsearch. By default, this property has the following value: {"match_all": {}}.

        Query DSL
    • version string Required
POST /_ml/data_frame/analytics/{id}/_update
POST _ml/data_frame/analytics/loganalytics/_update
{
  "model_memory_limit": "200mb"
}
resp = client.ml.update_data_frame_analytics(
    id="loganalytics",
    model_memory_limit="200mb",
)
const response = await client.ml.updateDataFrameAnalytics({
  id: "loganalytics",
  model_memory_limit: "200mb",
});
response = client.ml.update_data_frame_analytics(
  id: "loganalytics",
  body: {
    "model_memory_limit": "200mb"
  }
)
$resp = $client->ml()->updateDataFrameAnalytics([
    "id" => "loganalytics",
    "body" => [
        "model_memory_limit" => "200mb",
    ],
]);
curl -X POST -H "Authorization: ApiKey $ELASTIC_API_KEY" -H "Content-Type: application/json" -d '{"model_memory_limit":"200mb"}' "$ELASTICSEARCH_URL/_ml/data_frame/analytics/loganalytics/_update"
client.ml().updateDataFrameAnalytics(u -> u
    .id("loganalytics")
    .modelMemoryLimit("200mb")
);
Request example
An example body for a `POST _ml/data_frame/analytics/loganalytics/_update` request.
{
  "model_memory_limit": "200mb"
}

Get trained model configuration info Generally available

GET /_ml/trained_models/{model_id}

All methods and paths for this operation:

GET /_ml/trained_models

GET /_ml/trained_models/{model_id}

Required authorization

  • Cluster privileges: monitor_ml

Path parameters

  • model_id string | array[string] Required

    The unique identifier of the trained model or a model alias.

    You can get information for multiple trained models in a single API request by using a comma-separated list of model IDs or a wildcard expression.

Query parameters

  • allow_no_match boolean

    Specifies what to do when the request:

    • Contains wildcard expressions and there are no models that match.
    • Contains the _all string or no identifiers and there are no matches.
    • Contains wildcard expressions and there are only partial matches.

    If true, it returns an empty array when there are no matches and the subset of results when there are partial matches.

  • decompress_definition boolean

    Specifies whether the included model definition should be returned as a JSON map (true) or in a custom compressed format (false).

  • exclude_generated boolean

    Indicates if certain fields should be removed from the configuration on retrieval. This allows the configuration to be in an acceptable format to be retrieved and then added to another cluster.

  • from number

    Skips the specified number of models.

  • include string

    A comma delimited string of optional fields to include in the response body.

    Supported values include:

    • definition: Includes the model definition.
    • feature_importance_baseline: Includes the baseline for feature importance values.
    • hyperparameters: Includes the information about hyperparameters used to train the model. This information consists of the value, the absolute and relative importance of the hyperparameter as well as an indicator of whether it was specified by the user or tuned during hyperparameter optimization.
    • total_feature_importance: Includes the total feature importance for the training data set. The baseline and total feature importance values are returned in the metadata field in the response body.
    • definition_status: Includes the model definition status.

    Values are definition, feature_importance_baseline, hyperparameters, total_feature_importance, or definition_status.

  • size number

    Specifies the maximum number of models to obtain.

  • tags string | array[string]

    A comma delimited string of tags. A trained model can have many tags, or none. When supplied, only trained models that contain all the supplied tags are returned.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • count number Required
    • trained_model_configs array[object] Required

      An array of trained model resources, which are sorted by the model_id value in ascending order.

      Hide trained_model_configs attributes Show trained_model_configs attributes object
      • model_id string Required
      • model_type string

        Values are tree_ensemble, lang_ident, or pytorch.

      • tags array[string] Required

        A comma delimited string of tags. A trained model can have many tags, or none.

      • version string
      • compressed_definition string
      • created_by string

        Information on the creator of the trained model.

      • create_time string | number

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        One of:
      • default_field_map object

        Any field map described in the inference configuration takes precedence.

        Hide default_field_map attribute Show default_field_map attribute object
        • * string Additional properties
      • description string

        The free-text description of the trained model.

      • estimated_heap_memory_usage_bytes number

        The estimated heap usage in bytes to keep the trained model in memory.

      • estimated_operations number

        The estimated number of operations to use the trained model.

      • fully_defined boolean

        True if the full model definition is present.

      • inference_config object

        Inference configuration provided when storing the model config

        Hide inference_config attributes Show inference_config attributes object
        • regression object
          Hide regression attributes Show regression attributes object
          • results_field string

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • num_top_feature_importance_values number

            Specifies the maximum number of feature importance values per document.

            Default value is 0.

        • classification object
          Hide classification attributes Show classification attributes object
          • num_top_classes number

            Specifies the number of top class predictions to return. Defaults to 0.

          • num_top_feature_importance_values number

            Specifies the maximum number of feature importance values per document.

            Default value is 0.

          • prediction_field_type string

            Specifies the type of the predicted field to write. Acceptable values are: string, number, boolean. When boolean is provided 1.0 is transformed to true and 0.0 to false.

          • results_field string

            The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

          • top_classes_results_field string

            Specifies the field to which the top classes are written. Defaults to top_classes.

        • text_classification object

          Text classification configuration options

          Hide text_classification attributes Show text_classification attributes object
          • num_top_classes number

            Specifies the number of top class predictions to return. Defaults to 0.

          • tokenization object

            Tokenization options stored in inference configuration

            Hide tokenization attributes Show tokenization attributes object
            • bert
            • bert_ja
            • mpnet
            • roberta
            • xlm_roberta
          • results_field string

            The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

          • classification_labels array[string]

            Classification labels to apply other than the stored labels. Must have the same deminsions as the default configured labels

          • vocabulary object
            Hide vocabulary attribute Show vocabulary attribute object
            • index string Required
        • zero_shot_classification object

          Zero shot classification configuration options

          Hide zero_shot_classification attributes Show zero_shot_classification attributes object
          • tokenization object

            Tokenization options stored in inference configuration

            Hide tokenization attributes Show tokenization attributes object
            • bert
            • bert_ja
            • mpnet
            • roberta
            • xlm_roberta
          • hypothesis_template string

            Hypothesis template used when tokenizing labels for prediction

            Default value is "This example is {}.".

          • classification_labels array[string] Required

            The zero shot classification labels indicating entailment, neutral, and contradiction Must contain exactly and only entailment, neutral, and contradiction

          • results_field string

            The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

          • multi_label boolean

            Indicates if more than one true label exists.

            Default value is false.

          • labels array[string]

            The labels to predict.

        • fill_mask object

          Fill mask inference options

          Hide fill_mask attributes Show fill_mask attributes object
          • mask_token string

            The string/token which will be removed from incoming documents and replaced with the inference prediction(s). In a response, this field contains the mask token for the specified model/tokenizer. Each model and tokenizer has a predefined mask token which cannot be changed. Thus, it is recommended not to set this value in requests. However, if this field is present in a request, its value must match the predefined value for that model/tokenizer, otherwise the request will fail.

          • num_top_classes number

            Specifies the number of top class predictions to return. Defaults to 0.

          • tokenization object

            Tokenization options stored in inference configuration

            Hide tokenization attributes Show tokenization attributes object
            • bert
            • bert_ja
            • mpnet
            • roberta
            • xlm_roberta
          • results_field string

            The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

          • vocabulary object Required
            Hide vocabulary attribute Show vocabulary attribute object
            • index string Required
        • learning_to_rank object
          Hide learning_to_rank attributes Show learning_to_rank attributes object
          • default_params object
            Hide default_params attribute Show default_params attribute object
            • * object Additional properties
          • feature_extractors array[object]
          • num_top_feature_importance_values number Required
        • ner object

          Named entity recognition options

          Hide ner attributes Show ner attributes object
          • tokenization object

            Tokenization options stored in inference configuration

            Hide tokenization attributes Show tokenization attributes object
            • bert
            • bert_ja
            • mpnet
            • roberta
            • xlm_roberta
          • results_field string

            The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

          • classification_labels array[string]

            The token classification labels. Must be IOB formatted tags

          • vocabulary object
            Hide vocabulary attribute Show vocabulary attribute object
            • index string Required
        • pass_through object

          Pass through configuration options

          Hide pass_through attributes Show pass_through attributes object
          • tokenization object

            Tokenization options stored in inference configuration

            Hide tokenization attributes Show tokenization attributes object
            • bert
            • bert_ja
            • mpnet
            • roberta
            • xlm_roberta
          • results_field string

            The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

          • vocabulary object
            Hide vocabulary attribute Show vocabulary attribute object
            • index string Required
        • text_embedding object

          Text embedding inference options

          Hide text_embedding attributes Show text_embedding attributes object
          • embedding_size number

            The number of dimensions in the embedding output

          • tokenization object

            Tokenization options stored in inference configuration

            Hide tokenization attributes Show tokenization attributes object
            • bert
            • bert_ja
            • mpnet
            • roberta
            • xlm_roberta
          • results_field string

            The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

          • vocabulary object Required
            Hide vocabulary attribute Show vocabulary attribute object
            • index string Required
        • text_expansion object

          Text expansion inference options

          Hide text_expansion attributes Show text_expansion attributes object
          • tokenization object

            Tokenization options stored in inference configuration

            Hide tokenization attributes Show tokenization attributes object
            • bert
            • bert_ja
            • mpnet
            • roberta
            • xlm_roberta
          • results_field string

            The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

          • vocabulary object Required
            Hide vocabulary attribute Show vocabulary attribute object
            • index string Required
        • question_answering object

          Question answering inference options

          Hide question_answering attributes Show question_answering attributes object
          • num_top_classes number

            Specifies the number of top class predictions to return. Defaults to 0.

          • tokenization object

            Tokenization options stored in inference configuration

            Hide tokenization attributes Show tokenization attributes object
            • bert
            • bert_ja
            • mpnet
            • roberta
            • xlm_roberta
          • results_field string

            The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

          • max_answer_length number

            The maximum answer length to consider

      • input object Required
        Hide input attribute Show input attribute object
        • field_names array[string] Required

          An array of input field names for the model.

      • license_level string

        The license level of the trained model.

      • metadata object
        Hide metadata attributes Show metadata attributes object
        • model_aliases array[string]
        • feature_importance_baseline object

          An object that contains the baseline for feature importance values. For regression analysis, it is a single value. For classification analysis, there is a value for each class.

          Hide feature_importance_baseline attribute Show feature_importance_baseline attribute object
          • * string Additional properties
        • hyperparameters array[object]

          List of the available hyperparameters optimized during the fine_parameter_tuning phase as well as specified by the user.

          Hide hyperparameters attributes Show hyperparameters attributes object
          • absolute_importance number

            A positive number showing how much the parameter influences the variation of the loss function. For hyperparameters with values that are not specified by the user but tuned during hyperparameter optimization.

          • name string Required
          • relative_importance number

            A number between 0 and 1 showing the proportion of influence on the variation of the loss function among all tuned hyperparameters. For hyperparameters with values that are not specified by the user but tuned during hyperparameter optimization.

          • supplied boolean Required

            Indicates if the hyperparameter is specified by the user (true) or optimized (false).

          • value number Required

            The value of the hyperparameter, either optimized or specified by the user.

        • total_feature_importance array[object]

          An array of the total feature importance for each feature used from the training data set. This array of objects is returned if data frame analytics trained the model and the request includes total_feature_importance in the include request parameter.

          Hide total_feature_importance attributes Show total_feature_importance attributes object
          • feature_name string Required
          • importance array[object] Required

            A collection of feature importance statistics related to the training data set for this particular feature.

          • classes array[object] Required

            If the trained model is a classification model, feature importance statistics are gathered per target class value.

      • model_size_bytes number | string

      • model_package object
        Hide model_package attributes Show model_package attributes object
        • create_time number

          Time unit for milliseconds

        • description string
        • inference_config object
          Hide inference_config attribute Show inference_config attribute object
          • * object Additional properties
        • metadata object
          Hide metadata attribute Show metadata attribute object
          • * object Additional properties
        • minimum_version string
        • model_repository string
        • model_type string
        • packaged_model_id string Required
        • platform_architecture string
        • prefix_strings object
          Hide prefix_strings attributes Show prefix_strings attributes object
          • ingest string

            String prepended to input at ingest

        • size number | string

        • sha256 string
        • tags array[string]
        • vocabulary_file string
      • location object
        Hide location attribute Show location attribute object
        • index object Required
          Hide index attribute Show index attribute object
          • name string Required
      • platform_architecture string
      • prefix_strings object
        Hide prefix_strings attributes Show prefix_strings attributes object
        • ingest string

          String prepended to input at ingest

GET /_ml/trained_models/{model_id}
GET _ml/trained_models/
resp = client.ml.get_trained_models()
const response = await client.ml.getTrainedModels();
response = client.ml.get_trained_models
$resp = $client->ml()->getTrainedModels();
curl -X GET -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/_ml/trained_models/"
client.ml().getTrainedModels(g -> g);




Delete an unreferenced trained model Generally available

DELETE /_ml/trained_models/{model_id}

The request deletes a trained inference model that is not referenced by an ingest pipeline.

Required authorization

  • Cluster privileges: manage_ml

Path parameters

  • model_id string Required

    The unique identifier of the trained model.

Query parameters

  • force boolean

    Forcefully deletes a trained model that is referenced by ingest pipelines or has a started deployment.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

    Values are -1 or 0.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_ml/trained_models/{model_id}
DELETE _ml/trained_models/regression-job-one-1574775307356
resp = client.ml.delete_trained_model(
    model_id="regression-job-one-1574775307356",
)
const response = await client.ml.deleteTrainedModel({
  model_id: "regression-job-one-1574775307356",
});
response = client.ml.delete_trained_model(
  model_id: "regression-job-one-1574775307356"
)
$resp = $client->ml()->deleteTrainedModel([
    "model_id" => "regression-job-one-1574775307356",
]);
curl -X DELETE -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/_ml/trained_models/regression-job-one-1574775307356"
client.ml().deleteTrainedModel(d -> d
    .modelId("regression-job-one-1574775307356")
);
Response examples (200)
A successful response when deleting an existing trained inference model.
{
  "acknowledged": true
}












Evaluate a trained model Generally available

POST /_ml/trained_models/{model_id}/_infer

Path parameters

  • model_id string Required

    The unique identifier of the trained model.

Query parameters

  • timeout string

    Controls the amount of time to wait for inference results.

    Values are -1 or 0.

application/json

Body Required

  • docs array[object] Required

    An array of objects to pass to the model for inference. The objects should contain a fields matching your configured trained model input. Typically, for NLP models, the field name is text_field. Currently, for NLP models, only a single value is allowed.

    Hide docs attribute Show docs attribute object
    • * object Additional properties
  • inference_config object
    Hide inference_config attributes Show inference_config attributes object
    • regression object
      Hide regression attributes Show regression attributes object
      • results_field string

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • num_top_feature_importance_values number

        Specifies the maximum number of feature importance values per document.

        Default value is 0.

    • classification object
      Hide classification attributes Show classification attributes object
      • num_top_classes number

        Specifies the number of top class predictions to return. Defaults to 0.

      • num_top_feature_importance_values number

        Specifies the maximum number of feature importance values per document.

        Default value is 0.

      • prediction_field_type string

        Specifies the type of the predicted field to write. Acceptable values are: string, number, boolean. When boolean is provided 1.0 is transformed to true and 0.0 to false.

      • results_field string

        The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

      • top_classes_results_field string

        Specifies the field to which the top classes are written. Defaults to top_classes.

    • text_classification object
      Hide text_classification attributes Show text_classification attributes object
      • num_top_classes number

        Specifies the number of top class predictions to return. Defaults to 0.

      • tokenization object
        Hide tokenization attributes Show tokenization attributes object
        • truncate string

          Values are first, second, or none.

        • span number

          Span options to apply

      • results_field string

        The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

      • classification_labels array[string]

        Classification labels to apply other than the stored labels. Must have the same deminsions as the default configured labels

    • zero_shot_classification object
      Hide zero_shot_classification attributes Show zero_shot_classification attributes object
      • tokenization object
        Hide tokenization attributes Show tokenization attributes object
        • truncate string

          Values are first, second, or none.

        • span number

          Span options to apply

      • results_field string

        The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

      • multi_label boolean

        Update the configured multi label option. Indicates if more than one true label exists. Defaults to the configured value.

      • labels array[string] Required

        The labels to predict.

    • fill_mask object
      Hide fill_mask attributes Show fill_mask attributes object
      • num_top_classes number

        Specifies the number of top class predictions to return. Defaults to 0.

      • tokenization object
        Hide tokenization attributes Show tokenization attributes object
        • truncate string

          Values are first, second, or none.

        • span number

          Span options to apply

      • results_field string

        The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

    • ner object
      Hide ner attributes Show ner attributes object
      • tokenization object
        Hide tokenization attributes Show tokenization attributes object
        • truncate string

          Values are first, second, or none.

        • span number

          Span options to apply

      • results_field string

        The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

    • pass_through object
      Hide pass_through attributes Show pass_through attributes object
      • tokenization object
        Hide tokenization attributes Show tokenization attributes object
        • truncate string

          Values are first, second, or none.

        • span number

          Span options to apply

      • results_field string

        The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

    • text_embedding object
      Hide text_embedding attributes Show text_embedding attributes object
      • tokenization object
        Hide tokenization attributes Show tokenization attributes object
        • truncate string

          Values are first, second, or none.

        • span number

          Span options to apply

      • results_field string

        The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

    • text_expansion object
      Hide text_expansion attributes Show text_expansion attributes object
      • tokenization object
        Hide tokenization attributes Show tokenization attributes object
        • truncate string

          Values are first, second, or none.

        • span number

          Span options to apply

      • results_field string

        The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

    • question_answering object
      Hide question_answering attributes Show question_answering attributes object
      • question string Required

        The question to answer given the inference context

      • num_top_classes number

        Specifies the number of top class predictions to return. Defaults to 0.

      • tokenization object
        Hide tokenization attributes Show tokenization attributes object
        • truncate string

          Values are first, second, or none.

        • span number

          Span options to apply

      • results_field string

        The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

      • max_answer_length number

        The maximum answer length to consider for extraction

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • inference_results array[object] Required
      Hide inference_results attributes Show inference_results attributes object
      • entities array[object]

        If the model is trained for named entity recognition (NER) tasks, the response contains the recognized entities.

        Hide entities attributes Show entities attributes object
        • class_name string Required
        • class_probability number Required
        • entity string Required
        • start_pos number Required
        • end_pos number Required
      • is_truncated boolean

        Indicates whether the input text was truncated to meet the model's maximum sequence length limit. This property is present only when it is true.

      • predicted_value number | string | boolean | null | array[number | string | boolean | null] | array[number | string | boolean | null | array]

        If the model is trained for a text classification or zero shot classification task, the response is the predicted class. For named entity recognition (NER) tasks, it contains the annotated text output. For fill mask tasks, it contains the top prediction for replacing the mask token. For text embedding tasks, it contains the raw numerical text embedding values. For regression models, its a numerical value For classification models, it may be an integer, double, boolean or string depending on prediction type

      • predicted_value_sequence string

        For fill mask tasks, the response contains the input text sequence with the mask token replaced by the predicted value. Additionally

      • prediction_probability number

        Specifies a probability for the predicted value.

      • prediction_score number

        Specifies a confidence score for the predicted value.

      • top_classes array[object]

        For fill mask, text classification, and zero shot classification tasks, the response contains a list of top class entries.

        Hide top_classes attributes Show top_classes attributes object
        • class_name string Required
        • class_probability number Required
        • class_score number Required
      • warning string

        If the request failed, the response contains the reason for the failure.

      • feature_importance array[object]

        The feature importance for the inference results. Relevant only for classification or regression models

        Hide feature_importance attributes Show feature_importance attributes object
        • feature_name string Required
        • importance number
        • classes array[object]
POST /_ml/trained_models/{model_id}/_infer
POST _ml/trained_models/lang_ident_model_1/_infer
{
  "docs":[{"text": "The fool doth think he is wise, but the wise man knows himself to be a fool."}]
}
resp = client.ml.infer_trained_model(
    model_id="lang_ident_model_1",
    docs=[
        {
            "text": "The fool doth think he is wise, but the wise man knows himself to be a fool."
        }
    ],
)
const response = await client.ml.inferTrainedModel({
  model_id: "lang_ident_model_1",
  docs: [
    {
      text: "The fool doth think he is wise, but the wise man knows himself to be a fool.",
    },
  ],
});
response = client.ml.infer_trained_model(
  model_id: "lang_ident_model_1",
  body: {
    "docs": [
      {
        "text": "The fool doth think he is wise, but the wise man knows himself to be a fool."
      }
    ]
  }
)
$resp = $client->ml()->inferTrainedModel([
    "model_id" => "lang_ident_model_1",
    "body" => [
        "docs" => array(
            [
                "text" => "The fool doth think he is wise, but the wise man knows himself to be a fool.",
            ],
        ),
    ],
]);
curl -X POST -H "Authorization: ApiKey $ELASTIC_API_KEY" -H "Content-Type: application/json" -d '{"docs":[{"text":"The fool doth think he is wise, but the wise man knows himself to be a fool."}]}' "$ELASTICSEARCH_URL/_ml/trained_models/lang_ident_model_1/_infer"
client.ml().inferTrainedModel(i -> i
    .docs(Map.of("text", JsonData.fromJson("\"The fool doth think he is wise, but the wise man knows himself to be a fool.\"")))
    .modelId("lang_ident_model_1")
);
Request example
An example body for a `POST _ml/trained_models/lang_ident_model_1/_infer` request.
{
  "docs":[{"text": "The fool doth think he is wise, but the wise man knows himself to be a fool."}]
}








Start a trained model deployment Generally available

POST /_ml/trained_models/{model_id}/deployment/_start

It allocates the model to every machine learning node.

Required authorization

  • Cluster privileges: manage_ml

Path parameters

  • model_id string Required

    The unique identifier of the trained model. Currently, only PyTorch models are supported.

Query parameters

  • cache_size number | string

    The inference cache size (in memory outside the JVM heap) per node for the model. The default value is the same size as the model_size_bytes. To disable the cache, 0b can be provided.

  • number_of_allocations number

    The number of model allocations on each node where the model is deployed. All allocations on a node share the same copy of the model in memory but use a separate set of threads to evaluate the model. Increasing this value generally increases the throughput. If this setting is greater than the number of hardware threads it will automatically be changed to a value less than the number of hardware threads. If adaptive_allocations is enabled, do not set this value, because it’s automatically set.

  • priority string

    The deployment priority.

    Values are normal or low.

  • queue_capacity number

    Specifies the number of inference requests that are allowed in the queue. After the number of requests exceeds this value, new requests are rejected with a 429 error.

  • threads_per_allocation number

    Sets the number of threads used by each model allocation during inference. This generally increases the inference speed. The inference process is a compute-bound process; any number greater than the number of available hardware threads on the machine does not increase the inference speed. If this setting is greater than the number of hardware threads it will automatically be changed to a value less than the number of hardware threads.

  • timeout string

    Specifies the amount of time to wait for the model to deploy.

    Values are -1 or 0.

  • wait_for string

    Specifies the allocation status to wait for before returning.

    Supported values include:

    • started: The trained model is started on at least one node.
    • starting: Trained model deployment is starting but it is not yet deployed on any nodes.
    • fully_allocated: Trained model deployment has started on all valid nodes.

    Values are started, starting, or fully_allocated.

application/json

Body

  • adaptive_allocations object
    Hide adaptive_allocations attributes Show adaptive_allocations attributes object
    • enabled boolean Required

      If true, adaptive_allocations is enabled

    • min_number_of_allocations number

      Specifies the minimum number of allocations to scale to. If set, it must be greater than or equal to 0. If not defined, the deployment scales to 0.

    • max_number_of_allocations number

      Specifies the maximum number of allocations to scale to. If set, it must be greater than or equal to min_number_of_allocations.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • assignment object Required
      Hide assignment attributes Show assignment attributes object
      • adaptive_allocations object | string | null

        One of:
        Hide attributes Show attributes
        • enabled boolean Required

          If true, adaptive_allocations is enabled

        • min_number_of_allocations number

          Specifies the minimum number of allocations to scale to. If set, it must be greater than or equal to 0. If not defined, the deployment scales to 0.

        • max_number_of_allocations number

          Specifies the maximum number of allocations to scale to. If set, it must be greater than or equal to min_number_of_allocations.

      • assignment_state string Required

        Values are started, starting, stopping, or failed.

      • max_assigned_allocations number
      • reason string
      • routing_table object Required

        The allocation state for each node.

        Hide routing_table attribute Show routing_table attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • reason string

            The reason for the current state. It is usually populated only when the routing_state is failed.

          • routing_state string Required

            Values are failed, started, starting, stopped, or stopping.

          • current_allocations number Required

            Current number of allocations.

          • target_allocations number Required

            Target number of allocations.

      • start_time string | number Required

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        One of:
      • task_parameters object Required
        Hide task_parameters attributes Show task_parameters attributes object
POST /_ml/trained_models/{model_id}/deployment/_start
POST _ml/trained_models/elastic__distilbert-base-uncased-finetuned-conll03-english/deployment/_start?wait_for=started&timeout=1m
resp = client.ml.start_trained_model_deployment(
    model_id="elastic__distilbert-base-uncased-finetuned-conll03-english",
    wait_for="started",
    timeout="1m",
)
const response = await client.ml.startTrainedModelDeployment({
  model_id: "elastic__distilbert-base-uncased-finetuned-conll03-english",
  wait_for: "started",
  timeout: "1m",
});
response = client.ml.start_trained_model_deployment(
  model_id: "elastic__distilbert-base-uncased-finetuned-conll03-english",
  wait_for: "started",
  timeout: "1m"
)
$resp = $client->ml()->startTrainedModelDeployment([
    "model_id" => "elastic__distilbert-base-uncased-finetuned-conll03-english",
    "wait_for" => "started",
    "timeout" => "1m",
]);
curl -X POST -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/_ml/trained_models/elastic__distilbert-base-uncased-finetuned-conll03-english/deployment/_start?wait_for=started&timeout=1m"
client.ml().startTrainedModelDeployment(s -> s
    .modelId("elastic__distilbert-base-uncased-finetuned-conll03-english")
    .timeout(t -> t
        .offset(1)
    )
    .waitFor(DeploymentAllocationState.Started)
);

Stop a trained model deployment Generally available

POST /_ml/trained_models/{model_id}/deployment/_stop

Required authorization

  • Cluster privileges: manage_ml

Path parameters

  • model_id string Required

    The unique identifier of the trained model.

Query parameters

  • allow_no_match boolean

    Specifies what to do when the request: contains wildcard expressions and there are no deployments that match; contains the _all string or no identifiers and there are no matches; or contains wildcard expressions and there are only partial matches. By default, it returns an empty array when there are no matches and the subset of results when there are partial matches. If false, the request returns a 404 status code when there are no matches or only partial matches.

  • force boolean

    Forcefully stops the deployment, even if it is used by ingest pipelines. You can't use these pipelines until you restart the model deployment.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • stopped boolean Required
POST /_ml/trained_models/{model_id}/deployment/_stop
POST _ml/trained_models/my_model_for_search/deployment/_stop
resp = client.ml.stop_trained_model_deployment(
    model_id="my_model_for_search",
)
const response = await client.ml.stopTrainedModelDeployment({
  model_id: "my_model_for_search",
});
response = client.ml.stop_trained_model_deployment(
  model_id: "my_model_for_search"
)
$resp = $client->ml()->stopTrainedModelDeployment([
    "model_id" => "my_model_for_search",
]);
curl -X POST -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/_ml/trained_models/my_model_for_search/deployment/_stop"
client.ml().stopTrainedModelDeployment(s -> s
    .modelId("my_model_for_search")
);

Update a trained model deployment Beta

POST /_ml/trained_models/{model_id}/deployment/_update

Required authorization

  • Cluster privileges: manage_ml

Path parameters

  • model_id string Required

    The unique identifier of the trained model. Currently, only PyTorch models are supported.

Query parameters

  • number_of_allocations number

    The number of model allocations on each node where the model is deployed. All allocations on a node share the same copy of the model in memory but use a separate set of threads to evaluate the model. Increasing this value generally increases the throughput. If this setting is greater than the number of hardware threads it will automatically be changed to a value less than the number of hardware threads.

application/json

Body

  • number_of_allocations number

    The number of model allocations on each node where the model is deployed. All allocations on a node share the same copy of the model in memory but use a separate set of threads to evaluate the model. Increasing this value generally increases the throughput. If this setting is greater than the number of hardware threads it will automatically be changed to a value less than the number of hardware threads. If adaptive_allocations is enabled, do not set this value, because it’s automatically set.

    Default value is 1.

  • adaptive_allocations object
    Hide adaptive_allocations attributes Show adaptive_allocations attributes object
    • enabled boolean Required

      If true, adaptive_allocations is enabled

    • min_number_of_allocations number

      Specifies the minimum number of allocations to scale to. If set, it must be greater than or equal to 0. If not defined, the deployment scales to 0.

    • max_number_of_allocations number

      Specifies the maximum number of allocations to scale to. If set, it must be greater than or equal to min_number_of_allocations.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • assignment object Required
      Hide assignment attributes Show assignment attributes object
      • adaptive_allocations object | string | null

        One of:
        Hide attributes Show attributes
        • enabled boolean Required

          If true, adaptive_allocations is enabled

        • min_number_of_allocations number

          Specifies the minimum number of allocations to scale to. If set, it must be greater than or equal to 0. If not defined, the deployment scales to 0.

        • max_number_of_allocations number

          Specifies the maximum number of allocations to scale to. If set, it must be greater than or equal to min_number_of_allocations.

      • assignment_state string Required

        Values are started, starting, stopping, or failed.

      • max_assigned_allocations number
      • reason string
      • routing_table object Required

        The allocation state for each node.

        Hide routing_table attribute Show routing_table attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • reason string

            The reason for the current state. It is usually populated only when the routing_state is failed.

          • routing_state string Required

            Values are failed, started, starting, stopped, or stopping.

          • current_allocations number Required

            Current number of allocations.

          • target_allocations number Required

            Target number of allocations.

      • start_time string | number Required

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        One of:
      • task_parameters object Required
        Hide task_parameters attributes Show task_parameters attributes object
POST /_ml/trained_models/{model_id}/deployment/_update
POST _ml/trained_models/elastic__distilbert-base-uncased-finetuned-conll03-english/deployment/_update
{
  "number_of_allocations": 4
}
resp = client.ml.update_trained_model_deployment(
    model_id="elastic__distilbert-base-uncased-finetuned-conll03-english",
    number_of_allocations=4,
)
const response = await client.ml.updateTrainedModelDeployment({
  model_id: "elastic__distilbert-base-uncased-finetuned-conll03-english",
  number_of_allocations: 4,
});
response = client.ml.update_trained_model_deployment(
  model_id: "elastic__distilbert-base-uncased-finetuned-conll03-english",
  body: {
    "number_of_allocations": 4
  }
)
$resp = $client->ml()->updateTrainedModelDeployment([
    "model_id" => "elastic__distilbert-base-uncased-finetuned-conll03-english",
    "body" => [
        "number_of_allocations" => 4,
    ],
]);
curl -X POST -H "Authorization: ApiKey $ELASTIC_API_KEY" -H "Content-Type: application/json" -d '{"number_of_allocations":4}' "$ELASTICSEARCH_URL/_ml/trained_models/elastic__distilbert-base-uncased-finetuned-conll03-english/deployment/_update"
client.ml().updateTrainedModelDeployment(u -> u
    .modelId("elastic__distilbert-base-uncased-finetuned-conll03-english")
    .numberOfAllocations(4)
);
Request example
An example body for a `POST _ml/trained_models/elastic__distilbert-base-uncased-finetuned-conll03-english/deployment/_update` request.
{
  "number_of_allocations": 4
}

Query rules

Query rules enable you to configure per-query rules that are applied at query time to queries that match the specific rule. Query rules are organized into rulesets, collections of query rules that are matched against incoming queries. Query rules are applied using the rule query. If a query matches one or more rules in the ruleset, the query is re-written to apply the rules before searching. This allows pinning documents for only queries that match a specific term.

Learn more about the rule query

































Get a script or search template Generally available

GET /_scripts/{id}

Retrieves a stored script or search template.

Required authorization

  • Cluster privileges: manage

Path parameters

  • id string Required

    The identifier for the stored script or search template.

Query parameters

  • master_timeout string

    The period to wait for the master node. If the master node is not available before the timeout expires, the request fails and returns an error. It can also be set to -1 to indicate that the request should never timeout.

    Values are -1 or 0.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • _id string Required
    • found boolean Required
    • script object
      Hide script attributes Show script attributes object
      • lang string Required

        Any of:

        Values are painless, expression, mustache, or java.

      • options object
        Hide options attribute Show options attribute object
        • * string Additional properties
      • source string | object Required

        One of:
GET _scripts/my-search-template
resp = client.get_script(
    id="my-search-template",
)
const response = await client.getScript({
  id: "my-search-template",
});
response = client.get_script(
  id: "my-search-template"
)
$resp = $client->getScript([
    "id" => "my-search-template",
]);
curl -X GET -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/_scripts/my-search-template"
client.getScript(g -> g
    .id("my-search-template")
);




Create or update a script or search template Generally available

POST /_scripts/{id}/{context}

All methods and paths for this operation:

PUT /_scripts/{id}

POST /_scripts/{id}
PUT /_scripts/{id}/{context}
POST /_scripts/{id}/{context}

Creates or updates a stored script or search template.

Required authorization

  • Cluster privileges: manage
External documentation

Path parameters

  • id string Required

    The identifier for the stored script or search template. It must be unique within the cluster.

  • context string Required

    The context in which the script or search template should run. To prevent errors, the API immediately compiles the script or template in this context.

Query parameters

  • context string

    The context in which the script or search template should run. To prevent errors, the API immediately compiles the script or template in this context. If you specify both this and the <context> path parameter, the API uses the request path parameter.

  • master_timeout string

    The period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error. It can also be set to -1 to indicate that the request should never timeout.

    Values are -1 or 0.

  • timeout string

    The period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error. It can also be set to -1 to indicate that the request should never timeout.

    Values are -1 or 0.

application/json

Body Required

  • script object Required
    Hide script attributes Show script attributes object
    • lang string Required

      Any of:

      Values are painless, expression, mustache, or java.

    • options object
      Hide options attribute Show options attribute object
      • * string Additional properties
    • source string | object Required

      One of:

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

POST /_scripts/{id}/{context}
PUT _scripts/my-search-template
{
  "script": {
    "lang": "mustache",
    "source": {
      "query": {
        "match": {
          "message": "{{query_string}}"
        }
      },
      "from": "{{from}}",
      "size": "{{size}}"
    }
  }
}
resp = client.put_script(
    id="my-search-template",
    script={
        "lang": "mustache",
        "source": {
            "query": {
                "match": {
                    "message": "{{query_string}}"
                }
            },
            "from": "{{from}}",
            "size": "{{size}}"
        }
    },
)
const response = await client.putScript({
  id: "my-search-template",
  script: {
    lang: "mustache",
    source: {
      query: {
        match: {
          message: "{{query_string}}",
        },
      },
      from: "{{from}}",
      size: "{{size}}",
    },
  },
});
response = client.put_script(
  id: "my-search-template",
  body: {
    "script": {
      "lang": "mustache",
      "source": {
        "query": {
          "match": {
            "message": "{{query_string}}"
          }
        },
        "from": "{{from}}",
        "size": "{{size}}"
      }
    }
  }
)
$resp = $client->putScript([
    "id" => "my-search-template",
    "body" => [
        "script" => [
            "lang" => "mustache",
            "source" => [
                "query" => [
                    "match" => [
                        "message" => "{{query_string}}",
                    ],
                ],
                "from" => "{{from}}",
                "size" => "{{size}}",
            ],
        ],
    ],
]);
curl -X PUT -H "Authorization: ApiKey $ELASTIC_API_KEY" -H "Content-Type: application/json" -d '{"script":{"lang":"mustache","source":{"query":{"match":{"message":"{{query_string}}"}},"from":"{{from}}","size":"{{size}}"}}}' "$ELASTICSEARCH_URL/_scripts/my-search-template"
Request examples
Run `PUT _scripts/my-search-template` to create a search template.
{
  "script": {
    "lang": "mustache",
    "source": {
      "query": {
        "match": {
          "message": "{{query_string}}"
        }
      },
      "from": "{{from}}",
      "size": "{{size}}"
    }
  }
}
Run `PUT _scripts/my-stored-script` to create a stored script.
{
  "script": {
    "lang": "painless",
    "source": "Math.log(_score * 2) + params['my_modifier']"
  }
}




Search





























Count search results Generally available

GET /{index}/_count

All methods and paths for this operation:

POST /_count

GET /_count
POST /{index}/_count
GET /{index}/_count

Get the number of documents matching a query.

The query can be provided either by using a simple query string as a parameter, or by defining Query DSL within the request body. The query is optional. When no query is provided, the API uses match_all to count all the documents.

The count API supports multi-target syntax. You can run a single count API search across multiple data streams and indices.

The operation is broadcast across all shards. For each shard ID group, a replica is chosen and the search is run against it. This means that replicas increase the scalability of the count.

Required authorization

  • Index privileges: read

Path parameters

  • index string | array[string] Required

    A comma-separated list of data streams, indices, and aliases to search. It supports wildcards (*). To search all data streams and indices, omit this parameter or use * or _all.

Query parameters

  • allow_no_indices boolean

    If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

  • analyzer string

    The analyzer to use for the query string. This parameter can be used only when the q query string parameter is specified.

  • analyze_wildcard boolean

    If true, wildcard and prefix queries are analyzed. This parameter can be used only when the q query string parameter is specified.

  • default_operator string

    The default operator for query string query: AND or OR. This parameter can be used only when the q query string parameter is specified.

    Values are and, AND, or, or OR.

  • df string

    The field to use as a default when no field prefix is given in the query string. This parameter can be used only when the q query string parameter is specified.

  • expand_wildcards string | array[string]

    The type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. It supports comma-separated values, such as open,hidden.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.

    Values are all, open, closed, hidden, or none.

  • ignore_throttled boolean Deprecated

    If true, concrete, expanded, or aliased indices are ignored when frozen.

  • ignore_unavailable boolean

    If false, the request returns an error if it targets a missing or closed index.

  • lenient boolean

    If true, format-based query failures (such as providing text to a numeric field) in the query string will be ignored. This parameter can be used only when the q query string parameter is specified.

  • min_score number

    The minimum _score value that documents must have to be included in the result.

  • preference string

    The node or shard the operation should be performed on. By default, it is random.

  • routing string

    A custom value used to route operations to a specific shard.

  • terminate_after number

    The maximum number of documents to collect for each shard. If a query reaches this limit, Elasticsearch terminates the query early. Elasticsearch collects documents before sorting.

    IMPORTANT: Use with caution. Elasticsearch applies this parameter to each shard handling the request. When possible, let Elasticsearch perform early termination automatically. Avoid specifying this parameter for requests that target data streams with backing indices across multiple data tiers.

  • q string

    The query in Lucene query string syntax. This parameter cannot be used with a request body.

application/json

Body

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • count number Required
    • _shards object Required
      Hide _shards attributes Show _shards attributes object
      • failed number Required
      • successful number Required
      • total number Required
      • failures array[object]
        Hide failures attributes Show failures attributes object
        • index string
        • node string
        • reason object Required

          Cause and details about a request failure. This class defines the properties common to all error types. Additional details are also provided, that depend on the error type.

          Hide reason attributes Show reason attributes object
          • type string Required

            The type of error

          • reason string | null

            A human-readable explanation of the error, in English.

          • stack_trace string

            The server stack trace. Present only if the error_trace=true parameter was sent with the request.

          • caused_by object

            Cause and details about a request failure. This class defines the properties common to all error types. Additional details are also provided, that depend on the error type.

          • root_cause array[object]

            Cause and details about a request failure. This class defines the properties common to all error types. Additional details are also provided, that depend on the error type.

            Cause and details about a request failure. This class defines the properties common to all error types. Additional details are also provided, that depend on the error type.

          • suppressed array[object]

            Cause and details about a request failure. This class defines the properties common to all error types. Additional details are also provided, that depend on the error type.

            Cause and details about a request failure. This class defines the properties common to all error types. Additional details are also provided, that depend on the error type.

        • shard number Required
        • status string
      • skipped number
GET /my-index-000001/_count
{
  "query" : {
    "term" : { "user.id" : "kimchy" }
  }
}
resp = client.count(
    index="my-index-000001",
    query={
        "term": {
            "user.id": "kimchy"
        }
    },
)
const response = await client.count({
  index: "my-index-000001",
  query: {
    term: {
      "user.id": "kimchy",
    },
  },
});
response = client.count(
  index: "my-index-000001",
  body: {
    "query": {
      "term": {
        "user.id": "kimchy"
      }
    }
  }
)
$resp = $client->count([
    "index" => "my-index-000001",
    "body" => [
        "query" => [
            "term" => [
                "user.id" => "kimchy",
            ],
        ],
    ],
]);
curl -X GET -H "Authorization: ApiKey $ELASTIC_API_KEY" -H "Content-Type: application/json" -d '{"query":{"term":{"user.id":"kimchy"}}}' "$ELASTICSEARCH_URL/my-index-000001/_count"
client.count(c -> c
    .index("my-index-000001")
    .query(q -> q
        .term(t -> t
            .field("user.id")
            .value(FieldValue.of("kimchy"))
        )
    )
);
Request example
Run `GET /my-index-000001/_count?q=user:kimchy`. Alternatively, run `GET /my-index-000001/_count` with the same query in the request body. Both requests count the number of documents in `my-index-000001` with a `user.id` of `kimchy`.
{
  "query" : {
    "term" : { "user.id" : "kimchy" }
  }
}
Response examples (200)
A successful response from `GET /my-index-000001/_count?q=user:kimchy`.
{
  "count": 1,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  }
}

















































Create or update a search application Beta

PUT /_application/search_application/{name}

Required authorization

  • Index privileges: manage
  • Cluster privileges: manage_search_application

Path parameters

  • name string Required

    The name of the search application to be created or updated.

Query parameters

  • create boolean

    If true, this request cannot replace or update existing Search Applications.

application/json

Body Required

  • indices array[string] Required

    Indices that are part of the Search Application.

  • analytics_collection_name string
  • template object
    Hide template attribute Show template attribute object
    • script object Required
      Hide script attributes Show script attributes object
      • source string | object

        One of:
      • id string
      • params object

        Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

        Hide params attribute Show params attribute object
        • * object Additional properties
      • lang string

        Any of:

        Values are painless, expression, mustache, or java.

      • options object
        Hide options attribute Show options attribute object
        • * string Additional properties

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • result string Required

      Values are created, updated, deleted, not_found, or noop.

PUT /_application/search_application/{name}
PUT _application/search_application/my-app
{
  "indices": [ "index1", "index2" ],
  "template": {
    "script": {
      "source": {
        "query": {
          "query_string": {
            "query": "{{query_string}}",
            "default_field": "{{default_field}}"
          }
        }
      },
      "params": {
        "query_string": "*",
        "default_field": "*"
      }
    },
    "dictionary": {
      "properties": {
        "query_string": {
          "type": "string"
        },
        "default_field": {
          "type": "string",
          "enum": [
            "title",
            "description"
          ]
        },
        "additionalProperties": false
      },
      "required": [
        "query_string"
      ]
    }
  }
}
resp = client.search_application.put(
    name="my-app",
    search_application={
        "indices": [
            "index1",
            "index2"
        ],
        "template": {
            "script": {
                "source": {
                    "query": {
                        "query_string": {
                            "query": "{{query_string}}",
                            "default_field": "{{default_field}}"
                        }
                    }
                },
                "params": {
                    "query_string": "*",
                    "default_field": "*"
                }
            },
            "dictionary": {
                "properties": {
                    "query_string": {
                        "type": "string"
                    },
                    "default_field": {
                        "type": "string",
                        "enum": [
                            "title",
                            "description"
                        ]
                    },
                    "additionalProperties": False
                },
                "required": [
                    "query_string"
                ]
            }
        }
    },
)
const response = await client.searchApplication.put({
  name: "my-app",
  search_application: {
    indices: ["index1", "index2"],
    template: {
      script: {
        source: {
          query: {
            query_string: {
              query: "{{query_string}}",
              default_field: "{{default_field}}",
            },
          },
        },
        params: {
          query_string: "*",
          default_field: "*",
        },
      },
      dictionary: {
        properties: {
          query_string: {
            type: "string",
          },
          default_field: {
            type: "string",
            enum: ["title", "description"],
          },
          additionalProperties: false,
        },
        required: ["query_string"],
      },
    },
  },
});
response = client.search_application.put(
  name: "my-app",
  body: {
    "indices": [
      "index1",
      "index2"
    ],
    "template": {
      "script": {
        "source": {
          "query": {
            "query_string": {
              "query": "{{query_string}}",
              "default_field": "{{default_field}}"
            }
          }
        },
        "params": {
          "query_string": "*",
          "default_field": "*"
        }
      },
      "dictionary": {
        "properties": {
          "query_string": {
            "type": "string"
          },
          "default_field": {
            "type": "string",
            "enum": [
              "title",
              "description"
            ]
          },
          "additionalProperties": false
        },
        "required": [
          "query_string"
        ]
      }
    }
  }
)
$resp = $client->searchApplication()->put([
    "name" => "my-app",
    "body" => [
        "indices" => array(
            "index1",
            "index2",
        ),
        "template" => [
            "script" => [
                "source" => [
                    "query" => [
                        "query_string" => [
                            "query" => "{{query_string}}",
                            "default_field" => "{{default_field}}",
                        ],
                    ],
                ],
                "params" => [
                    "query_string" => "*",
                    "default_field" => "*",
                ],
            ],
            "dictionary" => [
                "properties" => [
                    "query_string" => [
                        "type" => "string",
                    ],
                    "default_field" => [
                        "type" => "string",
                        "enum" => array(
                            "title",
                            "description",
                        ),
                    ],
                    "additionalProperties" => false,
                ],
                "required" => array(
                    "query_string",
                ),
            ],
        ],
    ],
]);
curl -X PUT -H "Authorization: ApiKey $ELASTIC_API_KEY" -H "Content-Type: application/json" -d '{"indices":["index1","index2"],"template":{"script":{"source":{"query":{"query_string":{"query":"{{query_string}}","default_field":"{{default_field}}"}}},"params":{"query_string":"*","default_field":"*"}},"dictionary":{"properties":{"query_string":{"type":"string"},"default_field":{"type":"string","enum":["title","description"]},"additionalProperties":false},"required":["query_string"]}}}' "$ELASTICSEARCH_URL/_application/search_application/my-app"
client.searchApplication().put(p -> p
    .name("my-app")
    .searchApplication(s -> s
        .indices(List.of("index1","index2"))
        .template(t -> t
            .script(sc -> sc
                .source(so -> so
                    .scriptTemplate(scr -> scr
                        .query(q -> q
                            .queryString(qu -> qu
                                .defaultField("{{default_field}}")
                                .query("{{query_string}}")
                            )
                        )
                    )
                )
                .params(Map.of("default_field", JsonData.fromJson("\"*\""),"query_string", JsonData.fromJson("\"*\"")))
            )
        )
    )
);
Request example
Run `PUT _application/search_application/my-app` to create or update a search application called `my-app`. When the dictionary parameter is specified, the search application search API will perform the following parameter validation: it accepts only the `query_string` and `default_field` parameters; it verifies that `query_string` and `default_field` are both strings; it accepts `default_field` only if it takes the values title or description. If the parameters are not valid, the search application search API will return an error.
{
  "indices": [ "index1", "index2" ],
  "template": {
    "script": {
      "source": {
        "query": {
          "query_string": {
            "query": "{{query_string}}",
            "default_field": "{{default_field}}"
          }
        }
      },
      "params": {
        "query_string": "*",
        "default_field": "*"
      }
    },
    "dictionary": {
      "properties": {
        "query_string": {
          "type": "string"
        },
        "default_field": {
          "type": "string",
          "enum": [
            "title",
            "description"
          ]
        },
        "additionalProperties": false
      },
      "required": [
        "query_string"
      ]
    }
  }
}