Perform chat completion inference | Elasticsearch Serverless API documentation

Get tokens from text analysis Generally available

POST /{index}/_analyze

Api key auth

All methods and paths for this operation:

GET /_analyze

POST /_analyze

GET /{index}/_analyze

POST /{index}/_analyze

The analyze API performs analysis on a text string and returns the resulting tokens.

Generating excessive amount of tokens may cause a node to run out of memory. The index.analyze.max_token_count setting enables you to limit the number of tokens that can be produced. If more than this limit of tokens gets generated, an error occurs. The _analyze endpoint without a specified index will always use 10000 as its limit.

Required authorization

Index privileges: index

External documentation

Path parameters

index string Required

Index used to derive the analyzer. If specified, the analyzer or field parameter overrides this value. If no index is specified or the index does not have a default analyzer, the analyze API uses the standard analyzer.

Query parameters

index string

Index used to derive the analyzer. If specified, the analyzer or field parameter overrides this value. If no index is specified or the index does not have a default analyzer, the analyze API uses the standard analyzer.

application/json

Body

analyzer string

The name of the analyzer that should be applied to the provided text. This could be a built-in analyzer, or an analyzer that’s been configured in the index.
attributes array[string]

Array of token attributes used to filter the output of the explain parameter.
char_filter array

Array of character filters used to preprocess characters before the tokenizer.

External documentation
explain boolean

If true, the response includes token attributes and additional details.

Default value is false.
field string

Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
filter array

Array of token filters used to apply after the tokenizer.

External documentation
normalizer string

Normalizer to use to convert text into a single token.
text string | array[string]

One of:
string-1 string array-2 array[string]

Responses

200 application/json
Hide response attributes Show response attributes object
- detail object
  
  Hide detail attributes Show detail attributes object
  
  analyzer object
  
  Hide analyzer attributes Show analyzer attributes object
  
  name string Required
  
  tokens array[object] Required
  
  Hide tokens attributes Show tokens attributes object
  
  bytes string Required
  
  end_offset number Required
  
  keyword boolean
  
  position number Required
  
  positionLength number Required
  
  start_offset number Required
  
  termFrequency number Required
  
  token string Required
  
  type string Required
  
  charfilters array[object]
  
  Hide charfilters attributes Show charfilters attributes object
  
  filtered_text array[string] Required
  
  name string Required
  
  custom_analyzer boolean Required
  
  tokenfilters array[object]
  
  Hide tokenfilters attributes Show tokenfilters attributes object
  
  name string Required
  
  tokens array[object] Required
  
  Hide tokens attributes Show tokens attributes object
  
  bytes string Required
  
  end_offset number Required
  
  keyword boolean
  
  position number Required
  
  positionLength number Required
  
  start_offset number Required
  
  termFrequency number Required
  
  token string Required
  
  type string Required
  
  tokenizer object
  
  Hide tokenizer attributes Show tokenizer attributes object
  
  name string Required
  
  tokens array[object] Required
  
  Hide tokens attributes Show tokens attributes object
  
  bytes string Required
  
  end_offset number Required
  
  keyword boolean
  
  position number Required
  
  positionLength number Required
  
  start_offset number Required
  
  termFrequency number Required
  
  token string Required
  
  type string Required
- tokens array[object]
  
  Hide tokens attributes Show tokens attributes object
  
  end_offset number Required
  
  position number Required
  
  positionLength number
  
  start_offset number Required
  
  token string Required
  
  type string Required

POST /{index}/_analyze

GET /_analyze
{
  "analyzer": "standard",
  "text": "this is a test"
}

resp = client.indices.analyze(
    analyzer="standard",
    text="this is a test",
)

const response = await client.indices.analyze({
  analyzer: "standard",
  text: "this is a test",
});

response = client.indices.analyze(
  body: {
    "analyzer": "standard",
    "text": "this is a test"
  }
)

$resp = $client->indices()->analyze([
    "body" => [
        "analyzer" => "standard",
        "text" => "this is a test",
    ],
]);

curl -X GET -H "Authorization: ApiKey $ELASTIC_API_KEY" -H "Content-Type: application/json" -d '{"analyzer":"standard","text":"this is a test"}' "$ELASTICSEARCH_URL/_analyze"

client.indices().analyze(a -> a
    .analyzer("standard")
    .text("this is a test")
);

Request examples

You can apply any of the built-in analyzers to the text string without specifying an index.

{
  "analyzer": "standard",
  "text": "this is a test"
}

If the text parameter is provided as array of strings, it is analyzed as a multi-value field.

{
  "analyzer": "standard",
  "text": [
    "this is a test",
    "the second text"
  ]
}

You can test a custom transient analyzer built from tokenizers, token filters, and char filters. Token filters use the filter parameter.

{
  "tokenizer": "keyword",
  "filter": [
    "lowercase"
  ],
  "char_filter": [
    "html_strip"
  ],
  "text": "this is a <b>test</b>"
}

Custom tokenizers, token filters, and character filters can be specified in the request body.

{
  "tokenizer": "whitespace",
  "filter": [
    "lowercase",
    {
      "type": "stop",
      "stopwords": [
        "a",
        "is",
        "this"
      ]
    }
  ],
  "text": "this is a test"
}

Run `GET /analyze_sample/_analyze` to run an analysis on the text using the default index analyzer associated with the `analyze_sample` index. Alternatively, the analyzer can be derived based on a field mapping.

{
  "field": "obj1.field1",
  "text": "this is a test"
}

Run `GET /analyze_sample/_analyze` and supply a normalizer for a keyword field if there is a normalizer associated with the specified index.

{
  "normalizer": "my_normalizer",
  "text": "BaR"
}

If you want to get more advanced details, set `explain` to `true`. It will output all token attributes for each token. You can filter token attributes you want to output by setting the `attributes` option. NOTE: The format of the additional detail information is labelled as experimental in Lucene and it may change in the future.

{
  "tokenizer": "standard",
  "filter": [
    "snowball"
  ],
  "text": "detailed output",
  "explain": true,
  "attributes": [
    "keyword"
  ]
}

Response examples (200)

A successful response for an analysis with `explain` set to `true`.

{
  "detail": {
    "custom_analyzer": true,
    "charfilters": [],
    "tokenizer": {
      "name": "standard",
      "tokens": [
        {
          "token": "detailed",
          "start_offset": 0,
          "end_offset": 8,
          "type": "<ALPHANUM>",
          "position": 0
        },
        {
          "token": "output",
          "start_offset": 9,
          "end_offset": 15,
          "type": "<ALPHANUM>",
          "position": 1
        }
      ]
    },
    "tokenfilters": [
      {
        "name": "snowball",
        "tokens": [
          {
            "token": "detail",
            "start_offset": 0,
            "end_offset": 8,
            "type": "<ALPHANUM>",
            "position": 0,
            "keyword": false
          },
          {
            "token": "output",
            "start_offset": 9,
            "end_offset": 15,
            "type": "<ALPHANUM>",
            "position": 1,
            "keyword": false
          }
        ]
      }
    ]
  }
}

Perform chat completion inference Generally available

POST /_inference/chat_completion/{inference_id}/_stream

Api key auth

The chat completion inference API enables real-time responses for chat completion tasks by delivering answers incrementally, reducing response times during computation. It only works with the chat_completion task type for openai and elastic inference services.

NOTE: The chat_completion task type is only available within the _stream API and only supports streaming. The Chat completion inference API and the Stream inference API differ in their response structure and capabilities. The Chat completion inference API provides more comprehensive customization options through more fields and function calling support. If you use the openai, hugging_face or the elastic service, use the Chat completion inference API.

Path parameters

inference_id string Required

The inference Id

Query parameters

timeout string

Specifies the amount of time to wait for the inference request to complete.

Values are -1 or 0.

application/json

Body Required

messages array[object] Required

A list of objects representing the conversation. Requests should generally only add new messages from the user (role user). The other message roles (assistant, system, or tool) should generally only be copied from the response to a previous completion request, such that the messages array is built up throughout a conversation.

An object representing part of the conversation.
Hide messages attributes Show messages attributes object
- content string | array[object]
  
  One of:
  string-1 string array-2 array[object]
- role string Required
  
  The role of the message author. Valid values are user, assistant, system, and tool.
- tool_call_id string
- tool_calls array[object]
  Only for assistant role messages. The tool calls generated by the model. If it's specified, the content field is optional. Example:
  
  { "tool_calls": [ { "id": "call_KcAjWtAww20AihPHphUh46Gd", "type": "function", "function": { "name": "get_current_weather", "arguments": "{\"location\":\"Boston, MA\"}" } } ] }
  A tool call generated by the model.
  Hide tool_calls attributes Show tool_calls attributes object
  
  id string Required
  
  function object Required
  
  The function that the model called.
  
  Hide function attributes Show function attributes object
  
  arguments string Required
  
  The arguments to call the function with in JSON format.
  
  name string Required
  
  The name of the function to call.
  
  type string Required
  
  The type of the tool call.
model string

The ID of the model to use.
max_completion_tokens number

The upper bound limit for the number of tokens that can be generated for a completion request.
stop array[string]

A sequence of strings to control when the model should stop generating additional tokens.
temperature number

The sampling temperature to use.
tool_choice string | object
One of:
string-1 string CompletionToolChoice object
Controls which tool is called by the model.
Hide attributes Show attributes

type string Required

The type of the tool.

function object Required

The tool choice function.

Hide function attribute Show function attribute object

name string Required

The name of the function to call.
tools array[object]
A list of tools that the model can call. Example:
```
{
  "tools": [
      {
          "type": "function",
          "function": {
              "name": "get_price_of_item",
              "description": "Get the current price of an item",
              "parameters": {
                  "type": "object",
                  "properties": {
                      "item": {
                          "id": "12345"
                      },
                      "unit": {
                          "type": "currency"
                      }
                  }
              }
          }
      }
  ]
}
```
A list of tools that the model can call.
Hide tools attributes Show tools attributes object
- type string Required
  
  The type of tool.
- function object Required
  
  The completion tool function definition.
  Hide function attributes Show function attributes object
  
  description string
  
  A description of what the function does. This is used by the model to choose when and how to call the function.
  
  name string Required
  
  The name of the function.
  
  parameters object
  
  The parameters the functional accepts. This should be formatted as a JSON object.
  
  strict boolean
  
  Whether to enable schema adherence when generating the function call.
top_p number

Nucleus sampling, an alternative to sampling with temperature.

Responses

200 application/json

POST /_inference/chat_completion/{inference_id}/_stream

POST _inference/chat_completion/openai-completion/_stream
{
  "model": "gpt-4o",
  "messages": [
      {
          "role": "user",
          "content": "What is Elastic?"
      }
  ]
}

resp = client.inference.chat_completion_unified(
    inference_id="openai-completion",
    chat_completion_request={
        "model": "gpt-4o",
        "messages": [
            {
                "role": "user",
                "content": "What is Elastic?"
            }
        ]
    },
)

const response = await client.inference.chatCompletionUnified({
  inference_id: "openai-completion",
  chat_completion_request: {
    model: "gpt-4o",
    messages: [
      {
        role: "user",
        content: "What is Elastic?",
      },
    ],
  },
});

response = client.inference.chat_completion_unified(
  inference_id: "openai-completion",
  body: {
    "model": "gpt-4o",
    "messages": [
      {
        "role": "user",
        "content": "What is Elastic?"
      }
    ]
  }
)

$resp = $client->inference()->chatCompletionUnified([
    "inference_id" => "openai-completion",
    "body" => [
        "model" => "gpt-4o",
        "messages" => array(
            [
                "role" => "user",
                "content" => "What is Elastic?",
            ],
        ),
    ],
]);

curl -X POST -H "Authorization: ApiKey $ELASTIC_API_KEY" -H "Content-Type: application/json" -d '{"model":"gpt-4o","messages":[{"role":"user","content":"What is Elastic?"}]}' "$ELASTICSEARCH_URL/_inference/chat_completion/openai-completion/_stream"

client.inference().chatCompletionUnified(c -> c
    .inferenceId("openai-completion")
    .chatCompletionRequest(ch -> ch
        .messages(m -> m
            .content(co -> co
                .string("What is Elastic?")
            )
            .role("user")
        )
        .model("gpt-4o")
    )
);

Request examples

Run `POST _inference/chat_completion/openai-completion/_stream` to perform a chat completion on the example question with streaming.

{
  "model": "gpt-4o",
  "messages": [
      {
          "role": "user",
          "content": "What is Elastic?"
      }
  ]
}

Run `POST _inference/chat_completion/openai-completion/_stream` to perform a chat completion using an Assistant message with `tool_calls`.

{
  "messages": [
      {
          "role": "assistant",
          "content": "Let's find out what the weather is",
          "tool_calls": [ 
              {
                  "id": "call_KcAjWtAww20AihPHphUh46Gd",
                  "type": "function",
                  "function": {
                      "name": "get_current_weather",
                      "arguments": "{\"location\":\"Boston, MA\"}"
                  }
              }
          ]
      },
      { 
          "role": "tool",
          "content": "The weather is cold",
          "tool_call_id": "call_KcAjWtAww20AihPHphUh46Gd"
      }
  ]
}

Run `POST _inference/chat_completion/openai-completion/_stream` to perform a chat completion using a User message with `tools` and `tool_choice`.

{
  "messages": [
      {
          "role": "user",
          "content": [
              {
                  "type": "text",
                  "text": "What's the price of a scarf?"
              }
          ]
      }
  ],
  "tools": [
      {
          "type": "function",
          "function": {
              "name": "get_current_price",
              "description": "Get the current price of a item",
              "parameters": {
                  "type": "object",
                  "properties": {
                      "item": {
                          "id": "123"
                      }
                  }
              }
          }
      }
  ],
  "tool_choice": {
      "type": "function",
      "function": {
          "name": "get_current_price"
      }
  }
}

Response examples (200)

A successful response when performing a chat completion task using a User message with `tools` and `tool_choice`.

event: message
data: {"chat_completion":{"id":"chatcmpl-Ae0TWsy2VPnSfBbv5UztnSdYUMFP3","choices":[{"delta":{"content":"","role":"assistant"},"index":0}],"model":"gpt-4o-2024-08-06","object":"chat.completion.chunk"}}

event: message
data: {"chat_completion":{"id":"chatcmpl-Ae0TWsy2VPnSfBbv5UztnSdYUMFP3","choices":[{"delta":{"content":Elastic"},"index":0}],"model":"gpt-4o-2024-08-06","object":"chat.completion.chunk"}}

event: message
data: {"chat_completion":{"id":"chatcmpl-Ae0TWsy2VPnSfBbv5UztnSdYUMFP3","choices":[{"delta":{"content":" is"},"index":0}],"model":"gpt-4o-2024-08-06","object":"chat.completion.chunk"}}

(...)

event: message
data: {"chat_completion":{"id":"chatcmpl-Ae0TWsy2VPnSfBbv5UztnSdYUMFP3","choices":[],"model":"gpt-4o-2024-08-06","object":"chat.completion.chunk","usage":{"completion_tokens":28,"prompt_tokens":16,"total_tokens":44}}} 

event: message
data: [DONE]

Get tokens from text analysis Generally available

Required authorization

Path parameters

Query parameters

Body

text string | array[string]

Responses

Perform chat completion inference Generally available

Path parameters

Query parameters

Body Required

content string | array[object]

tool_choice string | object

Responses