Create an Amazon SageMaker inference endpoint
Generally available
Path parameters
-
The type of the inference task that the model will perform.
Values are
text_embedding
,completion
,chat_completion
,sparse_embedding
, orrerank
. -
The unique identifier of the inference endpoint.
Query parameters
-
Specifies the amount of time to wait for the inference endpoint to be created.
Values are
-1
or0
.
PUT
/_inference/{task_type}/{amazonsagemaker_inference_id}
curl \
--request PUT 'http://api.example.com/_inference/{task_type}/{amazonsagemaker_inference_id}' \
--header "Authorization: $API_KEY" \
--header "Content-Type: application/json" \
--data '{"chunking_settings":{"max_chunk_size":250,"overlap":100,"sentence_overlap":1,"separator_group":"string","separators":["string"],"strategy":"sentence"},"service":"amazon_sagemaker","service_settings":{"access_key":"string","endpoint_name":"string","api":"openai","region":"string","secret_key":"string","target_model":"string","target_container_hostname":"string","inference_component_name":"string","batch_size":256,"dimensions":42.0},"task_settings":{"custom_attributes":"string","enable_explanations":"string","inference_id":"string","session_id":"string","target_variant":"string"}}'