Create a preset from a chat-completions request body

Creates a preset (or a new version of an existing one) from an inference request body. Only fields that overlap with the preset config are persisted; other fields (e.g. messages, stream, prompt) are silently ignored.

Authentication

AuthorizationBearer
API key as bearer token in Authorization header

Path parameters

slugstringRequired>=1 character

URL-safe slug identifying the preset. Created if it does not exist.

Request

This endpoint expects an object.
messageslist of objectsRequired
List of messages for the conversation
cache_controlobjectOptional
Enable automatic prompt caching. When set at the top level, the system automatically applies cache breakpoints to the last cacheable block in the request. Currently supported for Anthropic Claude models.
debugobjectOptional

Debug options for inspecting request transformations (streaming only)

frequency_penaltydouble or nullOptional

Frequency penalty (-2.0 to 2.0)

image_configstring or double or list of anyOptional
logit_biasmap from strings to doubles or nullOptional
Token logit bias adjustments
logprobsboolean or nullOptional
Return log probabilities
max_completion_tokensinteger or nullOptional
Maximum tokens in completion
max_tokensinteger or nullOptional

Maximum tokens (deprecated, use max_completion_tokens). Note: some providers enforce a minimum of 16.

metadatamap from strings to stringsOptional

Key-value pairs for additional object information (max 16 pairs, 64 char keys, 512 char values)

modalitieslist of enumsOptional
Output modalities for the response. Supported values are "text", "image", and "audio".
Allowed values:
modelstringOptional
Model to use for completion
modelslist of stringsOptional
Models to use for completion
parallel_tool_callsboolean or nullOptional
Whether to enable parallel function calling during tool use. When true, the model may generate multiple tool calls in a single response.
pluginslist of objectsOptional
Plugins you want to enable for this request, including their settings.
presence_penaltydouble or nullOptional

Presence penalty (-2.0 to 2.0)

providerobjectOptional
When multiple model providers are available, optionally indicate your routing preference.
reasoningobjectOptional
Configuration options for reasoning models
response_formatobjectOptional
Response format configuration
routeanyOptional
seedinteger or nullOptional
Random seed for deterministic outputs
service_tierenum or nullOptional
The service tier to use for processing this request.
Allowed values:
session_idstringOptional<=256 characters

A unique identifier for grouping related requests (e.g., a conversation or agent workflow) for observability. If provided in both the request body and the x-session-id header, the body value takes precedence. Maximum of 256 characters.

stopstring or list of strings or anyOptional

Stop sequences (up to 4)

stop_server_tools_whenlist of objectsOptional

Stop conditions for the server-tool agent loop. Any condition firing halts the loop (OR logic). When set, this overrides max_tool_calls.

streambooleanOptionalDefaults to false
Enable streaming response
stream_optionsobjectOptional
Streaming configuration options
temperaturedouble or nullOptional

Sampling temperature (0-2)

tool_choiceenum or objectOptional
Tool choice configuration
toolslist of objectsOptional
Available tools for function calling
top_logprobsinteger or nullOptional

Number of top log probabilities to return (0-20)

top_pdouble or nullOptional

Nucleus sampling parameter (0-1)

traceobjectOptional

Metadata for observability and tracing. Known keys (trace_id, trace_name, span_name, generation_name, parent_span_id) have special handling. Additional keys are passed through as custom metadata to configured broadcast destinations.

userstringOptional
Unique user identifier

Response

Preset created or updated successfully.
dataobject
A preset with its currently designated version.

Errors

400
Bad Request Error
401
Unauthorized Error
403
Forbidden Error
404
Not Found Error
409
Conflict Error
500
Internal Server Error