Configuration Schema

The Traffic Policy configuration reference for the AI Gateway action.

Supported phases

on_http_request

Type

ai-gateway

Basic structure

on_http_request:
  - type: ai-gateway
    config:
      max_input_tokens: 4096
      max_output_tokens: 8192
      headers: {}
      query_params: {}
      body: {}
      on_error: "halt"
      total_timeout: "5m"
      per_request_timeout: "30s"
      providers: []
      only_allow_configured_providers: false
      only_allow_configured_models: false
      model_selection:
        strategy: []
      api_key_selection:
        strategy: []

Configuration fields

max_input_tokens

integer

Maximum number of tokens allowed in the prompt and context. Requests exceeding this limit will be rejected.

No limit is applied if not specified. Maximum allowed value is 500,000.

max_input_tokens: 4096

max_output_tokens

integer

Maximum number of tokens allowed in the completion response.

No limit is applied if not specified. Maximum allowed value is 500,000.

max_output_tokens: 2048

headers

object

Supports CEL

Additional HTTP headers to include in requests to AI providers.

headers:
  X-Custom-Header: "value"
  X-Request-ID: "${req.id}"

query_params

object

Supports CEL

Additional query parameters to append to provider requests.

query_params:
  api_version: "2023-10-01"

body

object

Supports CEL

Additional JSON fields to merge into the request body.

body:
  temperature: 0.7
  top_p: 0.9

on_error

enum

default:halt

Behavior when all failover attempts are exhausted.

Supported values

halt (default) - Stop processing and return error to client

continue - Continue to next action in Traffic Policy

on_error: "continue"

total_timeout

string

default:5m

Maximum total time for all failover attempts across all models and keys. Must be specified as a duration string (for example, “2m”, ”90s”).

total_timeout: "2m"

per_request_timeout

string

default:30s

Timeout for a single request to a provider. Must be specified as a duration string (for example, ”45s”, “1m”).

per_request_timeout: "45s"

providers

array

List of AI provider configurations. When empty, all built-in providers are allowed in passthrough mode.

See Provider Configuration below for detailed field definitions.

providers:
  - id: "openai"
    api_keys:
      - value: ${secrets.get('openai', 'key-one')}

only_allow_configured_providers

boolean

default:false

When true, only providers explicitly listed in providers are allowed. Requests to other providers are rejected with an error.

only_allow_configured_providers: true

only_allow_configured_models

boolean

default:false

When true, only models explicitly listed in provider configurations are allowed. Requests for other models are rejected.

only_allow_configured_models: true

model_selection

object

Strategy for selecting model candidates using CEL expressions. The first strategy that returns models wins—subsequent strategies are only used if previous ones return no models.

See Model Selection Strategies for details and CEL Functions Reference for available functions.

model_selection:
  strategy:
    - "ai.models.filter(m, m.provider_id == 'openai')"
    - "ai.models"

api_key_selection

object

Strategy for selecting API keys using CEL expressions. Enables intelligent key selection based on metrics like quota usage and error rates.

When not specified, keys are tried in the order listed. See CEL Functions Reference for available functions.

api_key_selection:
  strategy:
    - "ai.keys.filter(k, k.quota.remaining_requests > 100)"
    - "ai.keys.filter(k, k.error_rate.rate_limit < 0.1)"
    - "ai.keys"

Provider configuration

Each provider in the providers array supports these fields:

providers[].id

string

Required

Provider identifier. Use built-in names (openai, anthropic, google, deepseek) or custom names for self-hosted providers.

- id: "openai"

providers[].id_aliases

array of strings

Alternative identifiers for this provider. Allows clients to reference the same provider by different names.

- id: "custom-openai"
  id_aliases: ["openai", "gpt"]

providers[].base_url

string

Custom endpoint URL for self-hosted or alternative provider endpoints. Required for custom providers.

- id: "ollama"
  base_url: "https://ollama.internal.company.com"

providers[].display_name

string

Human-readable name for the provider.

providers[].description

string

Description of the provider.

providers[].website

string

Provider’s website URL.

providers[].disabled

boolean

default:false

Temporarily disable this provider without removing its configuration.

- id: "openai"
  disabled: true

providers[].metadata

object

Custom metadata for tracking and organization. Not sent to providers. Available in selection strategies via m.getMetadata().

- id: "openai"
  metadata:
    team: "ml-platform"
    environment: "production"

providers[].api_keys

array

List of API keys for this provider. Keys are tried in order for automatic failover.

api_keys:
  - value: ${secrets.get('openai', 'primary')}
  - value: ${secrets.get('openai', 'backup')}

providers[].models

array

List of model configurations for this provider. See Model Configuration below.

providers[].supported_api_surfaces

array

List of API formats this provider supports. Possible values are openai, anthropic. Default is openai.

supported_api_surfaces:
- format: openai
- format: anthropic

API key configuration

Each API key in providers[].api_keys supports:

providers[].api_keys[].value

string

Required

The API key value. Use secrets.get() for secure storage.

api_keys:
  - value: ${secrets.get('openai', 'key-one')}

Model configuration

Each model in providers[].models supports:

providers[].models[].id

string

Required

Model identifier as recognized by the provider.

models:
  - id: "gpt-4o"

providers[].models[].id_aliases

array of strings

Alternative identifiers for this model.

models:
  - id: "gpt-4o-2024-11-20"
    id_aliases: ["gpt-4o", "gpt-4-latest"]

providers[].models[].author_id

string

ID of the model author (for third-party models).

providers[].models[].display_name

string

Human-readable name for the model.

providers[].models[].description

string

Description of the model.

providers[].models[].disabled

boolean

default:false

Temporarily disable this model.

models:
  - id: "gpt-3.5-turbo"
    disabled: true

providers[].models[].metadata

object

Custom metadata for the model. Available in selection strategies.

models:
  - id: "gpt-4o"
    metadata:
      tier: "premium"
      approved: true

providers[].models[].input_modalities

array of strings

Input types supported by the model (for example, “text”, “image”, “audio”).

providers[].models[].output_modalities

array of strings

Output types supported by the model.

providers[].models[].max_context_window

integer

Maximum context window size in tokens.

providers[].models[].max_output_tokens

integer

Maximum output tokens the model can generate.

providers[].models[].supported_features

array of strings

Features supported by the model (for example, “tool-calling”, “coding”).

Complete example

on_http_request:
  - type: ai-gateway
    config:
      max_input_tokens: 4096
      max_output_tokens: 2048
      total_timeout: "3m"
      per_request_timeout: "30s"
      on_error: "halt"
      only_allow_configured_providers: true
      only_allow_configured_models: true
      
      providers:
        - id: "openai"
          metadata:
            team: "ml"
          api_keys:
            - value: ${secrets.get('openai', 'primary')}
            - value: ${secrets.get('openai', 'backup')}
            - value: ${secrets.get('openai', 'emergency')}
          models:
            - id: "gpt-4o"
              metadata:
                approved: true
            - id: "gpt-4o-mini"
              metadata:
                approved: true
        
        - id: "anthropic"
          api_keys:
            - value: ${secrets.get('anthropic', 'key')}
          models:
            - id: "claude-3-5-sonnet-20241022"
        
        - id: "ollama-internal"
          base_url: "https://ollama.company.internal"
          models:
            - id: "llama3-70b"
      
      model_selection:
        strategy:
          - "ai.models.filter(m, m.provider_id == 'openai')"
          - "ai.models.filter(m, m.provider_id == 'anthropic')"
          - "ai.models.filter(m, m.provider_id == 'ollama-internal')"
      
      api_key_selection:
        strategy:
          # Prefer keys with remaining quota
          - "ai.keys.filter(k, k.quota.remaining_requests > 100)"
          # Fall back to keys with low error rates
          - "ai.keys.filter(k, k.error_rate.rate_limit < 0.1)"
          # Fall back to all keys
          - "ai.keys"

SDKs

Concepts

Guides

Custom Providers

Observability

Examples

Reference

Supported phases

Type

Basic structure

Configuration fields

Provider configuration

API key configuration

Model configuration

Complete example

SDKs

Concepts

Guides

Custom Providers

Observability

Examples

Reference

​Supported phases

​Type

​Basic structure

​Configuration fields

​Provider configuration

​API key configuration

​Model configuration

​Complete example

Supported phases

Type

Basic structure

Configuration fields

Provider configuration

API key configuration

Model configuration

Complete example