Skip to main content
Model selection strategies let you customize how the AI Gateway chooses which model to use for requests. Using CEL (Common Expression Language) expressions, you can filter, sort, and prioritize models based on performance metrics, cost, features, and custom metadata.

When to use selection strategies

Selection strategies are useful when:
  • Clients use ngrok/auto or omit the model field
  • You want to prefer certain models over others
  • You need performance-based routing (lowest latency, lowest error rate)
  • You want cost optimization (cheapest models first)
  • You need feature-based filtering (only models with tool calling)

Basic configuration

Define strategies in your Traffic Policy:
on_http_request:
  - type: ai-gateway
    config:
      model_selection:
        strategy:
          - "ai.models.filter(m, m.provider_id == 'openai')"
          - "ai.models"

How strategies execute

Strategies execute in order until one returns at least one model:
model_selection:
  strategy:
    - "ai.models.filter(m, m.provider_id == 'openai')"   # Try first
    - "ai.models.filter(m, m.provider_id == 'anthropic')" # Try if first returns empty
    - "ai.models"                                          # Fallback to all models
If a strategy returns no models, the gateway tries the next strategy. Always include a fallback strategy (like ai.models) to ensure requests don’t fail.

Client model priority

When clients specify models in their request, selection strategies act as filters only:
  • Strategies can filter OUT models but cannot ADD models the client didn’t request
  • The client’s preferred order is preserved (model field first, then models array entries)
  • If strategies filter out all client-specified models, the request fails with an error
This ensures clients get predictable behavior—if they ask for gpt-4o, they won’t unexpectedly get claude-3-5-sonnet even if your strategy prefers Anthropic.

Common patterns

Provider priority

Prefer a specific provider, with fallbacks:
model_selection:
  strategy:
    - "ai.models.filter(m, m.provider_id == 'openai')"
    - "ai.models.filter(m, m.provider_id == 'anthropic')"
    - "ai.models"

Cost optimization

Prefer cheaper models:
model_selection:
  strategy:
    - "ai.models.filter(m, m.id.contains('mini'))"
    - "ai.models.sortBy('price')"
    - "ai.models"

Performance-based

Prefer low-latency, reliable models:
model_selection:
  strategy:
    - "ai.models.filter(m, m.metrics.global.latency.upstream_ms_p95 < 1500 && m.metrics.global.error_rate.total < 0.01)"
    - "ai.models"

Feature-based

Only models with specific capabilities:
model_selection:
  strategy:
    - "ai.models.filter(m, 'tool-calling' in m.supported_features)"
    - "ai.models"

Geographic filtering

Only models in specific regions:
model_selection:
  strategy:
    - "ai.models.inCountryCode('US')"
    - "ai.models"

Metadata-based

Use custom metadata for filtering:
providers:
  - id: "openai"
    models:
      - id: "gpt-4o"
        metadata:
          tier: "premium"
          approved: true
      - id: "gpt-4o-mini"
        metadata:
          tier: "budget"
          approved: true

model_selection:
  strategy:
    - "ai.models.filter(m, m.metadata.tier == 'budget')"
    - "ai.models"

Known models only

Reject unknown pass-through models (only allow models in the catalog):
model_selection:
  strategy:
    - "ai.models.filter(m, m.known)"
This prevents clients from requesting arbitrary model names that get passed through to providers.

Available functions

Filtering

FunctionDescriptionExample
filter(predicate)Filter by conditionai.models.filter(m, m.provider_id == 'openai')
only(ids)Include only specific modelsai.models.only(['gpt-4o', 'claude-3-5-sonnet'])
ignore(ids)Exclude specific modelsai.models.ignore(['gpt-3.5-turbo'])
onlyProviders(ids)Include only specific providersai.models.onlyProviders(['openai', 'anthropic'])
ignoreProviders(ids)Exclude specific providersai.models.ignoreProviders(['google'])
onlyAuthors(ids)Include only specific authorsai.models.onlyAuthors(['openai'])
ignoreAuthors(ids)Exclude specific authorsai.models.ignoreAuthors(['meta'])

Geographic

FunctionDescriptionExample
inRegion(code)Models available in regionai.models.inRegion('us-east-1')
inCountryCode(code)Models available in countryai.models.inCountryCode('US')

Cost

FunctionDescriptionExample
underCost(type, max)Models under price thresholdai.models.underCost('text.input', 1.0)
sortBy('price')Sort by price (cheapest first)ai.models.sortBy('price')

Selection

FunctionDescriptionExample
random()Select one random modelai.models.random()
randomize()Shuffle model orderai.models.randomize()
[index]Select by indexai.models.filter(...)[0]

Lookup

FunctionDescriptionExample
get(providerId, modelId)Get specific modelai.models.get('openai', 'gpt-4o')
getMetadata(key)Get metadata valuem.getMetadata('tier')

Available model variables

When using filter(), these variables are available on each model m:
VariableTypeDescription
m.idstringModel identifier
m.provider_idstringProvider identifier
m.author_idstringModel author identifier
m.display_namestringHuman-readable name
m.knownbooleanWhether this model is in the catalog (false for unknown pass-through models)
m.custombooleanWhether this is a custom-configured model
m.metadataobjectCustom metadata from config
m.input_modalitieslistInput types (“text”, “image”, etc.)
m.output_modalitieslistOutput types
m.supported_featureslistFeatures (“tool-calling”, “coding”, etc.)
m.max_context_windownumberMaximum context window size
m.max_output_tokensnumberMaximum output tokens

Metrics variables

Access performance metrics via m.metrics:
# Global metrics (all ngrok traffic)
m.metrics.global.request_count
m.metrics.global.latency.upstream_ms_avg
m.metrics.global.latency.upstream_ms_p95
m.metrics.global.error_rate.total
m.metrics.global.error_rate.timeout
m.metrics.global.error_rate.rate_limit

# Account-scoped metrics
m.metrics.account.request_count
m.metrics.account.token.provider_input
m.metrics.account.token.provider_output

# Endpoint-scoped metrics
m.metrics.endpoint.request_count
m.metrics.endpoint.latency.upstream_ms_avg

CEL operators

OperatorDescriptionExample
==, !=Equalitym.provider_id == 'openai'
<, >, <=, >=Comparisonm.metrics.global.latency.upstream_ms_avg < 1000
&&Logical ANDm.provider_id == 'openai' && m.id.contains('gpt-4')
||Logical ORm.id.contains('mini') || m.id.contains('turbo')
!Logical NOT!m.custom
inList membership'image' in m.input_modalities

String functions

FunctionDescriptionExample
contains()Check substringm.id.contains('gpt')
startsWith()Check prefixm.id.startsWith('gpt-4')
endsWith()Check suffixm.id.endsWith('-turbo')

Complete example

on_http_request:
  - type: ai-gateway
    config:
      providers:
        - id: "openai"
          metadata:
            cost_per_1k_input: 0.03
        - id: "anthropic"
        - id: "google"
      
      model_selection:
        strategy:
          # Prefer fast, reliable OpenAI models
          - "ai.models.filter(m, m.provider_id == 'openai' && m.metrics.global.error_rate.total < 0.01)"
          # Fall back to any fast model
          - "ai.models.filter(m, m.metrics.global.latency.upstream_ms_p95 < 2000)"
          # Fall back to any model
          - "ai.models"

Next steps