Multi-Provider Failover

Configure multiple providers for automatic failover when your primary provider experiences issues.

Basic example

on_http_request:
  - type: ai-gateway
    config:
      only_allow_configured_providers: true

      providers:
        - id: openai
          api_keys:
            - value: ${secrets.get('openai', 'key-one')}
        - id: anthropic
          api_keys:
            - value: ${secrets.get('anthropic', 'key')}

How it works

When the primary provider fails, the gateway automatically tries the next provider:

Request arrives for compatible models
Try OpenAI → Timeout
Automatically try Anthropic → Success ✓

Important: The client must specify compatible models for cross-provider failover to work:

const res = await openai.chat.completions.create({
  model: "gpt-4o",
  models: ["claude-3-5-sonnet-20241022"],  // Fallback models
  messages: [{ role: "user", content: "Hello!" }]
});

Three-provider setup

Add multiple providers for maximum reliability:

on_http_request:
  - type: ai-gateway
    config:
      only_allow_configured_providers: true

      providers:
        - id: openai
          api_keys:
            - value: ${secrets.get('openai', 'key')}
        
        - id: anthropic
          api_keys:
            - value: ${secrets.get('anthropic', 'key')}
        
        - id: google
          api_keys:
            - value: ${secrets.get('google', 'key')}

Provider order

Providers are tried in alphabetical order, not the order they are configured, to control order use the model_selection.strategy to specify the order:

on_http_request:
  - type: ai-gateway
    config:
      only_allow_configured_providers: true

      providers:
        - id: openai
        - id: anthropic
        - id: google
      
      model_selection:
        strategy:
          - "ai.models.filter(m, m.provider_id == 'openai')"
          - "ai.models.filter(m, m.provider_id == 'anthropic')"
          - "ai.models.filter(m, m.provider_id == 'google')"

Combining multi-key and multi-provider

Maximum resilience with multiple keys per provider and explicit ordering:

on_http_request:
  - type: ai-gateway
    config:
      only_allow_configured_providers: true

      providers:
        - id: openai
          api_keys:
            - value: ${secrets.get('openai', 'key-one')}
            - value: ${secrets.get('openai', 'key-two')}
        
        - id: anthropic
          api_keys:
            - value: ${secrets.get('anthropic', 'key-one')}
            - value: ${secrets.get('anthropic', 'key-two')}

      model_selection:
        strategy:
          # Try OpenAI models first
          - "ai.models.filter(m, m.provider_id == 'openai')"
          # Then fall back to Anthropic models
          - "ai.models.filter(m, m.provider_id == 'anthropic')"

Failover cascade:

openai/key-one → Rate limited
openai/key-two → Success ✓

If both OpenAI keys fail:
anthropic/key-one → Success ✓

Note: The model_selection.strategy ensures providers are tried in the specified order, not alphabetically. The only_allow_configured_providers option restricts requests to only the configured providers.

Performance-based selection

Use model selection strategies to prefer providers based on metrics:

on_http_request:
  - type: ai-gateway
    config:
      only_allow_configured_providers: true

      providers:
        - id: openai
        - id: anthropic
        - id: google
      
      model_selection:
        strategy:
          # Prefer low-latency models
          - "ai.models.filter(m, m.metrics.global.latency.upstream_ms_p95 < 2000)"
          # Prefer reliable models
          - "ai.models.filter(m, m.metrics.global.error_rate.total < 0.02)"
          # Fall back to any model
          - "ai.models"

Regional failover

For providers that offer regional availability, you can use the custom providers feature to add specific regions:

on_http_request:
  - type: ai-gateway
    config:
      only_allow_configured_providers: true

      providers:
        - id: openai
          api_keys:
            - value: ${secrets.get('openai', 'us-east')}
          metadata:
            region: "us-east"
        
        - id: openai-eu
          base_url: "https://eu.api.openai.com"
          id_aliases: ["openai"]
          api_keys:
            - value: ${secrets.get('openai', 'eu-west')}
          metadata:
            region: "eu-west"
      
      model_selection:
        strategy:
          - "ai.models.filter(m, m.metadata.region == 'us-east')"
          - "ai.models.filter(m, m.metadata.region == 'eu-west')"

Cost optimization

Prefer cheaper providers with fallback to premium:

on_http_request:
  - type: ai-gateway
    config:
      only_allow_configured_providers: true

      providers:
        - id: deepseek      # Cost-effective primary
        - id: openai        # Premium fallback
        - id: anthropic     # Alternative premium

      model_selection:
        strategy:
          # Prefer mini/turbo models
          - "ai.models.filter(m, m.id.contains('mini') || m.id.contains('turbo'))"
          # Fall back to any model
          - "ai.models"

Real-world production example

Enterprise setup with multiple providers:

on_http_request:
  - type: ai-gateway
    config:
      only_allow_configured_providers: true
      per_request_timeout: "45s"
      total_timeout: "3m"
      on_error: "halt"
      
      providers:
        # Primary: OpenAI with 3 keys
        - id: openai
          api_keys:
            - value: ${secrets.get('openai', 'prod-1')}
            - value: ${secrets.get('openai', 'prod-2')}
            - value: ${secrets.get('openai', 'prod-3')}
          metadata:
            tier: "primary"
        
        # Secondary: Anthropic with 2 keys
        - id: anthropic
          api_keys:
            - value: ${secrets.get('anthropic', 'prod-1')}
            - value: ${secrets.get('anthropic', 'prod-2')}
          metadata:
            tier: "secondary"
        
        # Tertiary: Google with 1 key
        - id: google
          api_keys:
            - value: ${secrets.get('google', 'prod')}
          metadata:
            tier: "tertiary"
      
      model_selection:
        strategy:
          - "ai.models.filter(m, m.metrics.global.error_rate.total < 0.05)"
          - "ai.models.filter(m, m.metrics.global.latency.upstream_ms_p95 < 3000)"
          - "ai.models"

This provides:

6 total provider API keys across 3 providers
Automatic failover at both key and provider levels
Performance-based selection
Up to 3 minutes of retry attempts

Client configuration

For cross-provider failover, clients must specify multiple models:

// TypeScript
const res = await openai.chat.completions.create({
  model: "gpt-4o",
  models: [
    "claude-3-5-sonnet-20241022",
    "gemini-2.5-pro"
  ],
  messages: [...]
});

# Python
response = openai.ChatCompletion.create(
    model="gpt-4o",
    models=["claude-3-5-sonnet-20241022", "gemini-2.5-pro"],
    messages=[...]
)

Best practices

Configure at least 2 providers for reliability
Order providers by preference (fastest/cheapest first)
Use multiple keys per provider for key-level failover
Monitor provider metrics to optimize order
Test failover regularly to ensure it works
Set appropriate timeouts to fail fast

SDKs

Concepts

Guides

Custom Providers

Observability

Examples

Reference

Basic example

How it works

Three-provider setup

Provider order

Combining multi-key and multi-provider

Performance-based selection

Regional failover

Cost optimization

Real-world production example

Client configuration

Best practices

See also

SDKs

Concepts

Guides

Custom Providers

Observability

Examples

Reference

​Basic example

​How it works

​Three-provider setup

​Provider order

​Combining multi-key and multi-provider

​Performance-based selection

​Regional failover

​Cost optimization

​Real-world production example

​Client configuration

​Best practices

​See also

Basic example

How it works

Three-provider setup

Provider order

Combining multi-key and multi-provider

Performance-based selection

Regional failover

Cost optimization

Real-world production example

Client configuration

Best practices

See also