Skip to main content
The AI Gateway can route requests to any OpenAI-compatible endpoint beyond the built-in providers. This enables you to use self-hosted models, private deployments, or alternative AI services.

Requirements

Custom providers must:
  1. Expose an OpenAI-compatible API - Same request/response format as OpenAI
  2. Be reachable from ngrok - Either via HTTPS URL or ngrok internal endpoint
  3. Support the endpoints you need - /v1/chat/completions, /v1/embeddings, etc.

URL requirements

The base_url configuration has specific requirements based on the URL type:
URL TypeSchemeExample
External HTTPShttps:// onlyhttps://api.custom.com/v1
ngrok internal endpointhttp:// or https://https://my-service.internal
HTTP URLs are only allowed for ngrok .internal endpoints. All external URLs must use HTTPS.

External HTTPS URLs

For publicly accessible services with valid TLS certificates:
providers:
  - id: "my-provider"
    base_url: "https://api.my-service.com/v1"
    api_keys:
      - value: ${secrets.get('my-provider', 'api-key')}

ngrok Internal Endpoints

For services running behind ngrok (local services, private networks), use internal endpoints:
providers:
  - id: "my-local-service"
    base_url: "https://my-service.internal"
    # No API key needed if your service doesn't require one
Internal endpoints let you:
  • Route to local services without public exposure
  • Connect to other ngrok endpoints in your account
  • Use HTTP for services without TLS

Basic configuration

on_http_request:
  - type: ai-gateway
    config:
      providers:
        - id: "my-custom-provider"
          base_url: "https://my-ai-service.example.com/v1"
          api_keys:
            - value: ${secrets.get('custom', 'api-key')}
          models:
            - id: "my-model"

Configuration fields

FieldRequiredDescription
idYesUnique identifier for the provider
base_urlYesBase URL for the provider’s API
api_keysNoAPI keys for authentication
modelsNoList of available models
id_aliasesNoAlternative names for the provider
metadataNoCustom metadata for selection strategies

Defining models

Specify which models your custom provider offers:
providers:
  - id: "my-provider"
    base_url: "https://my-service.internal"
    models:
      - id: "llama3-70b"
      - id: "mistral-7b"
      - id: "codellama-34b"

Model metadata

Add metadata to models for use in selection strategies:
providers:
  - id: "my-provider"
    base_url: "https://my-service.internal"
    models:
      - id: "llama3-70b"
        metadata:
          gpu: "A100"
          quantization: "none"
          max_context: 8192

Authentication

With provider API keys

If your service requires authentication:
providers:
  - id: "my-provider"
    base_url: "https://my-service.example.com"
    api_keys:
      - value: ${secrets.get('my-provider', 'api-key')}

Without provider API keys

Some self-hosted services don’t require authentication:
providers:
  - id: "my-provider"
    base_url: "https://my-service.internal"
    # No api_keys needed

Custom headers

For services requiring non-standard authentication:
on_http_request:
  - type: ai-gateway
    config:
      headers:
        X-Custom-Auth: ${secrets.get('my-provider', 'token')}
      providers:
        - id: "my-provider"
          base_url: "https://my-service.example.com"

Timeouts

Self-hosted models can be slower than cloud providers. Adjust timeouts as needed:
on_http_request:
  - type: ai-gateway
    config:
      per_request_timeout: "120s"
      total_timeout: "5m"
      providers:
        - id: "my-provider"
          base_url: "https://my-service.internal"

Restricting access

Allow only your custom provider and block cloud providers:
on_http_request:
  - type: ai-gateway
    config:
      only_allow_configured_providers: true
      only_allow_configured_models: true
      providers:
        - id: "my-provider"
          base_url: "https://my-service.internal"
          models:
            - id: "llama3"
            - id: "mistral"

Failover patterns

Custom provider with cloud fallback

Use your self-hosted model as primary with cloud backup:
on_http_request:
  - type: ai-gateway
    config:
      providers:
        - id: "my-provider"
          base_url: "https://my-service.internal"
          models:
            - id: "llama3"
        
        - id: "openai"
          api_keys:
            - value: ${secrets.get('openai', 'api-key')}
      
      model_selection:
        strategy:
          - "ai.models.filter(m, m.provider_id == 'my-provider')"
          - "ai.models.filter(m, m.provider_id == 'openai')"
The first strategy that returns models wins. If your custom provider has matching models, only those are tried. OpenAI is only used if no custom provider models match. For cross-provider failover when requests fail, have clients specify multiple models: models: ["my-provider:llama3", "openai:gpt-4o"].

Multiple custom providers

Load balance across multiple self-hosted instances:
on_http_request:
  - type: ai-gateway
    config:
      only_allow_configured_providers: true
      providers:
        - id: "inference-1"
          base_url: "https://inference-1.internal"
          models:
            - id: "llama3"
        
        - id: "inference-2"
          base_url: "https://inference-2.internal"
          models:
            - id: "llama3"
      
      model_selection:
        strategy:
          # Randomize across configured providers only
          - "ai.models.random()"

Troubleshooting

Connection refused

  • Verify the base_url is correct and reachable
  • For internal endpoints, ensure the ngrok tunnel is running
  • Check firewall rules allow traffic

HTTPS required error

External URLs must use HTTPS. For local services, use ngrok internal endpoints:
# Expose local service with internal endpoint
ngrok http 8000 --url https://my-service.internal

Authentication errors

  • Verify API key is correct in secrets
  • Check if the service requires specific headers
  • Some services use Authorization: Bearer vs Api-Key

Model not found

  • Verify the model ID matches exactly what the service expects
  • Check if the model is loaded/downloaded on the service
  • Some services require specific model name formats

Integration guides

Step-by-step setup instructions for specific platforms: