Skip to main content

AI Gateway API Keys

What are AI Gateway API Keys?

AI Gateway API Keys are keys you create to authenticate requests to your AI Gateway. When you use one, ngrok handles the provider API keys for you—you don’t need accounts with OpenAI or Anthropic. See AI Gateway API Keys for details.

How are AI Gateway API Keys different from ngrok API keys?

AI Gateway API Keys authenticate requests to AI Gateway endpoints—they’re used by your application to make AI requests. ngrok API Keys authenticate to the ngrok management API (api.ngrok.com)—they’re used for managing ngrok resources like endpoints, domains, and secrets.

Can I create multiple AI Gateway API Keys?

Yes. Create separate keys for each client or application. This enables independent revocation and per-client usage tracking.

What happens if I lose my API key?

The token is only shown once at creation time. If you lose it, delete the old key and create a new one.

Are AI Gateway API Keys scoped to specific endpoints?

Yes. Each key is scoped to a specific AI Gateway endpoint via the required endpoint_id field. A key can only be used with the endpoint it was created for.

Which providers work with AI Gateway API Keys?

Currently OpenAI and Anthropic are supported with ngrok-managed keys. For other providers (Google, DeepSeek, etc.), use Bring Your Own Keys.

Credits & billing

Which plans support AI Gateway?

AI Gateway requires the Pay-as-you-go plan. Free and Hobbyist plans cannot use it.

What are AI Gateway credits?

AI Gateway credits are a prepaid balance that funds AI Gateway API Key usage. Credits cover both the processing fee and upstream provider costs. BYOK does not require credits—you pay providers directly. See Credits.

What’s the minimum credit purchase?

$5.00. Purchase credits directly from the dashboard.

Do credits expire?

Yes. Credits expire 365 days after purchase.

What happens when credits run out?

AI Gateway API Key requests stop working. Clients receive HTTP 403 with error ERR_NGROK_4026. BYOK requests are not affected. Purchase more credits to restore access.

General

Does ngrok provide AI provider API keys?

Yes. When you use AI Gateway API Keys, ngrok manages the provider keys for you—no provider accounts needed. Currently supported for OpenAI and Anthropic. For other providers, you bring your own keys.

Configuration and setup

Why am I getting “provider not allowed”?

This error occurs when only_allow_configured_providers: true is set and you’re trying to use a provider that isn’t explicitly configured. Solution:
only_allow_configured_providers: true
providers:
  - id: openai      # Add your provider here
    api_keys:
      - value: ${secrets.get('openai', 'key')}
Or set only_allow_configured_providers: false to allow all providers.

Why am I getting “model unknown”?

This error can occur for several reasons:
  1. Model restrictions enabled - only_allow_configured_models: true is set and the model isn’t explicitly listed
  2. Missing provider prefix - For unknown models (not in catalog), you must include a provider prefix like openai:new-model
  3. Provider not configured - The provider for the unknown model isn’t configured or allowed
Solutions: Add the model to your configuration:
only_allow_configured_models: true
providers:
  - id: openai
    models:
      - id: "gpt-4o"      # Add your model here
      - id: "gpt-4o-mini"
Or use a provider prefix for unknown models:
{
  "model": "openai:gpt-5-preview",
  "messages": [{"role": "user", "content": "Hello"}]
}
Or set only_allow_configured_models: false to allow all models.

How do I limit which models users can access?

Use model restrictions:
only_allow_configured_models: true
providers:
  - id: openai
    models:
      - id: "gpt-4o-mini"  # Only allow specific models

Can I use the gateway with the Vercel AI SDK?

Yes:
import { createOpenAI } from "@ai-sdk/openai"

const openai = createOpenAI({
  baseURL: "https://your-ai-gateway.ngrok.app/v1",
})
See SDK Integration for details.

Failover

How does failover work?

When a request fails, the gateway tries the next candidate in its list. This includes failures from:
  • Timeouts
  • HTTP errors (4xx, 5xx)
  • Connection failures
Failover order:
  1. Next API key for the same model/provider
  2. Next model candidate (from model selection or client models array)
The gateway never retries the same model/key combination—it always moves to the next candidate. See Error Handling for details.

How long does the gateway try before giving up?

Controlled by total_timeout (default: 6 minutes):
total_timeout: "3m"  # Allow up to 3 minutes for all attempts

Can I disable automatic failover?

Not directly, but you can limit the attempts by:
  • Configuring only one API key per provider
  • Having clients specify a single model
  • Setting short timeouts:
per_request_timeout: "5s"
total_timeout: "10s"

Will the gateway failover to a different provider automatically?

Yes, if the client specifies multiple models that span different providers:
  1. Using the models array: "models": ["openai:gpt-4o", "anthropic:claude-3-5-sonnet-20241022"]
  2. Or using just the model name (like "model": "gpt-4o") when selection strategies return candidates from multiple providers
The gateway automatically tries alternative providers when the primary fails. You can also specify fallback models in the request body:
{
  "model": "gpt-4o",
  "models": ["claude-3-5-sonnet-20241022"],
  "messages": [{"role": "user", "content": "Hello"}]
}

Keys and authentication

What’s the easiest way to authenticate requests?

Use AI Gateway API Keys. Create a key, use it in your SDK—authorization is built-in—no Traffic Policy configuration needed. See Securing Your Gateway.

Do I need to configure provider API keys in the gateway?

Not if you’re using AI Gateway API Keys—ngrok handles provider keys for you. If you want to use your own provider keys, see Bring Your Own Keys.

How do I secure my gateway when using my own provider keys (BYOK)?

When you configure your own provider keys in the gateway, your endpoint becomes publicly accessible. You need to add your own authorization layer. See Securing Endpoints (BYOK) for complete examples.

Can I use different keys for different teams?

Yes. Create separate AI Gateway API Keys for each team or client. Each key has its own description, metadata, and last_used tracking. For BYOK, configure multiple provider keys and use separate endpoints or providers for team-based tracking.

How do I rotate API keys?

AI Gateway API Keys: Delete the old key and create a new one. Update your clients with the new key. BYOK provider keys: Add the new key, deploy, then remove the old key. See Managing Provider Keys for details.

Performance and costs

Does the gateway add latency?

Yes, minimal overhead (~10-15ms) for parsing, token counting, and routing. Provider response time dominates total latency.

How are tokens counted?

Using tiktoken (OpenAI’s tokenizer) for estimation, with provider-reported counts used when available. Token counts are available in metrics and event destinations.

Does the gateway charge for token usage?

When using AI Gateway API Keys, usage is deducted from your prepaid credits. Credits cover both ngrok’s processing fee and the upstream provider cost (ngrok pays OpenAI/Anthropic on your behalf). When using BYOK, there are no credit charges—you pay providers directly.

Can I cache responses to reduce costs?

Currently no. Caching may be added in future versions.

Security and privacy

Is my data stored by ngrok?

No, unless Traffic Inspector is enabled. By default:
  • Request/response bodies processed in-memory only
  • Data immediately discarded after processing
  • Only metadata (token counts, latencies) retained
See Traffic Inspector for details.

Can I use the gateway for sensitive data?

Yes, with considerations:
  • Disable Traffic Inspector in production
  • Review your compliance requirements
  • Understand that data still goes to AI providers

How do I redact PII automatically?

Use Traffic Policy’s find-and-replace actions before the AI Gateway action:
on_http_request:
  - actions:
      - type: request-body-find-replace
        config:
          replacements:
            - from: "\\b\\d{3}-\\d{2}-\\d{4}\\b"  # SSN pattern
              to: "[REDACTED]"
      - type: ai-gateway
        config:
          # …
For streaming responses, use sse-find-replace in on_event_stream_message. See Modifying Requests and Modifying Responses for details.

Models and providers

Which providers are supported?

Built-in support for:
  • OpenAI
  • Anthropic
  • Groq
  • Google
  • DeepSeek
  • OpenRouter
  • Hyperbolic
  • InceptionLabs
  • Inference.net
Plus any self-hosted OpenAI or Anthropic Claude API compatible endpoint. See the Model Catalog for the full list.

Can I use self-hosted models?

Yes, you can configure a custom provider:
providers:
  - id: ollama
    base_url: "https://ollama.internal"
    models:
      - id: "llama3"
See Custom Providers for details.

How do I know which models are available?

Built-in providers have pre-configured model catalogs. For custom providers, you must specify models manually.

What is ngrok/auto?

ngrok/auto is a special model name that tells the gateway to choose the model based on your model selection strategy:
{
  "model": "ngrok/auto",
  "messages": [{"role": "user", "content": "Hello"}]
}
This is equivalent to omitting the model field. Use it when you want the gateway to select the best model based on latency, cost, or other criteria you’ve configured.

Can clients use models not in the catalog?

Yes, if a client includes a provider prefix (like openai:new-model), the gateway passes the request through to that provider even if the model isn’t in the catalog. This lets clients use new models immediately. To restrict this behavior, use a model selection strategy:
model_selection:
  strategy:
    - "ai.models.filter(m, m.known)"  # Only allow catalog models
The m.known field is false for pass-through models not in the catalog.

Why isn’t my custom model working?

Ensure:
  1. Your endpoint is OpenAI or Anthropic Claude API compatible
  2. The model name is configured correctly
  3. Authentication is set up
  4. The endpoint is reachable from ngrok

Monitoring and debugging

How do I view request metrics?

View in the ngrok dashboard:
  1. Navigate to your AI Gateway endpoint
  2. Click “Metrics” tab
Or export to external systems via Log Exporting.

Why are my requests failing silently?

Check:
  1. Traffic Inspector (if enabled) for error details
  2. Event destinations for error logs
  3. Provider-level error metrics

How do I debug which provider was used?

Enable event destinations at the endpoint level to stream detailed request logs. Configure event destinations in your ngrok endpoint configuration (not in the ai-gateway action config).

Can I see individual request/response bodies?

Yes, enable Traffic Inspector in development (not recommended for production with sensitive data).

Advanced usage

Can I implement custom routing logic?

Yes, using CEL expressions in model selection strategies:
model_selection:
  strategy:
    - "ai.models.filter(m, m.metrics.global.latency.upstream_ms_avg < 1000)"
    - "ai.models.random()"
See Model Selection Strategies.

How do I prioritize certain models or providers?

Use selection strategies to define preference order:
model_selection:
  strategy:
    - "ai.models.filter(m, m.provider_id == 'openai')"
    - "ai.models.filter(m, m.provider_id == 'anthropic')"
    - "ai.models"
The first strategy that returns models wins. If OpenAI models exist, only those are tried. Anthropic is only considered if no OpenAI models match.
Strategies control which models are considered, not failover order. For cross-provider failover when requests fail, have clients specify multiple models: models: ["openai:gpt-4o", "anthropic:claude-3-5-sonnet-20241022"].
See Model Selection Strategies.

Can I route based on request content?

Model selection is based on model/provider, but you can use Traffic Policy expressions to route different requests to different configurations:
on_http_request:
  - expressions:
      - req.headers['x-priority'][0] == 'high'
    actions:
      - type: ai-gateway
        config:
          model_selection:
            strategy:
              - "ai.models.filter(m, m.provider_id == 'openai')"
  - actions:
      - type: ai-gateway
        config:
          # Default configuration
You cannot currently route based on the message content itself.

Does the gateway support streaming?

Yes, streaming is fully supported. The gateway forwards SSE streams transparently.

Troubleshooting

The gateway is timing out

Increase timeout values:
per_request_timeout: "60s"
total_timeout: "5m"
Or check provider health metrics.

I’m getting rate limited even with multiple keys

Ensure keys are properly configured with multiple keys for automatic failover:
providers:
  - id: openai
    api_keys:
      - value: ${secrets.get('openai', 'key-one')}
      - value: ${secrets.get('openai', 'key-two')}
Keys are tried in order when the previous key fails.

Failover isn’t working

For key failover (same provider, different keys):
  • Configure multiple keys for the provider
  • Keys are tried in order when one fails
For provider failover (different providers):
  • Client must specify fallback models using the models array
  • Or the same model must be available from multiple providers
{
  "model": "gpt-4o",
  "models": ["claude-3-5-sonnet-20241022"],
  "messages": [{"role": "user", "content": "Hello"}]
}
For model selection failover:
  • Configure multiple strategies in model_selection
  • Strategies are tried in order until one returns models

How do I get support?


See also