Configuration and setup
Why am I getting “provider not allowed”?
This error occurs whenonly_allow_configured_providers: true is set and you’re trying to use a provider that isn’t explicitly configured.
Solution:
only_allow_configured_providers: false to allow all providers.
Why am I getting “model unknown”?
This error can occur for several reasons:- Model restrictions enabled -
only_allow_configured_models: trueis set and the model isn’t explicitly listed - Missing provider prefix - For unknown models (not in catalog), you must include a provider prefix like
openai:new-model - Provider not configured - The provider for the unknown model isn’t configured or allowed
only_allow_configured_models: false to allow all models.
How do I limit which models users can access?
Use model restrictions:Can I use the gateway with the Vercel AI SDK?
Yes, the gateway is fully compatible:Failover
How does failover work?
When a request fails, the gateway tries the next candidate in its list. This includes failures from:- Timeouts
- HTTP errors (4xx, 5xx)
- Connection failures
- Next API key for the same model/provider
- Next model candidate (from model selection or client
modelsarray)
How long does the gateway try before giving up?
Controlled bytotal_timeout (default: 5 minutes):
Can I disable automatic failover?
Not directly, but you can limit the attempts by:- Configuring only one API key per provider
- Having clients specify a single model
- Setting short timeouts:
Will the gateway failover to a different provider automatically?
Yes, if the client specifies multiple models that span different providers:- Using the
modelsarray:"models": ["openai:gpt-4o", "anthropic:claude-3-5-sonnet-20241022"] - Or using just the model name (like
"model": "gpt-4o") when selection strategies return candidates from multiple providers
Keys and authentication
Do I need to configure API keys in the gateway?
No, it’s optional. By default (passthrough mode), the gateway forwards whatever key your SDK sends. Configure keys in the gateway for:- Key rotation and failover
- Hiding keys from clients
- Key-level metrics
How do I secure my gateway when using server-side API keys?
When you configure API keys in the gateway, your endpoint becomes publicly accessible. Add authorization to protect it:Can I use different keys for different teams?
Yes, configure multiple keys and use metadata:How do I rotate API keys?
Add the new key, deploy, then remove the old key:Performance and costs
Does the gateway add latency?
Yes, minimal overhead (~10-15ms) for parsing, token counting, and routing. Provider response time dominates total latency.How are tokens counted?
Using tiktoken (OpenAI’s tokenizer) for estimation, with provider-reported counts used when available. Token counts are available in metrics and event destinations.Does the gateway charge for token usage?
No, you only pay providers for actual token usage. The gateway itself doesn’t charge per token. Check ngrok’s pricing for gateway usage costs.Can I cache responses to reduce costs?
Currently no. Caching may be added in future versions.Security and privacy
Is my data stored by ngrok?
No, unless Traffic Inspector is enabled. By default:- Request/response bodies processed in-memory only
- Data immediately discarded after processing
- Only metadata (token counts, latencies) retained
Can I use the gateway for sensitive data?
Yes, with considerations:- Disable Traffic Inspector in production
- Review your compliance requirements
- Understand that data still goes to AI providers
How do I redact PII automatically?
Use Traffic Policy’s find-and-replace actions before the AI Gateway action:sse-find-replace in on_event_stream_message. See Modifying Requests and Modifying Responses for details.
Models and providers
Which providers are supported?
Built-in support for:- OpenAI
- Anthropic
- DeepSeek
- OpenRouter
- Hyperbolic
- Inception Labs
- Inference Net
Can I use self-hosted models?
Yes, you can configure a custom provider:How do I know which models are available?
Built-in providers have pre-configured model catalogs. For custom providers, you must specify models manually.What is ngrok/auto?
ngrok/auto is a special model name that tells the gateway to choose the model based on your model selection strategy:
Can clients use models not in the catalog?
Yes, if a client includes a provider prefix (likeopenai:new-model), the gateway passes the request through to that provider even if the model isn’t in the catalog. This lets clients use new models immediately.
To restrict this behavior, use a model selection strategy:
m.known field is false for pass-through models not in the catalog.
Why isn’t my custom model working?
Ensure:- Your endpoint is OpenAI-compatible
- The model name is configured correctly
- Authentication is set up
- The endpoint is reachable from ngrok
Monitoring and debugging
How do I view request metrics?
View in the ngrok dashboard:- Navigate to your AI Gateway endpoint
- Click “Metrics” tab
Why are my requests failing silently?
Check:- Traffic Inspector (if enabled) for error details
- Event destinations for error logs
- Provider-level error metrics
How do I debug which provider was used?
Enable event destinations at the endpoint level to stream detailed request logs. Configure event destinations in your ngrok endpoint configuration (not in the ai-gateway action config).Can I see individual request/response bodies?
Yes, enable Traffic Inspector in development (not recommended for production with sensitive data).Advanced usage
Can I implement custom routing logic?
Yes, using CEL expressions in model selection strategies:How do I prioritize certain models or providers?
Use selection strategies to define preference order:Strategies control which models are considered, not failover order. For cross-provider failover when requests fail, have clients specify multiple models:
models: ["openai:gpt-4o", "anthropic:claude-3-5-sonnet-20241022"].Can I route based on request content?
Model selection is based on model/provider, but you can use Traffic Policy expressions to route different requests to different configurations:Does the gateway support streaming?
Yes, streaming is fully supported. The gateway forwards SSE streams transparently.Troubleshooting
The gateway is timing out
Increase timeout values:I’m getting rate limited even with multiple keys
Ensure keys are properly configured with multiple keys for automatic failover:Failover isn’t working
For key failover (same provider, different keys):- Configure multiple keys for the provider
- Keys are tried in order when one fails
- Client must specify fallback models using the
modelsarray - Or the same model must be available from multiple providers
- Configure multiple strategies in
model_selection - Strategies are tried in order until one returns models
How do I get support?
- Check ngrok documentation
- Contact ngrok support
- Join ngrok community forums
See also
- Quickstart - Get started quickly
- Configuration Schema - All configuration options
- Error Handling - Understanding failures
- Examples - Common use cases