Configure multiple provider keys per provider to automatically failover when keys hit rate limits or encounter errors. This works with both attached provider keys (recommended) and Traffic Policy keys (custom providers only).Documentation Index
Fetch the complete documentation index at: https://ngrok.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
Basic example
How it works
When a request fails with the first key, the gateway automatically tries the next key:Benefits
- No downtime when hitting rate limits
- Automatic failover without manual intervention
- Load distribution across multiple billing accounts
- Increased capacity by combining quotas
Three-key failover
Add more keys for additional resilience:Multiple providers with multiple keys
Combine multi-key and multi-provider failover for maximum resilience:For cross-provider failover, clients must specify multiple models in the request or use a model selection strategy. Providers are tried in alphabetical order by default; use
model_selection.strategy to specify a custom order.Real-world scenario
High-traffic production application:Best practices
- Use at least 2-3 keys per provider for reliability
- Order keys by capacity - highest quota first
- Use different billing accounts for true isolation
- Monitor usage to identify when keys need rotation
- Rotate keys regularly for security
See also
- Multi-Provider Failover - Failover across providers
- Attaching Provider Keys: Attach and manage provider keys on your AI Gateway API Key
- Managing Provider API Keys: Key rotation and secrets for custom providers
- Configuring Providers: Full provider setup