Requirements
Custom providers must:- Expose an supported API - Same request/response format as OpenAI or Anthropic Claude
- Be reachable from ngrok - Either via HTTPS URL or ngrok internal endpoint
- Support the endpoints you need -
/v1/chat/completions,/v1/embeddings,/v1/messagesetc.
URL requirements
Thebase_url configuration has specific requirements based on the URL type:
| URL Type | Scheme | Example |
|---|---|---|
| External HTTPS | https:// only | https://api.custom.com/v1 |
| ngrok internal endpoint | http:// or https:// | https://my-service.internal |
External HTTPS URLs
For publicly accessible services with valid TLS certificates:ngrok Internal Endpoints
For services running behind ngrok (local services, private networks), use internal endpoints:- Route to local services without public exposure
- Connect to other ngrok endpoints in your account
- Use HTTP for services without TLS
Basic configuration
Configuration fields
| Field | Required | Description |
|---|---|---|
id | Yes | Unique identifier for the provider |
base_url | Yes | Base URL for the provider’s API |
supported_api_surfaces | No | API formats this provider supports, default is openai |
api_keys | No | API keys for authentication |
models | No | List of available models |
id_aliases | No | Alternative names for the provider |
metadata | No | Custom metadata for selection strategies |
Defining models
Specify which models your custom provider offers:Model metadata
Add metadata to models for use in selection strategies:Parameter filtering
By default, no parameter filtering occurs for custom providers or models. Every parameter your client sends is forwarded as-is. If your provider or model rejects parameters that clients commonly include, you can opt into filtering using either or both of the mechanisms below.Provider surface allowlist
Usesupported_params on a supported_api_surfaces entry to declare the exact set of parameters the provider accepts for that surface. Any top-level request body parameter not on this list will be removed before the request is forwarded.
logit_bias or parallel_tool_calls will have those fields removed before being forwarded to my-provider.
supported_params only takes effect when the gateway has matched a known surface (chat-completions, responses, or messages). The list must be non-empty to activate filtering.Model-specific denylist
Useunsupported_params on a model to declare parameters that model doesn’t accept. These are removed regardless of what the provider surface allows, and regardless of which surface was matched.
Combining both
The two mechanisms are independent and can be used together. Provider-level filtering (allowlist) runs first, then model-level filtering (denylist) runs second. A common pattern is to setsupported_params at the provider level for all models, then use unsupported_params on individual models that have additional restrictions:
Authentication
With provider API keys
If your service requires authentication:Without provider API keys
Some self-hosted services don’t require authentication:Custom headers
For services requiring non-standard authentication:Timeouts
Self-hosted models can be slower than cloud providers. Adjust timeouts as needed:Restricting access
Allow only your custom provider and block cloud providers:Failover patterns
Custom provider with cloud fallback
Use your self-hosted model as primary with cloud backup:The first strategy that returns models wins. If your custom provider has matching models, only those are tried. OpenAI is only used if no custom provider models match. For cross-provider failover when requests fail, have clients specify multiple models:
models: ["my-provider:llama3", "openai:gpt-4o"].Multiple custom providers
Load balance across multiple self-hosted instances:Troubleshooting
Connection refused
- Verify the
base_urlis correct and reachable - For internal endpoints, ensure the ngrok tunnel is running
- Check firewall rules allow traffic
HTTPS required error
External URLs must use HTTPS. For local services, use ngrok internal endpoints:Authentication errors
- Verify API key is correct in secrets
- Check if the service requires specific headers
- Some services use
Authorization: BearervsApi-Key
Model not found
- Verify the model ID matches exactly what the service expects
- Check if the model is loaded/downloaded on the service
- Some services require specific model name formats
Integration guides
Step-by-step setup instructions for specific platforms:Ollama
Run open-source models locally with Ollama
LM Studio
Desktop app for local model inference
vLLM
High-performance inference server
Azure OpenAI
Microsoft’s OpenAI service