Before you start
Before creating an AI Gateway endpoint, you’ll need:- An ngrok account
- Your ngrok auth token
- An ngrok API key (for API-based setup)
Option 1: Cloud Endpoint
Cloud Endpoints are persistent, always-on endpoints managed via the dashboard or API. They’re ideal for production deployments where you want centralized control.Using the dashboard
- Go to dashboard.ngrok.com/endpoints/new/cloud to create a new Cloud Endpoint
- Choose a URL for your endpoint (for example,
https://your-ai-subdomain.ngrok.app) - Add the following Traffic Policy:
- Click Save to create your endpoint
Authorization header.
After saving, your endpoint will appear in the AI Gateways section of the dashboard since it uses the
ai-gateway action.Using the API
Create a Traffic Policy file:traffic-policy.yaml
Option 2: Agent endpoint
Agent Endpoints are created by the ngrok agent and exist for the lifetime of the agent process. They’re ideal for development, testing, or deployments where the endpoint should go offline when your application stops.Using the Agent CLI
Create a Traffic Policy file:traffic-policy.yaml
Using Agent configuration file
Add the Traffic Policy to your ngrok configuration file:ngrok.yml
Option 3: Kubernetes operator
If you’re using the ngrok Kubernetes Operator, create an AI Gateway endpoint using a custom resource:ai-gateway.yaml
Testing your endpoint
Once your endpoint is created, test it with a simple request. In passthrough mode, provide your provider API key:Configuring server-side API keys
Once your gateway is working, you can configure server-side API keys for failover and to avoid exposing provider keys to clients. Store your keys in Vaults & Secrets, then reference them in your Traffic Policy:- The gateway uses your server-side keys instead of requiring clients to provide their own
- If one key hits rate limits, the gateway automatically tries the next one
- If OpenAI fails entirely, requests can fail over to Anthropic
When to use each option
| Option | Best For |
|---|---|
| Cloud Endpoint (Dashboard) | Quick setup, non-technical users, prototyping |
| Cloud Endpoint (API) | Production deployments, infrastructure-as-code, CI/CD |
| Agent Endpoint | Development, testing, temporary deployments |
| Kubernetes Operator | Kubernetes-native deployments, GitOps workflows |
Next steps
- Configuring Providers - Set up providers, models, and API keys
- Model Selection Strategies - Define custom routing logic
- SDK Integration - Connect your application