Creating Endpoints

This guide shows how to create an AI Gateway endpoint manually. For a simpler, faster setup, follow the dashboard quickstart guide instead.

Before you start

Before creating an AI Gateway endpoint, you’ll need:

An ngrok account
Your ngrok auth token
An ngrok API key (for API-based setup)
API keys from your AI providers (OpenAI, Anthropic, Google, etc.)

Option 1: Cloud Endpoint

Cloud Endpoints are persistent, always-on endpoints managed via the dashboard or API. They’re ideal for production deployments where you want centralized control.

Using the dashboard

Go to dashboard.ngrok.com/endpoints/new/cloud to create a new Cloud Endpoint
Choose a URL for your endpoint (for example, https://your-ai-gateway.ngrok.app)
Add the following Traffic Policy:

on_http_request:
  - type: ai-gateway
    config: {}

Click Save to create your endpoint

With an empty config, the gateway operates in passthrough mode - it routes requests to providers and uses the API key from the client’s Authorization header.

After saving, your endpoint will appear in the AI Gateways section of the dashboard since it uses the ai-gateway action.

Using the API

Create a Traffic Policy file:

traffic-policy.yaml

on_http_request:
  - type: ai-gateway
    config: {}

Create the Cloud Endpoint using the ngrok CLI:

ngrok api endpoints create \
  --api-key $NGROK_API_KEY \
  --url https://your-ai-gateway.ngrok.app \
  --type cloud \
  --bindings public \
  --traffic-policy-file traffic-policy.yaml

Or using cURL:

curl -X POST https://api.ngrok.com/endpoints \
  -H "Authorization: Bearer $NGROK_API_KEY" \
  -H "Content-Type: application/json" \
  -H "Ngrok-Version: 2" \
  -d '{
    "url": "https://your-ai-gateway.ngrok.app",
    "type": "cloud",
    "bindings": ["public"],
    "traffic_policy": "on_http_request:\n  - type: ai-gateway\n    config: {}"
  }'

Option 2: Agent endpoint

Agent Endpoints are created by the ngrok agent and exist for the lifetime of the agent process. They’re ideal for development, testing, or deployments where the endpoint should go offline when your application stops.

Using the Agent CLI

Create a Traffic Policy file:

traffic-policy.yaml

on_http_request:
  - type: ai-gateway
    config: {}

Start the agent with the Traffic Policy:

ngrok http 8080 --url https://your-ai-gateway.ngrok.app --traffic-policy-file traffic-policy.yaml

When using an Agent Endpoint as an AI Gateway, the 8080 port doesn’t matter since the AI Gateway action handles all requests. You can use any port number.

Using Agent configuration file

Add the Traffic Policy to your ngrok configuration file:

ngrok.yml

version: 3
agent:
  authtoken: $NGROK_AUTHTOKEN
endpoints:
  - name: ai-gateway
    url: https://your-ai-gateway.ngrok.app
    upstream:
      url: http://localhost:8080  # Not used when ai-gateway handles requests
    traffic_policy:
      on_http_request:
        - type: ai-gateway
          config: {}

Start the agent:

ngrok start ai-gateway

Option 3: Kubernetes operator

If you’re using the ngrok Kubernetes Operator, create an AI Gateway endpoint using a custom resource:

ai-gateway.yaml

apiVersion: ngrok.k8s.io/v1alpha1
kind: CloudEndpoint
metadata:
  name: ai-gateway
spec:
  url: https://your-ai-gateway.ngrok.app
  trafficPolicy:
    on_http_request:
      - type: ai-gateway
        config: {}

Apply the resource:

kubectl apply -f ai-gateway.yaml

Testing your endpoint

Once your endpoint is created, test it with a simple request. In passthrough mode, provide your provider API key:

curl https://your-ai-gateway.ngrok.app/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Or with the OpenAI SDK:

from openai import OpenAI

client = OpenAI(
    base_url="https://your-ai-gateway.ngrok.app/v1",
    api_key="sk-..."  # Your OpenAI API key
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}]
)

print(response.choices[0].message.content)

Once you configure server-side API keys, clients won’t need to provide their own keys.

Configuring server-side API keys

Once your gateway is working, you can configure server-side API keys for failover and to avoid exposing provider keys to clients. Store your keys in Vaults & Secrets, then reference them in your Traffic Policy:

on_http_request:
  - type: ai-gateway
    config:
      providers:
        - id: openai
          api_keys:
            - value: ${secrets.get('openai', 'primary-key')}
            - value: ${secrets.get('openai', 'backup-key')}
        - id: anthropic
          api_keys:
            - value: ${secrets.get('anthropic', 'api-key')}

With this configuration:

The gateway uses your server-side keys instead of requiring clients to provide their own
If one key hits rate limits, the gateway automatically tries the next one
If OpenAI fails entirely, requests can fail over to Anthropic

When you configure server-side API keys, your endpoint becomes publicly accessible. Anyone with the URL can make requests using your keys. See Securing Your Gateway to add authorization.

See Managing API Keys for detailed instructions on setting up vaults and secrets.

When to use each option

Option	Best For
Cloud Endpoint (Dashboard)	Quick setup, non-technical users, prototyping
Cloud Endpoint (API)	Production deployments, infrastructure-as-code, CI/CD
Agent Endpoint	Development, testing, temporary deployments
Kubernetes Operator	Kubernetes-native deployments, GitOps workflows

Next steps

Configuring Providers - Set up providers, models, and API keys
Model Selection Strategies - Define custom routing logic
SDK Integration - Connect your application

SDKs

Concepts

Guides

Custom Providers

Observability

Examples

Reference

Before you start

Option 1: Cloud Endpoint

Using the dashboard

Using the API

Option 2: Agent endpoint

Using the Agent CLI

Using Agent configuration file

Option 3: Kubernetes operator

Testing your endpoint

Configuring server-side API keys

When to use each option

Next steps

SDKs

Concepts

Guides

Custom Providers

Observability

Examples

Reference

​Before you start

​Option 1: Cloud Endpoint

​Using the dashboard

​Using the API

​Option 2: Agent endpoint

​Using the Agent CLI

​Using Agent configuration file

​Option 3: Kubernetes operator

​Testing your endpoint

​Configuring server-side API keys

​When to use each option

​Next steps

Before you start

Option 1: Cloud Endpoint

Using the dashboard

Using the API

Option 2: Agent endpoint

Using the Agent CLI

Using Agent configuration file

Option 3: Kubernetes operator

Testing your endpoint

Configuring server-side API keys

When to use each option

Next steps