Use a model you run yourself

Use a custom provider when you want the AI Gateway to call a model running on your machine, private network, or cloud GPU. Custom providers can point to self-hosted tools like Ollama, vLLM, LM Studio, or any OpenAI- or Anthropic-compatible API.

What you’ll need

Your model server must:

Expose a supported API surface, such as OpenAI Chat Completions, OpenAI Responses, or Anthropic Messages.
Be reachable from the AI Gateway.
Have at least one model ID configured on the provider.

Connect a local model with an internal endpoint

If your model runs on your machine or private network, expose it with an ngrok internal endpoint. For example, if your model server listens on port 11434:

ngrok http 11434 --url https://my-ollama.internal

Use the internal endpoint URL as the custom provider base URL.

Internal endpoints (.internal domains) are private to your ngrok account, meaning they’re not reachable from the public internet. Use the same ngrok account here and in the AI Gateway, otherwise the gateway can’t reach the endpoint.

Create a custom provider

In app.ngrok.ai:

Go to Providers.
Open the Custom tab.
Select Add provider.
Choose External URL or Local.
Enter a provider ID, base URL, API format, and model IDs.
Add a provider key if the upstream requires authentication.

Create a custom provider with the API

curl -X POST https://api.ngrok.ai/providers \
  -H "Authorization: Bearer $AI_GATEWAY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "providerId": "my-ollama",
    "name": "Local Ollama",
    "baseUrl": "https://my-ollama.internal",
    "supportedApiSurfaces": [
      { "format": "openai", "surface": "chat-completions" }
    ],
    "models": [{ "modelId": "llama3.2" }]
  }'

See the Providers API reference for every field.

Allow the provider on an access key

Creating a custom provider doesn’t automatically make every access key able to call it. To route traffic to the provider:

Add a provider key if the upstream requires authentication.
Create or update an access key configuration.
Allow the custom provider ID.
Assign the configuration to an access key.

Call the model

Use provider:model in the request.

{
  "model": "my-ollama:llama3.2",
  "messages": [{"role": "user", "content": "Hello"}]
}

URL requirements

URL type	Scheme	Example
External	`https://` only	`https://api.example.com/v1`
ngrok internal	`http://` or `https://`	`https://my-ollama.internal`

HTTP is only allowed for ngrok .internal endpoints. External URLs must use HTTPS.

Next steps

Ollama guide: Connect Ollama
vLLM guide: Connect vLLM
Provider keys: Store upstream credentials
Restrict providers and models: Allow the custom provider on a key

Concepts

Getting Started With

SDKs

Guides

Observability

Examples

Reference

API Reference

What you’ll need

Connect a local model with an internal endpoint

Create a custom provider

Create a custom provider with the API

Allow the provider on an access key

Call the model

URL requirements

Next steps

​What you’ll need

​Connect a local model with an internal endpoint

​Create a custom provider

​Create a custom provider with the API

​Allow the provider on an access key

​Call the model

​URL requirements

​Next steps

What you’ll need

Connect a local model with an internal endpoint

Create a custom provider

Create a custom provider with the API

Allow the provider on an access key

Call the model

URL requirements

Next steps