Skip to main content
Inference.net provides a distributed inference network for running AI models at scale. Inference.net requires you to bring your own key—ngrok-managed keys are not available.

Setup

1

Create an AI Gateway endpoint

If you don’t have one yet, follow the quickstart to create your AI Gateway endpoint.
2

Get an Inference.net API key

Sign up at inference.net and generate an API key.
3

Make a request

Pass your key directly—the gateway forwards it to Inference.net.
from openai import OpenAI

client = OpenAI(
    base_url="https://your-ai-gateway.ngrok.app/v1",
    api_key="..."  # Your Inference.net key, forwarded by gateway
)

response = client.chat.completions.create(
    model="inference-net:meta-llama/llama-3.1-8b-instruct/fp-8",
    messages=[{"role": "user", "content": "Hello!"}]
)

print(response.choices[0].message.content)

Store key in the gateway

Instead of each client passing their own key, you can store it once in ngrok Secrets and have the gateway inject it automatically.
Storing your provider key in the gateway makes your endpoint publicly accessible. You must add authorization to prevent unauthorized use and unexpected charges. See Protecting BYOK Endpoints.
1

Store your key in ngrok secrets

ngrok api secrets create \
  --name inference-net \
  --secret-data '{"api-key": "..."}'
Or use the Vaults & Secrets dashboard.
2

Configure your traffic policy

on_http_request:
  - type: ai-gateway
    config:
      providers:
        - id: "inference-net"
          api_keys:
            - value: ${secrets.get('inference-net', 'api-key')}
3

Make a request

Clients no longer need an Inference.net key—pass any value for api_key.
from openai import OpenAI

client = OpenAI(
    base_url="https://your-ai-gateway.ngrok.app/v1",
    api_key="unused"  # Gateway injects your Inference.net key
)

response = client.chat.completions.create(
    model="inference-net:meta-llama/llama-3.1-8b-instruct/fp-8",
    messages=[{"role": "user", "content": "Hello!"}]
)

print(response.choices[0].message.content)

Next steps