Azure OpenAI

Azure OpenAI Service provides access to OpenAI models through Microsoft’s Azure cloud. This guide shows you how to connect Azure OpenAI to the ngrok AI Gateway.

Prerequisites

ngrok account with AI Gateway access
Azure OpenAI Service resource
Deployed model in Azure OpenAI

Overview

Azure OpenAI uses a different URL structure than standard OpenAI. You configure the full deployment URL as the base_url and the gateway routes requests to it.

Getting started

Get Azure OpenAI details

From the Azure Portal, gather:

Endpoint URL: https://your-resource.openai.azure.com
API Key: From “Keys and Endpoint” section
Deployment Name: The name you gave your model deployment

Store your API key

Add your Azure OpenAI API key to ngrok secrets:

ngrok api secrets create \
  --name azure-openai \
  --secret-data '{"api-key": "your-azure-api-key"}'

You can also create secrets in the ngrok Dashboard.

Configure the AI Gateway

Create a Traffic Policy with Azure OpenAI as a provider:

policy.yaml

on_http_request:
  - type: ai-gateway
    config:
      headers:
        api-version: "2024-02-15-preview"
      providers:
        - id: "azure-openai"
          base_url: "https://your-resource.openai.azure.com/openai/deployments/your-deployment"
          api_keys:
            - value: ${secrets.get('azure-openai', 'api-key')}
          models:
            - id: "gpt-4o"

Azure OpenAI requires the api-version header. The headers configuration above ensures this is added to all requests.

Use with OpenAI SDK

Point any OpenAI-compatible SDK at your AI Gateway:

from openai import OpenAI

client = OpenAI(
    base_url="https://your-ai-gateway.ngrok.app/v1",
    api_key="unused"  # Gateway handles auth
)

response = client.chat.completions.create(
    model="azure-openai:gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}]
)

print(response.choices[0].message.content)

Advanced configuration

Multiple deployments

Configure multiple Azure OpenAI deployments:

on_http_request:
  - type: ai-gateway
    config:
      headers:
        api-version: "2024-02-15-preview"
      providers:
        - id: "azure-gpt4"
          base_url: "https://your-resource.openai.azure.com/openai/deployments/gpt-4o-deployment"
          api_keys:
            - value: ${secrets.get('azure-openai', 'api-key')}
          models:
            - id: "gpt-4o"
        
        - id: "azure-gpt35"
          base_url: "https://your-resource.openai.azure.com/openai/deployments/gpt-35-deployment"
          api_keys:
            - value: ${secrets.get('azure-openai', 'api-key')}
          models:
            - id: "gpt-3.5-turbo"

Multiple regions

Configure multiple Azure regions for failover:

on_http_request:
  - type: ai-gateway
    config:
      headers:
        api-version: "2024-02-15-preview"
      providers:
        - id: "azure-eastus"
          base_url: "https://myapp-eastus.openai.azure.com/openai/deployments/gpt-4o"
          api_keys:
            - value: ${secrets.get('azure-openai-eastus', 'api-key')}
          models:
            - id: "gpt-4o"
          metadata:
            region: "eastus"
        
        - id: "azure-westus"
          base_url: "https://myapp-westus.openai.azure.com/openai/deployments/gpt-4o"
          api_keys:
            - value: ${secrets.get('azure-openai-westus', 'api-key')}
          models:
            - id: "gpt-4o"
          metadata:
            region: "westus"

Without a model selection strategy, requesting model: "gpt-4o" returns both regions as candidates (in config order), enabling failover. Requesting model: "azure-eastus:gpt-4o" pins to that region only. For explicit control over failover order, clients can use models: ["azure-eastus:gpt-4o", "azure-westus:gpt-4o"].

Failover to OpenAI

Use Azure as primary with OpenAI fallback:

on_http_request:
  - type: ai-gateway
    config:
      headers:
        api-version: "2024-02-15-preview"
      providers:
        - id: "azure-openai"
          base_url: "https://your-resource.openai.azure.com/openai/deployments/gpt-4o"
          api_keys:
            - value: ${secrets.get('azure-openai', 'api-key')}
          models:
            - id: "gpt-4o"
        
        - id: "openai"
          api_keys:
            - value: ${secrets.get('openai', 'api-key')}
      
      model_selection:
        strategy:
          - "ai.models.filter(m, m.provider_id == 'azure-openai')"
          - "ai.models.filter(m, m.provider_id == 'openai')"

The first strategy that returns models wins. If Azure has matching models, only those are tried. OpenAI is only used if no Azure models match. For cross-provider failover when requests fail, have clients specify multiple models: models: ["azure-openai:gpt-4o", "openai:gpt-4o"].

Embeddings

Configure Azure OpenAI embeddings:

on_http_request:
  - type: ai-gateway
    config:
      headers:
        api-version: "2024-02-15-preview"
      providers:
        - id: "azure-embeddings"
          base_url: "https://your-resource.openai.azure.com/openai/deployments/text-embedding-ada-002"
          api_keys:
            - value: ${secrets.get('azure-openai', 'api-key')}
          models:
            - id: "text-embedding-ada-002"

Troubleshooting

401 unauthorized

Symptom: Requests fail with authentication errors. Solutions:

Verify the API key is correct in secrets
Check the key hasn’t been regenerated in Azure Portal
Ensure the secret name matches your config

404 deployment not found

Symptom: Requests fail with deployment not found. Solutions:

Verify the deployment name in your base_url
Check the deployment exists in Azure Portal
Ensure the deployment is in the correct region

API version errors

Symptom: Requests fail with API version errors. Solutions:

Update the api-version header to a supported version
Check Azure OpenAI API versions for current versions

Rate limiting

Symptom: 429 errors from Azure. Solutions:

Configure multiple deployments for failover
Request quota increase in Azure Portal
Add multiple API keys per deployment for automatic failover

Next steps

Custom Providers - URL requirements and configuration
Model Selection Strategies - Intelligent routing
Multi-Provider Failover - Failover patterns

SDKs

Concepts

Guides

Custom Providers

Observability

Examples

Reference

Prerequisites

Overview

Getting started

Advanced configuration

Multiple deployments

Multiple regions

Failover to OpenAI

Embeddings

Troubleshooting

401 unauthorized

404 deployment not found

API version errors

Rate limiting

Next steps

SDKs

Concepts

Guides

Custom Providers

Observability

Examples

Reference

​Prerequisites

​Overview

​Getting started

​Advanced configuration

​Multiple deployments

​Multiple regions

​Failover to OpenAI

​Embeddings

​Troubleshooting

​401 unauthorized

​404 deployment not found

​API version errors

​Rate limiting

​Next steps

Prerequisites

Overview

Getting started

Advanced configuration

Multiple deployments

Multiple regions

Failover to OpenAI

Embeddings

Troubleshooting

401 unauthorized

404 deployment not found

API version errors

Rate limiting

Next steps