Debugging

When troubleshooting AI Gateway issues, you can access detailed information about what happened during request processing using action result variables. This page explains how to capture and interpret this data.

Action result variables

After the ai-gateway action runs, detailed results are available in ${actions.ngrok.ai_gateway}. This includes:

Model selection process and filtering steps
Every model and request attempted
API key selection details
Token counts
Latency measurements
Error details for failed attempts

Schema

actions.ngrok.ai_gateway.status

string

Overall outcome of the action: "success" or "error".

actions.ngrok.ai_gateway.error

object

Error details, only present if status is "error".

code - Error code (for example, "ERR_NGROK_3807")
message - Error message describing the failure

actions.ngrok.ai_gateway.client

object

Information about the client request.

method - HTTP method from the client request
path - Request path from the client
request_headers - Client request headers (API key/auth headers trimmed)
user_agent - User-Agent header from the client
model - Client-requested model field
models - Client-requested models field
api_key_hash - Trimmed API key from the client
rejected_models - Models the client requested but were excluded (each with model and reason)

actions.ngrok.ai_gateway.input_ngrok_tokens

integer

Estimated input token count calculated by ngrok (used for billing).

actions.ngrok.ai_gateway.output_ngrok_tokens

integer

Estimated output token count calculated by ngrok (used for billing).

actions.ngrok.ai_gateway.gateway_latency_ms

integer

Time added by gateway logic in milliseconds. Excludes time the gateway spent waiting on the upstream to respond to requests.

actions.ngrok.ai_gateway.model_selection

array

Details about the model selection process (if applicable). Each entry contains:

strategy - The resolved strategy expression
error - Error if strategy evaluation failed
candidates_returned - Models that the strategy returned
candidates_after_allowed_filter - Above list of models after ones that are not allowed by the gateway config have been removed.
candidates_after_client_filter - Above list of models after filtering out ones that the client did not want (if applicable).
models_to_try - Final list of models to try

actions.ngrok.ai_gateway.models_tried

array

Models attempted in chronological order. Each entry contains:

model - Model identifier
provider - Provider identifier
author - Author identifier (may be empty)
api_key_selection - Details about API key selection (strategy, error, keys_to_try)
requests_made - Array of request attempts for this model

actions.ngrok.ai_gateway.models_tried[].requests_made

array

List of requests made for a specific model. Each request contains:

status - Request outcome: "success" or "error"
error - Error message if request failed
upstream_input_tokens - Input tokens counted by the provider
upstream_output_tokens - Output tokens counted by the provider
request - url and api_key of the request made to the upstream. Api key is trimmed to avoid leaking.
response - status_code, headers, body_on_error capture details of the response.
upstream_latency - Latency measurements (time_to_first_byte_ms, total_ms)

actions.ngrok.ai_gateway.success

object

Details of the successful model, only present when status is "success".

model_index - Index into models_tried array for the successful model
request_index - Index into requests_made array for the successful request
model - Model id used in the successful request
provider - Provider id used in the successful request
author - Author id used in the successful request (may be empty)

Accessing action results

To access action results, configure on_error: "continue" so subsequent actions can inspect the data:

on_http_request:
  - type: ai-gateway
    config:
      on_error: continue
  - type: log
    config:
      metadata:
        ai_gateway_result: ${actions.ngrok.ai_gateway}
  - type: deny

Cloud Endpoints require a terminal action such as deny, custom-response, redirect, or forward-internal to complete the request. See Cloud Endpoints for more details.

Debugging patterns

Return results as response (development)

During development, return the full action result to the client for inspection:

on_http_request:
  - type: ai-gateway
    config:
      on_error: continue
  - type: custom-response
    config:
      status_code: 503
      headers:
        content-type: application/json
      body: ${actions.ngrok.ai_gateway}

Example response:

{
  "status": "error",
  "error": {
    "code": "ERR_NGROK_3807",
    "message": "All AI providers failed to respond successfully."
  },
  "client": {
    "method": "POST",
    "path": "/v1/chat/completions",
    "user_agent": "curl/8.17.0",
    "model": "gpt-4o",
    "models": ["gpt-4.1", "gpt-5"],
    "api_key_hash": "sk-proj...7890" // If the actual key was sk-proj-abcdefghijklmnopqrstuvwxyz1234567890
  },
  "input_ngrok_tokens": 150,
  "gateway_latency_ms": 45,
  "model_selection": [
    {
      "strategy": "ai.models.filter(x, x.id in [\"gpt-4o\", \"claude-3-5-sonnet-20241022\"])",
      "candidates_returned": ["gpt-4o", "claude-3-5-sonnet-20241022"],
      "candidates_after_allowed_filter": ["gpt-4o"],
      "candidates_after_client_filter": ["gpt-4o"],
      "models_to_try": ["gpt-4o"]
    }
  ],
  "models_tried": [
    {
      "model": "gpt-4o",
      "provider": "openai",
      "author": "openai",
      "api_key_selection": [
        {
          "strategy": "ai.keys",
          "keys_to_try": ["sk-proj...2315"]
        }
      ],
      "requests_made": [
        {
          "status": "error",
          "error": "rate limit exceeded",
          "upstream_input_tokens": 0,
          "upstream_output_tokens": 0,
          "request": {
            "url": "https://api.openai.com/v1/chat/completions",
            "api_key": "sk-proj...2315"
          },
          "response": {
            "status_code": 429,
            "headers": {"content-type": ["application/json"]},
            "body_on_error": "{\"error\":{\"message\":\"Rate limit exceeded\",\"type\":\"rate_limit_error\"}}"
          },
          "upstream_latency": {
            "time_to_first_byte_ms": 120,
            "total_ms": 125
          }
        }
      ]
    }
  ]
}

Send to log exports (production)

In production, send action results to your logging infrastructure:

on_http_request:
  - type: ai-gateway
    config:
      on_error: continue
  - type: log
    config:
      metadata:
        ai_gateway_result: ${actions.ngrok.ai_gateway}
  - type: deny

This fires a log event that can be exported to your observability platform. See Log Exporting for setup.

Combined approach

Log the results and return a user-friendly error:

on_http_request:
  - type: ai-gateway
    config:
      on_error: continue
  - type: log
    config:
      metadata:
        ai_gateway_result: ${actions.ngrok.ai_gateway}
  - type: custom-response
    config:
      status_code: 503
      headers:
        content-type: application/json
      body: |
        {
          "error": "AI service temporarily unavailable",
          "code": "${actions.ngrok.ai_gateway.error.code}"
        }

Interpreting results

Identifying rate limits

Look for status_code: 429 in request responses:

{
  "models_tried": [
    {
      "model": "gpt-4o",
      "provider": "openai",
      "requests_made": [
        {
          "status": "error",
          "response": {
            "status_code": 429,
            "body_on_error": "{\"error\":{\"type\":\"rate_limit_error\"}}"
          },
          "request": {
            "api_key": "sk-proj...2315"
          }
        }
      ]
    }
  ]
}

Solution: Add more API keys or configure key rotation with api_key_selection.

Identifying model filtering issues

When you get ERR_NGROK_3804 or want to understand what the gateway did, check model_selection to understand how models were filtered:

{
  "status": "error",
  "model_selection": [
    {
      "strategy": "round-robin",
      "candidates_returned": ["gpt-4o", "claude-3-5-sonnet-20241022"],
      "candidates_after_allowed_filter": ["gpt-4o"],
      "candidates_after_client_filter": [], // After filtering by client model/models we have no remaining models
      "models_to_try": []
    }
  ],
  "models_tried": [],
  "error": {
    "code": "ERR_NGROK_3804",
    "message": "Unable to route request - no models matched"
  }
}

Solution: The client requested models that didn’t survive filtering. Check client.model and adjust your configuration or the client request. Model names/spelling may be wrong. Try adding the provider prefix for new models or models not in the gateway config or model catalog.

Identifying timeout issues

Look for attempts without a status_code or with timeout errors:

{
  "models_tried": [
    {
      "model": "gpt-4o",
      "provider": "openai",
      "requests_made": [
        {
          "status": "error",
          "error": "context deadline exceeded",
          "response": {
            "status_code": 0
          }
        }
      ]
    }
  ]
}

Solution: Increase per_request_timeout or investigate provider latency using upstream_latency metrics.

Understanding latency

Use the latency measurements to identify bottlenecks:

{
  "gateway_latency_ms": 45,
  "models_tried": [
    {
      "requests_made": [
        {
          "upstream_latency": {
            "time_to_first_byte_ms": 2500,
            "total_ms": 8000
          }
        }
      ]
    }
  ]
}

Solution: High time_to_first_byte_ms/total_ms indicates slow provider response. High gateway_latency_ms indicates gateway processing overhead. Increase per_request_timeout/total_timeout or investigate provider latency.

Concepts

Getting Started With

SDKs

Guides

Observability

Examples

Reference

Action result variables

Schema

Accessing action results

Debugging patterns

Return results as response (development)

Send to log exports (production)

Combined approach

Interpreting results

Identifying rate limits

Identifying model filtering issues

Identifying timeout issues

Understanding latency

Next steps

Troubleshooting

Log Exporting

Concepts

Getting Started With

SDKs

Guides

Observability

Examples

Reference

​Action result variables

​Schema

​Accessing action results

​Debugging patterns

​Return results as response (development)

​Send to log exports (production)

​Combined approach

​Interpreting results

​Identifying rate limits

​Identifying model filtering issues

​Identifying timeout issues

​Understanding latency

​Next steps

Troubleshooting

Log Exporting

Action result variables

Schema

Accessing action results

Debugging patterns

Return results as response (development)

Send to log exports (production)

Combined approach

Interpreting results

Identifying rate limits

Identifying model filtering issues

Identifying timeout issues

Understanding latency

Next steps