Block Unwanted Requests - ngrok documentation

With Traffic Policy, you can block unwanted requests to your endpoints. This page demonstrates a few example rules that do so. See the following Traffic Policy action docs for more information:

How to deny traffic from Tor

This rule uses the connection variables available in IP Intelligence to block Tor exit node IPs.

on_http_request:
  - expressions:
      - ('proxy.anonymous.tor' in conn.client_ip.categories)
    actions:
      - type: deny
        config:
          status_code: 403

How to deny traffic from bots and crawlers with a `robots.txt`

This rule returns a custom response with a robots.txt file to deny search engine or AI crawlers on all paths.

on_http_request:
  - expressions: 
    - "!req.url.path.contains('/robots.txt')"
    actions:
    - type: forward-internal
      config: 
        url: <Internal endpoint URL Here>
  - expressions: 
    - "req.url.path.contains('/robots.txt')"
    actions:
    - type: custom-response
      config:
        body: "User-agent: *\r\nDisallow: /"
        headers:
          content-type: text/plain
        status_code: 200

You can extend this example to create specific rules for crawlers based on their user agent strings, like ChatGPT-User and GPTBot.

on_http_request:
  - name: Add `robots.txt` to deny specific bots and crawlers
    expressions:
      - req.url.contains('/robots.txt')
    actions:
      - type: custom-response
        config:
          status_code: 200
          body: "User-agent: ChatGPT-User\r\nDisallow: /"
          headers:
            content-type: text/plain

How to block traffic from bots and crawlers by user agent

You can also take action on incoming requests that contain specific strings in the req.user_agent request variable.

on_http_request:
  - name: Block specific bots by user agent
    expressions:
      - req.user_agent.name in ['ChatGPT-User', 'GPTBot', 'OAI-SearchBot']
    actions:
      - type: deny
        config:
          status_code: 404

You can expand the expression to include additional user agents by adding them to the list:

['ChatGPT-User', 'GPTBot', 'anthropic', 'claude']

How to block traffic from bots and crawlers by IP Address

You can also use IP Intelligence variables to block AI Bots by IP Address.

{
  "on_http_request": [
    {
      "name": "Block specific AI Bots with IP Intelligence",
      "expressions": [
        "('com.anthropic' in conn.client_ip.categories) || ('com.openai' in conn.client_ip.categories) || ('com.perplexity' in conn.client_ip.categories)"
      ],
      "actions": [
        {
          "type": "deny",
          "config": {
            "status_code": 404
          }
        }
      ]
    }
  ]
}

Deny non-GET requests

This rule denies all inbound traffic that is not a GET request.

on_http_request:
  - expressions:
      - req.method != 'GET'
    actions:
      - type: deny

Custom response for unauthorized requests

This rule sends a custom response with status code 401 and body Unauthorized for requests without an Authorization header.

on_http_request:
  - expressions:
      - "!('authorization' in req.headers)"
    actions:
      - type: custom-response
        config:
          status_code: 401
          body: Unauthorized

How to block traffic from specific countries

Sometimes you may need to block requests originating from one or more countries to remain compliant with data regulations or sanctions. This rule blocks requests based on the origin country using ISO country codes with the following steps:

Check if the request is from an array of countries you can define
If so, return a 401 status code with an error message.

on_http_request:
  - expressions:
      - conn.geo.country_code in ['<COUNTRY-01>', '<COUNTRY-02>']
    name: Block traffic from unwanted countries
    actions:
      - type: custom-response
        config:
          status_code: 401
          body: "Unauthorized request due to country of origin."

Limit request sizes

This rule demonstrates how to prevent excessively large user uploads, like text or images, that might cause performance or availability issues for your upstream service with the following steps:

Check if the request is POST or `PUT
Check if the request’s content is 1MB or larger.
If both conditions are met, return a 400 status code with an error message.

on_http_request:
  - name: Block large POST/PUT requests.
    expressions:
      - req.method == 'POST' || req.method == 'PUT'
      - req.content_length >= 1000
    actions:
      - type: custom-response
        config:
          status_code: 400
          body: "Error: You can't upload content larger than 1MB."

Exempt specific traffic from rate limits

In this example, the Algolia web crawler is exempted from any rate limiting configured on your site. See the IP Intelligence docs for other bots and crawlers that are available.

on_http_request:
  - expressions:
      - "!('com.algolia.crawler' in conn.client_ip.categories)"
    actions:
      - type: rate-limit
        config:
          name: Only allow 30 requests per minute
          algorithm: sliding_window
          capacity: 30
          rate: 60s
          bucket_key:
            - conn.client_ip

Documentation Index

​How to deny traffic from Tor

​How to deny traffic from bots and crawlers with a robots.txt

​How to block traffic from bots and crawlers by user agent

​How to block traffic from bots and crawlers by IP Address

​Deny non-GET requests

​Custom response for unauthorized requests

​How to block traffic from specific countries

​Limit request sizes

​Exempt specific traffic from rate limits

How to deny traffic from Tor

How to deny traffic from bots and crawlers with a `robots.txt`

How to block traffic from bots and crawlers by user agent

How to block traffic from bots and crawlers by IP Address

Deny non-GET requests

Custom response for unauthorized requests

How to block traffic from specific countries

Limit request sizes

Exempt specific traffic from rate limits