> ## Documentation Index
> Fetch the complete documentation index at: https://ngrok.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Protecting Services with Rate Limiting

> Configure rate limiting in Kubernetes to protect your services from excessive traffic and ensure fair usage across clients.

Rate limiting is a critical feature that prevents your services from being overwhelmed by excessive network traffic. By enforcing rate limits, you can control how many requests are allowed over a given period, helping to:

⚡ Maintain API performance by preventing excessive load. <br />
🔐 Protect against abuse and malicious traffic. <br />
🌍 Ensure fair usage across clients and users. <br />

## 🔍 What are the benefits of rate limiting?

Without rate limiting, a single misconfigured client or a spike in traffic could easily overload your backend.
Implementing rate limiting at the network edge helps prevent:

* API overuse and abuse (for example, DDoS attacks, aggressive polling).
* Performance degradation for other users.
* Unexpected infrastructure costs due to excessive traffic.

Key Benefits:

* **Prevents Service Overload:** Ensures API stability even under high traffic.
* **Protects Against Automated Abuse:** Stops bots and scrapers from overwhelming services.
* **Ensures Fair Usage:** Distributes API resources evenly among users.
* **Reduces Infrastructure Costs:** Avoids unnecessary scaling due to excessive requests.
* **Improves Security & Compliance:** Helps meet API quotas and prevent brute-force attacks.

## Rate limiting examples

The following examples demonstrate how to rate limit all incoming requests by the `Host` header.

Check out the [rate limit Traffic Policy action](/traffic-policy/actions/rate-limit/) page for more details about how it functions and the parameters it accepts.

<Tabs>
  <Tab title="AgentEndpoint">
    ```yaml theme={null}
    apiVersion: ngrok.k8s.ngrok.com/v1alpha1
    kind: AgentEndpoint
    metadata:
      name: example-agent-endpoint
    spec:
      url: https://example-hostname.ngrok.io
      upstream:
        url: http://my-service.my-namespace:8080
      trafficPolicy:
        inline:
          on_http_request:
            - actions:
                - type: rate-limit
                  config:
                    name: Only allow 30 requests per minute
                    algorithm: sliding_window
                    capacity: 30
                    rate: 60s
                    bucket_key:
                      - req.headers['host']
    ```
  </Tab>

  <Tab title="CloudEndpoint">
    ```yaml theme={null}
    apiVersion: ngrok.k8s.ngrok.com/v1alpha1
    kind: CloudEndpoint
    metadata:
      name: example-cloud-endpoint
    spec:
      url: https://example-hostname.ngrok.io
      trafficPolicy:
        policy:
          on_http_request:
            - actions:
                - type: rate-limit
                  config:
                    name: Only allow 30 requests per minute
                    algorithm: sliding_window
                    capacity: 30
                    rate: 60s
                    bucket_key:
                      - req.headers['host']
    ```
  </Tab>

  <Tab title="Ingress">
    💡 `Ingress` resources do not natively support rate limiting, but they can be extended using a Traffic Policy.

    ### 1. Create an `NgrokTrafficPolicy`

    ```yaml theme={null}
    apiVersion: ngrok.k8s.ngrok.com/v1alpha1
    kind: NgrokTrafficPolicy
    metadata:
      name: example-tp
      namespace: default
    spec:
      policy:
        on_http_request:
          - actions:
              - type: rate-limit
                config:
                  name: Only allow 30 requests per minute
                  algorithm: sliding_window
                  capacity: 30
                  rate: 60s
                  bucket_key:
                    - req.headers['host']
    ```

    ### 2. Use the `NgrokTrafficPolicy` on an `Ingress`

    ```yaml theme={null}
    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
      annotations:
        k8s.ngrok.com/traffic-policy: example-tp
      name: example-ingress
      namespace: default
    spec:
      ingressClassName: ngrok
      rules:
        - host: example-hostname.ngrok.io
          http:
            paths:
              - path: /
                pathType: Prefix
                backend:
                  service:
                    name: example-service
                    port:
                      number: 80
    ```
  </Tab>

  <Tab title="Gateway API">
    💡 Gateway API resources do not natively support rate limiting, but they can be extended using a Traffic Policy.

    ### 1. Create an `NgrokTrafficPolicy`

    ```yaml theme={null}
    apiVersion: ngrok.k8s.ngrok.com/v1alpha1
    kind: NgrokTrafficPolicy
    metadata:
      name: example-tp
      namespace: default
    spec:
      policy:
        on_http_request:
          - actions:
              - type: rate-limit
                config:
                  name: Only allow 30 requests per minute
                  algorithm: sliding_window
                  capacity: 30
                  rate: 60s
                  bucket_key:
                    - req.headers['host']
    ```

    ### 2. Use the `NgrokTrafficPolicy` on a `Gateway`

    The following example showcases supplying the `NgrokTrafficPolicy` on a `Gateway` resource. All requests to the `Gateway` will run the Traffic Policy.
    If you prefer, `NgrokTrafficPolicy` can also be used on the route level by using an `externalRef` filter on an `HTTPRoute`. See the [using Gateway API guide](/k8s/guides/using-gwapi) for examples.

    ```yaml theme={null}
    apiVersion: gateway.networking.k8s.io/v1
    kind: Gateway
    metadata:
      name: example-gateway
      namespace: default
      annotations:
        k8s.ngrok.com/traffic-policy: example-tp
    spec:
      gatewayClassName: ngrok
      listeners:
        - name: example-hostname
          hostname: "example-hostname.ngrok.io"
          port: 443
          protocol: HTTPS
    ```
  </Tab>
</Tabs>