Skip to main content

Load Balancing Between Services Deployed in Kubernetes

A core benefit of Kubernetes is that it simplifies creating multiple replica pods for your apps or APIs and automatically load-balancing between them. That's helpful if you have a single cluster and need to horizontally scale, but Kubernetes can't help if you want to:

  • Share traffic between multiple clusters
  • Balance traffic between more than one cloud providers
  • Deploy canary versions or run A/B tests of your apps/APIs on a single cluster

Endpoint Pooling makes load balancing between Kubernetes services simple—you only need to create two endpoints with the same URL. Here's how that works in your clusters, whether you're using our AgentEndpoint custom resource, Ingress objects, or Gateway API resources.

1. Install the ngrok Kubernetes Operator

Check out our installation instructions for details on how to use Helm to deploy the open-source Operator to each cluster you'd like to load-balance between.

2. Create your first Agent Endpoint

Pooling is always enabled on AgentEndpoint resources, but with Ingress or Gateway API, you have to enable it with an annotation.

The YAML snippets below are just illustrations—you'll also need to change the details of your services, like their names, ports, and namespaces, to match what you've already implemented. Same goes for $NGROK_DOMAIN—you can reserve a domain in the dashboard if you don't have one already.

Loading…

3. Create a second Agent Endpoint to enable pooling

On a second cluster, apply the same or similar ingress configuration—just make sure the url, host, or hostname value is the same for AgentEndpoint, Ingress, and Gateway API implementations, respectively.

Hit your endpoint a few times to get responses from multiple clusters.

What's next?

Now that you're load-balancing in Kubernetes, you can repeat the process in other clusters or clouds to add many other agent endpoints to the pool—there's no limit on how many services can share traffic on a single URL.

You might also consider creating a single Cloud Endpoint that serves as the "front door" to all your Kubernetes services, then create a pool of internal Agent Endpoints on a pooled URL like https://example.internal.