Skip to main content

Load Balancing Between Multiple Clouds

When you load-balance within a single cloud, you can often get away with using that cloud provider's built-in solutions, like AWS Elastic Load Balancing for just one example. But once you start deploying replicas of your services to multiple clouds, you need a load balancer that's multicloud-ready.

Endpoint Pooling makes multicloud load balancing as simple as creating two endpoints with the same URL. ngrok then automatically distributes traffic to replicas in multiple clouds, which means you don't need other load balancers in your stack. Plus, endpoint pooling:

  • Works with the agent CLI, SDKs and our Kubernetes Operator. You can even use all these tools in a single pool.
  • Lets you migrate from one deployment strategy to another without hard cut-overs, like moving from using the agent for ingress to embedding ngrok directly in your app/API with an SDK.
  • Works with Traffic Policy, enabling you to manage traffic identically across all your replicas from a single "front door", or enforce different policies for each replica.

This guide will show you how to set up endpoint pooling with a few different methods.

Using the CLI

Start with a single agent endpoint on an ngrok URL (or a custom domain!), and the --pooling-enabled flag on.

Loading…

Repeat the same steps on another cloud where you've deployed a replica of your service to start a second internal agent endpoint on the same URL.

Loading…

You're now pooling! Send a few requests with curl https://$YOUR_DOMAIN.ngrok.app to see responses balanced between those two endpoints, even though they're running in multiple clouds.

You can verify pooling works in a few ways:

  1. Check each agent's UI in the terminal that launched the process. You should see your curl requests in the number of recent requests.
  2. In Traffic Inspector—click on a request, then the Response tab, and look at Ngrok-Agent-Ips to see the IP for the responding agent changes from request to request with a similar frequency.

Using SDKs and other tools

Pooling isn't limited to how you start a given service or its replicas. If you've already started up one agent endpoint with the CLI, you can start up another with one of our SDKs or the Kubernetes Operator to load-balance between them.

Here's an example of a Go app with the ngrok Go SDK. Just as with the agent CLI examples in the previous step, this Go app starts up an agent endpoint at $YOUR_DOMAIN.ngrok.app with pooling enabled.

To try this example, save the following file as main.go.

Loading…

Then, initialize the dependencies.

Loading…

Finally, start your app to fire up the ngrok Go SDK and add a new agent endpoint to your pool.

Loading…

If you already have a pool of endpoints created by agent CLIs, make a few more requests to see responses from your Go app despite these services operating in entirely different clouds and deployment environments.

Using Endpoint POoling with Cloud Endpoints for custom traffic management

Pooling also works with internal Agent Endpoints, which can only receive traffic from the forward-internal Traffic Policy action attached to another endpoint created by your ngrok account.

When you pair a pool of internal agent endpoints with a Cloud Endpoint, you have a single "front door" for your multicloud load-balanced services. That also lets you add traffic management fundamentals, like authentication, URL rewrites/redirects, or header manipulation, across all your replicas from one place.

If you want to create your cloud endpoint on the same URL ($YOUR_DOMAIN.ngrok.app), then you'll need to first stop any agents you've already created on that URL in the first two steps and recreate them on an internal URL, which ends with .internal, like https://pooling.internal.

Then jump over to the Endpoint section of your ngrok dashboard. Click New, then Cloud Endpoint. Leave the binding as Public, enter $YOUR_DOMAIN.ngrok.app for the domain, and click Create Cloud Endpoint.

You then need to apply a Traffic Policy and the forward-internal action—the policy below does one important thing: Route all requests to your pool of internal agent endpoints.

Loading…

Paste the above YAML into the dashboard and click Save.

Now when you request https://$YOUR_DOMAIN.ngrok.app, your traffic is first routed through the cloud endpoint before being shared among your pooled internal agent endpoints.

Coming soon!

Custom load balancing strategies are not yet generally available, but you can request early access to the developer preview in your ngrok dashboard.

With custom load balancing strategies, you'll be able to decide exactly what happens to load-balanced traffic in certain scenarios, like:

  • Balance randomly among endpoints in a single cloud provider, then fall back to a secondary cloud if they all become unavailable.
  • Prioritize endpoints with the most memory regardless of which cloud they were deployed from.
  • Route traffic to specific cloud providers depending on where the request originated from.

What's next?

If you opted for a Cloud Endpoint that routes traffic to a pool of internal Agent Endpoints, you can now filter, manage, and orchestrate traffic from that single endpoint using Traffic Policy. Here's a few of our most popular actions:

We didn't cover Kubernetes deployments in this guide, but we have a similar quickstart guide for load-balancing K8s services. If you're interested in a more production-ready deployment, take a look at our multicloud API gateway setup tutorial.