Load Balancing Between Multiple Clouds
ngrok's Endpoint Pooling allows you to automatically load balance between replicas of your service across multiple clouds by creating two endpoints with the same URL.
This guide will show you how to set up with a few different methods.
This guide will reference reserved domains frequently. You can reserve a free domain in the dashboard or use a custom domain you own.
Why use Endpoint Pooling?
ngrok's Endpoint Pooling:
- Works with the agent CLI, SDKs and our Kubernetes Operator. You can even use all these tools in a single pool.
- Lets you migrate from one deployment strategy to another without hard cut-overs, like moving from using the agent for to embedding ngrok directly in your app/API with an SDK.
- Works with Traffic Policy, enabling you to manage traffic identically across all your replicas from a single "front door," or enforce different policies for each replica.
Using the CLI
Start with a single agent endpoint on an ngrok URL, or a reserved domain, with the --pooling-enabled
flag on.
Loading…
Repeat the same steps on another cloud where you've deployed a replica of your service. This will start a second internal agent endpoint on the same URL.
Loading…
Test your pool
Send a few requests to your domain to see responses balanced between those two endpoints, even though they're running in multiple clouds.
You can verify pooling works in a few ways:
- Check each agent's UI in the terminal that launched the process. You should see an increase in the number of recent requests.
- In Traffic Inspector, select a request, then select the Response tab. Under Ngrok-Agent-Ips you should see that the IP for the responding agent changes from request to request with a similar frequency.
Using SDKs and other tools
Pooling isn't limited to how you start a given service or its replicas. If you've already started up one agent endpoint with the CLI, you can start up another with one of our SDKs or the Kubernetes Operator to load-balance between them.
The following is a simple example of enabling pooling with one of our SDKs.
Loading…
Run your app. If you already have a pool of endpoints created by agent CLIs, make a few requests to see responses from your app using an SDK.
Using Cloud Endpoints
You can use Endpoint Pooling with Cloud Endpoints for custom traffic management. Pooling works with internal Agent Endpoints, which can only receive traffic from other endpoints associated with your ngrok account.
A Cloud Endpoint is a persistent public endpoint where you can manage a for any upstream endpoints it forwards traffic to. Forwarding traffic from a cloud endpoint to a pool of internal endpoints gives you a single "front door" for your multicloud load-balanced services.
1. Set up your internal Agent Endpoints
First, set up your internal Agent Endpoints. To pool traffic with a cloud endpoint, you must:
- Stop any agents you've already created associated with the URL you want to use.
- Restart those agents as internal endpoints by giving them any URL with a TLD of
.internal
.
Loading…
2. Set up your Cloud Endpoint
Next, set up your cloud endpoint:
- Navigate to the Endpoint section of your ngrok dashboard and select New, then Cloud Endpoint.
- Leave the binding as Public and enter
your-reserved-domain
for the domain. - Select Create Cloud Endpoint.
3. Add a Traffic Policy
Now you can add a Traffic Policy.
While viewing your Cloud Endpoint in the dashboard, add the following traffic policy.
It uses the forward-internal
action to route all requests to your pool of internal agent endpoints.
Loading…
Be sure to select the Save button to apply the policy.
Now when you make a request to your URL, your traffic is first routed through the cloud endpoint before being shared among your pooled internal agent endpoints.
Custom load balancing strategies are not yet generally available, but you can request early access to the developer preview in your ngrok dashboard.
With custom load balancing strategies, you'll be able to decide exactly what happens to load-balanced traffic in certain scenarios, like:
- Balance randomly among endpoints in a single cloud provider, then fall back to a secondary cloud if they all become unavailable.
- Prioritize endpoints with the most memory regardless of which cloud they were deployed from.
- Route traffic to specific cloud providers depending on where the request originated from.
What's next?
If you opted for a Cloud Endpoint that routes traffic to a pool of internal Agent Endpoints, you can now filter, manage, and orchestrate traffic from that single endpoint using Traffic Policy. Here's a few of our most popular actions:
We didn't cover Kubernetes deployments in this guide, but we have a similar quickstart guide for load-balancing K8s services. If you're interested in a more production-ready deployment, take a look at our multicloud API gateway setup tutorial.