Architectural reference

What you’ll need
- An ngrok account. If you don’t have one, sign up.
- An MCP server process running on a local server/VM/container
- The ngrok agent installed directly on the machine (VM, server, or container) running your local MCP server. See the downloads page for instructions on how to install the ngrok agent.
1. Install the ngrok agent and configure internal Agent Endpoints in ngrok.yml
You’re going to configure the agent to declare an internal Agent Endpoint that points to the port running your MCP server process. This will connect the server to your ngrok account but nothing will be able to connect to them until you complete the subsequent steps. Internal Endpoints are private endpoints that only receive traffic when forwarded through the forward-internal Traffic Policy action. This allows you to route traffic to an application through ngrok without making it publicly addressable. Internal endpoint URL hostnames must end with.internal.
After installing the ngrok agent, define an internal endpoint inside the ngrok configuration file for the MCP server you want to make accessible from your AI tools. You can install ngrok and its configuration file in /path/to/ngrok/ngrok.yml and the executable in /path/to/ngrok/ngrok.
2. Create a public Cloud Endpoint
Cloud Endpoints are persistent, always-on endpoints whose creation, deletion and configuration is managed centrally via the Dashboard or API. They exist permanently until they are explicitly deleted. Cloud Endpoints do not forward their traffic to an agent by default and instead only use their attached Traffic Policy to handle connections. Create a Cloud Endpoint for the MCP server you need to route traffic to. Go to the endpoints section of your ngrok dashboard and click New:
3. Attach Traffic Policy to your Cloud Endpoint
Navigate to thehttps://mcp.example.com Cloud Endpoint and replace the default Traffic Policy with:
Now, use ngrok’s Traffic Policy to handle routing rules (forward-internal action above), rate limiting, and authentication via IP Intel. A full list of available Traffic Policy actions can be seen here. These actions can be used singularly or layered on top of each other (executed sequentially from top to bottom). Here’s how your Traffic Policy might look like for your Cloud Endpoint which restricts incoming traffic to Claude’s MCP host, requires an API key, and rate limits based on that API key:
4. Test your MCP gateway
Within your Claude configuration file where you’ve defined your MCP server, replace theurl field with your ngrok Cloud Endpoint and add an authorization header with your bearer token. Also, ensure your ngrok agent and the internal Agent Endpoint are active. Now, any prompt you send from Claude’s MCP host will be routed through your ngrok MCP gateway endpoint. There are a few tests you can run to make sure it’s functioning as you need it to:
- Remove the auth token from the Claude desktop config. The request should now be blocked and return a 401 status code and a message saying
Authorization required. - Curl the URL or POST to it directly. The request should be blocked and return a 403 status code since it didn’t originate from an Anthropic source IP.
- Send a prompt through Claude and ensure the Auth key is present in the Claude config, and the request should return the expected response back.