AI meets privacy: ngrok's secure bridge to customer data
Previously on AI-meets-ngrok: You built a proof of concept deployment that uses ngrok as a secure tunnel to external GPU compute, which youâre using to experiment with AI development with a self-hosted, open source LLM.
Your story doesnât end there. After validating what you could do with AI, external GPU, and some example data, your CTO has asked you to take this whole AI experiment one (big) step further.
Now, youâre going to deploy a containerized service to your customerâs virtual private cloud (VPC), which runs AI workloads against their customer data and makes results available via an API. Youâll use ngrok to securely tunnel those API endpoints to your SaaS platform, which your customer accesses to view results and insights.
Your first reaction is right: The fundamentals of what you learned with your external GPU experiment will be helpful here, but this is a major jump in complexity. Luckily, with that knowledge and ngrokâs secure tunneling still at your side, youâre already closer to a successful deployment than you might think.
Why run API workloads on external customer data
The first question you might ask is why your customer canât just upload their data to your service and run AI workloads there. Wouldnât that make the topology of this infrastructure a lot simpler? The most obvious answer is that your customer will refuse, citing their obligation to protect customer data and the integrity of their cloud infrastructure.
Egressing data outside their infrastructure is almost always a no-goâtheyâll have obligations to customers and regulators. Beyond issues of privacy and governance, youâre also dealing with the sheer volume of data your customers want to analyze with your API service. The transfer costs associated with data between clouds would quickly outweigh the value your service could provide.
The opposite solutionâthe customer running and managing your entire service on-premises, is often a hard sell as well. Theyâll balk at broad changes to their existing virtual private cloud (VPC) infrastructure, and often donât want the additional IT/DevOps burden.
These challenges are the lever behind the quick jump in traction for Bring Your Own Cloud (BYOC) architecture. With a BYOC architecture, you deploy a service into a customerâs data plane to process and analyze their data using APIs. Your SaaS platform is the control plane, operating as the central hub to tell your data plane services what to analyze and when, along with giving your customers a centralized place to view results.
Databricks is a great example of an AI platform that operates with a BYOC architecture. Once a Databricks customer sets up their workspace, a secure integration between the Databricks platform and their cloud provider, Databricks deploys new clusters and resources in the customerâs cloud to process and store data.
In your case, you get some knock-on benefits, too:
- Most of the compute, particularly for AI workloads, happens on your customersâ infrastructure, lessening your burden to build and maintain it yourself.
- Because your customerâs data never leaves their network, your path to a successful deployment requires much less technical complexity and scrutiny around data security.
- You maintain strong control of your service running in customer networks.
Architect your integration with your customersâ network
Normally, accessing data and running workloads in external networks using a BYOC architecture comes with numerous complexities:
- Bureaucratic approvals from NetOps, DevOps, and SecOps, which often drag on for months.
- Configuration hiccups around network/application firewalls, load balancers, VPC routers, and more.
- Security and compliance issues around encryption, secure data transfer, and strict access control.
- Complexity in the deployment process, which leads to delays and crushes the time-to-value of your AI service.
- Falling behind on changes in the AI landscapeâlike the release of a new fine-tuned LLM that would serve your customers betterâbecause of the complexity around the integration.
With ngrok operating as a secure tunneling middleware between your control plane and your customerâs data plane, youâll bypass all that headache.
In reality, youâll find that every customerâs infrastructure is different, which is why talented solutions architects are so valuable. In this example, however, your customer is using a public cloud provider and already runs Kubernetes clusters.
The process of setting up this BYOC architecture is roughly:
- Deploy a new Kubernetes cluster in the customerâs virtual private cloud (VPC).
- On said cluster, install the ngrok Kubernetes Ingress Controller with Helm.
- Create the services/resources to run your AI service.
- Configure access to the customerâs data volumes from the pods/services in the AI deployment.
- Set up routes for the services your control plane needs to access.
- Double-check ngrok has automatically set up HTTPS edges for those routes, which publish an API.
- Layer in security as needed.
- Start consuming the results of your AI service via an API!
In the end, your customers will access your SaaS platform at <code>console.YOUR-AMAZING-AI-PLATFORM.com.</code>
Speaking of which, we havenât yet talked about what service youâre running on the customerâs data plane. Because you already got some experience in Ollama, you'll continue along that path.
First, letâs clarify: this is an example illustrating whatâs possible, not a real-world solution. Youâre never going to deploy Ollama to a customerâs cloud as a production AI workload, as itâs not designed to access, process, and analyze data in the ways a real AI service would require. The example might be far-fetched, but the process covered below, particularly around setting up a secure ngrok tunnel for your customer, is one you can adapt to your actual deployment process.
Create your containerized, Kubernetes-ready AI service
We wonât go into all the nuances of containerizing an application or creating a Kubernetes manifest for itâDocker has published documentation covering the fundamentals on creating a Dockerfile and using docker build. The Kubernetes docs also contains extensive resources on managing resources and deploying containers using <code>kubectl</code>.
Fortunately, the folks behind Ollama have already containerized the service and created a Kubernetes manifest.
This Kubernetes service tells you how to set up your ingress: ngrok should direct incoming traffic to the appropriate domain name, which youâll set up alongside your customer, through a Kubernetes ingress and to the <code>ollama</code> service running on port <code>80</code> in your clusterâs internal network.
If you were building a real AI service, you would also need additional deployments, using Kubernetes secrets, for accessing databases to leverage the customer data that required this BYOC infrastructure in the first place.
Deploy your AI service on your customerâs data plane
Now, weâll focus on installing the services youâll need to process data using AI and handle ingress. Based on the relationship with your customer, you may run these commands yourself or provide a subset of them to your customer in a document to help with onboarding.
Install your AI service
Using Ollama as an example, you can copy-paste the Kubernetes manifest above or grab it from GitHub, then apply the deployment to your customerâs cluster.
Remove ngrok branding from the Agent Ingress
To create a secure tunnel to the ngrok Cloud Edge, every ngrok Agent first authenticates using your account credentials at the default Ingress Address, which is <code>connect.ngrok-agent.com:443</code>. This Ingress Address is then used to tunnel traffic to and from your application service.
For the clearest and easiest to manage configuration, you can configure all ngrok Agents to connect to a specific domain, <code>tunnel.CUSTOMER_DOMAIN.com</code>, instead of the default hostname, which is <code>connect.ngrok-agent.com</code>. You can also extend the brand-free Agent Ingress experience with more features, like dedicated IPs for your account, for more reliability and security assurances for your customers.
Not every ngrok account can create Agent Ingressesâif you see ERR_NGROK_6707, you can reach out to ngrok support to learn more about activating the feature.
Add a custom wildcard domain
Youâll start by creating an ngrok API key for this customer, which gives them privileges to configure their tunnels, domains, and Edges. Itâs helpful to export the API key for easy access later.
Next, request the ngrok API to create a wildcard domain at <code>*.CUSTOMER0001-DOMAIN.COM</code> using automated TLS certificates.Â
You would use a custom TLS certificate (<code>certificate_id</code>) in real-world production environments. For the best experience for your customer, youâll want to manage these domains by setting up the appropriate DNS records, which the ngrok dashboard will help you with.
Configure the customerâs ngrok Authtoken
Your customer will use their Authtoken credential to start new tunnel sessions, but by specifying a bind, you restrict them from creating new Edges or Tunnels on anything but the wildcard domain. This step ensures that your service, running on your customerâs data plane, never creates tunnels on domains you havenât protected against eavesdropping or intrusion.
A successful response looks like the following.
You can find the new Authtoken in the âtokenâ field of the responseâitâs also helpful to export this customer-specific Authtoken for future use.
Install and configure the ngrok Kubernetes Ingress Controller
The ngrok Kubernetes Ingress Controller automatically creates secure Edges on the ngrok Network for your services, cutting away all the networking and security complexity around typical BYOC integrations. As a bonus, youâll be able to quickly add observability, authentication, and other security measures, like IP restrictions, to prevent unauthorized access to your ngrok Edges.
First, add the Helm repository for the ngrok Kubernetes Ingress Controller.
Next, install the ngrok Kubernetes Ingress Controller using the new customer-specific Authtoken.
If you created a new ngrok Agent Ingress for your customer for an unbranded experience, youâll also need to specify <code>--set serverAddr=$TUNNEL</code> during installation of the Ingress Controller.
Next, create a Kubernetes Ingress resource for your AI service. This is where your Ollama example comes in handy againâset <code>host</code> to a subdomain on the wildcard set up earlier, then </code>name</code> and <code>port.number</code> of the Ollama service for ngrok to securely tunnel through the Cloud Edge.
This configuration produces a single HTTPS Edge at <code>ai.CUSTOMER0001-DOMAIN.com</code> and a single route, pointing <code>/</code> toward <code>ollama:80</code>, which is already running on their Kubernetes cluster. Your SaaS platform, running in your cloud, can now request the Ollama API made available at <code>ai.CUSTOMER0001-DOMAIN.com</code> to run AI workloads and pull results to your cloud SaaS.
Access your AI service via API
With Ollama running on the customerâs data plane, and the secure tunnel connecting it to your cloud SaaS via the ngrok Cloud Edge running smoothly, you can start controlling your AI service via an API.
Start by pulling a model.
Test-run a basic response.
Youâre off to the races! Ollama comes with an extensive API manipulating the AI service running on your customerâs data plane, and your real-world AI service would operate much the same way.
The technical prowess of ngrokâs Cloud Edge
With ngrok taking care of tunneling and ingress to your AI service running on your customerâs BYOC infrastructure, you can now add in additional security and routing features without another round of approvals from your customerâs NetOps/DevSecOps teams and even more complexity for your solutions architects.
IP restrictions
One of the easiest ways to demonstrate just how easily you can extend your usage of the ngrok Agent for better security practices is IP restrictions. Your customers will appreciate knowing that only your SaaS platform, which uses dedicated IPs, can access the routes exposed by the ngrok Kubernetes Ingress Controller. Create a <code>ip-policy.yaml</code> file with the following and create the resource with <code>kubectl -f apply ip-policy.yaml</code>.
You can also extend your security practices to authentication via an ngrok- or user-managed OAuth provider or OpenID Connect provider.
Multiple routes (fanout)
Based on how your AI service runs, and the routes/ports it exposes, you can extend the ngrok Kubernetes Ingress Controller to support multiple routes with a fanout. With the following Ingress configuration, youâll have a single HTTPS Edge with two routes, the second of which points /bar to a different Kubernetes Service.
Whatâs next?
With this BYOC deployment workflow, you can bring your AI service straight to the customer data you need without the complexity typically involved in accessing external networks. Not only do you lessen the technical burden on your organization, but you deliver value to your customers far faster.
With the AI and ML space moving faster than ever before, that speed can make all the difference. Sign up today to start building that delivers value where your customers need it mostâtheir data.
Are you building an AI service of your own? Curious about how BYOC is changing and growing in the cloud native era? Weâd love to hear how youâre simplifying your networking stack with ngrok by pinging us on X (aka Twitter) @ngrokhq, LinkedIn, or by joining our community on Slack.
â