Skip to main content
Prerequisite: You need an AI Gateway endpoint before continuing. Create one using the dashboard quickstart or follow the manual setup guide.
LangChain is a framework for building applications with LLMs. The ngrok AI Gateway works with LangChain’s OpenAI-compatible integrations, giving you automatic failover and key rotation.

Installation

pip install langchain langchain-openai

Basic usage

Configure LangChain to use your AI Gateway:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    base_url="https://your-ai-subdomain.ngrok.app/v1",
    api_key="your-api-key",
    model="gpt-4o"
)

response = llm.invoke("What is the capital of France?")
print(response.content)

Streaming

Stream responses for real-time output:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    base_url="https://your-ai-subdomain.ngrok.app/v1",
    api_key="your-api-key",
    model="gpt-4o"
)

for chunk in llm.stream("Write a poem about AI"):
    print(chunk.content, end="", flush=True)

Using different providers

Route to different providers with model prefixes:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    base_url="https://your-ai-subdomain.ngrok.app/v1",
    api_key="unused",  # Gateway handles auth
    model="anthropic:claude-3-5-sonnet-latest"  # Use Anthropic through gateway
)

response = llm.invoke("Hello!")

Chains

Build chains that route through the gateway:
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate

llm = ChatOpenAI(
    base_url="https://your-ai-subdomain.ngrok.app/v1",
    api_key="your-api-key",
    model="gpt-4o"
)

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant that translates {input_language} to {output_language}."),
    ("human", "{input}")
])

chain = prompt | llm

response = chain.invoke({
    "input_language": "English",
    "output_language": "French",
    "input": "Hello, how are you?"
})

print(response.content)

Embeddings

Generate embeddings through the gateway:
from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(
    base_url="https://your-ai-subdomain.ngrok.app/v1",
    api_key="your-api-key",
    model="text-embedding-3-small"
)

vector = embeddings.embed_query("Hello world")

RAG applications

Build RAG applications with automatic failover:
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_core.prompts import ChatPromptTemplate

llm = ChatOpenAI(
    base_url="https://your-ai-subdomain.ngrok.app/v1",
    api_key="your-api-key",
    model="gpt-4o"
)

embeddings = OpenAIEmbeddings(
    base_url="https://your-ai-subdomain.ngrok.app/v1",
    api_key="your-api-key"
)

# Create vector store
texts = ["Document 1 content", "Document 2 content"]
vectorstore = FAISS.from_texts(texts, embeddings)

# Query with RAG
retriever = vectorstore.as_retriever()
docs = retriever.invoke("query")

prompt = ChatPromptTemplate.from_template(
    "Answer based on context: {context}\n\nQuestion: {question}"
)

chain = prompt | llm
response = chain.invoke({"context": docs, "question": "Your question"})

Agents

Build agents that use the gateway:
from langchain_openai import ChatOpenAI
from langchain.agents import create_react_agent, AgentExecutor
from langchain_core.tools import tool

llm = ChatOpenAI(
    base_url="https://your-ai-subdomain.ngrok.app/v1",
    api_key="your-api-key",
    model="gpt-4o"
)

@tool
def search(query: str) -> str:
    """Search for information."""
    return f"Results for: {query}"

# Create and run agent
# ...

Next steps