MCP Gateway & Tools¶
DeltaLLM can act as an MCP gateway and as an MCP-to-chat bridge.
That gives you two ways to use the same approved tool catalog:
- direct MCP access through
POST /mcp - OpenAI-compatible chat or responses requests that include
tools: [{ "type": "mcp", ... }]
Quick Path¶
For the first successful rollout:
- register one MCP server
- refresh capabilities and run a health check
- add one binding for the organization, team, or API key that should see it
- add one enabled tool policy with
require_approval: never - verify the tool through
/mcpbefore letting a model auto-call it
If you want the shortest path, start with MCP Quick Start.
What DeltaLLM Adds on Top of MCP¶
Raw MCP servers expose tools. DeltaLLM adds the control plane around them:
- server registry with health and capability refresh
- scoped visibility through bindings
- per-tool policy controls
- optional manual approvals
- per-tool rate limiting, concurrency limits, and result caching
- audit logging and metrics
- namespaced tool exposure into chat and responses APIs
How It Works¶
App or model
-> DeltaLLM
-> resolve visible MCP servers for the caller
-> filter tools by binding allowlist
-> apply the effective per-tool policy
-> call upstream MCP server over streamable HTTP
-> return tool result through /mcp or back into chat execution
DeltaLLM namespaces each tool as:
So a search tool on a server with key docs is exposed as docs.search.
Visibility and Scope Resolution¶
Normal API keys do not automatically see all MCP tools. DeltaLLM resolves visibility from bindings.
Supported binding scopes:
organizationteamapi_key
Resolution precedence is:
The most specific binding wins. If that winning binding has a tool_allowlist, only those tools are visible.
Important Behaviors¶
- a request-level
allowed_toolslist only narrows already visible tools - disabled tool policies hide the tool from normal callers
- the master key bypasses normal binding and policy visibility, which is useful for testing
Tool Policies¶
Policies are scoped the same way as bindings and use the same precedence:
Each policy can control:
- whether the tool is enabled
- whether it requires approval
- requests per minute
- max concurrency
- result cache TTL
- max total execution time
Use manual approval when a tool can take irreversible or sensitive actions. Use never when you want low-friction retrieval and read-only integrations.
Direct MCP Gateway¶
Use POST /mcp when you want deterministic tool access without relying on model tool selection.
Supported JSON-RPC methods:
initializepingtools/listtools/call
This is the best first verification path because it isolates:
- DeltaLLM-to-MCP connectivity
- auth and header forwarding
- binding visibility
- policy enforcement
See API Reference: MCP Gateway & Tooling for examples.
Chat and Responses Bridge¶
DeltaLLM also lets OpenAI-compatible chat and responses requests reference MCP servers directly.
Example request shape:
{
"model": "gpt-4o-mini",
"messages": [
{"role": "user", "content": "Search the docs for DeltaLLM."}
],
"tools": [
{
"type": "mcp",
"server": "docs",
"allowed_tools": ["search"],
"require_approval": "never"
}
],
"tool_choice": "required"
}
DeltaLLM translates the visible MCP tools into OpenAI-style function tools, executes any resulting tool calls, and feeds the tool result back into the model.
Current Limits¶
- MCP tools are not supported on streaming chat requests yet
- MCP tools are not supported on streaming responses requests yet
- tools that require manual approval are not auto-executed in chat or responses flows
- the upstream model still needs reliable tool-calling support
For early production rollouts, verify the provider and model combination with a real tool call path before enabling it broadly.
Upstream MCP Server Requirements¶
DeltaLLM currently supports:
- transport:
streamable_http - request format: JSON-RPC
2.0 - response content type: JSON or
text/event-streamfor Streamable HTTP POST responses - stateful HTTP sessions through upstream
MCP-Session-Idheaders
For compatibility with common client config names, the admin API also accepts transport: "http" and stores it as streamable_http.
Supported upstream auth modes:
nonebearerbasicheader_map
If you need to pass selected end-user headers through DeltaLLM, use forwarded_headers_allowlist. Callers then send headers in this form:
Example:
Operations and Observability¶
DeltaLLM records MCP activity in both audits and metrics.
Operator workflows include:
- refreshing tool capabilities after an upstream server changes
- running on-demand health checks
- inspecting recent call counts, failures, and approval volume per server
- reviewing and deciding approval requests
For the day-to-day UI workflow, see Admin UI: MCP Servers.
Troubleshooting¶
| Symptom | Likely cause |
|---|---|
No visible MCP tools are available for server 'docs' |
The caller has no matching binding, or the binding and policy combination hides every tool |
Unknown MCP tool 'docs.search' |
The tool is not visible to that caller, or the upstream capabilities changed and were not refreshed |
Health check reports unhealthy |
Base URL, upstream auth, or network reachability is wrong |
/mcp works but chat fails |
The provider/model may not support tool calling well enough, or the request is streaming |
| Chat returns a manual approval error | Use /mcp tools/call, approve the request, and retry there |