MCP Servers¶

The MCP Servers page is where operators register upstream MCP servers, decide who can use them, and govern how each tool executes.

What This Page Is For¶

Use this area when you need to:

register a new MCP server
refresh the tool catalog after the upstream server changes
run a health check before exposing the server to applications
bind a server to an organization, team, or API key
bind a server to an organization, team, API key, or user
set policy for a specific tool
review and decide manual approval requests

Quick Success Workflow¶

Open AI Gateway -> MCP Servers
Create one server with a real base_url
Refresh capabilities and confirm the expected tool list appears
Run a health check and confirm the server is healthy
Add a binding for the scope that should see the server
Add a tool policy for the first tool you want to allow
Verify the tool through /mcp or a chat request

Server Fields¶

When you create or edit a server, the key fields are:

Server Key: stable identifier used in tool names such as docs.search
Name: human-readable label in the UI
Description: optional operator context
Transport: currently streamable_http
Base URL: upstream MCP endpoint
Enabled: hides or exposes the server at runtime
Auth Mode: none, bearer, basic, or header_map
Forwarded Headers Allowlist: which caller-supplied headers DeltaLLM may forward
Request Timeout: per-request timeout to the MCP server

Platform admins can create:

global servers
organization-owned servers

Scoped admins can create only organization-owned servers for organizations they manage.

Refresh and Health Check¶

Two actions should happen before you bind a server to real traffic:

Refresh Capabilities¶

This fetches the upstream tools/list response and stores the current tool catalog in DeltaLLM.

Use it when:

you create the server for the first time
the upstream MCP server adds, removes, or renames tools
you need to confirm the current input schemas seen by DeltaLLM

Health Check¶

This runs an initialize plus tools/list sequence against the upstream server and stores the current status, latency, and error.

Use it when:

a new server was just configured
an operator changed auth or networking
you want a fast answer before debugging app-side behavior

Bindings¶

Bindings control who can see a server at all.

Supported binding scopes:

organization
team
api_key
user

If you leave the binding allowlist empty, every tool from that server is eligible to be visible at that scope. If you set a tool_allowlist, only those tools are eligible.

Resolution Model¶

MCP now follows the same scoped-governance model used by callable targets:

organization bindings are the ceiling
team, API key, and user policies can narrow access with restrict
tool allowlists intersect as scope becomes more specific

If an organization has not been migrated yet, DeltaLLM still supports a compatibility fallback. Use the MCP migration report and backfill endpoints to move orgs to the explicit org-ceiling model.

Tool Policies¶

Policies apply after the binding step and govern each tool individually.

Available controls:

Enabled: hard on/off switch
Approval: never or manual
Max RPM: per-tool rate limit
Max Concurrency: max simultaneous executions
Result Cache TTL: cache identical tool results
Max Total Execution Time: fail long-running tool calls

Use policies to keep the server broadly visible but still control the riskiest tools narrowly.

Approval Requests¶

If a policy uses manual approval, DeltaLLM creates approval requests for /mcp tools/call attempts.

From the UI, operators can:

inspect the requested tool and arguments
approve or reject the request
add a decision comment

Important limitation:

manual approval is currently part of the direct /mcp flow
chat and responses auto-execution do not support manual-approval tools yet

Operations View¶

Each server also exposes an operations summary that helps answer:

how often the server is being called
which tools are used most
how many recent failures happened
how many approval requests were created, approved, or rejected

This is the fastest place to look before opening raw audit logs.

Header Forwarding¶

If the upstream server needs caller-specific headers, add the header names to the allowlist on the server.

Clients then send them with this prefix:

x-deltallm-mcp-<server_key>-<header-name>

Example:

x-deltallm-mcp-github-authorization: Bearer ...

DeltaLLM strips the prefix and forwards only the allowlisted header names.

Common Operator Pitfalls¶

Problem	What usually fixes it
Refresh works but the app sees no tools	Add a binding for the caller scope and confirm the tool is not filtered out
Health check fails in Docker	Use `host.docker.internal` instead of `localhost` for a host-run MCP server
A tool is visible to the master key but not to app keys	Missing binding or overly narrow allowlist/policy
Chat requests fail but `/mcp tools/call` works	The chosen model is not handling tool calling reliably enough