Skip to content

Features

Use this section when DeltaLLM is already running and you want to turn on a capability, understand how it behaves, or decide which feature to use.

Start With the Outcome You Want

Goal Read this
Protect access to the gateway Authentication & SSO
Connect external MCP tools and expose them safely MCP Gateway & Tools
Spread traffic across multiple deployments Routing & Failover
Lower latency and cost for repeated requests Caching
Block or sanitize unsafe content Guardrails
Control request volume at each scope Rate Limiting
Track or cap spend Budgets & Spend
Process large embedding or chat workloads asynchronously Batch API & Production Setup
Export evidence for compliance or investigations Audit Log
Monitor health, latency, and request volume Observability

Quick Success Pattern

Most feature pages in this section follow the same order:

  1. Turn the feature on with the smallest working configuration
  2. Verify it with one request, API call, or UI action
  3. Read the advanced options only if you need them

If you are still trying to get DeltaLLM running for the first time, go back to Getting Started first.

Where Other Capabilities Live

Some DeltaLLM capabilities are documented outside the Features section because they are primarily control-plane workflows:

  • Model Deployments explains how runtime models are defined
  • MCP Servers covers the operator workflow for server registration, bindings, policies, and approvals
  • Admin UI covers operator workflows such as Models, Route Groups, Prompt Registry, Batch Jobs, and Settings
  • API Reference documents the public proxy API and admin API endpoints