Deployment¶
Use this section when you are moving from local evaluation to a repeatable environment.
Choose a Deployment Path¶
| Path | Best for | Start here |
|---|---|---|
| Docker Compose | Single instance, demos, small teams, simple self-hosting | Docker |
| Kubernetes | Multi-instance production, autoscaling, managed infrastructure | Kubernetes |
| Batch production setup | Async embedding/chat workloads with dedicated workers and shared storage | Batch API & Production Setup |
| Upstream HTTP tuning | Production provider concurrency, streaming, and egress capacity planning | Upstream HTTP Tuning |
Quick Path to Success¶
- Choose Docker if you want the fastest production-style setup
- Choose Kubernetes if you need replicas, ingress, and cluster-native operations
- Generate a valid
DELTALLM_MASTER_KEYandDELTALLM_SALT_KEY - Keep secrets in environment variables, not in
config.yaml - Verify
/health/livelinessand/health/readinessafter startup
Shared Requirements¶
All deployment methods rely on the same core services:
- PostgreSQL for persistent runtime data such as keys, accounts, spend logs, and model records
- Redis for distributed coordination, rate limiting, cache sharing, and runtime state
- Explicit upstream HTTP connection limits for predictable provider concurrency
- A master key for admin access
- A salt key for API key hashing
Shared Best Practices¶
Store Secrets in Environment Variables¶
general_settings:
master_key: os.environ/DELTALLM_MASTER_KEY
salt_key: os.environ/DELTALLM_SALT_KEY
database_url: os.environ/DATABASE_URL
redis_url: os.environ/REDIS_URL
Use the Built-In Health Endpoints¶
GET /health/livelinessfor process livenessGET /health/readinessfor dependency readinessGET /metricsfor Prometheus scraping
Expect Schema Setup on Startup¶
The application runs Prisma schema setup automatically during container startup. You do not need a separate manual migration step for the default deployment paths documented here.