Kubernetes Deployment¶

Use this guide to deploy DeltaLLM on Kubernetes with Helm.

There are two install paths:

Install from a released chart if you want the simplest production or evaluation setup without cloning the repository.
Install from the repo if you are developing the chart itself or testing local chart changes before release.

The rewritten chart supports three concrete deployment shapes:

evaluation with bundled PostgreSQL and Redis
standard production with external PostgreSQL and Redis
high-availability production with multiple replicas, HPA, PDB, topology spread, ingress, and monitoring

Production batch workloads can additionally split batch workers into a dedicated Deployment so UI/API/gateway pods do not execute batch work.

Prerequisites¶

Kubernetes 1.24+
Helm 3.10+
kubectl access to the target cluster

Option 1: Install From a Released Chart¶

Published releases are available from the public Helm repository at https://deltawi.github.io/deltallm.

Each release publishes three matching values files:

values-eval-<chart-version>.yaml: self-contained quick-start with bundled PostgreSQL and Redis
values-production-<chart-version>.yaml: HA-oriented production baseline for external PostgreSQL and Redis
values-<chart-version>.yaml: raw base chart values

The bare chart does not provision PostgreSQL or Redis by default. For a first install, use the eval values file.

helm repo add deltallm https://deltawi.github.io/deltallm
helm repo update

Generate the required secrets first:

export DELTALLM_MASTER_KEY="$(python3 -c 'import secrets; print(\"sk-\" + secrets.token_hex(20) + \"A1\")')"
export DELTALLM_SALT_KEY="$(openssl rand -hex 32)"

Quick-start evaluation install:

helm install deltallm deltallm/deltallm \
  --version <chart-version> \
  --namespace deltallm \
  --create-namespace \
  -f https://deltawi.github.io/deltallm/values-eval-<chart-version>.yaml \
  --set secret.values.masterKey="$DELTALLM_MASTER_KEY" \
  --set secret.values.saltKey="$DELTALLM_SALT_KEY" \
  --set-string env[0].name=PLATFORM_BOOTSTRAP_ADMIN_EMAIL \
  --set-string env[0].value=admin@example.com \
  --set-string env[1].name=PLATFORM_BOOTSTRAP_ADMIN_PASSWORD \
  --set-string env[1].value='ChangeMe123!'

To use the Presidio-enabled image variant from the same release:

helm install deltallm deltallm/deltallm \
  --version <chart-version> \
  --namespace deltallm \
  --create-namespace \
  -f https://deltawi.github.io/deltallm/values-eval-<chart-version>.yaml \
  --set secret.values.masterKey="$DELTALLM_MASTER_KEY" \
  --set secret.values.saltKey="$DELTALLM_SALT_KEY" \
  --set-string env[0].name=PLATFORM_BOOTSTRAP_ADMIN_EMAIL \
  --set-string env[0].value=admin@example.com \
  --set-string env[1].name=PLATFORM_BOOTSTRAP_ADMIN_PASSWORD \
  --set-string env[1].value='ChangeMe123!' \
  --set image.tag=v<chart-version>-presidio

Use the latest GitHub Release version for <chart-version>. The exact pinned install commands for each release live in the release notes.

After install:

kubectl get pods -n deltallm should show DeltaLLM plus bundled PostgreSQL and Redis pods
use admin@example.com and the bootstrap password to sign in to the Admin UI
use DELTALLM_MASTER_KEY for gateway and API requests

For production, do not use the eval overlay. Start from the released production overlay instead:

curl -fsSLo values-production.yaml \
  https://deltawi.github.io/deltallm/values-production-<chart-version>.yaml

Edit values-production.yaml to point at your external PostgreSQL and Redis secrets, then install:

set secret.existingSecret to the secret that contains master-key and salt-key
set runtime.database.existingSecret.name and runtime.database.existingSecret.urlKey
set runtime.redis.existingSecret.name and runtime.redis.existingSecret.urlKey
add any provider keys or platform credentials under envFrom or env

helm install deltallm deltallm/deltallm \
  --version <chart-version> \
  --namespace deltallm \
  --create-namespace \
  -f values-production.yaml \
  --set secret.existingSecret=deltallm-app-secrets

Use the eval overlay for the simplest first working install. Use the production overlay once you have external stateful services and secret-backed runtime configuration.

Option 2: Install From the Repo¶

Use this path when you want to:

inspect the chart locally
test changes before opening a PR
install directly from deploy/kubernetes/helm

Clone the repository first:

git clone https://github.com/deltawi/deltallm.git
cd deltallm

Fetch chart dependencies¶

The chart uses Bitnami PostgreSQL and Redis as optional subcharts.

helm dependency build deploy/kubernetes/helm

Chart profiles¶

The chart now ships with three value layers:

deploy/kubernetes/helm/values.yaml: safe baseline
deploy/kubernetes/helm/values-eval.yaml: quick-start with bundled PostgreSQL and Redis
deploy/kubernetes/helm/values-production.yaml: HA-oriented production defaults

By default, the app pod uses an init container to wait until the configured PostgreSQL and Redis endpoints accept TCP connections before DeltaLLM starts. This avoids the initial crash loop that can happen while bundled stateful dependencies are still coming up.

Quick start from the repo¶

This path uses bundled PostgreSQL and Redis and generated control-plane secrets.

Generate the master key and salt key before you install

DeltaLLM will not start with placeholder values such as change-me. Generate both values first, then pass them into Helm.

Copy and run:

export DELTALLM_MASTER_KEY="$(python3 -c 'import secrets; print(\"sk-\" + secrets.token_hex(20) + \"A1\")')"
export DELTALLM_SALT_KEY="$(openssl rand -hex 32)"

DELTALLM_MASTER_KEY must be at least 32 characters long and include both letters and numbers. DELTALLM_SALT_KEY must be a real secret value and must not be change-me.

helm upgrade --install deltallm deploy/kubernetes/helm \
  --namespace deltallm \
  --create-namespace \
  -f deploy/kubernetes/helm/values-eval.yaml \
  --set secret.values.masterKey="$DELTALLM_MASTER_KEY" \
  --set secret.values.saltKey="$DELTALLM_SALT_KEY"

Access the service with port-forwarding:

kubectl port-forward -n deltallm svc/deltallm 4000:4000
curl http://localhost:4000/health/liveliness

Open the admin UI at http://localhost:4000.

Secret layout¶

For production, keep secrets out of Helm values.

Generate the secrets first if you have not already:

export DELTALLM_MASTER_KEY="$(python3 -c 'import secrets; print(\"sk-\" + secrets.token_hex(20) + \"A1\")')"
export DELTALLM_SALT_KEY="$(openssl rand -hex 32)"

Create one secret for master-key and salt-key:

kubectl create secret generic deltallm-app-secrets \
  --namespace deltallm \
  --from-literal=master-key="$DELTALLM_MASTER_KEY" \
  --from-literal=salt-key="$DELTALLM_SALT_KEY"

Create one secret for runtime environment variables:

kubectl create secret generic deltallm-runtime-secrets \
  --namespace deltallm \
  --from-literal=DATABASE_URL='postgresql://user:pass@postgres:5432/deltallm' \
  --from-literal=REDIS_URL='redis://redis:6379/0' \
  --from-literal=OPENAI_API_KEY='sk-...'

Then reference them from the chart:

secret:
  existingSecret: deltallm-app-secrets

runtime:
  database:
    existingSecret:
      name: deltallm-runtime-secrets
      urlKey: DATABASE_URL
  redis:
    existingSecret:
      name: deltallm-runtime-secrets
      urlKey: REDIS_URL

envFrom:
  - secretRef:
      name: deltallm-runtime-secrets

The chart will not emit empty database or Redis env vars, so envFrom works cleanly for provider keys and platform integrations.

Configuration patterns¶

1. Bundled PostgreSQL and Redis¶

Use the eval profile or enable both subcharts explicitly:

postgresql:
  enabled: true
  image:
    tag: latest
  auth:
    username: deltallm
    password: change-this
    database: deltallm

redis:
  enabled: true
  image:
    tag: latest
  auth:
    enabled: true
    password: strong-redis-password

If bundled Redis auth is enabled, the chart will generate the correct authenticated URL for DeltaLLM.

If you need to tune or disable the startup wait behavior:

dependencyWait:
  enabled: true
  timeoutSeconds: 180
  periodSeconds: 2

2. External PostgreSQL and Redis¶

Disable the bundled subcharts and reference external connection strings:

postgresql:
  enabled: false

redis:
  enabled: false

secret:
  existingSecret: deltallm-app-secrets

runtime:
  database:
    existingSecret:
      name: deltallm-runtime-secrets
      urlKey: DATABASE_URL
  redis:
    existingSecret:
      name: deltallm-runtime-secrets
      urlKey: REDIS_URL

envFrom:
  - secretRef:
      name: deltallm-runtime-secrets

3. Split batch workers from API/UI pods¶

For production batch workloads, keep the API/UI/gateway Deployment latency-focused and run batch execution in a dedicated worker Deployment. The chart uses the same image for both roles, but renders separate ConfigMaps so each role can run different general_settings.

config:
  general_settings:
    embeddings_batch_enabled: true
    embeddings_batch_storage_backend: s3
    embeddings_batch_s3_bucket: deltallm-batch-artifacts
    embeddings_batch_s3_region: us-east-1

batchWorker:
  enabled: true
  replicaCount: 2
  resources:
    requests:
      cpu: 500m
      memory: 1Gi
    limits:
      cpu: 2000m
      memory: 2Gi
  config:
    general_settings:
      embeddings_batch_worker_concurrency: 2
      embeddings_batch_item_claim_limit: 10

When batchWorker.enabled=true, the chart automatically disables batch executor, completion outbox, and cleanup loops in the API ConfigMap and enables them in the worker ConfigMap. config remains the shared base config, api.config overrides only API/UI/gateway pods, and batchWorker.config overrides only worker pods.

The Service keeps the legacy API selector for upgrade safety. Worker pods use a distinct app.kubernetes.io/name, so they are not routed by the public Service. When prometheus.serviceMonitor.enabled=true, split mode also renders a worker-only metrics Service and ServiceMonitor so API and worker metrics can be scraped separately.

Shared mode is acceptable for evaluation and small single-replica deployments. For platform workloads that serve UI navigation, synchronous gateway traffic, and batches at the same time, split mode gives each workload its own scaling envelope:

shared mode upper bound: api replicas * embeddings_batch_worker_concurrency
split mode upper bound: batchWorker replicas * embeddings_batch_worker_concurrency

Use S3 batch artifact storage for split mode. The chart rejects enabled batching with local artifact storage in split mode because API pods and worker pods cannot safely share local files. For single-node development only, set batchWorker.allowUnsafeLocalStorage=true to bypass the guard.

4. Provider credentials and platform settings¶

Use env and envFrom directly:

envFrom:
  - secretRef:
      name: deltallm-runtime-secrets

env:
  - name: PLATFORM_BOOTSTRAP_ADMIN_EMAIL
    value: admin@example.com

This covers provider API keys, bootstrap admin credentials, SSO client credentials, JWT settings, and any other runtime env.

5. Model deployment lifecycle¶

DeltaLLM stores model deployments in the database at runtime. On first install you can seed them from config.model_list using the bootstrap mechanism.

Initial seed with config bootstrap¶

Set model_deployment_source: hybrid and model_deployment_bootstrap_from_config: true (the base chart default):

config:
  model_list:
    - model_name: gpt-4o
      deltallm_params:
        provider: openai
        model: openai/gpt-4o
        api_key: os.environ/OPENAI_API_KEY
        api_base: https://api.openai.com/v1
        timeout: 300
      model_info:
        mode: chat
  general_settings:
    model_deployment_source: hybrid
    model_deployment_bootstrap_from_config: true

On startup, DeltaLLM checks whether the deltallm_modeldeployment table is empty. If it is, the entries from model_list are inserted as a one-time seed. If the table already has rows, the bootstrap is skipped — it is safe to leave enabled.

In hybrid mode, DeltaLLM reads deployments from the database first. If the database is empty or unreachable, it falls back to model_list from the config file.

Transition to database-only¶

Once your models are in the database (seeded by bootstrap or created through the Admin UI), switch to the recommended steady-state:

config:
  general_settings:
    model_deployment_source: db_only
    model_deployment_bootstrap_from_config: false

In db_only mode, DeltaLLM reads deployments exclusively from the database and ignores model_list in the config. If the database is empty and no bootstrap happened, the instance starts with zero models.

What the production profile sets¶

The production values file (values-production.yaml) ships with production-appropriate defaults:

config:
  general_settings:
    cache_backend: redis
    model_deployment_source: db_only
    model_deployment_bootstrap_from_config: false

If you use values-production.yaml, model management is database-only from the start. To seed models on the first install, either:

temporarily override during the initial install:

helm upgrade --install deltallm deploy/kubernetes/helm \
  -f deploy/kubernetes/helm/values-production.yaml \
  --set config.general_settings.model_deployment_source=hybrid \
  --set config.general_settings.model_deployment_bootstrap_from_config=true

then remove the overrides on the next upgrade

or create deployments through the Admin UI or Admin API after the first install

Summary of modes¶

Setting combination	Behavior	When to use
`hybrid` + `bootstrap: true`	Seeds empty DB from config, then reads from DB with config fallback	First install, initial seeding
`hybrid` + `bootstrap: false`	Reads from DB with config fallback, no seeding	Transitional if you want config as a safety net
`db_only` + `bootstrap: false`	Database only, config ignored	Recommended production steady-state
`db_only` + `bootstrap: true`	Seeds empty DB from config, then reads from DB only	One-time seed then database-only

See Model Deployments and General Settings for the full reference.

Service and ingress¶

Ingress is disabled by default.

service:
  type: LoadBalancer
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-type: nlb

ingress:
  enabled: true
  className: nginx
  annotations:
    nginx.ingress.kubernetes.io/proxy-read-timeout: "600"
    nginx.ingress.kubernetes.io/proxy-send-timeout: "600"
  hosts:
    - host: llm-gateway.example.com
      paths:
        - path: /
          pathType: Prefix
  tls:
    - secretName: llm-gateway-tls
      hosts:
        - llm-gateway.example.com

High availability¶

Use the production profile as the base:

helm upgrade --install deltallm deploy/kubernetes/helm \
  --namespace deltallm \
  --create-namespace \
  -f deploy/kubernetes/helm/values-production.yaml \
  -f values-custom.yaml

values-production.yaml gives you:

replicaCount: 3
HPA enabled
PDB enabled
topology spread constraints
soft anti-affinity
bundled PostgreSQL and Redis disabled
cache_backend: redis (shared cache across replicas)
model_deployment_source: db_only (database-managed models)
model_deployment_bootstrap_from_config: false (no auto-seeding)

A typical HA overlay looks like this:

secret:
  existingSecret: deltallm-app-secrets

runtime:
  database:
    existingSecret:
      name: deltallm-runtime-secrets
      urlKey: DATABASE_URL
  redis:
    existingSecret:
      name: deltallm-runtime-secrets
      urlKey: REDIS_URL

envFrom:
  - secretRef:
      name: deltallm-runtime-secrets

ingress:
  enabled: true
  className: nginx
  hosts:
    - host: llm-gateway.example.com
      paths:
        - path: /
          pathType: Prefix
  tls:
    - secretName: llm-gateway-tls
      hosts:
        - llm-gateway.example.com

prometheus:
  serviceMonitor:
    enabled: true

Optional migration job¶

The current container image still bootstraps Prisma on startup by default.

The chart also exposes an optional migrationJob for teams that want a separate Kubernetes job for explicit migration control:

migrationJob:
  enabled: true
  hook:
    enabled: true

Use that only if your rollout process is intentionally built around a separate migration step. If you want the application pods to stop using the image default bootstrap path, set command and args explicitly for the app container.

S3 request logging¶

Use workload identity or an existing secret.

Workload identity¶

serviceAccount:
  create: true
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/deltallm-s3-role
  automountServiceAccountToken: true

s3:
  enabled: true
  bucket: company-deltallm-logs
  region: us-east-1
  compression: gzip

Existing AWS credentials secret¶

s3:
  enabled: true
  bucket: company-deltallm-logs
  region: us-east-1
  existingSecret:
    name: deltallm-aws-creds
    accessKeyIdKey: aws-access-key-id
    secretAccessKeyKey: aws-secret-access-key

Optional hardening features¶

The chart includes:

podDisruptionBudget
topologySpreadConstraints
affinity
networkPolicy
serviceAccount.automountServiceAccountToken
startupProbe, readinessProbe, and livenessProbe
config and generated-secret checksum rollouts

If you enable networkPolicy, define ingress and egress rules that match your cluster and ingress-controller topology.

Validation¶

Lint the chart before deploying:

helm lint deploy/kubernetes/helm -f deploy/kubernetes/helm/values-eval.yaml \
  --set secret.values.masterKey=StrongMasterKey2026SecureValue99 \
  --set secret.values.saltKey=unique-salt-2026

helm lint deploy/kubernetes/helm -f deploy/kubernetes/helm/values-production.yaml \
  --set secret.existingSecret=deltallm-app-secrets \
  --set runtime.database.existingSecret.name=deltallm-runtime-secrets \
  --set runtime.redis.existingSecret.name=deltallm-runtime-secrets \
  --set ingress.enabled=true \
  --set 'ingress.hosts[0].host=llm-gateway.example.com' \
  --set 'ingress.hosts[0].paths[0].path=/' \
  --set 'ingress.hosts[0].paths[0].pathType=Prefix'

If subchart dependencies are not present locally yet, run:

helm dependency build deploy/kubernetes/helm

Troubleshooting¶

Symptom	Likely cause	Fix
Pod exits during startup	Missing `master-key` or `salt-key`	Set `secret.values.*` or `secret.existingSecret`
App cannot connect to PostgreSQL	Wrong external DB secret or bundled PostgreSQL disabled	Check `runtime.database.*` and subchart settings
App cannot connect to Redis	Wrong Redis URL or missing Redis auth password	Check `runtime.redis.*` or bundled `redis.auth.password`
Provider calls fail immediately	Missing provider env vars	Add them via `envFrom` / `env`
Config change did not roll pods	External secret changed outside Helm	Restart the deployment or rotate through your secret operator
Migration job fails	DB not reachable or migration command not appropriate	Inspect `kubectl logs job/<release>-migrate` and adjust `migrationJob.args`