Skip to content

Kubernetes Deployment

Use this guide to deploy DeltaLLM on Kubernetes with Helm.

There are two install paths:

  • Install from a released chart if you want the simplest production or evaluation setup without cloning the repository.
  • Install from the repo if you are developing the chart itself or testing local chart changes before release.

The rewritten chart supports three concrete deployment shapes:

  • evaluation with bundled PostgreSQL and Redis
  • standard production with external PostgreSQL and Redis
  • high-availability production with multiple replicas, HPA, PDB, topology spread, ingress, and monitoring

Production batch workloads can additionally split batch workers into a dedicated Deployment so UI/API/gateway pods do not execute batch work.

Prerequisites

  • Kubernetes 1.24+
  • Helm 3.10+
  • kubectl access to the target cluster

Option 1: Install From a Released Chart

Published releases are available from the public Helm repository at https://deltawi.github.io/deltallm.

Each release publishes three matching values files:

  • values-eval-<chart-version>.yaml: self-contained quick-start with bundled PostgreSQL and Redis
  • values-production-<chart-version>.yaml: HA-oriented production baseline for external PostgreSQL and Redis
  • values-<chart-version>.yaml: raw base chart values

The bare chart does not provision PostgreSQL or Redis by default. For a first install, use the eval values file.

helm repo add deltallm https://deltawi.github.io/deltallm
helm repo update

Generate the required secrets first:

export DELTALLM_MASTER_KEY="$(python3 -c 'import secrets; print(\"sk-\" + secrets.token_hex(20) + \"A1\")')"
export DELTALLM_SALT_KEY="$(openssl rand -hex 32)"

Quick-start evaluation install:

helm install deltallm deltallm/deltallm \
  --version <chart-version> \
  --namespace deltallm \
  --create-namespace \
  -f https://deltawi.github.io/deltallm/values-eval-<chart-version>.yaml \
  --set secret.values.masterKey="$DELTALLM_MASTER_KEY" \
  --set secret.values.saltKey="$DELTALLM_SALT_KEY" \
  --set-string env[0].name=PLATFORM_BOOTSTRAP_ADMIN_EMAIL \
  --set-string env[0].value=admin@example.com \
  --set-string env[1].name=PLATFORM_BOOTSTRAP_ADMIN_PASSWORD \
  --set-string env[1].value='ChangeMe123!'

To use the Presidio-enabled image variant from the same release:

helm install deltallm deltallm/deltallm \
  --version <chart-version> \
  --namespace deltallm \
  --create-namespace \
  -f https://deltawi.github.io/deltallm/values-eval-<chart-version>.yaml \
  --set secret.values.masterKey="$DELTALLM_MASTER_KEY" \
  --set secret.values.saltKey="$DELTALLM_SALT_KEY" \
  --set-string env[0].name=PLATFORM_BOOTSTRAP_ADMIN_EMAIL \
  --set-string env[0].value=admin@example.com \
  --set-string env[1].name=PLATFORM_BOOTSTRAP_ADMIN_PASSWORD \
  --set-string env[1].value='ChangeMe123!' \
  --set image.tag=v<chart-version>-presidio

Use the latest GitHub Release version for <chart-version>. The exact pinned install commands for each release live in the release notes.

After install:

  • kubectl get pods -n deltallm should show DeltaLLM plus bundled PostgreSQL and Redis pods
  • use admin@example.com and the bootstrap password to sign in to the Admin UI
  • use DELTALLM_MASTER_KEY for gateway and API requests

For production, do not use the eval overlay. Start from the released production overlay instead:

curl -fsSLo values-production.yaml \
  https://deltawi.github.io/deltallm/values-production-<chart-version>.yaml

Edit values-production.yaml to point at your external PostgreSQL and Redis secrets, then install:

  • set secret.existingSecret to the secret that contains master-key and salt-key
  • set runtime.database.existingSecret.name and runtime.database.existingSecret.urlKey
  • set runtime.redis.existingSecret.name and runtime.redis.existingSecret.urlKey
  • add any provider keys or platform credentials under envFrom or env
helm install deltallm deltallm/deltallm \
  --version <chart-version> \
  --namespace deltallm \
  --create-namespace \
  -f values-production.yaml \
  --set secret.existingSecret=deltallm-app-secrets

Use the eval overlay for the simplest first working install. Use the production overlay once you have external stateful services and secret-backed runtime configuration.

Option 2: Install From the Repo

Use this path when you want to:

  • inspect the chart locally
  • test changes before opening a PR
  • install directly from ./helm

Clone the repository first:

git clone https://github.com/deltawi/deltallm.git
cd deltallm

Fetch chart dependencies

The chart uses Bitnami PostgreSQL and Redis as optional subcharts.

helm dependency build ./helm

Chart profiles

The chart now ships with three value layers:

  • helm/values.yaml: safe baseline
  • helm/values-eval.yaml: quick-start with bundled PostgreSQL and Redis
  • helm/values-production.yaml: HA-oriented production defaults

By default, the app pod uses an init container to wait until the configured PostgreSQL and Redis endpoints accept TCP connections before DeltaLLM starts. This avoids the initial crash loop that can happen while bundled stateful dependencies are still coming up.

Quick start from the repo

This path uses bundled PostgreSQL and Redis and generated control-plane secrets.

Generate the master key and salt key before you install

DeltaLLM will not start with placeholder values such as change-me. Generate both values first, then pass them into Helm.

Copy and run:

export DELTALLM_MASTER_KEY="$(python3 -c 'import secrets; print(\"sk-\" + secrets.token_hex(20) + \"A1\")')"
export DELTALLM_SALT_KEY="$(openssl rand -hex 32)"

DELTALLM_MASTER_KEY must be at least 32 characters long and include both letters and numbers. DELTALLM_SALT_KEY must be a real secret value and must not be change-me.

helm upgrade --install deltallm ./helm \
  --namespace deltallm \
  --create-namespace \
  -f helm/values-eval.yaml \
  --set secret.values.masterKey="$DELTALLM_MASTER_KEY" \
  --set secret.values.saltKey="$DELTALLM_SALT_KEY"

Access the service with port-forwarding:

kubectl port-forward -n deltallm svc/deltallm 4000:4000
curl http://localhost:4000/health/liveliness

Open the admin UI at http://localhost:4000.

Secret layout

For production, keep secrets out of Helm values.

Generate the secrets first if you have not already:

export DELTALLM_MASTER_KEY="$(python3 -c 'import secrets; print(\"sk-\" + secrets.token_hex(20) + \"A1\")')"
export DELTALLM_SALT_KEY="$(openssl rand -hex 32)"

Create one secret for master-key and salt-key:

kubectl create secret generic deltallm-app-secrets \
  --namespace deltallm \
  --from-literal=master-key="$DELTALLM_MASTER_KEY" \
  --from-literal=salt-key="$DELTALLM_SALT_KEY"

Create one secret for runtime environment variables:

kubectl create secret generic deltallm-runtime-secrets \
  --namespace deltallm \
  --from-literal=DATABASE_URL='postgresql://user:pass@postgres:5432/deltallm' \
  --from-literal=REDIS_URL='redis://redis:6379/0' \
  --from-literal=OPENAI_API_KEY='sk-...'

Then reference them from the chart:

secret:
  existingSecret: deltallm-app-secrets

runtime:
  database:
    existingSecret:
      name: deltallm-runtime-secrets
      urlKey: DATABASE_URL
  redis:
    existingSecret:
      name: deltallm-runtime-secrets
      urlKey: REDIS_URL

envFrom:
  - secretRef:
      name: deltallm-runtime-secrets

The chart will not emit empty database or Redis env vars, so envFrom works cleanly for provider keys and platform integrations.

Configuration patterns

1. Bundled PostgreSQL and Redis

Use the eval profile or enable both subcharts explicitly:

postgresql:
  enabled: true
  image:
    tag: latest
  auth:
    username: deltallm
    password: change-this
    database: deltallm

redis:
  enabled: true
  image:
    tag: latest
  auth:
    enabled: true
    password: strong-redis-password

If bundled Redis auth is enabled, the chart will generate the correct authenticated URL for DeltaLLM.

If you need to tune or disable the startup wait behavior:

dependencyWait:
  enabled: true
  timeoutSeconds: 180
  periodSeconds: 2

2. External PostgreSQL and Redis

Disable the bundled subcharts and reference external connection strings:

postgresql:
  enabled: false

redis:
  enabled: false

secret:
  existingSecret: deltallm-app-secrets

runtime:
  database:
    existingSecret:
      name: deltallm-runtime-secrets
      urlKey: DATABASE_URL
  redis:
    existingSecret:
      name: deltallm-runtime-secrets
      urlKey: REDIS_URL

envFrom:
  - secretRef:
      name: deltallm-runtime-secrets

3. Split batch workers from API/UI pods

For production batch workloads, keep the API/UI/gateway Deployment latency-focused and run batch execution in a dedicated worker Deployment. The chart uses the same image for both roles, but renders separate ConfigMaps so each role can run different general_settings.

config:
  general_settings:
    embeddings_batch_enabled: true
    embeddings_batch_storage_backend: s3
    embeddings_batch_s3_bucket: deltallm-batch-artifacts
    embeddings_batch_s3_region: us-east-1

batchWorker:
  enabled: true
  replicaCount: 2
  resources:
    requests:
      cpu: 500m
      memory: 1Gi
    limits:
      cpu: 2000m
      memory: 2Gi
  config:
    general_settings:
      embeddings_batch_worker_concurrency: 2
      embeddings_batch_item_claim_limit: 10

When batchWorker.enabled=true, the chart automatically disables batch executor, completion outbox, and cleanup loops in the API ConfigMap and enables them in the worker ConfigMap. config remains the shared base config, api.config overrides only API/UI/gateway pods, and batchWorker.config overrides only worker pods.

The Service keeps the legacy API selector for upgrade safety. Worker pods use a distinct app.kubernetes.io/name, so they are not routed by the public Service. When prometheus.serviceMonitor.enabled=true, split mode also renders a worker-only metrics Service and ServiceMonitor so API and worker metrics can be scraped separately.

Shared mode is acceptable for evaluation and small single-replica deployments. For platform workloads that serve UI navigation, synchronous gateway traffic, and batches at the same time, split mode gives each workload its own scaling envelope:

  • shared mode upper bound: api replicas * embeddings_batch_worker_concurrency
  • split mode upper bound: batchWorker replicas * embeddings_batch_worker_concurrency

Use S3 batch artifact storage for split mode. The chart rejects enabled batching with local artifact storage in split mode because API pods and worker pods cannot safely share local files. For single-node development only, set batchWorker.allowUnsafeLocalStorage=true to bypass the guard.

4. Provider credentials and platform settings

Use env and envFrom directly:

envFrom:
  - secretRef:
      name: deltallm-runtime-secrets

env:
  - name: PLATFORM_BOOTSTRAP_ADMIN_EMAIL
    value: admin@example.com

This covers provider API keys, bootstrap admin credentials, SSO client credentials, JWT settings, and any other runtime env.

5. Model deployment lifecycle

DeltaLLM stores model deployments in the database at runtime. On first install you can seed them from config.model_list using the bootstrap mechanism.

Initial seed with config bootstrap

Set model_deployment_source: hybrid and model_deployment_bootstrap_from_config: true (the base chart default):

config:
  model_list:
    - model_name: gpt-4o
      deltallm_params:
        provider: openai
        model: openai/gpt-4o
        api_key: os.environ/OPENAI_API_KEY
        api_base: https://api.openai.com/v1
        timeout: 300
      model_info:
        mode: chat
  general_settings:
    model_deployment_source: hybrid
    model_deployment_bootstrap_from_config: true

On startup, DeltaLLM checks whether the deltallm_modeldeployment table is empty. If it is, the entries from model_list are inserted as a one-time seed. If the table already has rows, the bootstrap is skipped — it is safe to leave enabled.

In hybrid mode, DeltaLLM reads deployments from the database first. If the database is empty or unreachable, it falls back to model_list from the config file.

Transition to database-only

Once your models are in the database (seeded by bootstrap or created through the Admin UI), switch to the recommended steady-state:

config:
  general_settings:
    model_deployment_source: db_only
    model_deployment_bootstrap_from_config: false

In db_only mode, DeltaLLM reads deployments exclusively from the database and ignores model_list in the config. If the database is empty and no bootstrap happened, the instance starts with zero models.

What the production profile sets

The production values file (values-production.yaml) ships with production-appropriate defaults:

config:
  general_settings:
    cache_backend: redis
    model_deployment_source: db_only
    model_deployment_bootstrap_from_config: false

If you use values-production.yaml, model management is database-only from the start. To seed models on the first install, either:

  • temporarily override during the initial install:
helm upgrade --install deltallm ./helm \
  -f helm/values-production.yaml \
  --set config.general_settings.model_deployment_source=hybrid \
  --set config.general_settings.model_deployment_bootstrap_from_config=true

then remove the overrides on the next upgrade

  • or create deployments through the Admin UI or Admin API after the first install

Summary of modes

Setting combination Behavior When to use
hybrid + bootstrap: true Seeds empty DB from config, then reads from DB with config fallback First install, initial seeding
hybrid + bootstrap: false Reads from DB with config fallback, no seeding Transitional if you want config as a safety net
db_only + bootstrap: false Database only, config ignored Recommended production steady-state
db_only + bootstrap: true Seeds empty DB from config, then reads from DB only One-time seed then database-only

See Model Deployments and General Settings for the full reference.

Service and ingress

Ingress is disabled by default.

service:
  type: LoadBalancer
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-type: nlb

ingress:
  enabled: true
  className: nginx
  annotations:
    nginx.ingress.kubernetes.io/proxy-read-timeout: "600"
    nginx.ingress.kubernetes.io/proxy-send-timeout: "600"
  hosts:
    - host: llm-gateway.example.com
      paths:
        - path: /
          pathType: Prefix
  tls:
    - secretName: llm-gateway-tls
      hosts:
        - llm-gateway.example.com

High availability

Use the production profile as the base:

helm upgrade --install deltallm ./helm \
  --namespace deltallm \
  --create-namespace \
  -f helm/values-production.yaml \
  -f values-custom.yaml

values-production.yaml gives you:

  • replicaCount: 3
  • HPA enabled
  • PDB enabled
  • topology spread constraints
  • soft anti-affinity
  • bundled PostgreSQL and Redis disabled
  • cache_backend: redis (shared cache across replicas)
  • model_deployment_source: db_only (database-managed models)
  • model_deployment_bootstrap_from_config: false (no auto-seeding)

A typical HA overlay looks like this:

secret:
  existingSecret: deltallm-app-secrets

runtime:
  database:
    existingSecret:
      name: deltallm-runtime-secrets
      urlKey: DATABASE_URL
  redis:
    existingSecret:
      name: deltallm-runtime-secrets
      urlKey: REDIS_URL

envFrom:
  - secretRef:
      name: deltallm-runtime-secrets

ingress:
  enabled: true
  className: nginx
  hosts:
    - host: llm-gateway.example.com
      paths:
        - path: /
          pathType: Prefix
  tls:
    - secretName: llm-gateway-tls
      hosts:
        - llm-gateway.example.com

prometheus:
  serviceMonitor:
    enabled: true

Optional migration job

The current container image still bootstraps Prisma on startup by default.

The chart also exposes an optional migrationJob for teams that want a separate Kubernetes job for explicit migration control:

migrationJob:
  enabled: true
  hook:
    enabled: true

Use that only if your rollout process is intentionally built around a separate migration step. If you want the application pods to stop using the image default bootstrap path, set command and args explicitly for the app container.

S3 request logging

Use workload identity or an existing secret.

Workload identity

serviceAccount:
  create: true
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/deltallm-s3-role
  automountServiceAccountToken: true

s3:
  enabled: true
  bucket: company-deltallm-logs
  region: us-east-1
  compression: gzip

Existing AWS credentials secret

s3:
  enabled: true
  bucket: company-deltallm-logs
  region: us-east-1
  existingSecret:
    name: deltallm-aws-creds
    accessKeyIdKey: aws-access-key-id
    secretAccessKeyKey: aws-secret-access-key

Optional hardening features

The chart includes:

  • podDisruptionBudget
  • topologySpreadConstraints
  • affinity
  • networkPolicy
  • serviceAccount.automountServiceAccountToken
  • startupProbe, readinessProbe, and livenessProbe
  • config and generated-secret checksum rollouts

If you enable networkPolicy, define ingress and egress rules that match your cluster and ingress-controller topology.

Validation

Lint the chart before deploying:

helm lint ./helm -f helm/values-eval.yaml \
  --set secret.values.masterKey=StrongMasterKey2026SecureValue99 \
  --set secret.values.saltKey=unique-salt-2026

helm lint ./helm -f helm/values-production.yaml \
  --set secret.existingSecret=deltallm-app-secrets \
  --set runtime.database.existingSecret.name=deltallm-runtime-secrets \
  --set runtime.redis.existingSecret.name=deltallm-runtime-secrets \
  --set ingress.enabled=true \
  --set 'ingress.hosts[0].host=llm-gateway.example.com' \
  --set 'ingress.hosts[0].paths[0].path=/' \
  --set 'ingress.hosts[0].paths[0].pathType=Prefix'

If subchart dependencies are not present locally yet, run:

helm dependency build ./helm

Troubleshooting

Symptom Likely cause Fix
Pod exits during startup Missing master-key or salt-key Set secret.values.* or secret.existingSecret
App cannot connect to PostgreSQL Wrong external DB secret or bundled PostgreSQL disabled Check runtime.database.* and subchart settings
App cannot connect to Redis Wrong Redis URL or missing Redis auth password Check runtime.redis.* or bundled redis.auth.password
Provider calls fail immediately Missing provider env vars Add them via envFrom / env
Config change did not roll pods External secret changed outside Helm Restart the deployment or rotate through your secret operator
Migration job fails DB not reachable or migration command not appropriate Inspect kubectl logs job/<release>-migrate and adjust migrationJob.args