Skip to main content

Sizing

How the deployer maps a registered-user count (or t-shirt alias) to per-component CPU/memory/replicas, and how to change that decision after a cluster is live. Answers: "What size cluster do I need for N users?", "How are resources allocated across the ~30 services?", "How do I resize without redeploying from scratch?", "What does the xs/s/m/l/xl alias actually correspond to in registered-user counts?".

Audience: operators planning a fresh deploy, or right-sizing an existing one. Read this before you fill in platformSizing in instance.yaml — and again before you change it on a live deployment.

What lives here

  • The conceptual model: how a single platformSizing field fans out to ~57 helm charts via sizing.yaml.gotmpl (step-03-generate-helmfile-values).
  • The decisions: which preset is the right default at a given user count, and why.
  • The runbooks: how to change sizing without losing user data.

What does NOT live here:

  • Node pools, machine types, Terraform outputs — see infrastructure.
  • The instance.yaml schema (where platformSizing is just one field among many) — see config.
  • The actual rendering of sizing.yaml.gotmpl — see deployment (step-03-generate-helmfile-values).

Pages

Small but covers all three sub-themes — concept, decision, runbook.

Concepts

  • concept-sizing — registered-user tier keys, t-shirt aliases (xs/s/m/l/xl), the shared/sizing parametric model + GET /api/sizing/model, tier-driven Terraform infra inheritance (strict gate / lenient destroy), in-cluster substrate sizing (Galera/redis-proxy/HAProxy/Prometheus + OPENDESK_INCLUSTER_TIER_SIZING + OPENDESK_REPLICA_CAP), sizing.json override schema incl. the inCluster: block

Decisions

  • decision-tier-presets-for-50-users — rationale for which preset is right at the low end (≈50 users)

Runbooks

  • runbook-resize-cluster — step-by-step procedure to change platformSizing on a live deployment without losing data
  • infrastructure — node-pool count, machine types, and Terraform outputs that bound what sizing can ask for
  • deployment — the GenerateHelmfileValues step is where platformSizing is resolved into sizing.yaml.gotmpl
  • config — platformSizing is one of the input fields in instance.yaml
  • apps — per-app resource expectations (e.g., Synapse worker counts, Nextcloud cache sizing) shape what each tier can run

When to add a page here

  • A new sizing tier or preset is introduced (concept-* or decision-*)
  • A capacity / scale incident has a sizing-related root cause (incident-*)
  • A resize procedure for a specific component is documented — e.g., scaling Synapse workers, OX MariaDB connections (runbook-*)
  • A decision on default sizing for a new deployment shape — e.g., test, staging, demo (decision-*)
  • Empirical capacity numbers are recorded — e.g., observed memory headroom for a given tier at N users (concept-* or incident-*)

Anything covering provisioning of the underlying STACKIT infrastructure (node pools, machine types, persistent-volume sizes) belongs in infrastructure instead. Anything about the instance.yaml schema for sizing — validation, defaults, sanitization — belongs in config. Anything about the rendering pipeline that actually consumes platformSizing belongs in deployment.