Sizing
How the deployer maps a registered-user count (or t-shirt alias) to per-component CPU/memory/replicas, and how to change that decision after a cluster is live. Answers: "What size cluster do I need for N users?", "How are resources allocated across the ~30 services?", "How do I resize without redeploying from scratch?", "What does the
xs/s/m/l/xlalias actually correspond to in registered-user counts?".Audience: operators planning a fresh deploy, or right-sizing an existing one. Read this before you fill in
platformSizingininstance.yaml— and again before you change it on a live deployment.
What lives here
- The conceptual model: how a single
platformSizingfield fans out to ~57 helm charts viasizing.yaml.gotmpl(step-03-generate-helmfile-values). - The decisions: which preset is the right default at a given user count, and why.
- The runbooks: how to change sizing without losing user data.
What does NOT live here:
- Node pools, machine types, Terraform outputs — see infrastructure.
- The
instance.yamlschema (whereplatformSizingis just one field among many) — see config. - The actual rendering of
sizing.yaml.gotmpl— see deployment (step-03-generate-helmfile-values).
Pages
Small but covers all three sub-themes — concept, decision, runbook.
Concepts
- concept-sizing — registered-user tier keys, t-shirt aliases (xs/s/m/l/xl), the
shared/sizingparametric model +GET /api/sizing/model, tier-driven Terraform infra inheritance (strict gate / lenient destroy), in-cluster substrate sizing (Galera/redis-proxy/HAProxy/Prometheus +OPENDESK_INCLUSTER_TIER_SIZING+OPENDESK_REPLICA_CAP),sizing.jsonoverride schema incl. theinCluster:block
Decisions
- decision-tier-presets-for-50-users — rationale for which preset is right at the low end (≈50 users)
Runbooks
- runbook-resize-cluster — step-by-step procedure to change
platformSizingon a live deployment without losing data
Related topics
- infrastructure — node-pool count, machine types, and Terraform outputs that bound what sizing can ask for
- deployment — the
GenerateHelmfileValuesstep is whereplatformSizingis resolved intosizing.yaml.gotmpl - config —
platformSizingis one of the input fields ininstance.yaml - apps — per-app resource expectations (e.g., Synapse worker counts, Nextcloud cache sizing) shape what each tier can run
When to add a page here
- A new sizing tier or preset is introduced (
concept-*ordecision-*) - A capacity / scale incident has a sizing-related root cause (
incident-*) - A resize procedure for a specific component is documented — e.g., scaling Synapse workers, OX MariaDB connections (
runbook-*) - A decision on default sizing for a new deployment shape — e.g., test, staging, demo (
decision-*) - Empirical capacity numbers are recorded — e.g., observed memory headroom for a given tier at N users (
concept-*orincident-*)
Anything covering provisioning of the underlying STACKIT infrastructure (node pools, machine types, persistent-volume sizes) belongs in infrastructure instead. Anything about the instance.yaml schema for sizing — validation, defaults, sanitization — belongs in config. Anything about the rendering pipeline that actually consumes platformSizing belongs in deployment.