Skip to content

Key Decisions

Every major technology choice has a reason. This page documents the thinking behind each one — useful for developers joining the project and for stakeholders evaluating the platform.


Why Go?

Go (also called Golang) is the programming language the platform is built in. It's used by companies like Uber, Cloudflare, Docker, and Google for backend systems that need to be fast and reliable.

ConcernOld system (Strapi / Node.js)New system (Go)
PerformanceSingle-threaded, slows under loadNative concurrency — handles many requests simultaneously
Memory~500MB per server instance~20MB per server instance
SecurityThousands of npm packages (supply chain risk)Minimal dependencies — smaller attack surface
Type safetyRuntime errors discovered in productionErrors caught at compile time, before deployment
DeploymentNode runtime + node_modules (~500MB)Single static binary (~20MB)
HIPAA/GDPRHard to audit — hidden framework behaviorsFull auditability — no hidden code paths
TalentLarge Node.js poolLarge Go pool — used by major infrastructure companies

Bottom line: Go is faster, cheaper to run, easier to audit, and more reliable for a healthcare platform at this scale.


Why PostgreSQL?

PostgreSQL is the primary database. It was chosen for one critical capability: Row-Level Security (RLS).

RLS is a PostgreSQL feature that enforces data isolation at the database level — not the application level. This means even if there were a bug in application code, the database itself would refuse to return one clinic's data to another clinic's user.

For a multi-tenant healthcare platform, this is not a nice-to-have — it's a requirement. RLS makes the compliance story much stronger because data isolation can be verified at the database level by auditors, not just taken on faith from application code.

The database runs on AWS RDS — a dedicated PostgreSQL instance in a private network (VPC), not exposed to the internet. The platform originally used Neon (a serverless PostgreSQL provider) during early development but migrated to RDS for production because: RDS includes a HIPAA BAA at no cost (Neon requires a $500+/month Enterprise plan), provides dedicated resources instead of shared multi-tenant compute, and avoids a provider migration later since the scaling plan already required RDS at Phase 2. See AWS Infrastructure for the full setup.


Why Clerk for authentication?

Clerk is a third-party authentication service. Authentication (login, MFA, password reset, session management) is a commodity problem — but getting it wrong has severe consequences for a healthcare platform.

Clerk was chosen because:

  • SOC 2 Type II certified
  • HIPAA Business Associate Agreement (BAA) available
  • Handles MFA, social login, magic links, passkeys out of the box
  • Reduces the attack surface — auth code is the most security-sensitive code in any system

The split: Clerk handles authentication (is this person who they say they are?). The platform's own database handles authorization (what are they allowed to do in which clinic?). These are separate concerns handled by the right tool for each.


Why one service (not microservices)?

The previous system had two separate services — a main backend and a separate scheduling microservice. That caused:

  • Duplicate authentication code
  • Data syncing between two databases (and sync bugs)
  • Two deployments to coordinate
  • More things to break

The new system merges all clinical operations into one service. The only separate service is Telemetry API (Layer 2), which is separate for a legitimate reason: it handles high-volume time-series ingest (pose-frame batches at potentially 10k+ req/sec at peak, video lifecycle events) that operates at a completely different scale and rhythm from clinical CRUD operations. See Why telemetry is PG + S3, not ClickHouse for the storage stack and the reasoning behind keeping it as one service rather than splitting further.

Rule: Separate services only when the operational profile genuinely differs. Don't split for the sake of "microservices architecture."


Why Redis?

Redis is an in-memory data store used for four specific things:

  1. Booking holds — When a patient selects a time slot, it's held in Redis for a few minutes while they complete the booking. This prevents two patients from booking the same slot simultaneously without locking the database.
  2. Rate limiting — Limits how many requests a user or IP can make per minute.
  3. Webhook idempotency — Tracks processed event IDs to prevent duplicate processing when Clerk or other services retry webhook deliveries.
  4. Query cache (P45) — Repository read paths wrap their DB queries with cache.Aside from internal/core/cache when the response is shared across multiple concurrent callers at the same scope (per-org, per-user, or platform-wide). Shared across all Core API instances — the first request from any instance populates the cache for the entire fleet. Pairs with the per-Next.js-process unstable_cache (P42) to compose a two-layer cache: 0 / 1 / 1+ Postgres queries (Next.js hit / Redis hit / Redis miss). 5-minute TTL on every wrapped read. See patterns.md → P45.

Why Daily.co for video?

Daily.co provides HIPAA-compliant video rooms via API. A video room is created automatically when an appointment is booked and expires after the appointment. The platform never stores video content — Daily.co handles recording if needed.

Alternatives were evaluated:

  • Twilio Video — more expensive, more complex
  • WebRTC DIY — requires STUN/TURN server infrastructure, significant ongoing maintenance
  • Whereby — less developer control over room lifecycle

Why AWS for deployment?

The platform originally launched on Railway (a Heroku-like PaaS) for its simplicity during early development. We migrated to AWS for production because:

  • Reliability: Railway had no published SLA and experienced frequent production issues. AWS publishes service-level SLAs across the stack we use (RDS Multi-AZ, ALB, Fargate, ElastiCache, S3, KMS), each at 99.95–99.99% — and the Multi-AZ posture in aws-infrastructure.md is what makes those SLAs meaningful in practice.
  • HIPAA compliance: Railway does not offer a Business Associate Agreement (BAA). AWS provides a BAA at no additional cost, covering every service in the stack — required for the eventual US-clinic path and good-hygiene baseline today.
  • We were already on AWS: S3 for file uploads, S3 for backups. Running compute and database on AWS too means one provider, one bill, private VPC networking between services.
  • No migration later: The scaling plan stays on AWS RDS through vertical scaling and read replicas (see scaling.md). Starting on AWS means zero provider migrations.

EU-native alternatives (Scaleway, Hetzner, OVH) were considered for the GDPR sovereignty narrative. Rejected because (a) the platform's mandatory sub-processors — Clerk, Daily.co, Anthropic — are US-based regardless of host, so the "100% EU stack" pitch is partial-only and the marketing benefit is marginal; (b) the team's operational experience and the AI-assisted-authoring footprint for AWS tooling are dramatically larger than for EU-native clouds, which matters for a small team; (c) AWS BAA + region presence + service maturity collectively cover both the GDPR Day-1 requirement and the future HIPAA path without re-platforming.

The full topology — ECS Fargate everywhere, RDS Multi-AZ for production / Aurora Serverless v2 for staging, Cloudflare at the edge — is in aws-infrastructure.md. The compute, database, and edge sub-decisions each have their own entries below.


Why a single AWS account for staging and production?

The new platform lives in the same AWS account as the legacy product, with new resources separated by name prefix (restartix-{env}-*) and IAM-tag conditions on aws:ResourceTag/Environment. The alternative — spinning up sibling member accounts (restartix-staging, restartix-production) under the existing Organization — was considered and rejected for now.

Why single-account:

  • The Org wrapper is already in place (single-account Org since 2023-11-24). It can grow into multi-account later if a trigger fires; no benefit to anticipating that today.
  • SES production sending status is account-scoped. The account currently has a 211,500/day quota and sandbox-exited reputation history. New member accounts would either start in sandbox (200/day, 24–48h support-ticket-driven exit per account, fresh reputation history) or require a cross-account SES routing pattern that complicates the email-channel architecture.
  • Legacy coexistence. The legacy product runs in the same account. New platform resources use restartix-{env}-* prefixes that don't collide with legacy bucket or role names. Tag-condition IAM (aws:ResourceTag/Environment on deploy-role policies) keeps the new platform's deploy roles out of legacy buckets without forcing an account boundary.
  • BAA accepted at Organization level. Auto-extends to any future member accounts if we later split.
  • Single billing surface and single console — smaller operational surface for a small team.

Multi-account triggers (when we'd reconsider):

  • A SOC 2 Type II Common Criteria CC6.1, ISO 27001, or HDS audit finding flags tag-based isolation as insufficient.
  • A specific clinic or insurer contract mandates "production data isolated to a separate AWS account."
  • Multiple teams need different blast-radius access patterns (e.g. a data-science team that should never touch production).
  • A US-clinic HIPAA-active rollout pushes account separation alongside the customer-managed-KMS migration.

What we accept: tag-condition IAM is a soft boundary. A misconfigured deploy policy could in principle let a staging-scoped role touch production resources. Per-env Terraform state, per-env IAM roles, and explicit ARN-prefix scoping in the policies make this a misconfiguration risk rather than a default risk. The migration cost to multi-account later is bounded (~1–2 weeks of focused work: snapshot/restore for stateful services, re-Terraform for stateless). The Terraform module layout in iac-layout.md applies identically in either model — we're not painting ourselves into a corner.


Why the .pro TLD for the platform domain?

restartix.pro is the platform's primary domain. Patient and staff subdomains (*.clinic.restartix.pro, *.portal.restartix.pro, console.restartix.pro) all live under it; clinic-side custom domains route through Cloudflare for SaaS as documented in the SaaS section above.

Why .pro:

  • The .pro TLD was originally a sponsored TLD for credentialed professionals (doctors, lawyers, accountants). It opened to general registration in 2008 but retains the professional-audience signal in branding context.
  • RestartiX is a clinical platform serving healthcare professionals; .pro reads on-brand for the audience and is consistent with the platform's positioning as a tool for clinics, not direct-to-patient.
  • Already registered and on Cloudflare alongside the other zones — no separate registrar to operate.

Alternatives considered:

  • .app — HSTS-preloaded by browsers, modern-SaaS positioning. Cloudflare HSTS achieves the same TLS-only effect at the edge, so the browser-preload benefit is marginal. Would have required a new registration.
  • .eu — Already owned, strong EU positioning that complements the GDPR-first launch posture. Less specific to the credentialed-professional audience.
  • .com / .ro — Generic. restartix.com was unavailable; restartix.ro hosts the legacy product, so it isn't free.

What we accept: .pro is less recognizable to consumers than .com. For a B2B platform sold to clinics, the recognizability cost is small — clinic admins reach the platform via direct links from onboarding emails, not by typing the domain. Pre-production switching cost is bounded (~half a day during the 1E.3 + F-tier window for search-and-replace, ACM cert re-issue, DKIM re-verify); post-production switching cost is weeks of small breakages (custom-domain clinics, bookmarks, search results, integration webhook URLs).


Why ECS Fargate over App Runner?

Compute could go on either. ECS Fargate is the choice because three of the platform's runtime needs don't fit App Runner cleanly:

  • Scheduled tasks. cmd/audit-partition-roll, cmd/usage-quota-reset, cmd/usage-summary-rollup, and cmd/check-providers run on cron schedules. App Runner doesn't host scheduled jobs — the standard workaround is EventBridge Scheduler invoking ECS RunTask, which means operating ECS anyway.
  • TCP services. pgbouncer speaks the Postgres wire protocol on port 6432 (TCP), not HTTP. App Runner is HTTP-only. pgbouncer must run on Fargate regardless (see aws-infrastructure.md → Connection pooling).
  • Migration runners. Database migrations run as a one-shot task using DATABASE_DIRECT_URL to bypass pgbouncer (per P44). App Runner can't host one-shot tasks; ECS RunTask can.

Mixing App Runner for some services and Fargate for the rest means operating two compute platforms, two task-definition shapes, two scaling models, two log-group conventions. Consolidating on Fargate keeps the Terraform layout single-shaped and the operational footprint smaller.

What we accept: Fargate is more explicit than App Runner — task definitions, target groups, ALB listener rules, auto-scaling policies are all named in Terraform. App Runner hides those abstractions behind a "push a container" UX. We pay that complexity once in IaC; after that the model is "edit a value, terraform apply." App Runner's simplicity was real for the first deploy and friction afterward — its abstractions don't compose with everything else we run.


Why Aurora Serverless v2 for staging, not production?

Staging is mostly idle. Production isn't. The two environments have opposite cost shapes, and Aurora Serverless v2's scale-to-zero capability solves the staging shape — not the production shape.

  • Staging idle = $0/hr compute. Aurora Serverless v2 with scale-to-zero (released late 2024) drops to 0 ACU when no traffic arrives for ~5 minutes. It wakes in 5–15 seconds when developers resume. RDS Multi-AZ would burn ~$110/mo on a database nobody's hitting most of the day; ASv2 with scale-to-zero hits the 1E.3 staging-cost target (<$100/mo idle) without compromising the wire protocol or extension surface.
  • Production wants predictable cost and predictable latency. RDS at a fixed instance class is straightforward to capacity-plan. ASv2's per-ACU billing is a sliding cost that can exceed the equivalent fixed instance under sustained load — the breakeven point depends on traffic shape, and the production shape is "always-on with diurnal peaks," not "mostly idle."
  • Same operational plane. Both are RDS-family services managed through the same AWS console, same Terraform provider, same Secrets Manager pattern, same golang-migrate story. Switching staging to RDS later or production to ASv2 is a parameter change, not a re-architecture.

What we accept: the 5–15s cold start when staging has been idle for more than 5 minutes. For a developer environment that pause is fine — a developer hitting an endpoint after lunch waits a few seconds the first time and never again. For a production environment serving live clinics, that pause would be a user-visible defect.


Why Cloudflare for SaaS over rolling our own ACM-on-ALB cert flow?

Per-tenant custom domains (a clinic registering physio-bucharest.ro) need automated TLS certificate provisioning + renewal for arbitrary tenant-controlled hostnames. The two paths:

  1. Build it on AWS. Call ACM RequestCertificate per domain → return validation CNAME to the clinic → poll for issuance → ModifyListener to attach the cert to the ALB. Handle ACM rate limits (per-region per-day issuance ceiling), ALB listener cert limits (default 25, raisable on request, hard ceiling around 25,000), renewal monitoring, and the failure-recovery playbook when a clinic's DNS misconfigures or the validation record disappears.
  2. Use Cloudflare for SaaS. The clinic adds a CNAME to a hostname we publish; Cloudflare provisions a Let's Encrypt cert, terminates TLS at the edge, and forwards to our ALB origin. We call Cloudflare's Custom Hostnames API to register hostnames; Cloudflare handles cert lifecycle, edge cache, and DDoS for that hostname.

Path 1 is real engineering work — 1–2 weeks of initial build, plus ongoing cert-state-machine maintenance and the on-call surface that comes with it. Path 2 is a configuration line plus ~200 lines of Go for the registration API. The cost difference at any reasonable scale is rounding error: Cloudflare for SaaS is $7/mo + ~$0.10 per active hostname.

Cloudflare is already in the architecture for DNS, CDN, and WAF — Cloudflare for SaaS is one more product on the same vendor, same dashboard, same DPA. It does not introduce "another scattered piece" by the criterion that ruled out splitting providers (see aws-infrastructure.md → Edge: Cloudflare).

What we don't get: if Cloudflare's SaaS layer is down, custom-domain traffic stops. Platform subdomains (*.clinic.restartix.pro, *.portal.restartix.pro) keep working through Cloudflare's main DNS + ALB path, so the platform isn't fully offline. Cloudflare's incident history at the SaaS-product layer is good but not perfect; we accept that single point of failure as the same one we already accept for DNS and WAF.


Why Terraform for infrastructure as code?

The IaC choice was one of three: Terraform (HCL), AWS CDK (TypeScript or Python), or Pulumi (TypeScript / Python / Go). Terraform won for one specific reason and one general one.

  • Specific: AI-assisted authoring quality. Terraform has the largest training-data footprint and the most mature AWS provider; LLM-generated Terraform is correct first-time at a noticeably higher rate than CDK or Pulumi. For a small team that will lean on AI for IaC iteration, this is the largest single factor.
  • General: HCL is simple, doesn't pull a Node or Python toolchain into infra changes, and stays out of the way. CDK's TypeScript matches the workspace language but the affinity benefit is marginal — infra code rarely shares anything with app code, and we don't want a TypeScript compiler error blocking an terraform apply.

What we rejected:

  • AWS CDK. Couples infrastructure to a single cloud at the abstraction layer (CDK is AWS-only by design). We have no plan to leave AWS, but Terraform's portability is a free option, not a cost. CDK also synthesizes through CloudFormation, which is slower to apply changes than Terraform's direct API calls and has a smaller blast-radius for state corruption.
  • Pulumi. Closest to CDK's programmatic model with multi-cloud support. Smaller community, fewer training-data-derived examples, smaller ecosystem of pre-built modules. The marginal benefits over Terraform don't pay for the community-size delta at this team size.

State backend. S3 bucket inside the same AWS account as the resources it manages, with native conditional-write locking (use_lockfile = true) — no separate DynamoDB table. Each env's backend.tf sets use_lockfile = true; Terraform writes a {state-key}.tflock object via If-None-Match: * to coordinate concurrent runs. No external state vendor (Terraform Cloud, Spacelift, Scalr) at this scale — the operational gain doesn't justify another DPA and another bill.

The Terraform module layout — environment separation, what's in modules/ versus envs/, how secrets and shared resources are wired — lives in iac-layout.md.


Why OpenAPI spec-first with oapi-codegen + openapi-typescript?

The wire format between the Core API and the three Next.js frontends needs to stay in sync, and the worst time to discover drift is in a 4xx loop on a customer demo. We chose spec-first generation early — at Layer 1.7 — so every endpoint added from Layer 2 onward inherits the contract instead of inventing its own.

  • The spec lives at apps/docs/openapi.yaml and is the source of truth for the wire format.
  • make openapi regenerates Go types into services/api/internal/core/server/openapi/spec.gen.go via oapi-codegen.
  • pnpm openapi regenerates TypeScript types into packages/api-client/src/generated.ts via openapi-typescript.
  • A drift test at internal/core/server/openapi/spec_test.go enforces three-way sync between routes.go, the OpenAPI spec, and the test's expectedRoutes table — adding a route in any one place without the others fails CI.

The unresolved sub-question — whether to also generate request validators from the spec — is parked until the first endpoint with a non-trivial request body lands (Layer 2). Today's handlers do parse-then-typed-error; switching to spec-driven validation is a localised change behind httputil.NewValidationError.

Tradeoff. Types-only generation (today) means the Go server still hand-rolls request decoding and response shaping. Full spec-driven server-stub generation was rejected because (a) it forces a code-gen step into every handler change and (b) oapi-codegen server stubs add a kin-openapi runtime dependency we don't otherwise want in go.mod. Re-evaluate when Layer 12 picks a request-validation library.


Why telemetry is PG + S3, not ClickHouse

Telemetry's job is two things: ingest patient exercise-engagement events + pose-detection landmark frames from the Patient Portal and make those readable later by the same clinic's specialists, the patient themselves, and clinic-admin cohort views. An earlier version of this design called for a separate ClickHouse cluster and a separate compliance Postgres. Both are out. The actual workload fits a single Postgres instance plus S3 for replay blobs, and the ADR below explains why.

What's actually being stored

  • Per-rep pose aggregates (form_score, ROM, rep count, exercise phase breakdown): ~100 rows/session, server-computed at session_end from the landmark stream. Clinical record — lives in Postgres alongside the rest of patient_exercise_logs with full RLS, audit, classification.
  • Per-session video aggregates (watch %, buffer count, avg bitrate): 1 row/session in Postgres.
  • Replay blobs: full landmark stream for one session, ~3 MB binary float32 + gzip per 30-min session at 10fps. Fetched on demand by Clinic-app replay viewer.

Earlier specs assumed raw landmark frames had to be queryable by analytics aggregations across many sessions and clinics. They don't — the specialist's view is one session at a time for their patient, and dashboards are per-clinic cohort aggregates. That's a Postgres-shaped read pattern, not a ClickHouse-shaped one. Replay is fetch one blob by session_id, which is S3, not OLAP.

The four reasons ClickHouse is out

  1. No cross-tenant analytical workload exists today. Readers are specialist-in-clinical-context, patient-own-history, and clinic-admin cohort. All three are clinic-scoped. ClickHouse pays off when you're scanning billions of rows across many tenants for an aggregate query — a workload we don't have specced and likely won't build for years.

  2. PG carries to ~50k peak concurrent users with the right design. RLS-scoped queries (one org reads at most ~1M rep_metrics rows per quarter), monthly partitioning on event-shaped tables (P41), materialized views per dashboard, and a read replica are the lever stack. The legacy product has 20k+ users today; reaching the scale that genuinely needs ClickHouse is years out.

  3. ClickHouse has no native RLS. Working around that means an app-layer guard pattern (the CH equivalent of P47) plus a CI check, plus mandatory org_id = predicates on every query. That's a meaningful tax for a workload PG already handles.

  4. Operational footprint. Managed ClickHouse Cloud (or Aiven CH) is ~$3–6k/year baseline before any traffic. Self-hosted is real ops weight (backups, upgrades, monitoring, replication). Either way it's a second database with a different SQL dialect, different migration tooling, different backup story, different on-call knowledge. That's a permanent operational tax for a benefit we don't yet need.

What replaces ClickHouse

Earlier designReplaced by
ClickHouse pose_tracking_frames (every frame queryable)S3 replay blobs + Postgres pose_rep_metrics (per-rep aggregates queryable, full frame stream replay-only via S3 fetch)
ClickHouse media_sessions + media_buffering_eventsPostgres media_session_metrics + media_buffering_events (monthly partitioned per P41)
ClickHouse analytics_events (generic event firehose)Removed entirely. App-internal analytics (automation execution counts, etc.) live in their own domain tables; cross-tenant aggregates aren't a current need.
Separate compliance Postgres for audit.audit_logs, security.security_events, privacy.*Core API's existing audit_log (already monthly partitioned, RLS-scoped, retention-tiered). The duplication was an artifact of the pre-RLS-foundation design.
Pseudonymized actor IDs in CHPlain principal_id + org_id in PG aggregates and S3 paths. Pseudonymization existed to make cross-tenant aggregates safe; we don't have those readers.
0–3 consent ladderPer-purpose consent flags (analytics, biometric) using the existing foundation per-purpose consent ledger (1B.9).
Generic POST /v1/analytics/track + POST /v1/errors/reportRemoved. Three typed ingest endpoints only (/v1/pose/frames, /v1/media/events, /v1/sessions/{id}/end). Errors → off-the-shelf (Sentry-equivalent) when needed.

Why a separate Telemetry service then, instead of a route group inside Core API?

Hard ingest isolation. Pose-frame ingest can hit ~10k req/sec sustained at peak; sharing a Go process and DB pool with Core API's transactional traffic risks starving the appointment/booking/forms workload. A separate service with its own Fargate task, its own scaling policy, and its own connection pool removes that coupling. The cost — duplicate auth plumbing — is paid via the signed session token pattern: Core API mints a short-lived token at exercise-session start, Telemetry API verifies signature only on the hot path. No Clerk JWT verification per pose-frame batch.

Separate service without separate databases is a deliberate combination: physical isolation where it matters (compute), shared substrate where it doesn't (storage).

The ClickHouse escape hatch

The redesign keeps ClickHouse reachable via swap-point interfaces (AggregateStore, AggregateQuery, SessionBuffer, LandmarkCodec, SignedSessionToken). At Tier 3 (~50k+ peak concurrent users) — if/when a genuinely cross-tenant analytical workload appears (Console-side platform-wide insights, research queries spanning many clinics) — we can dual-write a specific aggregate to CH and route a specific dashboard's reads to it without rewriting the ingest pipeline. The decision today is "don't operate ClickHouse for a workload that doesn't justify it." Not "never operate ClickHouse."

The full design (architecture, endpoints, scaling roadmap, swap-point interfaces) lives in /telemetry/index.md and /telemetry/api.md. The operational corollaries (which Postgres, which S3 bucket, sub-stack costs) live in aws-infrastructure.md → Telemetry sub-stack.


Why a monorepo?

Both services (Core API and Telemetry API) live in the same Git repository. This means:

  • Shared types are defined once and imported by both
  • A single go.mod — no dependency drift between services
  • Changes that span both services are a single commit
  • One CI/CD pipeline to maintain

The tradeoff is that a monorepo requires discipline — each service's code is in its own directory (internal/core/, internal/telemetry/) and they don't import from each other's internal packages.


Operational conventions

Smaller decisions that aren't worth their own section but should not get lost.

restartix_app bootstrap password rotation

The init migration creates the restricted DB role with the placeholder password 'changeme_in_production'. This is intentional for first-boot ergonomics; it must be rotated before any non-local environment connects. The runbook step on first deploy to a new environment is:

sql
ALTER ROLE restartix_app PASSWORD '<value-from-secrets-manager>';

…and the application's DATABASE_APP_URL env var must point at the same value. The migration intentionally does not read from a secret because migrations run as part of the deploy pipeline and shouldn't depend on the secret store; rotation is a one-time, deploy-adjacent operation.

Client IP in request logs

Request logs include remote_addr. Under GDPR, IP addresses are pseudonymous PII. We log them deliberately because:

  1. audit_log.ip_address already records IPs as a feature (CLAUDE.md → Audit Logging), so the request log carrying the same value adds no new exposure.
  2. They are essential for debugging cross-tenant or rate-limit incidents — without them, investigating a 5xx spike or a credential-stuffing burst becomes guesswork.

The slog ReplaceAttr handler still redacts any field whose key matches the documented sensitive patterns (password, secret, token, apikey, authorization, cookie, session); IPs are out of that set on purpose.

One audit row per logical event

A handler that completes a logical action emits exactly one audit_log row, even when the action touches multiple SQL statements. Examples:

  • POST /v1/organizations writes the org, clones system roles, and clones role permissions in one transaction. Only the org CREATE is audited; the clone INSERTs are system seeding downstream of the audited event.
  • The auth middleware's first-portal sign-up inserts a principal_organizations row and bumps humans.current_organization_id. Only the membership CREATE is audited; the current_organization_id write is bookkeeping for the same event.
  • An organizations.manage_members upsert that resolves to the same role as before bumps updated_at but emits no audit row — there is no semantic change worth recording.

The principle is "one event, one row." If a regulator later requires SQL-statement-level enumeration, the policy can tighten — but the default is what an auditor reading the row would call the action, not what the database engine did underneath.


Encryption: cached data keys, not per-record envelope encryption

The standard AWS pattern for field encryption is per-record envelope encryption: every row gets its own data key (DEK), the DEK is encrypted by KMS, and both the encrypted DEK and the ciphertext are stored. Decrypting a row requires a kms:Decrypt API call to unwrap the DEK.

We chose a different shape: a small set of data keys (one per active key version), unwrapped from KMS once at startup, then cached in process memory. New encryptions seal under the active version's DEK; decryption picks the right DEK by reading the version byte from the ciphertext.

Why:

  • Phone numbers and API keys are read frequently (every appointment lookup, every integration call). Per-record envelope adds a KMS round-trip to every read — slow and expensive.
  • Cached DEKs still satisfy the threats KMS is meant to address: the master key never leaves AWS, every data-key fetch is logged in CloudTrail, IAM controls who can unwrap, rotation invalidates cached keys at the next deploy.
  • Per-record envelope is worth the cost when records are large (whole files), encrypted under per-tenant DEKs, or read rarely. None of those apply to phone numbers and API tokens.

Tradeoff: A compromised running process holds the unwrapped DEKs in memory. Per-record envelope leaves only one wrapped DEK in memory at a time. We accept that tradeoff because the threat model that matters here is stolen DB / leaked backup, not memory inspection of the running app — if an attacker has memory access to the API process, they have an open authenticated session and KMS isn't going to save us.

The helper that implements this is at services/api/internal/core/crypto/. Phase 1 (current, all envs including production) uses InMemoryKeyring loaded from ENCRYPTION_KEYS — in production that env value lives in the restartix/{env}/encryption Secrets Manager secret enveloped under a customer-managed KMS CMK, so the keyring is KMS-rooted via the SM fetch at boot. Phase 2 (deferred) switches to the kmsKeyring (direct per-data-key KMS calls + per-tenant key custody); the stub returns ErrNotImplemented until then. See aws-infrastructure.md → Direct-KMS keyring + BYOK (Phase 2+).


Encryption: version byte stamped inside the ciphertext blob

When ciphertext needs a key version (so rotation can roll forward without re-encrypting all old rows immediately), there are two ways to record it: a sibling _key_version column for every encrypted field, or a single byte prefixed to the ciphertext blob itself.

We chose the prefix-byte approach. The wire format is [1-byte version][12-byte nonce][ciphertext + GCM tag] in a single BYTEA column.

Why:

  • One column per encrypted field, not two. The data model already has many _encrypted columns named (phone, emergency phone, API keys); doubling that with sibling version columns is bookkeeping with no upside.
  • The version travels with the data — there's no way to accidentally desync ciphertext and version by writing one without the other.
  • Migrating the wire format later is a localized change inside internal/core/crypto/envelope.go; nothing in higher layers cares about the layout.
  • 1 byte of overhead per field is invisible at our data sizes.

The single drawback — the version byte is unauthenticated (it's outside the GCM tag) — doesn't matter in practice: an attacker who flips the version byte makes Decrypt return ErrUnknownKeyVersion or auth-tag failure, never silent corruption.


Encryption: no AAD (Additional Authenticated Data) in v1

GCM supports AAD — extra context (e.g., the row's org_id or the column name) that's bound into the authentication tag without being encrypted. AAD prevents a class of attacks where someone with DB write access cuts a ciphertext from one row and pastes it into another to read it under a different access path.

We left AAD out of the v1 helper. Why:

  • Adding AAD requires every call site to pass the same context on encrypt and decrypt, and getting it wrong silently corrupts all reads (auth-tag failure on every row). That's a real cost paid by every domain that adds an encrypted field.
  • The threat model AAD addresses (attacker with arbitrary DB write but not key access, who knows our schema layout, and gains by moving ciphertext between rows) is narrow given the protections we already have: RLS, audit logging, IAM-restricted DB access.
  • It's straightforward to add later — the helper signature can grow an optional context argument without changing the wire format.

Documented as a future hardening step. Revisit when a concrete threat model demands it (e.g., a feature stores per-tenant secrets and we want to bind ciphertext to org_id).


Why no AI-first architecture?

AI is treated as a feature-layer concern, not a foundation-level architectural choice. The platform's substrate (Postgres + RLS, modular Go monolith, repo/service/handler split, audit log, RBAC) is already a near-optimal base for AI features. There is no DB to swap, no service shape to change, no schema philosophy to invert.

The "AI-first" framing is misleading by analogy to mobile-first. Mobile-first changed real architectural choices (responsive layout, performance budgets, touch input) because mobile constrained the platform. AI does not constrain — it sits on top. What it requires from the foundation is a small set of provenance and consent hooks, plus a design culture applied per feature.

What we rejected:

  • A vector database (Pinecone, Weaviate, etc.) — pgvector is enabled at the foundation layer (1A.16); embedding columns become a per-feature column add, with no provider migration. See ai-agents-runtime.md → Why pgvector, not Qdrant for the cost analysis.
  • Embedding columns on entity tables now — added per-feature when a use case exists. Pre-adding them is dead schema weight.
  • Event sourcing as a paradigm — the audit log already captures the journey of every mutation. Switching to event sourcing would tax every write path for a benefit AI can already get from audit_log.
  • A streaming-first architecture (WebSocket/SSE everywhere) — request/response is the right default. Streaming infra is added per feature (live transcription, live form coaching) when one needs it.
  • A separate AI service or microservice — the modular monolith is better for AI tool-calling: one auth context, no cross-network hops, RLS enforced at the DB on every call. AI runs as another caller of the existing services.
  • Document-oriented schemas — normalized relational + JSONB where appropriate is fine. AI does well with joins; pre-denormalizing for AI buys nothing.

What the foundation does need (the AI hooks):

These are the architectural concerns AI raises that are expensive to retrofit later. They belong in Layer 1, not deferred to a feature. Note that not all of them are AI-specific — some of these would be in the foundation regardless; AI just makes their absence more painful.

  • Actor provenance in audit (Layer 1.24) — model_version, inputs_hash, confidence columns on the sibling audit_ai_provenance table (split off from audit_log so AI-feature schema churn doesn't pollute the core audit table's compliance contract). Without these, "doctor decided" and "AI suggested and doctor accepted" are indistinguishable in the record. EU MDR, GDPR Art. 22, and any future malpractice question all hinge on that distinction. These columns are AI-specific.
  • Consent has an "AI processing" purpose (Layer 4.5) — separable per clinic, separable per use case (triage vs. clinical decision support vs. transcription). Bolting this on later means re-consenting the migrated user base. AI-specific. (Originally scoped at Layer 1.15; moved to 4.5 once forms — 4.3 — were confirmed as the canonical content source for clinical-grade consent.)
  • Column-level data classification (implementedLayer 1.25, data-classification.md, patterns.md P39) — a registry answering "is this column allowed to leave the tenant boundary, and for which inference targets." Useful for any egress (webhooks, exports, marketing email) — AI is one consumer among several. Cheap to define now, brutal to backfill across hundreds of columns later.
  • Principals as the root identity (Layer 1.24) — every actor (human, AI agent, service-account integration, system job) is a row in principals, with profile data in a sibling table (humans, agents, service_accounts). Audit, RLS, RBAC, and every domain reference principals.id. The platform direction toward AI agents as first-class actors is the heaviest of three converging reasons; system-actor gap and service-account extensibility are the other two. See the principals ADR below and product/ai-agents.md.
  • SOUP list has an AI/ML model category (Layer 1.16) — model_provider, model_version, validation_status. Required for medical device readiness regardless of AI-first framing; one row per model when it ships, not a schema change.

The shift that does happen — at the feature layer, not the foundation:

Every feature designed from now on assumes AI assistance is a possibility:

  • Capture drafts and revisions, not just the final saved record. Clinical notes, treatment plans, exercise prescriptions: "AI-drafted → human-edited → final" is a first-class flow, not a retrofit.
  • Capture rejection/correction signal. When a human overrides an AI suggestion, that override is the most valuable training data the platform produces — record it as structured data, not as a free-text comment.
  • Prefer structured fields over free text where reasonable. Structure is easier for AI to reason over; it is also easier for humans, search, audit, and reporting.
  • Design every API to be tool-callable — clear inputs, clear outputs, idempotent where possible. This is good API design that happens to be agent-ready; it does not require a separate "agent API."

Tradeoff: if a future AI-native paradigm (agent-resident state, learned schemas, something we cannot see today) genuinely requires a foundation rewrite, this position is wrong. We accept that risk because (a) no such paradigm exists today, (b) the cost of a hypothetical rewrite is bounded by the cost of writing a new system, and (c) the cost of a wrong AI-shaped foundation is paid every day until then. Revisit if a concrete clinical AI feature surfaces a foundation gap that the hooks above do not cover.


Why principals as the root identity?

The platform is built around AI agents as first-class actors — see product/ai-agents.md for the stated direction. The principal model is the foundation that supports it. The same model also closes a current gap (the system actor in audit_log) and pre-empts the service-account integration shape whenever it ships. All three reasons converge on the same shape; the AI-agents direction is the heaviest of the three.

The concrete gap today: the system actor

audit_log.user_id is FK to users(id). Every mutation needs a value. Today the schema has no clean answer for:

  • Trigger fan-out. When create_organization_companion_rows() fires on org INSERT and creates organization_settings / organization_billing / organization_entitlements rows, what user_id goes on the audit rows? The orchestrating handler's user, technically — but that conflates "the user who created the org" with "the system that fanned out the companions" and the audit story for the fan-out is muddied.
  • Scheduled jobs. GDPR retention sweeps, key rotation events, future cron-style cleanups — there is no user_id because there is no human in the loop. The current options are NULL (loses referential integrity), a fake "system user" row in users (carries Clerk fields, blocked status, and a fake email — wrong shape), or "skip the audit row" (violates P10).
  • External webhook handlers. Stripe sends a subscription.updated event; Daily.co sends a room.expired event; Twilio sends a message.delivered event. The platform mutates state in response. There's no human user — the event came from an external system. Same three bad options.
  • Public unauthenticated paths. A patient hits the resolve endpoint or starts a public booking flow; a row in audit_log records the request. Same problem.

These are not hypothetical. They are paths that exist or are scheduled to exist (1.16 rate-limit failures land on auth endpoints; 1.20 subscription webhooks; 4.5 consent withdrawal can fire from public paths). Each one currently leans on one of the three workarounds above.

A singleton 'system'-type principal — a real row with a well-known UUID — gives every one of these paths a real actor_id. That alone is the load-bearing reason.

The extensibility benefit: any future actor type

Once principals exists as a registry that audit/RLS/role-grants reference (instead of users directly), adding a new actor type later is a sibling table and an enum value — zero changes to audit_log, RLS helpers, or the role-grant pivot. AI agents land as a sibling agents table per the product direction. Service-account integrations (clinic-installed Zapier connectors with rotating API keys, scope-limited permissions, expiry) land as a sibling service_accounts table when the first integration ships. The shape is the same; only the per-actor-type profile columns differ.

Considered alternatives

ApproachWhat it doesWhy we rejected (or chose) it
Just add audit_log.user_id NULL for system rows + an actor_type enumAllow NULL when actor_type='system', CHECK enforces shapeRejected. Solves the audit case for now but doesn't solve role grants (platform_roles.user_id, principal_organizations.principal_id equivalent) or domain created_by_* columns. When service accounts arrive, every actor column on every Layer 2+ domain gets the same NULL+CHECK retrofit, and the cross-domain refactor is the cost we were trying to avoid. The narrow fix becomes the wide refactor.
Add a type column to usersOne table holds humans, service accounts, agents, systemRejected. users is shaped for Clerk-authenticated humans (clerk_user_id, email, confirmed, blocked). Stuffing service accounts and agents in here means most columns are NULL for non-humans, every Clerk-aware code path has to filter WHERE type = 'human', and the meaning of "user" gets muddled.
Discriminated (actor_type, actor_id) columnsaudit_log carries a type tag plus nullable type-specific FKs; each new actor type adds its own role-grant pivot tableRejected. Every new actor type means a new nullable column on audit_log, a new pivot table for role grants, and a new branch in every "who has permission?" check. Cost compounds with each type, integrity is convention-only.
Mirror-table polymorphism (principals as a pointer alongside users)A principals registry sits next to users; every user gets a companion principals row via trigger fan-out; audit_log and role grants reference principals.id, while users keeps its own primary keyRejected. Two parallel identity tables for humans (a users row and a mirror principals row maintained by trigger), two valid ways to reference an actor in code (current_app_user_id() vs current_app_principal_id()), and per-PR judgment on which to use in every new RLS policy and domain table. The mirror is a workaround for "we already had a users table" — and we don't, because we're still pre-production with editable migrations.
Principals as the root identityprincipals is THE actor identity table. users is renamed to humans and its primary key becomes principal_id (FK to principals.id). Future actor types are sibling tables: service_accounts, then agents if/when. No mirror, no trigger fan-out for identity, one column type for actors everywhereChosen. Solves the system-actor gap concretely (singleton system principal exists from seeding); makes the obvious next actor type (service accounts) a sibling table with no audit/RLS/role-grant changes; doesn't privilege humans in the schema.

The shape (Layer 1)

What ships in Layer 1.24 — nothing more. Future actor types ship per-feature.

  • principals — root identity registry. Columns: id UUID PK, principal_type ('human' | 'agent' | 'service_account' | 'system'), parent_principal_id UUID NULL (delegation chain), created_at, deleted_at. There is no organization_id column on principals — tenant binding for non-human actors lives on the actor-type child tables (agents.organization_id, service_accounts.organization_id, both NOT NULL). Putting the column on the children keeps the schema's column shape aligned with the actor's nature: humans never accidentally get an org binding, agents and service accounts always do — no trigger needed to enforce the asymmetry. RLS: SELECT for self only; cross-tenant principal visibility runs through the actor-type child tables. Mutations only via AdminPool / trigger fan-out (no AppPool write policy — same pattern that protects roles / permissions).
  • humans — externally-authenticated human profile. Replaces today's users table. Primary key is principal_id (FK to principals.id, ON DELETE CASCADE). Columns: provider_subject_id (provider-agnostic external identity — Clerk JWT sub today, any future provider's equivalent), email, confirmed, blocked, last_activity, preferred_language. Provider-specific code paths live in the auth/<provider> verifier subpackage; the rest of the platform sees only provider_subject_id.
  • principal_organizations — membership/role grants (renamed from user_organizations), with principal_id (FK to principals.id). Same columns otherwise (role_id, last_used_at, last_activity_at, invited_at, invited_by (also FK to principals.id), accepted_at). One role per principal per org by convention — splitting "membership" from "role grant" into two tables buys nothing in expressiveness, costs an extra join on every permission check, and last_used_at/invited_at are activity properties of the membership-with-role, not of either piece in isolation.
  • audit_log.user_id becomes audit_log.actor_id (FK to principals.id, NOT NULL — every audit row has a real actor, no NULL/CHECK gymnastics), plus a denormalized actor_type column (same enum values as principal_type) for at-a-glance reads without a join. Audit also gains AI-provenance columns at the same time per the AI-first ADR above: model_version TEXT NULL, inputs_hash BYTEA NULL, confidence NUMERIC(4,3) NULL (range-checked 0..1). The provenance columns are independent of the principal model — they would land regardless — but bundling them in the same migration avoids a second audit_log migration the same week.
  • platform_roles (superadmin) — user_id becomes principal_id (FK to principals.id) with a CHECK constraint that the principal is type='human'. granted_by becomes granted_by_principal_id with the same CHECK. Superadmin stays a human-only concept by constraint, not by table structure — service accounts and agents do not get superadmin grants.
  • Existing actor columns on non-actor tables — subscription_overrides.granted_by_user_id / revoked_by_user_id, plan_versions.changed_by — all become *_principal_id (FK to principals.id). Service accounts can grant overrides too; the schema doesn't fight that.
  • RLS helpers: current_app_user_id() is renamed to current_app_principal_id(). New companion current_app_principal_type(). current_app_has_permission(resource, action) continues to work — internally it joins through principal_organizations. There is no current_app_user_id() after the refactor. One way to do things, no per-PR judgment, no silent-bug class from picking the wrong helper.
  • Auth chain: split into three composable layers in internal/core/auth/ (token verification — provider-agnostic, with auth/clerk as the implementation today) and internal/core/principal/ (per-actor-type Resolver + the request-scoped Subject + SubjectLoader). The middleware that composes them is middleware.Authenticate(verifier, resolver, loader). Provisioning a new human on first sight is one transaction (principals + humans), atomically. RLS session vars are set as before — app.current_principal_id, app.current_actor_type = 'human'. The legacy app.current_user_id is removed.
  • Trigger functions are renamed: clear_user_organizations_on_superadmin_grantclear_principal_organizations_on_superadmin_grant; clear_current_org_on_membership_delete keeps its semantics but updates humans.current_organization_id (still humans-only — non-humans don't have a current-org concept).
  • Pseudonymization (internal/shared/pseudonym/) hashes principal IDs for any cross-tenant analytics surfaces that may need it. Same SHA-256 shape, different input column. Note: Telemetry (Layer 2) does NOT use pseudonymization for its own storage — readers are clinic-scoped (see Why telemetry is PG + S3, not ClickHouse). The helper remains for any future genuinely cross-tenant aggregate egress.
  • A singleton 'system'-type principal is seeded with a well-known UUID. Trigger fan-out paths, scheduled jobs, and external webhook handlers (Stripe, Daily.co, Twilio) attribute their audit rows to this principal. This is the concrete fix for the system-actor gap.
  • Future sibling tables — service_accounts (when the first integration with rotating API keys ships) and agents (if and when autonomous AI agents become a concrete product feature) — are described in data-model.md as future-shape notes only. They do not ship in Layer 1.

Why not the narrow fix instead

The smallest fix that solves the system-actor gap alone is "allow audit_log.user_id NULL when actor_type='system' + add the actor_type enum." About one migration edit and a few Go field additions. It works for the audit case in isolation. The reason it's not enough is that the same gap exists across every actor column on every other table: platform_roles.user_id when an org-creation trigger needs to record who granted; subscription_overrides.granted_by_user_id when a billing automation grants an override; every Layer 2+ domain's created_by_user_id the first time a non-human action creates a row; AI agent writes once those features ship.

The narrow fix solves one column. The wide fix (principals as root) solves the column class. Pre-prod is the only window where the wide fix costs the same as the narrow one (mechanical refactor in a bounded set of files). Once Layer 2-12 ships and FKs to users(id) are spread across 50+ domain tables, the wide fix becomes a cross-cutting migration on a live schema.

Why root identity, not mirror

The mirror-table version (the rejected fourth option above) preserves the existing users table and adds principals as a separate pointer, with a trigger fan-out maintaining one principals row per users row. It looks like the safer choice because nothing existing changes. It isn't safer — it's the workaround, and the workaround compounds every time a Layer 2+ feature ships:

  • Two valid identity columns on every domain table (created_by_user_id or created_by_principal_id — which?). Wrong choice is a silent bug that surfaces only when the first non-human actor exists.
  • Two valid RLS helpers (current_app_user_id() vs current_app_principal_id()). Same silent-bug class on every new policy.
  • Two writes per identity creation (one to users, one to principals via trigger). Two things to keep in sync; one more failure mode in the auth path.
  • A users row and a parallel principals row for every human — the database explicitly says "humans are special" while the application model is trying to say "humans are one profile shape sitting under a principal." The two views disagree.

The root-identity approach removes the choice. There is one identity column (principal_id or its sibling-specific variant human_id), one RLS helper (current_app_principal_id()), and one identity row per actor.

Cost frame

The mechanical scope is ~84 files: 10 migration files, 27 Go backend files, 18 frontend files, 20 docs, 9 tooling files. Most of it is sed (rename user_idprincipal_id, usershumans, current_app_user_idcurrent_app_principal_id) plus tests passing. None of it is design work — the design is in this ADR; the spec is in 1.24.

The honest framing is not "84 files is the cost we're paying." It's "84 files is the cost of doing this rename in a pre-prod window where every consumer is in this monorepo, vs. the same rename later as a coordinated migration across a live production schema with FKs in 50+ Layer 2-12 domain tables." The work is the same; the only question is when it's paid. Pre-prod, the answer is mechanical. Post-prod, it's a cross-cutting project.

Tradeoffs (kept honest)

Every read path through audit_log that needs the actor's underlying details (e.g., the human's name) does one extra join (audit_log → principals → humans) compared to the pre-refactor world's direct audit_log → users. Mitigated by denormalizing actor_type onto audit_log so most queries answering "what kind of actor was this?" don't need the join. Existing code paths that today read users directly become reads on humans — same query shape, renamed table. New code paths that reference an actor (audit writes, role grants, RLS helpers) reference principals directly without going through humans — symmetric across actor types.

The renaming churn (legacy UserContext → request-scoped principal.Subject, user_idprincipal_id across the API surface) breaks any consumers documented in OpenAPI today. There are none in production yet — that's why this lands now and not later. The follow-up cleanup that completed the Go-side shape (verifier ↔ resolver ↔ subject split, slim human.Repository, single Subject per request) landed shortly after 1.24 in the same pre-prod window — see "Auth chain shape" below.

Scope kept tight

The principal model is the abstraction. Specific sibling tables (agents, service_accounts) ship per-feature when those features are concretely defined — not in this layer. No parent_principal_id for delegation; that column lands when a real delegation feature ships per P36. AI provenance columns on audit_log (model_version, inputs_hash, confidence) are bundled in the same migration only to avoid a second audit_log change in the same week — they'd land regardless.

Implementation lands in Layer 1.24. At the end of Layer 1 the principals table contains rows of type 'human' (one per externally-authenticated account) and exactly one row of type 'system' (the singleton).

Auth chain shape (Go-side completion)

Layer 1.24 introduced the principal model at the schema layer; the Go side initially kept the legacy users → humans aggregate shape (a fat human.Human carrying memberships, platform roles, patient orgs, has-patient-profile, plus a parallel auth.PrincipalContext with the same data). Two parallel actor representations per request — measurable cost on /v1/me, soft maintenance hazard everywhere else.

The follow-up cleanup completes the migration on the Go side and locks in three concerns by package:

  • internal/core/auth/ — token verification only. auth.Verifier interface + provider implementations. auth/clerk/ is the Clerk JWT verifier today; future providers (signed agent JWTs, OIDC, service-account bearer tokens) ship as sibling subpackages. The platform never imports a provider directly.
  • internal/core/principal/ — the actor model from the runtime perspective. Principal (the row), per-actor-type Resolver (the boundary that maps verified subject → principal row, with auto-provision or admin-issued semantics depending on actor type), Subject (the canonical request-scoped actor view, replaces PrincipalContext), SubjectLoader (one-shot cross-domain reads — memberships, patient orgs, has-patient-profile, platform roles, per-(principal,org) permissions). Permission catalog constants live here too — they're principal-state concerns.
  • internal/core/domain/{human,agent,serviceaccount}/ — per-actor-type row CRUD. human.Human is now a bare humans-row (provider_subject_id, email, confirmed, blocked, last_activity, preferred_language, timestamps) — no membership / platform role / patient-org fields. The Resolver implementation for that actor type lives alongside the row repo (human.Resolverprincipal.Resolver).

The middleware that composes them — middleware.Authenticate(verifier auth.Verifier, resolver principal.Resolver, loader *principal.SubjectLoader) — is provider- and actor-type-agnostic. Different routes can pair different (verifier, resolver) tuples: human-staff routes use the Clerk verifier + human resolver; future agent routes will use a signed-JWT verifier + agent resolver; the loader is shared. The DB layer (principal_is_active in 000002) already dispatches on principal type for activation gating, so adding a new actor type is a Go-side concern only — no migration.

The invariant: per request, the actor is loaded exactly once. Handlers and services read from Subject (via principal.SubjectFromContext) and never re-fetch the data it carries. Reads that need humans-row presentation fields not on Subject (e.g. last_activity for /v1/me) make one targeted call against human.Repository for the bare row — no aggregate. The duplicate-load anti-pattern that the old shape had on /v1/me (and to varying degrees everywhere else) is gone by construction.

First-sight audit failure is best-effort, not 500 — documented exception

CLAUDE.md's audit invariant says "audit failures are 500s (the platform must not commit a mutation it cannot prove happened)." The first-sight provisioning audit row (recordPrincipalCreate in middleware/auth.go) deliberately deviates: if the audit insert fails, the request continues; the new principal stays provisioned; the failure is logged at error level.

The reasoning: the alternative is locking new users out of the platform during any audit_log unavailability window (Postgres degraded, partition rotation race, etc.). For a user-facing authentication path, "you can't sign in because our compliance log is down" is worse than "your sign-in audit row is missing — recoverable from the provider's user-creation event in the worst case." Every other audited mutation (handler-emitted, RLS-protected, transactionally bound) keeps the strict-fail contract — the exception applies only to the auth chain, where the principal is itself the actor and locking them out has no recovery path short of operator intervention.

When provider-side webhooks ship (the apps/docs/features/auth/clerk-integration.md planned section), the audit-row recovery becomes automatic: a missed first-sight CREATE can be reconstructed from the corresponding user.created provider event. Until then, the recovery path is "read the slog.Error and reconcile manually" — acceptable because the failure mode is rare and the data still exists in the principals + humans rows.


Why a column-level data classification?

Every Layer 2+ feature that pushes data to an external service — AI inference, analytics, third-party integrations, GDPR data exports, marketing campaigns, webhooks — has to answer "what data is allowed to leave the tenant?" Without a registry, each feature makes ad-hoc decisions and each new compliance question becomes a code audit across hundreds of columns. Worse, the answer drifts: a column added for one feature gets quietly included in another's payload because the developer didn't know it was sensitive.

The registry is two things: a class for each column (what kind of data it is — drives retention, encryption, RLS expectations) and a list of egress targets for which it is explicitly allowed (where it can flow externally). Default is "block." A column missing from the registry, or with an empty target list, cannot leave the tenant.

We considered four shapes:

ApproachWhat it isWhy we rejected (or chose) it
Markdown doc onlyA doc lists every column and classRejected. Drifts from reality fast, no enforcement, easy to forget on schema changes.
COMMENT ON COLUMNPostgres comments attached to columnsRejected. Comments survive migrations but are hard to query, easy to skip, and the class taxonomy isn't enforced at PR time.
Registry tableA data_classifications table holding (table, column, class, targets) rowsRejected. Adds another table to maintain; drift is still possible if a column changes without updating the registry. Same drift risk as markdown but with extra schema overhead.
Markdown doc + CI check + runtime helperMarkdown registry as source of truth + CI script comparing schema vs. registry on every PR + Go runtime helper that egress paths consult before sending data outside the tenantChosen. Readable + enforced. CI rejects PRs that add or rename columns without a classification entry. Runtime helper parses the same doc at startup so egress paths can never use an unclassified column.

Class taxonomy (initial):

public (no protection — org name, slug), org_internal (settings, configuration), pii_basic (names, emails, phones, addresses, contact info — plaintext + layered defense), pii_regulated (national IDs, SSNs, CUI — column-encrypted), clinical (diagnoses, treatments, notes), clinical_sensitive (mental health, sexual health, HIV — GDPR Art. 9 special category), auth_secret (Clerk IDs, tokens, API keys — encrypted or hashed), audit_only (IPs, user agents — pseudonymous PII per GDPR), system_metadata (timestamps, foreign keys, internal IDs). The encryption posture per class is enforced mechanically by cmd/check-classification — see Why most PII is plaintext (and what isn't).

Each class implies retention, encryption, RLS, and audit expectations — defined alongside the registry in data-classification.md. Adding a class is a deliberate change, not casual.

Egress target taxonomy (initial):

bulk_export (GDPR Art. 20 patient data portability), analytics_internal (cross-tenant aggregate egress placeholder; not active today since Telemetry's actual readers are clinic-scoped — see Why telemetry is PG + S3, not ClickHouse), webhook_egress (Cat C outbound webhooks), marketing_email (campaigns), support_export (break-glass exports), ai_clinical_drafting and ai_admin_summarization (placeholders — light up when the first AI feature ships). Targets extend per-feature.

How callers use it.

A Go runtime helper parses the registry at startup. Every egress path calls into it:

  • classification.AllowedFor("appointments", "ai_clinical_drafting") returns the column names allowed for that target.
  • classification.Filter(record, "ai_clinical_drafting") returns the record with only allowed columns.
  • The default — column not in registry, or no matching egress target — is "block." There is no way for an unclassified column to leak.

Why now, not when first egress lands.

Every Layer 2+ feature adds tables. Without the registry in place from Layer 1, those tables ship unclassified. Backfilling the classification when the first egress feature arrives means classifying ~200 columns in one PR — high coordination cost, easy to get wrong, no audit trail of who decided what. With the registry in place from Layer 1, every Layer 2+ migration includes the classification entry as a one-line addition, decided by the author who knows the column best, reviewed at the same PR. CLAUDE.md's foundation-discipline rule applies: pay the cost once at the layer, not every time a feature builds on top.

Tradeoff. Each new column adds a registry entry as a one-line task. CI rejects PRs without it. Cost is a few seconds of thought per column at PR time. Failure mode: the registry parses at startup — if it is malformed, the API refuses to start. Mitigated by CI catching the malformation before merge.

Implemented in Layer 1.25. Registry, CI check, and runtime helper all in place; default is block. First runtime consumers are Layer 8 (webhooks, marketing email) and the first AI feature in a later layer.


Why no session invalidation on role change?

The frontend may hold a stale view of a user's permissions for a few seconds after an admin changes their role. We treat that gap as acceptable rather than wiring Clerk session-revocation into the role-change handler.

What's actually happening on the backend. Auth-provider JWTs carry identity (provider_subject_id), not permissions. (Today the verifier is auth/clerk; the column is provider-agnostic.) The RequireOrganizationScope middleware loads role_permissions from the database on every authenticated request and sets the current_app_principal_id() / current_app_org_id() RLS session variables for the per-request transaction — see internal/core/middleware/organization.go (ResolveOrganizationContextRequireOrganizationScopeRequirePrincipalRLS, all in the same file post-1.24). A role change therefore takes effect on the user's next API call. The backend is always authoritative; there is no in-memory permission cache that would survive across requests.

Where the staleness lives. In the browser. The frontend caches /v1/me (current org + permission set) and uses it to decide which UI affordances to render. After a role change:

  • The Clinic / Portal app keeps showing old affordances ("Edit", "Members", etc.) until /v1/me is re-fetched on next page navigation (typically seconds).
  • A user clicking an affordance they no longer hold gets a 403 from the API. The frontend handles 403 by re-fetching /v1/me and re-rendering. The action did not succeed; data is not at risk.

This is a UX glitch, not a security breach.

Why not call Clerk's session-revoke API on every role change?

  • Costs: extra API call per role change (latency + error handling); user is forced to sign in again for what is often a small permission tweak; conflicts with multi-org sessions where the principal still holds memberships in other orgs that should keep working.
  • Benefit: closes a UX gap that the next page navigation closes anyway.

Not worth it for routine role changes.

The right hammer for "remove this user immediately." Set humans.blocked = TRUE. The auth middleware rejects every subsequent request with 403 regardless of permission state — verified in the Layer 1.2 RLS test harness. Optionally also call Clerk's session-revoke for thoroughness, but the database flag is the source of truth. This is the path for terminations, suspected compromises, and any other "this account must stop now" scenario — distinct from routine role changes.

Frontend SLA.

  • Routine role change: effective on the user's next API request. UI affordances update on next /v1/me fetch (page nav).
  • Block / termination: humans.blocked = TRUE is rejected within one request round-trip, regardless of UI state.

Schema impact. None. No session-tracking table, no Clerk-revoke audit log, no migration. The behavior is already in place.


Why asymmetric propagation for system role template edits?

When the platform operator edits a Layer-2 system role template via the Console editor, grants propagate to every existing Layer-3 clone; revocations do not. Renames and deletes require a migration, not the UI.

This is the rule the editor enforces. It is not symmetric — and the asymmetry is the point.

The four-layer model in one sentence. Layer 1 is the permission catalog (migration-only). Layer 2 is the four system role templates (patient, specialist, customer_support, admin) — what new orgs get. Layer 3 is each org's editable clone of those four. Layer 4 is custom roles each org can create. See rbac-permissions.md → The Four-Layer Authorization Model.

The two propagation paths, separated.

  • Migration-time propagation (already automatic). When a feature migration adds a permission to a system role template, the standard recipe in rbac-permissions.md grants it to every Layer-3 clone in the same statement (WHERE r.is_system = TRUE matches both rows). This is how feature rollouts work today; not in scope for this ADR.
  • UI-time propagation (the question). When the Console template editor changes a permission grant, do existing clones inherit?

Why grants propagate.

Adding a permission to a system role template is asserting "every clinic's specialist should have this." That intent applies equally to the new orgs that haven't been provisioned yet and to the existing orgs that already have a clone. Auto-propagating grants matches the migration-time behavior, so a permission added via the UI lands in the same place as one added via a feature migration. No drift between "what new orgs get" and "what existing orgs hold."

Why revocations do not propagate.

Layer-3 clones exist to be customized by clinic admins — and the only customization that makes sense at Layer 3 is revocation ("our specialists shouldn't have appointments.delete"). Clinics that need to grant a permission outside the template create a Layer-4 custom role. So a clinic's Layer-3 grant set is, by design, a subset of the template's grant set, possibly with permissions revoked.

If the platform operator removes a permission from the template, two things may be happening at the clinic:

  1. The clinic still has the permission on its clone (no revocation) — propagation would silently take a capability away, breaking workflows that rely on it.
  2. The clinic already revoked it — propagation is a no-op, but the editor cannot tell which case is which without inspecting every clone.

The safer default is to leave clones alone. The editor warns the operator that template revocations only affect new orgs; if they need to actually pull a capability from existing orgs, that's a migration, not a UI action — and the migration recipe already covers it.

Why renames and deletes go through migrations.

Renaming a role code or deleting a system role is a breaking change for every clone, every audit log entry referencing the role, every per-org custom UI that mentions the role name. The Console editor disallows it; a migration is the right tool because it can update audit references, regenerate clones, and run a transaction.

Audit trail.

Every UI grant emits one audit row per Layer-3 clone updated, all under the same audit_log.action_context = 'template_propagate' (action_context is TEXT, no enum constraint, so this is a labeling decision not a schema change). The originating Layer-2 template edit is its own audit row with action_context = 'normal'.

Schema impact. None. No role_template_version column, no propagation-tracking table, no migration. The rule is enforced in the Console editor handler.


Why owner uses provisioning, staff and patients use invitation?

Two onboarding primitives coexist by design — not as drift, as shape-fits-problem. The owner of an org is provisioned synchronously through auth.PrincipalProvisioner (Clerk's createUser API) at org-create time. Staff and patient onboarding go through auth.InvitationProvider (Clerk's Invitations API) — the recipient's identity is created at the provider when they accept, our humans row materialises on first authenticated request via the auth-middleware bind hook. Same provider (Clerk), two different APIs, two different lifetime semantics.

The setup

Three roles need to enter the system, all keyed on email:

  • Owner — required for the org to legally exist. The org's DPA, plan-tier change, ownership-transfer, and subscription-cancel all gate on the owner. There is no such thing as a clinic without an owner.
  • Staff — additional members the owner or an admin enrols into an existing org over time.
  • Patient — recipients invited into an existing org's care relationship, who must walk through a two-step onboarding flow (/v1/me/patient-profile then /v1/portal/onboard) with platform-level + per-clinic consent gates before they become a patient of that org.

Both Clerk APIs are available. The question is which one each role should use.

The decision

Owner uses provisioning. When a superadmin onboards a new clinic, the org row's INSERT transaction needs an owner_principal_id atomically. There is no valid path where the org exists without an owner — that's a load-bearing foundation invariant, not a UX preference. So at org-create time the platform calls Clerk's createUser directly, gets a provider_subject_id back, INSERTs the humans row + the principals row + the organizations row + the owner's organization_memberships row in one transaction, then sends a magic-link welcome email through our own notify pipeline (Category OwnerWelcome, en + ro templates). The owner is a member of the org from minute zero; the magic link only sets their password.

Staff and patients use invitation. No structural invariant forces their humans row to exist before they accept. The org operates fine with zero pending invites. So we issue a Clerk invitation; Clerk sends its email; the recipient signs up at Clerk; our auth-middleware bind hook (1B.12) finds the open invite by email on their first authenticated request and creates the membership row. Pending invites are first-class — organization_invites has list / revoke / resend / expiry semantics already wired.

Why the asymmetry is right, not avoidable

The provisioning path looks attractively uniform — "use one primitive everywhere" — but four costs land on staff/patient that don't land on owner:

  1. Clerk seat consumption. createUser consumes a billable Clerk seat the moment it's called. A clinic that invites 30 staff over six months and 10 never accept = 10 burned seats with provisioning. With invitation, zero seats consumed until the recipient actually accepts. Owner provisioning is bounded (one per org); staff/patient is not.
  2. Email-validity errors are irreversible. Owner provisioning runs once per org with a superadmin's deliberate input. Staff/patient invites happen frequently with typos, abandoned addresses, forwarding domains. Provisioning a typo'd email creates a Clerk user + humans row + audit trail you have to manually clean up; an invitation just expires.
  3. Patient consent flow integrates with the bind hook, not with provisioning. The two-step patient onboarding requires the recipient to walk through platform-level consents at /me/patient-profile and per-clinic consents at /portal/onboard. The bind hook marks an accepted patient invite as accepted_at but leaves consumed_at NULL until the onboarding chain commits — that's how the consents get gated. Migrating patients to provisioning would either bypass the consent flow (illegal under GDPR) or require re-implementing it on top of provisioning.
  4. Pending state is operationally useful. "Invited 3 days ago, no accept yet" tells the inviter to follow up. The invitation primitive surfaces this directly via organization_invites.accepted_at IS NULL. Provisioning collapses the state space — derivable from humans.last_sign_in_at IS NULL but lossy and indirect.

Owner doesn't pay these costs because owner is one row per org, deliberately provisioned by a superadmin, who lives outside the consent flow.

Considered alternatives

Migrate staff/patient to provisioning too. Initially anticipated in the e930d83 commit message ("later admin-pass migrates staff/patient invites to the same primitive"). Reconsidered when the four costs above surfaced. The "one primitive" appeal is uniformity, not user need; the deferred-invite shape genuinely fits staff/patient better.

Migrate owner to invitation. Would mean creating an org without an owner principal_id, accepting "pending owner" as a valid org state. Cascades into every owner-gated operation (DPA signing, plan changes, ownership transfer, subscription) needing a "no current owner" branch. The structural invariant is load-bearing; relaxing it is a much bigger change than keeping two primitives.

One unified "invite" abstraction over both Clerk APIs. Possible in principle, but the lifetime semantics differ at the model level: provisioning produces a fully-formed principal, invitation produces a pending row. A facade over both would either leak the difference (callers branch anyway) or collapse it (and lose pending-state visibility). The existing two-primitive shape is simpler than the abstraction.

Tradeoffs (kept honest)

  • Two mental models. New developers must learn that owner-create and staff-invite take different paths. The shape is documented here and enforced by the package boundaries (onboarding.ProvisionForEmail vs invites.Service.CreateStaff); the difference is also visible in the route shape (POST /v1/organizations provisions atomically, POST /v1/organizations/{id}/staff-invitations is deferred).
  • Staff invitation emails come from Clerk's template, not ours. Clerk's default template is acceptable as an interim. Migrating to a branded en/ro template would require either configuring Clerk's template per app (still Clerk-rendered, limited control) or suppressing Clerk's email and sending our own through notify alongside the invitation. Deferred until post-prod.
  • AddMember endpoint (POST /v1/organizations/{id}/members) keeps narrower semantics: enrolling a principal who already has a humans row, or changing the role of an existing member. New staff onboarding always goes through staff-invitations.

Revisit triggers

  • Clerk seat economics change such that pre-provisioning unaccepted invites becomes free or cheap.
  • A staff onboarding requirement appears that needs the principal_id to exist before accept (none today).
  • Branded staff-invite emails become a hard requirement and the cleanest path requires moving off Clerk's invitation template entirely.

Scope kept tight

This decision governs the three onboarding paths above. It does not change AddMember (still upserts membership for existing humans), does not change the patient-tier or is_owner semantics, and does not affect the four-layer authorization model. The two existing endpoints (POST /v1/organizations for owner, POST /v1/organizations/{id}/staff-invitations for staff, POST /v1/organizations/{id}/patient-invitations for patients) are the canonical surfaces.


Why patients are not memberships, and patient tiers are not roles?

Two structural decisions taken together. They are bundled because they share the same root cause: the original schema made patients reach the platform through staff-side machinery (principal_organizations for org membership; roles for tier-based perks) — and that conflation breaks down the moment any of the four signals appears: dynamic per-org role names, custom org roles, multi-tenant scale, or the staff/patient permission boundary.

The concrete gaps today

Gap 1 — staff vs. patient is role-string-only. Both staff and patients are humans with a row in principal_organizations. The only thing distinguishing them is roles.code — a string. The seeded patient system role grants app.access_portal; the seeded admin/specialist/customer_support templates grant app.access_clinic. Custom org roles (Layer 1.13) let a clinic create a role named pacient or junior_patient; the schema cannot tell whether such a row is staff or consumer. Listing "staff at org X" requires either filtering by permission grant (role grants app.access_clinic) or excluding role codes — neither is structurally meaningful at the database level.

Gap 2 — tier perks tunnel through RBAC. patient_tiers.role_id points at a per-org role; the patient's effective entitlements are read out of role_permissions against that role. When a patient subscribes to "Premium", the system flips principal_organizations.role_id and the new role's permissions take effect. This makes a billing event mutate an authorization table. Tier entitlements (e.g. monthly_appointments_limit, priority_support) ride on the same permissions catalog as staff permissions (e.g. forms.create, audit_log.view_org) — but the two have fundamentally different shapes: staff permissions are binary "are you allowed", tier perks are "what does your plan include with what limit." The platform already has a complete plan-entitlement-limit-subscription engine for B2B (plans + plan_versions + entitlements + limit_definitions + organization_subscriptions + organization_subscription_entitlements + organization_subscription_limits); the patient side reinvents it through the role machinery instead of using the same pattern.

Gap 3 — principal_organizations mixes two scales. Staff per org is small (dozens to low hundreds). Patients per org will reach 10k–100k each (legacy product migrates 20k+ users on day one — see CLAUDE.md → Production Scale). Putting them in one membership table means every "staff list at org X" query scans 100k patient rows to filter to a few hundred staff rows.

These are not Layer 2 problems pushed forward — they are Layer 1 shape problems already baked into the migrations.

The two decisions

Decision A — Patients are not memberships. Patients access an org through patient_profiles + patients, never through principal_organizations. organization_memberships (renamed from principal_organizations) becomes genuinely staff-only by elimination. Portal access is implicit from the existence of a patients row at that org; no app.access_portal permission grant, no patient system role.

Decision B — Patient tiers are not roles. patient_tiers no longer references roles.id. Tier entitlements ride on the same plan-entitlement-limit pattern the platform already uses for org subscriptions. The atomic catalogs (entitlements, limit_definitions) are shared between B2B and B2C; the higher-level grouping tables are parallel and separate (plans vs patient_tiers, organization_subscriptions vs patient_subscriptions, etc.).

The two decisions are inseparable. Decision A removes the carrier (principal_organizations.role_id) that Decision B's old tier-perks mechanism rode on. Decision B removes the only remaining reason patients needed a row in principal_organizations.

Considered alternatives

ApproachWhat it doesWhy we rejected (or chose) it
Filter staff lists by app.access_clinic permission and ship as-isKeeps current schema; lists become "members whose role grants app.access_clinic"Rejected. Solves the staff-list query but leaves three problems untouched: (1) dynamic role names mean a custom org role called pacient is structurally indistinguishable from a staff role; (2) principal_organizations still scales with patients (100k+ rows per org); (3) tier-as-role conflation persists, with billing state mutating RBAC tables. The query workaround is a symptom, not a fix.
Discriminator column on principal_organizationsAdd member_type ENUM('staff','patient') to the membership rowRejected. Cheap to add, but it's still one mega-table mixing 100 staff with 100k patients per org. RLS branches on the column. Listing staff still does a full scan filtered by the discriminator. The discriminator is a band-aid over the scale and conceptual problem; it doesn't separate the lifecycles, the permission models, or the billing carriers.
Separate staff and patient membership tables; keep tier-as-roleDrops patients out of the membership table, but patient_tiers.role_id still flips a role on subscriptionRejected. Solves Gap 1 and Gap 3, leaves Gap 2 alone. Once patients aren't in principal_organizations, there is no membership row whose role_id to flip — the old mechanism breaks. The "fix" of giving patients a synthetic membership row just to carry tier-perks via roles re-introduces Decision A's problem.
Unified billing table covering both org plans and patient tiersOne plans table with nullable organization_id; one polymorphic subscriptions tableRejected. The catalog is platform-defined for org plans, clinic-defined for patient tiers — different scopes, different management permissions, different RLS. Polymorphic FKs lose referential integrity on the subscription side (subscriber is organizations for one kind, patients for the other). The atomic concepts that genuinely overlap (a feature code, a limit definition) are the leaves; the higher-level groupings differ at every meaningful axis.
Patient identity tier + parallel billing engines + shared atomic catalogsPatients live in patient_profiles + patients + patient_caregivers, organized by org but never in the staff membership table; patient billing runs through patient_tiers + patient_tier_versions + patient_tier_entitlements + patient_tier_limits + patient_subscriptions + patient_subscription_entitlements + patient_subscription_limits + patient_subscription_overrides — exact mirror of the org-side shape. entitlements and limit_definitions are shared between both surfaces.Chosen. Solves all three gaps. Each table answers exactly one question. The entitlement-check function is generic over both billing surfaces because the catalog is shared. RLS on each table is simple and scoped to its actor. Custom org role names cannot collide with patient identity because patients are not in the role machinery at all.

The shape

What changes (taken together with the rename pass):

  • Membership table renamesprincipal_organizationsorganization_memberships; platform_rolesplatform_memberships. Staff-only by definition. Same columns; the rename makes the symmetry between platform-tier and org-tier memberships explicit and reflects that this table no longer contains patient rows.

  • Patient identity clusterpatient_personspatient_profiles (portable identity, no organization_id); patient_person_managerspatient_caregivers; patients (per-org link) keeps its name. The RLS helper current_human_patient_person_ids() is renamed current_human_patient_profile_ids() to match. patients gains a last_used_at column mirroring organization_memberships.last_used_at so the "default org on first sign-in" derivation works symmetrically across both membership kinds.

  • Patient role and app.access_portal removed. The seeded patient system role is dropped. The app.access_portal permission grant is dropped. Portal access is granted by the existence of a patients row at the org — checked in middleware via current_human_patient_profile_ids() joined to patients.organization_id.

  • Drop default_signup_role_id. organization_settings.default_signup_role_id was a denormalized cache of the org's default tier's role. With no role to point at, the column is dead. Sign-up reads patient_tiers WHERE organization_id = ? AND is_default = TRUE directly. The partial unique index on is_default already guarantees at most one default per org; the cached column added nothing it doesn't.

  • Drop humans.current_organization_id. Active-org tracking is derived from MAX(last_used_at) across both organization_memberships and patients for the principal. The clear-on-membership-delete trigger goes with it.

  • Patient tiers restructured. patient_tiers.role_id is dropped. The patient billing engine adds: patient_tier_versions (snapshot history, mirror of plan_versions), patient_tier_entitlements and patient_tier_limits (mirrors of plan_entitlements / plan_limits), patient_subscriptions (the per-patient subscription, FK to patients(id)), patient_subscription_entitlements / patient_subscription_limits / patient_subscription_overrides (mirrors of the org-side snapshot tables). The entitlements and limit_definitions catalogs are unchanged — they are the shared leaves used by both billing surfaces.

  • Org-side prefix added. Once patient subscriptions exist, subscription_entitlements / subscription_limits / subscription_overrides become organization_subscription_entitlements / _limits / _overrides. The unprefixed names only worked when there was one subscription concept; the prefix becomes load-bearing once both surfaces ship.

  • audit_log AI provenance split. audit_log.model_version, inputs_hash, confidence move to a sibling audit_ai_provenance(audit_log_id PK FK, model_version, inputs_hash, confidence, ...) table. Audit's compliance contract stays stable; AI-features schema churn is isolated. Bundled into the same migration wave because the cost asymmetry is the same as everything else here — pre-prod cheap, post-prod cross-cutting.

Why now

Layer 1 is in progress; Layer 2 (patient_profiles, patients, patient_caregivers) hasn't shipped. There is no patient identity in the database yet — no rows to migrate, no FKs in domain tables to chase. The migration scope is editing existing migrations in place per the pre-prod editable-migrations rule, not stacking ALTER TABLE deltas. The cost is mechanical (rename + structural change in a bounded set of files); after Layer 2 ships, the same change becomes a cross-cutting refactor against patient identity rows already linked into appointments, forms, treatment plans, consents, and consent ledgers.

The same cost-asymmetry argument that justified the principal-model rename in Layer 1.24 applies here: the work is the same; the only question is when it is paid. Pre-prod, mechanical. Post-prod, cross-cutting.

Tradeoffs (kept honest)

  • Two membership concepts in the codebase, not one. Middleware has two paths (staff session resolves via organization_memberships; patient session resolves via patients). RLS has two helpers (current_app_principal_id() for staff scopes; current_human_patient_profile_ids() for patient scopes). The "switch active org" derivation queries both tables. This is the surface area enterprise B2B2C systems pay for — the alternative (one mega-table mixing scales and lifecycles) is the thing that scales badly and confuses authorization.

  • More billing tables, not fewer. Roughly +5 tables on the patient side that mirror the org side. The shared entitlements / limit_definitions catalogs prevent the actual duplication that matters — definition drift between B2B and B2C entitlement codes. Higher-level tables stay separate because their scopes (platform-defined vs clinic-defined), management permissions (superadmin vs clinic admin), and pricing models (platform Stripe vs clinic-external) genuinely differ. A unified table forces those branches into one place; separate tables let each domain own its rules.

  • No structural enforcement that a human is exclusively staff or exclusively patient at the same org. A human can be staff at Clinic A and a patient at Clinic B (a doctor whose father is a patient elsewhere) — that's a feature, not a bug, and the new shape supports it cleanly. A human being staff at Clinic A and ALSO a patient at Clinic A is rare but allowed. The schema does not prevent it; product rules can if needed.

Scope kept tight

This ADR covers the membership separation and the tier/role decoupling. It does not cover: the role-cloning pattern (kept as-is — system templates clone per-org for clinic sovereignty); organization_entitlements (kept as-is — intentional regulated trust boundary); custom roles (Layer 1.13 — clinic admin manages Layer-3 cloned and Layer-4 custom). All of those were probed and confirmed correct during the design discussion.

The first patient identity migration (Layer 2.1) lands the renamed table names directly. The pre-prod migration edit pass updates principal_organizationsorganization_memberships, platform_rolesplatform_memberships, and the roles/role_permissions seed (drops the patient system role, drops app.access_portal from the catalog and grants). Patient billing tables ship at Layer 2.5 with the new shape from day one — there is no transitional patient_tiers.role_id to remove because the rename happens in the same wave that introduces the new patient schema.


Why clinic is controller, platform is processor?

The most consequential architectural decision in the whole platform: the clinic is the GDPR data controller for patient data; RestartiX is a data processor. This decision shapes the consent model, the privacy notice flow, the Console UI, the DSAR routing, and every cross-tenant feature design.

The setup

In a B2B2C health platform — clinics use RestartiX, patients use clinics — there are three actors and three relationship pairs:

  1. Patient → Platform: light. The patient holds an account on the platform (login credentials, security state). That's it.
  2. Patient → Clinic: substantive. Their health data, their treatments, their appointments, their consents.
  3. Clinic → Platform: contractual. The clinic uses the platform to deliver care; this is governed by an MSA + DPA (Art. 28).

Under GDPR, every party that touches personal data is either a controller (decides why and how), a processor (acts on the controller's instructions), or joint controllers (two parties that together decide why and how — Art. 26). The architecture has to put each actor in the right bucket.

The decision

  • Clinic = controller for all patient data: profile, clinical records, appointments, marketing prefs, medical consents.
  • Platform = processor for that data, governed by a DPA annexed to the MSA. Sub-processors (Clerk, Daily.co, AWS) listed and approved.
  • Platform = controller for the thin slice of account-level data: login credentials, security telemetry on the account itself, fraud prevention. (Stripe makes the same distinction in their privacy policy — merchants are controllers for transaction data; Stripe is controller only for the merchant's own account.)
  • No joint controllership for patient data. Cross-tenant features that would otherwise create joint controllership operate on anonymised data only.

What the patient actually sees at sign-up on clinicname.portal.restartix.pro:

SurfaceLegal basisWithdrawableCaptured atStored as
Platform ToScontract (Art. 6(1)(b))No (revoke = delete account)Sign-upconsents row, scope='platform', purpose='platform_terms'
Platform privacy notice (informational acknowledgement)legitimate interest (Art. 6(1)(f))NoSign-upconsents row, scope='platform', purpose='platform_privacy_notice'
Clinic ToS (if the clinic publishes their own)contractNo (revoke = leave clinic)Portal onboarding at the clinicconsents row, scope='org', purpose='org_terms'
Clinic privacy notice (Art. 13/14 disclosure)legal obligation (Art. 6(1)(c)) + Art. 9(2)(h) for medicalNoPortal onboarding at the clinicconsents row, scope='org', purpose='org_privacy_notice'
Marketing email / SMS / analytics / AI-processingconsent (Art. 6(1)(a))YesSettings page or sign-up togglesconsents row, scope='org', source='self_toggle' or 'signup_checkbox'
Telemedicine / video recording / biometric / treatment-specificexplicit consent (Art. 9(2)(a) for special-category)YesAt booking via signed formconsents row, scope='org', source='form'

All rows live in the same consents table. The scope discriminator says whose terms they're against; the legal_basis discriminator says which Art. 6 basis applies (and Art. 9 for special-category processing); withdrawable is derived (TRUE only when legal_basis='consent').

The org's relationship with the platform (concrete shape)

The clinic doesn't "consent" to the platform. They sign:

  • MSA — commercial agreement at org onboarding (pricing, SLA, term, liability)
  • DPA — Art. 28(3) processor agreement annexed to the MSA, lists sub-processors, security measures, audit rights, transfer mechanisms
  • Operational settings — toggles in organization_settings for things like platform_communications_enabled (legitimate-interest basis for B2B comms; Recital 47 well-established for clinic-to-corporate-address comms; one-click unsubscribe in every email)

No consents ledger entries for the org→platform relationship. The MSA + DPA + settings audit trail are the legal artefacts.

Privacy notice template, not a fixed notice

Each clinic owns their privacy notice. RestartiX provides versioned templates (privacy_notice_templates) with placeholders + toggleable sections. Clinic fills in (, , etc.), selects which optional sections apply (video recording? biometrics? cross-border transfer?), publishes the assembled markdown as their org_privacy_notice v_n. When the platform updates the template, the clinic gets a "review template update" prompt; until they re-publish, their existing notice keeps serving.

Why this shape and not a fixed platform notice:

  • Legally correct — the clinic is the controller, so they must own the final text. A platform-fixed notice mis-attributes controllership.
  • Operationally realistic — most clinics don't have legal staff who can write a notice from scratch; a template gets them to a passable notice in minutes.
  • Maintainable — when ANSPDCP guidance changes or the GDPR is amended, RestartiX updates the template and pushes a notification. Clinics aren't on their own.

Break-glass for Console: the controlled exception path

The Console must never surface identifiable cross-tenant patient data outside an explicit elevation flow. Default Console access is aggregate / processor-scope (org list, patient counters, audit metadata). Identifiable cross-tenant access — patient list per org, patient detail, audit-with-diffs — requires elevation.

Elevation pattern:

  • Per-org scope (cross-org elevation is a separate, higher-friction path)
  • Time-bound (default 1 hour, max 4)
  • Reason required (free-text + categorised: support ticket, security incident, DSAR routing, fraud investigation)
  • Always-on email to the clinic admin when a session opens against their org
  • Every read in the elevated session writes an audit_log row with action_context='break_glass' + break_glass_id linking the session
  • Visible to the clinic admin in their per-org audit log — transparency is what keeps this regulator-defensible

Clinic-side notification (always-on, at first) is the trade-off that makes this defensible to ANSPDCP, large-clinic procurement reviewers, and patients themselves.

Cross-tenant feature rule

Any feature that needs to compute over multiple clinics' patient data — benchmarks, platform analytics, cross-tenant search, AI training corpus — must operate on anonymised data only. Once data is irreversibly stripped of identifiers, it stops being personal data and the controllership question dissolves.

If a feature genuinely cannot work without identifiable cross-tenant data, that's a deliberate decision requiring its own ADR, with the rationale that justifies the joint-controllership cost. The data-classification work (1A.14) enforces this at the egress layer — a pii_basic column can't egress to a platform_analytics target without an explicit override.

DSAR routing

Patients exercise their GDPR rights against the controller — the clinic. The platform's role:

  • Auto-respond to misdirected emails ([email protected]): "Your medical data is held by the clinic(s) you've signed up with. Here's how to reach them." Lists clinics from /v1/me.patient_org_ids.
  • Self-service in the portal: "Your clinics" page with the per-clinic data-controller contact info.
  • Forward if the patient confirms which clinic — platform forwards the request to the clinic's DPO/billing contact, patient cc'd. No platform-side lookup of patient data.
  • Break-glass for orphaned cases — ex-patient with no active account, where the platform must look up which clinics they were a patient at to route the request. reason_category='dsar_routing'.

Crucially, the platform does not act as primary controller for patient DSARs. It assists the clinic.

Why now (cost frame)

Pre-prod cost: one rule + a consents table + privacy-notice-template machinery + a break_glass_sessions table + an elevation modal in Console. Maybe two weeks of work spread across foundation 1B and 1C.

Post-prod cost of getting it wrong: removing a "browse all patients" Console surface is a multi-quarter procurement-review nightmare. Fixing accidental joint controllership is a regulator-action-level event. The Romanian DPA (ANSPDCP) has issued public fines for this exact pattern at other healthtech platforms in the EU.

The asymmetry is severe. We pay the foundation cost.

Tradeoffs (kept honest)

  • More schema upfrontconsents, consent_purposes, consent_purpose_versions, privacy_notice_templates, organization_privacy_notices, break_glass_sessions. Roughly 6 tables that might feel like "speculative" infrastructure to a greenfield team. The cost is real; the alternative cost (post-prod retrofit under regulator pressure) is unbounded.
  • Slower Console feature velocity — every Console page has to be classified as aggregate or break-glass-gated. Internal tooling is forced through more friction than it otherwise would have. Acceptable trade.
  • Clinic onboarding has a "complete your privacy notice" task — small UX cost, real legal value. Clinics that ignore it can't onboard patients (the org_privacy_notice consent must succeed, which means a published version must exist). Acceptable trade.
  • DSAR self-service requires patient to know the right clinic — most do (they signed up there). The fallback is the auto-responder + portal "your clinics" page. The 1% case (orphaned ex-patient) goes through break-glass.

Scope kept tight

This ADR does not specify:

  • Whether to register a Romanian-specific DPO at platform level (likely yes; deferred to Romanian compliance pass)
  • The exact contents of the seeded privacy notice template (drafted by a data-protection lawyer pre-launch)
  • The break-glass UI (Console implementation detail; lands in 1C.1)
  • Joint controllership for any specific cross-tenant feature (ADR'd per-feature when it surfaces)

It establishes the controller/processor split, the consent ledger shape, the privacy-notice template pattern, the break-glass exception path, and the cross-tenant anonymisation guardrail. Everything downstream rests on these.


Two clinic-authored legal artefacts gate every patient relationship: the clinic's terms of service (org_terms) and the clinic's privacy notice (org_privacy_notice). The platform also ships a general-purpose forms feature (F3) that can collect signed acknowledgments and treatment-specific consents. Why are terms + privacy notice their own machinery (1B.10's legal_document_templates + organization_legal_documents) and not just two more form types?

The setup

  • Clinic is data controller. Platform is processor. The clinic must own the legal text patients accept — that's what makes the controllership real.
  • Most clinics don't have legal staff. A clinic-authored-from-scratch privacy notice gets GDPR/ANSPDCP compliance wrong 90% of the time (missing lawful basis under Art. 9(2)(h), missing DPO contact, missing sub-processor list, missing patient rights enumeration).
  • Forms are clinic-authored-from-scratch by design — the F3 builder ships a blank canvas and lets the clinic put anything on it. That's correct for clinical workflows. It's catastrophically wrong for a regulated GDPR notice.

The decision

Two separate machineries, two different shapes.

Templated legal documents (1B.10):

  • Platform owns the structure (placeholders + section catalog + required keys). The template enforces "you must name your DPO, you must list your processing purposes, you must mention cross-border transfers if applicable."
  • Clinic owns the specifics (, , , which optional sections apply).
  • Result: an assembled markdown document that's clinic-authored (their letterhead, their name, their toggles) but structurally compliant by construction.

Forms (F3):

  • Platform ships the form-builder primitive (fields, validation, signing, PDF rendering, audit).
  • Clinic authors all content from scratch.
  • Result: any clinical artefact the clinic needs (intake, ROM measurement, pain scale, treatment-specific consent).

Both end up writing to consents, but through different paths:

Consent purposeDelivery mechanismSource
platform_terms, platform_privacy_noticeSign-up checkbox at step 1 of two-step onboardingsignup_checkbox
org_terms, org_privacy_noticeSign-up checkbox at step 2 — body comes from 1B.10's published consent_purpose_versions for that orgsignup_checkbox
marketing_email, marketing_sms, analytics, ai_processing, profile_sharingPatient settings toggleself_toggle
telemedicine, video_recording, biometric_capture, treatment_specific_* (F3.5)F3 form submissionform (with source_form_id)

The consent ledger is the single source of truth for "is this patient legally cleared to do X." The delivery mechanism is just the UI that captured the click.

Why the lifecycle differs

The two machineries have different change-propagation semantics, and trying to unify them creates real bugs:

Legal documentsForms
ScopePer-clinic-relationship (one current version)Per-clinical-event (many per patient)
Update propagationRepublish triggers re-consent for every existing patient via 1B.9's 412 gateNew form definition only affects future fills
CadenceAnnual-ish (regulator updates, DPO change)Frequent (clinic refines workflows)
VersioningMandatory pin to consent_purpose_versions.version so trail view can render the exact body the patient acceptedForm revision history, but no "every existing patient must re-accept" semantic

A forms-based "privacy notice acknowledgment" form has no mechanism to require all existing patients to re-accept when the clinic updates it — the closest you can do is invalidate prior submissions, which doesn't map to the consent ledger correctly. Templated legal documents bake the re-consent gate into the version pin.

Why platform owns the template, not the clinic

Three reasons:

  • Compliance scales centrally. When ANSPDCP issues guidance or GDPR is amended, RestartiX updates the template once and 1,000 clinics get the "review template update" prompt. Clinic-authored-from-scratch means 1,000 separate updates, half of which never happen.
  • Procurement defensibility. Large clinics' procurement reviewers ask "how does this platform keep our privacy notice compliant?" Pointing at a versioned, lawyer-reviewed template they fill in is a cleaner answer than "you write your own."
  • Foundation discipline. The platform's value-add is structural correctness. A clinic that wants to author from scratch can — they override the assembled body in their published consent_purpose_versions row. The template is the default, not a cage.

Considered alternatives

  • One form per legal document, no separate machinery. Rejected: forms can't express "all existing patients must re-accept on republish," and putting GDPR compliance constraints into a generic form builder pollutes both abstractions.
  • Platform-fixed privacy notice, clinic just accepts it. Rejected: mis-attributes controllership. The clinic must own the artefact patients accept; otherwise the platform is the de facto controller and the whole processor-boundary architecture collapses.
  • Free-form markdown editor per clinic, no template. Rejected: 90% non-compliance rate. The platform's value evaporates and clinics churn the first time ANSPDCP audits one of them.

Tradeoffs (kept honest)

  • Two machineries to maintain. Acceptable: they have genuinely different lifecycles and constraints. Conflating them is the more expensive failure mode.
  • Editor in the clinic admin UI is more constrained than a free markdown editor. Acceptable: structural compliance > cosmetic flexibility. A clinic with bespoke needs can publish anything as their consent_purpose_versions.body_translations — the platform doesn't gate it after publish.
  • Section bodies live per-locale on the template (no en-authoritative shared body). Doubles the surface when adding a section. Acceptable: cross-row references would create their own divergence bugs; symmetric per-locale rows are easier to reason about.

Scope kept tight

The 1B.10 machinery covers org_terms and org_privacy_notice only. Treatment-specific consents (telemedicine, video recording, biometric capture, treatment-specific processing) live in the consent ledger but get delivered via F3 forms — the rationale below applies because their text is clinic-authored from scratch and their lifecycle is per-clinical-event, not per-relationship.

The ADR does not specify:

  • The exact lawyer-reviewed text for the seeded v1 templates (deferred to Romanian compliance pass).
  • The Console superadmin UI for publishing new platform template versions (lands in 1C.1).
  • The clinic-admin editor UX (lands in 1C.2).
  • Whether F3.5 treatment-specific consents share any template machinery (unlikely; their structure is genuinely per-clinic).

Why platform contracts are intentionally invisible to patients?

platform_terms and platform_privacy_notice exist as platform-scope consent_purposes rows; patients accept them at sign-up; bumping a version triggers re-consent across every patient on every clinic. So they're real legal artefacts in the data model. And yet the patient-facing UI never names RestartiX as a counterparty. Re-consent help text routes to the clinic. Sign-up captures these consents implicitly alongside the clinic's. The patient's only frontline relationship is with their clinic.

The setup

The platform-as-processor / clinic-as-controller story works cleanly while a patient is at a clinic — the clinic owns the relationship, the platform processes data on the clinic's documented instructions. But two cases force platform-level contracts to exist:

  1. Account-level identity. The humans row + patient_profiles row exist before any clinic relationship and persist across them. Someone has to be the controller of those rows; "the first clinic the patient signed up at" doesn't generalise (joint controllership for a multi-clinic patient is exactly the failure mode the foundation rule prevents). RestartiX-as-platform fills that role for account-level data.
  2. Orphan custodianship. When a patient leaves all clinics, the humans + patient_profiles rows survive (the patient might re-onboard at a new clinic later; the portable profile is reused). During orphan state the clinic isn't the controller — there is no clinic. The platform is the custodian. Access to orphan data is governed by 1B.11's break-glass pattern (per-org elevation; cross-org for orphan rescue is its own narrow scope).

So the platform DOES have a legal relationship with the patient — small, procedural, and load-bearing for the orphan case. platform_terms covers "you have a RestartiX account, here are the procedural rules." platform_privacy_notice covers "we're a processor for your clinics; here's how the platform layer holds your account-level data."

The decision

The platform contracts exist in the data model. The platform is never surfaced as a counterparty in patient-facing UI. Concretely:

  • Re-consent modal contact routing always lands on the clinic, never on [email protected] — even when the missing list is platform-scope (e.g. platform_terms was bumped). The clinic escalates to RestartiX as a vendor when it can't answer.
  • Sign-up consent block presents platform_terms + platform_privacy_notice as required acceptances, but the clinic's own terms + privacy notice are the substantive surfaces. Platform-side text is procedurally bounded.
  • DSAR routing flows through the clinic, not the platform (existing rule, restated here).
  • Account closure is initiated through the clinic too — the clinic walks the patient through data export per Art. 20 + escalates to RestartiX for the account-level erasure.

Why this shape

  • Mirrors the EHR-vendor-to-clinic-to-patient pattern. Patients at clinics that use Epic don't have a contract with Epic; patients of clinics that use Gmail-for-Workspace don't have a contract with Google. RestartiX is the same — we're the clinic's vendor, not the patient's. The clinic is the patient's only legal interface.
  • Matches the controllership story. The clinic is data controller for everything that meaningfully affects the patient (medical records, treatment plans, consents tied to clinical activities). The platform-level contracts cover only the account-host plumbing and orphan custodianship.
  • Avoids dual-counterparty confusion. A patient who sees both "your clinic" and "RestartiX support" as contacts during a re-consent flow has to decide which one to call for what — at exactly the moment they're already confused about why they're being asked to re-accept. The clinic-only path is cognitively simple: one number, one email, one human relationship.

Where the platform contracts still matter

  • Sign-up. Patient accepts both at first onboarding (one screen, four checkboxes). The platform-side text is short and procedural; doesn't compete with the clinic's privacy notice for attention.
  • Re-consent on bump (rare). platform_terms v2 publishes (Console editor, 1B.10's platform-side counterpart) → every patient re-accepts on next portal visit. The blocking modal renders the new text. Help text still routes to clinic.
  • Orphan-state access. When a patient has left all clinics, RestartiX still holds their patient_profiles row. Console-side cross-org access is gated by 1B.11 break-glass. The patient's contract for that state is platform_terms.
  • GDPR DSAR fallback. A patient who has left all clinics and emails [email protected] with a DSAR triggers the auto-responder + orphan-rescue flow described in the controllership ADR. That email exists; it's just not surfaced inside the portal UI.

Considered alternatives

  • Eliminate platform_terms / platform_privacy_notice entirely. Possible only if patient_profiles is also eliminated (so there's no platform-held data needing a controller). That's a meaningful schema change touching 1B.6/1B.8/1B.9/1B.10 — out of scope for this ADR but flagged as an Open Decision below. If we ever do it, this ADR is obsolete.
  • Show RestartiX support email when the bump is platform-scope. Tested and rolled back during the 1C.3 re-consent modal slice. Two front-line counterparties confuses the patient at the worst possible moment (consent decision); the cognitive cost outweighed the marginal "RestartiX can answer faster" benefit.
  • Have the clinic's privacy notice subsume the platform's. Considered, rejected: the clinic isn't the controller for orphan-state data, can't legally subsume the platform layer for cases where there is no clinic relationship.

Tradeoffs (kept honest)

  • Clinics carry the patient-support load even for platform-level questions. Acceptable: the volume is low (platform_terms rarely bumps, and never with substantive changes), the clinic can escalate to RestartiX on the back channel.
  • A future "RestartiX as a clinical service provider" pivot would invalidate this ADR. If RestartiX itself starts offering clinical care (telerehab hub, direct-to-patient services), the controllership story flips and the platform becomes a real counterparty. We'd revisit then. For the current product (B2B clinic SaaS), this ADR is correct.

Revisit triggers

  • The schema-shape decision around patient_profiles (kill it / shrink it / keep it) — see Open Decisions.
  • A pivot to direct-to-patient services from RestartiX itself.
  • A regulator decision (Romanian DPA) that requires platform-level patient transparency stronger than the current model. Today there is none we know of.

Scope kept tight

This ADR settles the patient-facing surface. It does NOT:

  • Eliminate the platform-scope consent purposes (still required for the account/orphan case).
  • Change the clinic's controllership over clinical data (already settled).
  • Remove [email protected] as a fallback for orphan DSAR routing — that path stays, it just isn't surfaced inside the portal UI for active patients.

Why dedicated is a premium tenancy mode with a different controllership story?

The platform supports two distinct tenancy modes for paying clinics, with different controllership stories, different identity-isolation guarantees, and different price tiers. The default mode is shared: clinics run on pooled platform infrastructure (one Clerk org, one S3 bucket, one platform CMK) with logical isolation via RLS, prefix scoping, and app-layer entitlement checks; RestartiX is named as the technology provider in patient UI (small attribution like "Powered by RestartiX") and in the tenant's privacy notice as a sub-processor. The premium mode is dedicated: a per-tenant Clerk org partitions patient identity by construction, with optional own-S3-bucket and own-CMK addons available as the underlying operational mechanisms ship; the tenant becomes sole controller for all patient data including the identity slice. Both modes can be fully visually white-labeled — per-tenant logo, colours, custom domain, and custom mail-from are universal customizations available on either mode, not premium-gated. Both modes target SMB clinics; hospital networks and dedicated-infrastructure tiers (separate RDS / Redis / KMS per tenant) are permanently out of scope (see CLAUDE.md → Project Overview). See features/platform/tenant-isolation.md for the full spec.

The setup

The visibility decision faced three commercial paths, each with downstream legal and architectural consequences:

  1. Fully invisible platform (Option A). RestartiX invisible in patient UI, named only in legal text. Salesforce / ServiceNow model. Maximally invisible, but undersells the platform's substantive contribution (validated exercise library, MDR-tracked clinical content) and creates IP-attribution awkwardness — clinics get implicit credit for content they didn't author.
  2. Named technology (Option B, "Dycare/ReHub model"). Tenant brand foregrounded; RestartiX named as a credentialed tool. Clinic owns 100% of patient relationship; platform accrues brand and certification value. Reversible to A or C. Validated commercially in EU healthtech.
  3. Hub of clinics (Option C, "Doctolib model"). RestartiX as destination, clinics listed within. Joint-controllership risk on discovery and booking. Romanian DPA (ANSPDCP) has fined healthtech platforms for exactly this pattern. Hard to reverse.

The exercise library is the load-bearing factor. RestartiX provides a clinically-validated library that clinics can prescribe from (plus a clinic-uploaded layer for the clinic's own IP). The library is RestartiX-controlled IP, MDR-class regulated artefact, and a real defensible moat. Option A undersells it; Option C trades it away for hub-aggregator dynamics that come with regulatory risk.

The decision

Default mode = shared (Option B's identity story). Patient sees tenant brand foregrounded, with a small factual attribution that RestartiX is the underlying telerehab technology. Shared patient_profiles across the platform's network of shared-mode tenants is a feature, not a tax — it removes onboarding friction for patients moving between clinics with consent. Visual white-label customizations (logo, colours, custom domain, custom mail-from) are available on top of this mode as paid addons.

Premium mode = dedicated. A per-org tenancy_mode = 'dedicated' setting that turns the org into a self-contained identity namespace via a dedicated auth-provider organisation (a Clerk org today; the abstraction generalises to any verifier). Same email at a dedicated tenant and any other tenant is two distinct platform identities. Database stays shared — RLS + dedicated provider tenant + DPA exclusion from cross-tenant analytics + data-export-and-purge endpoints deliver what clinics typically mean by "data sovereignty". Own-S3-bucket and own-CMK addons compose on top when their operational mechanisms ship (decisions.md → Why tenancy_mode is a single enum, not multi-axis). Industry comparable: Salesforce calls this "dedicated tenant" and charges 5–10x. Reserved for SMB clinics that demand identity isolation and can fund the operational setup; hospital networks are not the customer.

Dedicated-infrastructure tiers are permanently out of scope. No separate RDS / Redis / KMS per tenant, no "physical isolation" packaging. Clinics that demand byte-level infrastructure separation are not the platform's customer. See CLAUDE.md → Project Overview.

Option C (hub) is explicitly rejected. The B2B-clinic-SaaS positioning is incompatible with hub aggregator dynamics; a future "RestartiX clinical service provider" pivot would be a separate product, not a deployment-mode toggle.

Why this shape

  • Matches the foundation rule that cross-tenant features operate on anonymised data only. Shared-mode profile reuse is patient-consented identity reuse, not cross-tenant data flow — the patient sees and approves the reuse at the second clinic's onboarding step. Dedicated mode disables even that, by partitioning identity at the auth layer.
  • Captures the commercial reality. Most clinics on the platform will be SMB rehab clinics that benefit from the network effect of shared-profile identity. A subset of clinics will demand identity sovereignty and pay for it. One default + one premium tier covers both without forking the architecture.
  • Preserves the legal artefacts where they apply. Shared-mode tenants keep platform_terms / platform_privacy_notice (orphan custodianship is real). Dedicated-mode tenants don't need them (no orphan custodianship — exit cleanly via export-and-purge). Each mode's controllership story is internally coherent.
  • The exercise library scales across both. Validated platform-curated exercises plus clinic-uploaded content works identically in either mode. No fork in the clinical content layer.

Why dedicated mode is opt-in and not default

Pre-emptively isolating every tenant would forfeit the network-effect feature for the 95% of customers who benefit from it (smooth re-onboarding across clinics, portable caregiver links, the "your rehab profile follows you" pitch). Dedicated mode is the right answer for customers who explicitly require it; it would be the wrong default.

Considered alternatives

  • Default to invisible-platform, premium for "Powered by RestartiX" attribution. Inverted of what was chosen. Rejected: undersells the platform's substantive contribution to most customers and creates IP-attribution dishonesty (clinics implicitly take credit for the validated library).
  • Hub model as the platform's primary commercial face. Rejected, see above. Joint-controllership risk plus a fundamentally different B2C-adjacent product.
  • Per-tenant flag for "name visibility" only, without the deeper identity-isolation guarantee. Rejected: the customers who demand invisibility are the same customers who demand sovereignty. Decoupling them produces a tier that satisfies neither cohort.
  • Make dedicated mode a separate platform deployment. Rejected: forks the codebase, fragments the operational story, and prevents code reuse. A flag on organizations is the right granularity.

Tradeoffs (kept honest)

  • Two DPA templates. The shared-mode DPA (platform-as-controller-for-account-slice + clinic-as-controller-for-clinical) and the dedicated-mode DPA (tenant-as-controller-for-everything + platform-as-pure-processor + export-and-purge) are different artefacts. Legal counsel maintains both.
  • Build cost is paid lazily. The runtime logic for tenancy_mode = 'dedicated' is deferred until the first paying dedicated-mode contract closes. Risk: when the first contract arrives, build pressure is high. Mitigation: the foundation-cheap schema reservations (composite uniqueness on humans (email, provider_org_id) NULLS NOT DISTINCT, the tenancy_mode enum column itself, activated_at lifecycle column) land now, so the future build is "wire up the Clerk org provisioner + ops templating" not "rework foundational schema under contract pressure." See features/platform/tenant-isolation.md → Deferred design surface.
  • Sales motion is harder than pure invisible-platform. Some prospects will demand the platform be fully unnamed as a precondition; the shared-mode default loses those deals if pricing for dedicated is too aggressive. The pricing model needs a credible dedicated tier with a real number, not just "contact us."
  • Some customers want a middle ground (shared mode but with their own custom domain, their own SES sender, etc.). The foundation already supports custom domains for shared-mode tenants; per-tenant SES / SMS / Daily.co are available on either tenancy mode as visual-branding customizations, not bundled into the dedicated-mode premium. This is a sales packaging question, not an architectural one.

Revisit triggers

  • A pivot to direct-to-patient services from RestartiX itself (would invalidate Option B's clinic-as-sole-controller story).
  • ANSPDCP / EU regulator guidance requiring patient-visible platform transparency stronger than today's "Powered by RestartiX" attribution (would either tighten Option B's UX requirements or push more tenants toward dedicated mode as the safe path).
  • A single anchor dedicated-mode clinic contract at >5x ARR of an SMB tenant — at that point the cost of building dedicated-mode runtime is fully funded by the contract, and the revisit question is just timing.

Scope kept tight

This ADR commits the platform to two tenancy modes (shared default, dedicated premium) and explicitly rejects the hub model. It does NOT:

  • Define the exact pricing for either tier (sales decision).
  • Build the dedicated-mode runtime (deferred to first paying dedicated-mode clinic contract).
  • Change the existing patient_profiles shape (the prior Open Decision about shrinking or eliminating that table is resolved by this ADR — keep current shape, it's the right primitive for shared mode and is naturally bypassed in dedicated mode).

Why tenant-isolation has its own controllership story?

Shared-mode tenants live under Why clinic is controller, platform is processor — clinic is controller for clinical data, platform is controller for the thin account-level slice (humans + patient_profiles) plus orphan custodianship. Dedicated-mode tenants do NOT live under that ADR. They have a different, narrower controllership story.

The setup

The shared-mode ADR is built around two facts:

  1. Patient identity (humans + patient_profiles) is platform-held, persists across clinics, and survives orphan state. Someone has to be controller; "first clinic that onboarded the patient" doesn't generalise.
  2. Cross-tenant features (anonymised analytics, shared profile reuse with consent) require a platform-level controller for the cross-cutting layer.

Dedicated mode breaks both premises by design:

  1. There is no cross-tenant identity. Each dedicated tenant's patients live in a dedicated auth-provider organisation, with humans rows partitioned by provider_org_id. The "platform-wide identity" object does not exist for dedicated-mode tenants.
  2. There are no cross-tenant features for dedicated tenants. Even anonymised cross-tenant analytics are explicitly excluded from dedicated tenants' data (DPA clause). The cross-cutting layer doesn't apply.

The decision

For dedicated-mode tenants:

  • Tenant = sole controller. All patient data — identity, profile, clinical, account-level, everything — is under the tenant's controllership. No carve-out for the platform layer.
  • Platform = pure processor under Art. 28(3), governed by the dedicated-mode DPA (separate template from shared-mode DPA).
  • No platform-level patient contracts. Patients never accept platform_terms or platform_privacy_notice. The tenant's privacy notice is the only legal artefact patients see.
  • Data-export-and-purge on termination. Platform commits to delivering a structured export within a contractual SLA, then purging tenant data within an additional SLA after export receipt. No orphan custodianship.
  • Sub-processor disclosure flows through the tenant. Tenant's privacy notice names the platform's sub-processors (Clerk, AWS, Daily.co, Twilio). Sub-processor changes are notified to the tenant directly per DPA, not to patients via the platform.

Why this shape

  • Matches what the customer is paying for. Dedicated-mode pricing covers identity sovereignty. The legal model has to deliver that — anything less (e.g., platform retains a thin orphan-custody slice "just in case") would undermine the value proposition.
  • Avoids legal contortion. Trying to apply the shared-mode ADR's clinic-vs-platform split to dedicated tenants produces awkward edge cases (what happens at termination? does the platform-controlled account slice persist? where?). A clean tenant-as-sole-controller model is structurally simpler and matches actual industry practice for dedicated-tenant SaaS.
  • Sub-processor disclosure works naturally. Patients of dedicated tenants see "Tenant X uses RestartiX (and via RestartiX, Clerk, AWS, Daily.co, Twilio) as data processors." Same chain, named under the tenant's controllership.

Why both ADRs coexist

Shared and dedicated are different tenancy modes, not stages of an evolution. A platform that runs both will always have both ADRs active in parallel. They apply to different organizations rows distinguished by the tenancy_mode column (shared vs dedicated). This is intentional, and the tenant-isolation feature spec is the bridge that explains which ADR governs which org.

Considered alternatives

  • Apply the shared-mode ADR uniformly to dedicated tenants too, with a thin "platform retains orphan custody for X days post-termination" carve-out. Rejected: contradicts the data-sovereignty promise. If the platform retains anything post-termination, it isn't full isolation.
  • Eliminate platform_terms for shared-mode tenants too, to use one ADR everywhere. Rejected: the orphan custodianship case for shared-mode tenants is real (decisions.md → Why platform contracts are intentionally invisible covers it), and removing it forces the controller question to fall on "first clinic to onboard" which is the joint-controllership failure mode.

Tradeoffs (kept honest)

  • Two DPA templates to maintain. Acceptable; standard practice for tiered B2B SaaS with materially different processing models.
  • No cross-tenant analytics for dedicated tenants. Shared-tenant data continues to feed anonymised platform analytics; dedicated tenants are excluded by DPA. Marginal cost: a slightly thinner anonymised dataset for cross-tenant insights. Worth it.
  • Audit log retention is the trickiest carve-out. Per CLAUDE.md → Data Retention the platform must retain audit data for 6 years. The dedicated-mode DPA explicitly carves this out: tenant-scoped audit_log rows older than the deletion window are exported but retained on the platform's audit infrastructure per legal-hold rules. Tenants accept this at signing or do not become dedicated-mode customers.

Revisit triggers

  • Regulator decision (Romanian DPA / EU Data Act successor) that bans the legal-hold carve-out for audit data — would force the audit-retention question to be resolved differently, possibly by per-tenant audit infrastructure.
  • A litigation or insolvency scenario at a dedicated tenant that exposes ambiguity in the export-and-purge SLA — would force tightening of the DPA template.

Scope kept tight

This ADR governs dedicated-mode tenants only. It does not change anything about shared-mode tenants, the clinic-is-controller-platform-is-processor split, the orphan-custodianship pattern, or the existing platform_terms / platform_privacy_notice design. Both modes coexist; both ADRs apply.


Why tenancy_mode is a single enum, not multi-axis?

organizations.tenancy_mode TEXT NOT NULL DEFAULT 'shared' CHECK (tenancy_mode IN ('shared', 'dedicated')) is the single schema column that distinguishes shared from dedicated tenants. There is no dedicated_storage BOOLEAN, no dedicated_encryption BOOLEAN, and no table-level CHECK constraint coupling structural axes together. Earlier rounds of this design carried three columns plus a CHECK; that shape has been retired.

The setup

The earlier framing was three columns — dedicated_identity (per-tenant Clerk org), dedicated_storage (per-tenant S3 bucket), dedicated_encryption (per-tenant CMK) — bundled by a CHECK that forced all three TRUE on dedicated mode, with the latter two also sellable independently on shared.

When we audited what each axis actually requires to ship, the picture changed:

  1. Identity isolation is a real product feature. It needs a per-tenant Clerk organisation provisioned, humans.provider_org_id populated, the auth-provider verifier plumbed to that org, and a platform_service_providers row pointing at it. It changes patient identity flow by construction. This is the axis that's actually structurally different.
  2. Per-tenant S3 bucket without an exit / portability tool is a column on the org row, not a feature. Customers buy dedicated storage because they want clean GDPR export and (eventually) data portability when they leave. Without that tool implemented, a dedicated_storage = TRUE flag is decoration. Nothing in the runtime path changes — files still go through the same writers; we just choose a different bucket. The customer-visible delta is zero.
  3. Per-tenant CMK without a crypto-shred runbook is the same shape. The cryptographic story is that crypto-shred makes erasure tractable. Without a documented operational process to actually retire the key on contract termination, BYOK is a config row that produces identical encryption behaviour to the platform-shared CMK. Same: column, no feature.

The two non-identity axes were flag-only theater — schema columns with no operational mechanism behind them. Selling them as standalone addons before the mechanisms exist would write commitments we can't honour.

The decision

Single enum column:

sql
ALTER TABLE organizations
    ADD COLUMN tenancy_mode TEXT NOT NULL DEFAULT 'shared'
    CHECK (tenancy_mode IN ('shared', 'dedicated'));
  • shared (default): pooled platform infrastructure with logical isolation (RLS, prefix scoping, app-layer entitlement checks).
  • dedicated (reserved): per-tenant Clerk org. Today not provisionable via API — every creation path lands shared. Reservation columns (humans.provider_org_id + its NULLS NOT DISTINCT composite unique index) stay in place so the eventual provisioner doesn't need a foundation-schema migration.

When per-tenant storage and per-tenant encryption ship as products, they enter as entitlements in the entitlements catalog (and corresponding per-org platform_service_providers override rows for the bucket name and CMK alias), not as columns on organizations. They're addon billing posture, just like custom domain, custom mail-from, and custom branding — which already live as entitlements, not columns.

Why entitlements, not columns

Three reasons the entitlements catalog is the right home for future addons:

  1. The mechanism follows the feature. An entitlement only exists once there's runtime behaviour to gate. Adding entitlement = own_s3_bucket is the same PR as implementing the writer that routes uploads to the per-org bucket and the exit tool that exports from it. No flag without function.
  2. Sales packaging stays in one system. Custom domain, custom mail-from, branding, and (future) own-bucket / own-CMK are all the same shape — paid addons available on either tenancy mode. Putting them all on the entitlements engine keeps tier-and-addon resolution uniform and removes the special case "and also read these three columns on the orgs row."
  3. Schema reservations stay honest. The org row carries what's structurally different about a tenant (identity topology, lifecycle state); everything else is a billing decision. A column that has no operational consequence today is technical debt — someone will read it and infer a feature exists.

What operational mechanisms each future addon needs

Before either addon ships, the underlying mechanism has to exist:

  • tenancy_mode = 'dedicated' provisionable requires a Clerk org provisioner (programmatic API call to create the org, bind app credentials, write the resulting provider_org_id to the humans rows of the tenant's principals and to the platform_service_providers override row).
  • Own-S3-bucket entitlement requires an exit / portability tool that uses the per-tenant bucket — packaged GDPR export + media sync that produces a complete archive a clinic can hand to another vendor or take with them.
  • Own-CMK entitlement requires a documented crypto-shred runbook for GDPR erasure: when a clinic terminates, the per-tenant CMK is retired in KMS, rendering the column-encrypted data unrecoverable. The legal commitment we make ("we can erase your data even from backups") has to actually be true, which means the runbook is the feature.

Each of these is a real chunk of work that will ship in one PR alongside the addon-billing entry. The trigger is a paying customer who funds the work, not a feature flag.

Considered alternatives

  • Keep the three BOOLEANs with the CHECK. Rejected — the storage and encryption columns weren't sellable without the operational mechanisms; carrying them as reservations created an implicit promise the schema couldn't honour and a temptation to wire UI affordances around flags that did nothing.
  • Drop tenancy_mode entirely and derive everything from the entitlements catalog. Rejected — identity topology is structurally different from "is this addon enabled." A dedicated tenant has different identity semantics from day one (same email at two clinics = two distinct platform identities by construction), and that has to be visible at the schema/RLS layer, not buried in entitlement resolution. The column is load-bearing for the one real structural distinction; everything else folds into entitlements.
  • Defer the column too, model everything via entitlements. Rejected for the same reason — when dedicated provisioning ships, the public-resolve and Clerk-binding paths need a cheap synchronous read of "is this a dedicated tenant" without joining through subscriptions and entitlements. The single column is the right primitive for that question.

Tradeoffs (kept honest)

  • Schema today carries a reservation for a feature that isn't shipped. tenancy_mode only takes the value shared for now. The column exists so that when the Clerk org provisioner ships, the rest of the codebase doesn't need a coordinated migration.
  • Future addons don't go in the same place. Storage and encryption addons will live in entitlements / tier_entitlements / organization_subscription_entitlements, not as new columns on organizations. Code that wants to know "does this org have its own bucket?" reads the entitlement, not a schema flag.
  • No CHECK constraint enforces the bundle. Earlier design used a CHECK to make dedicated mode all-or-nothing across three axes. With one column the constraint is trivially satisfied by the CHECK on the enum's values. Bundling sense is now a sales/packaging decision, not a schema invariant.

Scope kept tight

This ADR commits to the single-enum schema shape. It does not:

  • Build the Clerk org provisioner or finalize-provisioning endpoint (deferred to first paying dedicated contract — see tenant-isolation.md → Deferred design surface).
  • Define the entitlements catalog entries for own_s3_bucket or own_cmk (added when the operational mechanisms behind each ship).
  • Decide the marketing label for either future addon (sales packaging).

Why activated_at as the org draft-state mechanism?

organizations.activated_at TIMESTAMPTZ NULL gates public visibility of an org row. NULL = draft (org row exists but the slug/domain returns 404 from public-resolve); non-NULL = active. The column applies to both tenancy modes, but only one mode actually exercises the draft window today.

The setup

Provisioning a dedicated-mode org will eventually have a multi-step workflow: insert the row, provision the per-tenant Clerk org programmatically, write platform_service_providers overrides, queue the welcome email. There's a real time gap between "row exists" and "org can serve patients." Without a gate, an attacker (or a bug) could resolve the slug, hit endpoints, get partial responses, or trigger a welcome email before the underlying identity provider exists.

The shared-mode path has no provisioning gap — every shared org is immediately serviceable. But carrying a uniform lifecycle column for both modes keeps the code paths symmetrical and decouples "create the row" from "send the welcome email," which is operationally useful even when the gap is zero.

The decision

A single nullable timestamp column. Three behaviours gate on it:

  1. GET /v1/public/organizations/resolve?slug=… returns 404 when activated_at IS NULL.
  2. Welcome email is queued by the activation transition, not by raw INSERT.
  3. Owner first-login bind-on-first-auth checks activated_at before binding the auth-provider user to the org row.

Console superadmin always sees all orgs regardless of state.

Today. Every creation path sets activated_at = NOW() in the same transaction as the INSERT. Dedicated mode is not yet provisionable via API — there's no Clerk org provisioner, so there's no provisioning step to wait for. Shared orgs auto-activate; that's all the orgs we create.

Future, when dedicated-mode provisioning ships. The dedicated creation path will INSERT with activated_at = NULL, run the Clerk org provisioner asynchronously, and a re-introduced finalize endpoint (with preconditions checking the Clerk org exists, platform_service_providers overrides are written, etc.) will flip activated_at = NOW(). The column is the reservation that lets that flow slot in without a migration.

Why not a state-machine column

A provisioning_state TEXT CHECK (...) enum (draft, provisioning, infra_ready, active) was considered. Rejected — over-engineered for what the public-facing code path actually needs to know. From the public side, the only question is "should I respond to this slug?" The answer is a single boolean derived from a single column. Internal Console UI can derive richer state from other signals (existence of platform_service_providers rows, Clerk org provisioning status) without that derivation needing to be a schema field.

Considered alternatives

  • is_active BOOLEAN instead of timestamp. Loses the activation-time information. Timestamp gives us both the boolean (activated_at IS NOT NULL) and the activation point, for free.
  • Separate published / pending columns. More state, more places for inconsistency, no behavioural benefit.
  • Use existing organization_billing.status to gate. Conflates billing posture with operational lifecycle; an unpaid org isn't the same as a draft org.
  • Drop the column entirely until dedicated provisioning ships. Considered — but then the eventual dedicated-mode work would need a schema migration that touches the same hot-path code (public-resolve, welcome-email queue, owner-bind) as the foundation already wires. Keeping the column means dedicated provisioning ships as a service-layer change, not a schema change.

Tradeoffs (kept honest)

  • The column is mostly a reservation today. Every row's activated_at is non-NULL in practice. The complexity of the gate exists for a future mode that isn't shipped. Accepted — the alternative is a schema migration coordinated with the dedicated-mode provisioner, which is the more expensive path.
  • No paused_at / suspended_at for nonpayment yet. The schema only carries the activation transition. When billing-driven suspension lands, it's a separate column. Keeping activated_at narrow to "did this org ever go live" is intentional.

Scope kept tight

This ADR adds one column and three gate conditions. It does not:

  • Build a full lifecycle state machine (deferred — added when there's a real second non-active state).
  • Define billing-driven suspension (separate concern; separate column when needed).
  • Re-introduce the finalize-provisioning endpoint (deferred until dedicated provisioning ships).

Why per-clinic re-onboarding creates a fresh patients row?

A patient withdraws from clinic A (patients.deleted_at = NOW(), subscription canceled, org-scope consents withdrawn). Months later, they sign up at clinic A again. What happens?

Decision: a brand-new patients row + brand-new patient_subscriptions row + reused patient_profiles row. The previous soft-deleted patients row and canceled subscription stay in place as historical record; never resurrected.

Schema mechanic: a partial unique index UNIQUE (patient_profile_id, organization_id) WHERE deleted_at IS NULL on patients. Multiple soft-deleted historical rows can coexist for the same (profile, org) pair, but at most one active row at any point in time.

Why not "restore" the old row (clear deleted_at, reuse): two reasons. (1) GDPR: each onboarding is a fresh processing relationship under a fresh consent — the patient agreed to the org's privacy notice version as of the new onboarding, which may differ from the version they previously withdrew from. Restoring the old row blurs that boundary. (2) Data integrity: subscription state (canceled), consent state (withdrawn), and various last-X timestamps all reflect the previous chapter. Restoring just the patients row leaves the chain inconsistent; restoring the chain wholesale fights the historical-record purpose of soft delete.

Why not fail with "you withdrew, contact support": the platform is multi-tenant SaaS for telemedicine; the friction would block legitimate returning patients without commensurate safety value (no re-identification step beyond Clerk's existing email-based auth gate matters here — it's the same human, already verified, returning to a clinic that already knows them).

The fresh-row approach gives clean per-chapter audit reconstruction: each patients.id is a stable handle on one processing chapter, each entity_id in audit_log traces back to exactly one chapter, and clinic admins / operators can answer "when was this patient at clinic A?" with a definitive timeline rather than a "between these two date ranges" hedge.

The portable patient_profiles row is reused because that's the patient's identity, not their relationship-with-this-clinic. Same human → same profile, regardless of how many onboarding chapters they accumulate.

Out of scope here: platform-wide account deletion (GDPR Art. 17 erasure) is a different concept — that anonymizes patient_profiles + humans, cascades soft-delete to all patients rows. After erasure, the same person returning to the platform is treated as a stranger (new Clerk → new principal → new identity chain). That flow lands in F11.1, not 1B.9.


Why most PII is plaintext (and what isn't)?

This platform stores a lot of personal and clinical data — names, dates of birth, addresses, occupations, insurance details, allergies, diagnoses, treatment plans, contact numbers, emails. The instinct to "encrypt all PII at the column level" sounds like the conservative choice for a healthcare SaaS. It is the opposite — adopting it would weaken the platform's posture, not strengthen it. This ADR captures the rule the foundation actually follows, the threat model behind it, and the decisions for the four columns that triggered the discussion.

The rule

Column-level encryption is reserved for two narrow categories:

  1. Credential material (auth_secret class) — values whose plaintext IS access. API keys, integration tokens, webhook signing secrets, OAuth refresh tokens. Stored encrypted (when the platform must read them back) or hashed (when one-way is sufficient — api_key_hash BYTEA is SHA-256).
  2. Regulated identifiers (pii_regulated class) — national IDs whose unauthorized disclosure carries disproportionate legal consequences under national law beyond GDPR Art. 6: SSN, CUI (RO), passport numbers, IBAN-like account identifiers when they appear. Stored as *_encrypted BYTEA under AES-256-GCM (P12).

Everything else is plaintext. Names, emails, phones, DOB, addresses, occupations, allergies, diagnoses, prescriptions, treatment notes, insurance entries, residences. They rely on the layered controls below.

This rule is enforced mechanically by services/api/cmd/check-classification, which fails the build on three invariants:

  • A column classified pii_regulated MUST be BYTEA AND named *_encrypted.
  • A column named *_encrypted MUST be BYTEA AND classified pii_regulated or auth_secret.
  • A column named *_hash whose class is auth_secret MUST be BYTEA.

So drift in either direction — forgetting to encrypt a regulated ID, or encrypting something that doesn't earn it — fails make check. The check runs in the pre-commit hook, so the rule is structural, not aspirational.

Why "encrypt everything" makes the platform less secure

Column-level encryption is real defense against a narrow, specific threat: a logical-backup or pg_dump leak that bypasses volume-at-rest encryption. Outside that window, it does almost nothing — and adopting it broadly causes harm:

  • It does not protect against the realistic attacker. Any attacker who rides the application — SQLi that pivots into a handler, RCE in the API service, a stolen webhook signing key, account takeover — gets plaintext, because the application has KMS access by definition. Encryption-everywhere does not raise the bar for the attackers who matter.
  • It breaks the relational database. Encrypted columns lose equality search, range queries, joins, foreign-key checks, sortable indexes. Workarounds (HMAC siblings, deterministic encryption, blind indexes) reintroduce side channels that are harder to reason about than the plaintext column they replaced. The patient-phone case is the canonical example: random-nonce AES-GCM made phone search impossible — a feature requirement that EHR clinics treat as table stakes.
  • It pays a hot-path cost on every read. A clinic dashboard listing 50 patients = 50 phone decryptions, 50 email decryptions, 50 emergency-contact decryptions. Cumulative milliseconds across millions of requests, for no defensive gain.
  • It expands the key-management surface. More keys = more rotation events = more chances to corrupt data with a botched rotation. The "we encrypted everything" platform is more likely to lose patient data to a key-handling incident than the layered-defense one is to leak it.
  • It produces false confidence. Teams that "encrypt everything" stop investing in the controls that actually matter, because the encryption checkbox is ticked.

What the layered defense actually looks like

The threats that matter for a rehab platform are addressed by the controls below. Column-level encryption appears once, on the bottom row, scoped narrowly:

ControlThreat it addressesStatus
TLS everywhere (browser↔app, app↔DB, app↔external)Network sniffing, MITMIn place
Volume-at-rest encryption (RDS KMS)Stolen physical media, snapshot theft1D
Encrypted automated backups (RDS managed, KMS-wrapped)Backup tape / S3 export theft1D
Row-Level Security at the database (P5)Cross-tenant access via SQLi or app bugIn place
Per-org RBAC permissions (P6)Within-tenant unauthorized accessIn place
Audit log on every mutation; row-level read audit on clinical_sensitive (P10)Insider abuse, forensic trailIn place
Break-glass for any cross-tenant identifiable accessSupport overreach, joint controllership1B.11
PII masking in logs (P11)Accidental PII in log aggregationIn place
Pseudonymized identifiers helper (internal/shared/pseudonym)Available for any cross-tenant aggregate egress surface that needs it (Telemetry itself does not — readers are clinic-scoped)In place
Secrets in KMS, never in env vars or sourceSource-control leak, ops accidentIn place
No public DB endpoint, VPC + security groupsNetwork-level enumeration1D
Restricted DB access (only the API has runtime creds; ops via IAM-controlled break-glass with audit)DBA insider, stolen ops credential1D
Column-level encryption for auth_secret and pii_regulated onlyLogical-backup leak of credential material or national IDsIn place for CUI; future API keys / signing secrets

A leaked logical backup of this platform exposes plaintext name, DOB, residence, allergies, diagnoses — and that is the same surface every major EHR (Epic, Cerner, Athena, Meditech) exposes by the same design. The legal expectation under GDPR Art. 32 is "appropriate technical and organizational measures," interpreted by every published EU-DPA EHR guidance (CNIL, Garante, ANSPDCP) as layered defense, not column-encrypt-everything. The notification surface under Art. 33 / 34 is determined by the data subject's identifiability, not by which fields happened to be encrypted at the column level.

The four columns this ADR resolves

ColumnDecisionClassWhy
patient_profiles.phone_encryptedRevert to plaintext phonepii_basicPhone search is a required clinic CSAT feature (caller-ID lookup, partial / last-N-digit lookup). Random-nonce AES-GCM made it impossible. Industry-default in EHRs is plaintext phone. The threat is no different from the patient's plaintext name and DOB on the same row.
patient_profiles.emergency_contact_phone_encryptedRevert to plaintext emergency_contact_phonepii_basicNever searched, but kept consistent with the patient-phone column — encrypting one and not the other is bookkeeping with no defensive gain.
humans.emailPlaintext, documentedpii_basicClerk is the canonical identity store and HIPAA-BAA + SOC 2 Type II. The local copy is a denormalization for /v1/me, member lists, audit-log enrichment, and webhook reconciliation (WHERE email = $1). Encrypting forces an HMAC sibling for every email-touching call site; the cost is real and the threat reduction is marginal because the same backup leaks plaintext name, DOB, residence on the same row.
organizations.phonePlaintextorg_internalThe clinic's own published contact line — appears in business directories, on the clinic's portal landing page, in invoices. Not private PII.

The two phone columns in patient_profiles come out of pii_sensitive. After this revert, pii_sensitive is empty across the registry, and the class is removed from the taxonomy entirely — see data-classification.md. The cleaner taxonomy is pii_basic (most contact PII, plaintext + layered defense) and pii_regulated (national IDs, encrypted), with no muddy middle tier that invites future authors to dump things in for ill-defined reasons.

organization_billing.tax_id_encrypted (CUI) stays as it is — it is a regulated national identifier and the canonical example of pii_regulated.

Considered alternatives

ApproachWhat it isWhy we rejected (or chose) it
Encrypt all PII columns at restEvery name, email, phone, DOB, address, allergies, diagnosis is BYTEA + AES-256-GCMRejected. Breaks search/joins/indexes, costs ms per read across the platform, expands the key-management surface, and does not protect against the application-layer attackers who actually matter. Misaligned with EHR industry practice and EU-DPA guidance.
Encrypt patient phone with HMAC sibling for lookupphone_encrypted BYTEA + phone_lookup BYTEA with HMAC-SHA-256Rejected. HMAC supports exact-match on a normalized form but cannot support partial-number / last-N-digit search, which is the actual CSAT requirement. Half a solution at full cost.
Drop humans.email and call Clerk on every readNo local copy; Clerk is the only sourceRejected. Every /v1/me, every member list, every audit-log enrichment, every invite-pending render becomes a Clerk round-trip. Performance + cost regression with no proportional security gain (Clerk's own data still leaks if they breach).
Column-encrypt only credentials and regulated IDs; plaintext for the rest, with layered defenseThe rule above, mechanically enforced by check-classificationChosen. Aligned with EHR industry practice and EU-DPA guidance. The column-level lever is reserved for the cases where it actually changes the threat picture; everything else relies on the controls that do address the realistic attacker.

How the rule scales to future columns

Every Layer 2+ migration adds columns. The author classifies each one in data-classification.md in the same PR; the encryption invariants check picks the shape automatically:

  • A new appointments.contact_email TEXT classified pii_basic — passes. Plaintext + RLS + audit + at-rest disk encryption is the protection envelope.
  • A new patient_profiles.national_id TEXT classified pii_regulated — fails the build. The author either reclassifies or renames to national_id_encrypted BYTEA and adds the encryption at the repository boundary.
  • A new webhook_subscriptions.signing_secret_encrypted BYTEA classified auth_secret — passes.
  • A new member_invites.invite_token_hash BYTEA classified auth_secret — passes.
  • A new outbox.payload_encrypted BYTEA classified system_metadata — fails the build. Either drop the _encrypted suffix (if it's not actually encrypted) or upgrade the class to auth_secret with documented reasoning.

There is no per-PR judgment about "should we encrypt this." The class drives the answer.

Tradeoffs (kept honest)

  • A logical-backup leak of this platform exposes plaintext patient name, DOB, residence, occupation, allergies, diagnoses, treatment plans. That's the same exposure every EHR has at the column level; the notification surface under GDPR Art. 33 / 34 is unchanged whether email is plaintext or column-encrypted, because the same backup leaks the rest of the row.
  • The DBA / ops-role threat is real and is mitigated operationally — IAM-restricted DB access, no shared accounts, every administrative connection audited, the API service holds the only runtime DB credentials. We accept that an attacker with both the restartix role and KMS access can decrypt anything; defense at that point is "the auditor sees them do it," not column encryption.
  • If a future feature stores something genuinely heightened-sensitivity that doesn't fit pii_regulated — e.g., a confidential whistleblower hotline number — the right move is to add a new class with explicit ADR rationale and update the encryption invariants check to cover it. The taxonomy stays disciplined; we don't reinflate pii_sensitive.

Revisit triggers

  • A retained DPO or DPIA reviewer formally objects to plaintext email or phone on backup-leak grounds → consider Option B (HMAC-indexed lookup) for the specific column. The migration shape is documented in the considered-alternatives table; the existing crypto helper covers the encryption side already.
  • A national DPA publishes binding guidance that EHR contact PII must be column-encrypted → reclassify and migrate; the check picks up the change automatically.
  • A new direct-to-public surface ships that exposes one of these columns to a target the registry doesn't currently allow → that's an egress decision, not an encryption decision; the registry already covers it.
  • Any future column that semantically belongs to a class not represented today (e.g., genuine biometric template data, financial account numbers) → add the class, decide its encryption posture in a sibling ADR, extend the check.

Scope kept tight

This ADR settles the four named columns and freezes the rule. It does not retrofit the layered controls — those land per-layer (RLS already shipped; KMS-backed RDS in 1D; break-glass in 1B.11). It does not preemptively classify Layer 4+ columns that don't exist yet — every migration carries its own classification entry per P39, which is when the encryption invariants will check it.

Implementation: the migration that reverts phone_encrypted and emergency_contact_phone_encrypted to plaintext lands as a single edit to migrations/core/000006_patient_identity.up.sql per the editable-migrations rule (CLAUDE.md). The encryption invariants check ships in services/api/cmd/check-classification alongside the registry update.


Open decisions

Decisions still pending live in the implementation plan, not here, so they stay attached to the layer that has to resolve them. See implementation-plan.md → Open Decisions. Items currently parked there:

  • IaC tool — Terraform vs Pulumi (1D, before AWS staging is provisioned)
  • PDF rendering engine — ChromeDP vs wkhtmltopdf vs Gotenberg (F6.2)
  • Email / SMS / push transport providers (F7)
  • Bunny Stream vs S3 for exercise videos (F9.1)
  • Cross-purpose withdrawal cascade for Tier B medical consents — independent vs cascading (F3.5)
  • Member invite mechanism — Clerk-driven email invites vs custom token vs share-link (1B.12)
  • Break-glass support_engineer platform role vs superadmin-only access (1B.11)

Once a decision is made, write the why here and remove it from the implementation-plan list.

Recently resolved:


Why unstable_cache, not Next.js's built-in fetch tags?

Pattern P42 caches server-side reads via unstable_cache from next/cache, NOT via the documented cache: "force-cache" + next: { tags } mechanism on fetch(). We tried the documented mechanism first; it silently doesn't work for any auth-walled API.

The discovery

We wired the obvious thing: init.cache = "force-cache"; init.next = { tags: [...] } on every tagged GET. Production-built Console + a clean Redis. Hit the page twice. Watched the Core API request log: every refresh fired a fresh GET. Five identical cache entries on disk for one URL — so writes were happening, hits were not.

The cause is in next/dist/server/lib/incremental-cache/index.js:generateCacheKey. The cache key is a hash over the request payload, including the entire headers object. Only traceparent and tracestate are excluded:

js
const cacheString = JSON.stringify([
    MAIN_KEY_PREFIX,
    this.fetchCacheKeyPrefix || '',
    url,
    init.method,
    headers,        // ← full headers including Authorization
    init.mode,
    ...
]);

Clerk's getToken() returns a freshly-signed JWT on each call (the IAT timestamp shifts), so the Authorization: Bearer <jwt> header is byte-different on every request from the same user. Different cache key → cache write but never a cache hit. The mechanism is not designed for rotating-token workloads.

unstable_cache works because the cache key is ours

ts
unstable_cache(fetchFn, [tag], { tags: [tag], revalidate: false })()

The cache key is the explicit string we pass — the data-scope identifier (platform:legal-templates, org:abc:summary, me:xyz:consents) — not derived from request properties. Two different users with two different Clerk JWTs hit the same cache entry because the key only encodes the data scope.

Considered alternatives

  • Strip the Authorization header from cache-tracked fetches. Would require either (a) routing through a Next.js Route Handler proxy that re-attaches auth on the way out, or (b) using a long-lived service token instead of the user's JWT. Both add a layer; option (b) trades user attribution for caching.
  • Use next.revalidate: 60 instead of tag-based invalidation. Same Authorization-header problem, plus loses precise invalidation on write.
  • Disable caching at the api-client and cache at the page level. Doable but fragments the convention — every page would need its own cache wrapper. Centralizing in the api-client keeps one control point.

Cost frame

unstable_cache is marked "unstable" in the Next.js API. Cost of switching to a future stable equivalent: a one-line change in packages/api-client/src/client.ts. That's worth more than fighting the fetch-tag mechanism.


Why URL ≡ scope guard at the route layer?

middleware.RequireURLOrgMatchesScope("id") is mounted on every per-org route group. It rejects any non-superadmin request whose URL {id} doesn't match the principal's CurrentOrganizationID (resolved from X-Organization-ID). Surfaced by the cache work; closes a latent gap that existed long before caching.

What the gap was

RequireOrganizationScope validates the principal has access to the org named in the header. The handler reads URL {id} and queries that org. If header=A and URL=B, the handler reaches the service. RLS at the DB layer protects today: query for org B with RLS context "user is member of A" returns nil → 404. Functionally safe — but only because RLS hides the response.

Why caching turns this into a real leak

Once we cache by URL {id} (P42 + P45):

  1. User Y (member of org B) requests /v1/organizations/B — cache populates org:B:summary with B's data.
  2. User X (member of org A only, header=A) requests /v1/organizations/B — cache hit, serves Y's data to X.

RLS only ran for the first request. Cache hits skip it.

Why fix it at the route layer, not the cache layer

We considered three approaches:

  1. Cache key derived from CurrentOrganizationID (header), not URL. Service-layer change. Works, but couples caching to auth context — a future endpoint that legitimately wants to cache by URL ID would need a different mechanism. Also clutters every cached service method with the same precondition check.
  2. Per-endpoint URL == header guard inside the service. Same problem at smaller scope.
  3. Route-layer middleware that 403s mismatches ← we picked this.

Option 3 is the universal answer:

  • Applies once at the route group; every nested route inherits.
  • Independent of caching — closes the latent security gap whether or not the endpoint caches today.
  • Matches HTTP semantics: URL is the resource path; if you can't access the resource, you can't query it.
  • Cost is a UUID parse per request.

Superadmin bypass

Superadmins (platform_roles, human-only by CHECK constraint) operate cross-tenant by design — Console reads any org without ever setting X-Organization-ID. Their CurrentOrganizationID is uuid.Nil; without the bypass they'd 403 every Console request. Bypass is safe because superadmin status is row-level state granted only by another superadmin, not derivable from per-org permissions.

Discipline

Every new r.Route("/{id}", ...) block under a per-org URL must mount the middleware, even if the endpoint doesn't cache. The reason is that adding caching later is a one-line change in the api-client; if the route guard is missing, adding caching turns into a security review. The /new-domain template lists this as a mandatory step.


Why self-hosted pgbouncer on ECS Fargate, not RDS Proxy?

Pattern P44. RDS Proxy is the AWS-native managed alternative; we use self-hosted pgbouncer 1.25 on ECS Fargate. The trade-off is operational simplicity vs throughput: RDS Proxy is push-button, but it pins client connections when prepared statements are in use, which kills the multiplexing we're paying for.

The pinning problem

pgx (our Postgres driver) uses QueryExecModeCacheStatement by default — every query becomes a named server-side prepared statement. RDS Proxy detects the PARSE step and pins the client to a specific backend connection for the rest of the session. Pinning means each app connection holds one backend connection exclusively — that's identical behavior to no proxy at all, except now we're paying RDS Proxy's per-vCPU cost on top.

Two ways to make RDS Proxy work, both worse

  1. Switch pgx to QueryExecModeCacheDescribe or QueryExecModeExec. Drops named prepared statements. Costs ~10–30% on query throughput for repeated queries because Postgres re-parses + re-plans every call. Workload is mostly request-scoped queries on rotating connections, so the loss is real but bounded — yet it's a real loss for a real benefit (managed-ness).
  2. Wait for RDS Proxy to add protocol-level prepared statement support. pgbouncer added this in 1.21 (Oct 2023); RDS Proxy roadmap is unclear at time of writing.

Why self-hosted is fine

pgbouncer 1.25 with max_prepared_statements=200 handles named prepared statements transparently in transaction pool mode. pgx's default exec mode works without modification. Full multiplexing benefit retained.

The operational cost of self-hosted pgbouncer on ECS:

  • Single static binary, single config file. Shipped at scale for 15+ years; the project is stable enough that "operate it yourself" is closer to "set it and forget it" than to "build a service team around it."
  • Two replicas in ECS Fargate (one per AZ) behind the same ALB as the application services, with security-group-restricted access. ~50 lines of Terraform. The Core API and Telemetry API tasks reach pgbouncer over the private subnet via the internal hostname.
  • Same pgbouncer.ini ships to ECS as runs in local docker-compose — true dev/staging/prod parity. Auth method differs (local: plain text userlist; prod: scram-sha-256 + auth_query against a SECURITY DEFINER Postgres function with credentials from Secrets Manager).

Why ECS Fargate everywhere, not App Runner

The full rationale lives in Why ECS Fargate over App Runner?. Short version: pgbouncer must be on Fargate (App Runner is HTTP-only), scheduled tasks must be on Fargate (App Runner doesn't host them), and migration runners must be on Fargate (App Runner can't run one-shot tasks). Mixing two compute platforms means operating two of everything; consolidating on Fargate keeps the IaC single-shaped.

Migrations bypass the pool

golang-migrate uses session-scoped pg_advisory_lock to serialize migration runs across deploying instances. Advisory locks are session features — incompatible with transaction pool mode. The Makefile's migrate-up / migrate-down targets use DATABASE_DIRECT_URL (port 5432, direct to Postgres / RDS), not the pooled DSN. Same convention applies on AWS: the deploy pipeline runs migrations as a one-shot ECS task with a security group allowing direct port 5432 to RDS for that task only.

Why one Console-side break-glass primitive (not per-feature elevation)

Foundation 1B.11.x. The break-glass primitive (1B.11) shipped backend-complete: a session table, scoped permissions, a route middleware that binds an audit GUC so every row written downstream links back to the session, and a notification fan-out to clinic admins. The Console UI side was stubbed — ?bg=active URL state, no real session opens, no real banners. As soon as the second Console surface needed elevation (staff invitations alongside the existing patient surfaces), the question surfaced: ship a one-off wrap per feature, or build a single primitive every Console write goes through?

Decision

Build the primitive once. Five pieces, all reusable:

  1. RequirePerOrgPermissionOrBreakGlass(permission, scope, svc, paramName) — chi middleware that admits both callers on the same endpoint: tenant principals via per-org permission, platform principals via active break-glass session at the named scope. The split-by-principal-type is structural — clinic = controller, platform = processor — and the middleware encodes it in 50 lines. Same route, two acceptable auth paths.

  2. org_management scope — broad scope covering staff/roles/members/settings writes. Single elevation session covers a related task without re-prompting per click; the audit row records the specific action so the session's blast radius is reconstructable. Held by support_engineer for routine clinic-recovery work; superadmin holds it unconditionally.

  3. <BreakGlassSessionProvider> + useBreakGlassSession(scope) — Console layout fetches the calling principal's active sessions for the surrounding org once, threads them through a context, and exposes per-scope state to descendant client components. P48-compliant via useServerSyncedState.

  4. <RequireBreakGlass scope=…> — client wrapper that conditionally renders children when an active session exists and surfaces the elevation prompt otherwise. Defense-in-depth alongside the backend gate, plus the UX win of "don't show the action button when the action will 403."

  5. withBreakGlass(fn) action wrapper — server-side helper that surfaces backend break_glass_required / break_glass_expired errors as a typed sentinel instead of an exception, so server actions can return them as part of useActionState for the client UI to react to.

Why not ship one-off wrappers per feature

The four wraps would have been: invite-staff, role-change, member-remove, org-settings. Each ~30 lines of "open modal, call action, retry on 403" UI logic. Mechanically copyable, which is exactly the failure mode — once one feature ships with theatre instead of a real gate, the next feature copies the theatre, and break-glass becomes a sticker rather than a gate. The CLAUDE.md rule "no shortcuts in foundation" applies here: the primitive cost is paid once, the workaround cost is paid every time.

The question "is per-feature wrap good enough?" was the right one to ask, and the answer is no. Specifically:

  • Audit linkage requires the backend gate. A Console feature that opens the modal but doesn't call a route gated by RequirePerOrgPermissionOrBreakGlass produces audit rows with break_glass_id IS NULL. The backend middleware is the load-bearing piece, and the Console primitive's main job is to make sure features can't accidentally bypass it.
  • Tenant + platform on the same endpoint avoids route duplication. Without the smarter middleware, the alternatives were (a) duplicate the staff-invitations route as /v1/admin/staff-invitations for Console (cost: doubled routes; drift between them) or (b) gate Console writes at a separate Console-only API (cost: Console can't reuse the Clinic-app routes). Both are worse than the 50-line middleware.
  • The taxonomy holds across future surfaces. Org settings, billing, integrations, designations, webhooks — every Console-side write against tenant data eventually needs the same primitive. Building it once means each future surface mounts the middleware on its route + uses the hook in its component, rather than re-deriving the elevation flow from scratch.

Considered alternatives

Per-resource scopes (staff_manage, member_remove, role_assign, org_settings). Finer-grain audit, but the elevation modal pops every two clicks while a superadmin is doing related ops. The audit row already records the specific action; the scope only needs to be coarse enough to bound the session's blast radius. org_management is the right grain.

Handler-layer elevation enforcement (vs middleware). Would let the handler decide elevation requirement based on the operation. Same outcome, more places to forget. Middleware is the canonical foundation gate for cross-tenant enforcement (P47, RequireBreakGlass, RequirePermission); adding one more that composes with them is the consistent shape.

Read-gate org_management too (require elevation just to SEE the staff list). Reads stay always-on. Staff data isn't PHI, listing members is operationally useful for support, and the writes are where the controller-vs-processor boundary actually matters. Mirrors the patient surfaces — aggregate view always on, identifiable lists / writes elevation-gated.

Tradeoffs (kept honest)

  • The Console-side primitive is ~6 files (provider, hook, modal, banner, gate wrapper, action wrapper) plus layout wiring. That's more code than four per-feature wraps. The cost amortizes starting at the second consumer.
  • The smarter middleware adds one new path through authorization. Tested in per_org_permission_or_break_glass_test.go with three scenarios (tenant pass, platform 403, platform 200 + audit linked) and one fail-closed scenario (tenant without permission). The branch is structural (principal type) so the test surface stays small.
  • inviteStaffAction is the first integration consumer end-to-end: the dialog wraps its body in <RequireBreakGlass scope="org_management"> so the content switches between elevation prompt and invite form based on session state, and the action wraps its api call in withBreakGlass(...) so a session that expires between dialog-open and submit surfaces a typed needsElevation field in the result that flips the body back to the elevation prompt. Validates the full UX flow: tenant principals would consume the same route from the Clinic app via per-org permission; Console superadmins consume it via elevation.

Scope kept tight

The first PR mounts the smarter middleware on four routes only: PATCH /:id, POST /members, DELETE /members/{id}, POST /staff-invitations. Org settings, domains, designations, webhooks, integrations are sibling targets that follow-up commits land once the primitive is proven on these. Foundation discipline says fix the load-bearing piece (the primitive); no-speculation says don't pre-emptively gate every conceivable future write.

Revisit triggers

  • A Console surface needs reads-with-elevation (e.g. revealing PII inside the staff list, not just the writes). Add a finer scope at that point; today's broad org_management doesn't cover that case and shouldn't be widened to.
  • The bind-on-first-auth flow's audit row for staff invites needs to inherit break_glass_id. Today the bind happens on the recipient's first request, not the elevating principal's session — so the audit row's break_glass_id is the session of the principal who issued the invite, captured at issue time and stored on the row. Fine for the current model; revisit if regulators ask "who confirmed the bind."
  • A future platform principal type (e.g. partner-tier support staff) holds org_management with a narrower row-level scope (specific orgs, not all). That's a per-tenant-allowlist column on platform_memberships, not a new scope.

Why composer is a separate Go service?

The exercise-video composition pipeline (P56) lives in services/exercise-composer/ as a sibling to the Core API, not as an internal package inside services/api/. Three reasons.

ffmpeg is heavyweight and infrastructure-shaped. Each render is ~5-15 seconds of CPU-bound ffmpeg work (download, decode, re-encode, mux, concat, upload). Doing that inside the Core API process steals request workers, balloons the API image (~80 MB of ffmpeg + libavfilter dependencies), and entangles the API's release cadence with whatever ffmpeg version Bunny expects on the way in. Separating it preserves the API's character as a fast request/response service.

Independent scaling by queue depth. When 100 treatment plans get created in a burst, that's 100 prescription renders queued. The API doesn't need to scale — it just enqueues. The composer scales by queue depth, on its own task definition, with whatever vCPU/memory ffmpeg actually wants. Same pattern the telemetry service set in motion: media-shaped workloads live outside the API.

Stateless worker pattern matches the work. The composer has no per-request state, no DB ownership, no domain logic beyond "translate one recipe into one MP4." That makes it a natural Fargate worker — same shape as a Lambda but on long-running compute (no 15-minute cap). It also means the worker can be replaced (Go, Python, anything that wraps ffmpeg) without touching the API.

Why not Lambda

ffmpeg layers are awkward (binary too large for the standard layer cap, need a custom container image). The 15-minute timeout is a hard cap that becomes the tail-latency ceiling. Cold starts add seconds to first renders. Fargate is already the platform's compute primitive — adding Lambda just to host one function isn't worth the runtime split.

Why not inside the API

The CLAUDE.md "no microservices" rule applies to domain services (no separate auth service, no separate billing service — those are packages inside the API). The composer isn't a domain service; it's an infrastructure-shaped media pipeline, equivalent to a transcoder or a thumbnail generator. The telemetry service set the precedent that media-heavy workloads with their own resource profile graduate to a sibling service.

Tradeoffs (kept honest)

  • Cross-service auth. The Core API calls the composer over HTTP. Today that's an unauthenticated VPC-internal call; production needs a shared bearer secret (or IAM-signed SigV4 if we want zero shared secrets). Not a big lift, but a real piece of work.
  • Two deployment pipelines instead of one. Composer needs its own ECR image, its own ECS service definition, its own deployment trigger. The platform's IaC layout already supports adding services; cost is real but bounded.
  • Cat A providers resolver lives only in the API service today. The composer accesses Bunny credentials via env vars in v1, with a tracked migration to the shared resolver before production launch (see "Why caching exercise renders by (exercise, prescription) only" below for the corresponding cache discussion).

Why Bunny Stream + AWS S3 split (not all-S3 or all-Bunny)?

Two different storage products in one pipeline: raw filming primitives live in our own AWS S3 bucket; rendered MP4s live in Bunny Stream. Why both.

Source-of-truth vs. delivery is a real distinction. The raw assets (intro/pauza/outro silent videos, 5-rep blocks per side, per-side coached VO mp3s) need our control: versioning, encryption, IAM, lifecycle, audit. They never serve patients directly — the patient never fetches intro-video-2.mp4; they fetch a composed MP4 that was built from intro-video-2.mp4 and a dozen other primitives. Raw primitives are inputs, not outputs.

Bunny Stream is for delivery, not for source-of-truth storage. What Bunny does well: take an MP4, transcode to HLS adaptive bitrates (240p/480p/720p/1080p), serve via a CDN with their player. What Bunny doesn't do: act as a primary store that we re-derive renders from. If a render needs to be re-baked (asset version changes, variant pool rotates, codec needs upgrade), the source assets must be where we can read them deterministically — not where they share lifecycle with patient-facing playback.

Different cost shapes, different access patterns.

  • S3 storage: $0.023/GB-month; rare reads (one per render); high durability; cheap to keep raw forever.
  • Bunny storage: $0.03/GB-month + CDN delivery costs; high reads (every patient session pulls the HLS playlist); transcoded into multiple bitrates (storage is ~3× the input). Bunny is priced for delivery, not archival.

Storing raw primitives in Bunny would mean paying transcoded-bitrate-ladder storage costs for content that never plays back. Storing rendered MP4s in S3 means paying CloudFront/CDN bills to deliver them, or running our own HLS transcoding pipeline (which is exactly what Bunny does for $0.005/GB).

Why not all-S3 + CloudFront for delivery too

Possible. But CloudFront alone doesn't transcode — we'd need MediaConvert or our own ffmpeg pipeline for the HLS ladder, then S3 lifecycle policies for each bitrate variant, then signed-URL handling, then the HLS playlist generator. That's a lot of plumbing to replicate Bunny's product. The price comparison ($0.005/GB Bunny CDN vs CloudFront's tiered pricing + MediaConvert + S3) does not favour DIY at our scale.

Why not all-Bunny (raw assets too)

Bunny Storage exists, but its access controls are coarser (no per-object IAM, no SSE-KMS, no cross-region replication primitives that match AWS's). The raw primitives are the highest-value content the platform owns — losing them or having them stolen would be worse than losing rendered MP4s (those re-bake). Keeping them in AWS with the platform's full security posture (IAM, KMS, audit, lifecycle) is the safer position.

Tradeoffs (kept honest)

  • Two providers to manage. AWS for S3 + ECS, Bunny for video delivery. Each has its own auth, its own dashboard, its own outage modes. Mitigated by Bunny being narrowly scoped to "rendered video delivery" — every other concern stays inside AWS.
  • One extra hop in the pipeline. Composer downloads from S3, uploads to Bunny. Network egress cost (AWS → Bunny via the public internet) is real but bounded — bundles are ~280 MB per render, ~$25/month for the entire library renders even at heavy refresh rates.
  • Bunny isn't Cat A yet. The Bunny credentials live in env vars in v1, with a tracked migration to the platform's curated providers resolver before production launch (see P56).

Why caching exercise renders by (exercise, prescription) only?

The composer's exercise_renders cache table (landing with F9.1) keys by (exercise_id, prescription_hash, language)bunny_video_id. Not per-patient, not per-session, not per-day. Three reasons.

Per-patient caching explodes storage. A treatment plan with 8 exercises × 100 patients = 800 renders, each ~10-50 MB depending on dose. At 5,000 patients × 10 active prescriptions = 50,000 renders × 30 MB = 1.5 TB of patient-scoped Bunny storage, growing linearly with patient count. Per-(exercise, prescription) caching collapses that to ~10 GB total across the entire platform — independent of patient count.

Variants provide the variety we actually want. Each exercise has 3 variants per slot (intro, pauza, outro, rep-left video/vo, rep-right video/vo). Different prescriptions of the same exercise pick different variant combinations at random, so two treatment plans that prescribe the same exercise at the same dose still see different rendered MP4s. That gives the anti-repetitiveness benefit (across the patient's treatment plan, exercise X at dose Y looks visually distinct from exercise X at dose Y') without paying per-patient storage cost.

"Same patient sees same render across treatment plan days" is acceptable. A patient doing lumbar detensioning at 2×10 for 30 days sees the same composed video every time. That's a worse experience than fresh-variant-every-day, but the cost gap is enormous and the patient gets visual variety across the treatment plan's other exercises (which use different variant combinations). If repetition becomes a complaint, the upgrade path is a small pool of K=3-5 renders per prescription, rotated by session index — still bounded, not exploded.

Considered alternatives

Per-patient pool with hash(patient_id, session_date) as the seed. Gives every patient fresh combinations every session. Cost: full storage explosion (5,000 patients × 10 prescriptions × N sessions). Rejected.

Per-session-date seed (hash(prescription_id, session_index) mod K). Patient sees one of K cached renders for their prescription, rotated. Provides freshness across sessions without per-patient storage. Reasonable upgrade if needed later — but not the v1 default. The point of v1 is to keep the cache lean.

No cache (re-render every patient session). Simplest, but each render is 5-15 seconds of composer CPU. At 5,000 patients × daily sessions = enormous compute waste rendering identical inputs. Rejected.

Tradeoffs (kept honest)

  • Variant pool is finite. With 3 variants per slot × ~9 slots, the combinatorial space is ~20k unique renders per exercise. We bake one per (exercise, prescription, language) — most of the space is unused. That's fine; the goal isn't to exhaust variant combinations, it's to make different prescriptions look different.
  • Asset version bumps invalidate the cache. When the filming team uploads a new variant, exercises.asset_version bumps; existing renders become stale and re-bake on next request. Bounded refresh cost (~$0.025/min × ~3 min average × N prescriptions).
  • Cache key includes language. Adding a 2nd language doubles the cache size for that exercise (one render per language per prescription). Acceptable — language is a small dimension and adding one is intentional.

Why fresh-per-set counting (not cumulative)?

Each set in a prescription hears the count "1, 2, …, N" fresh, regardless of where in the session it falls. Set 1 of a 3×5 prescription hears "1-5"; set 2 hears "1-5" again; set 3 hears "1-5" again. Cumulative counting ("1-5" then "6-10" then "11-15") was the obvious-but-wrong first instinct.

The compositional reason. Each set is baked once and re-used. A "3 sets × 5 reps" prescription bakes one 5-rep-set MP4 and concats it in three times. The baked set contains the first 5 counts of the side's 20-count VO master. Counts reset between sets by construction — there is no other way for the same baked clip to slot into multiple positions.

The clinical reason. Physio convention is per-set counting. The patient hears "one, two, three, four, five" for each set and intuits "five reps complete, take a breath, here we go again." Cumulative counting across sets ("…fifteen!") implies a single 15-rep continuous effort, which is the wrong mental model for a 3×5 set/pause/set pattern.

The production reason. Cumulative counting would require multiple audio masters per exercise per language — one for 1×20 reps, one for 2×10, one for 3×5, one for 4×5, etc., because the count word that lands on each rep differs. Fresh-per-set means the audio team records one 20-count VO master per side per language that serves every multi-set prescription up to 20 total reps. 4× recording reduction, before language multiplier.

Considered alternatives

Cumulative counting across the entire session. Patient hears "1, 2, 3, …, N_total" continuously. Wrong clinical model, requires multiple audio masters per exercise. Rejected.

Counting that resets within a set but builds across the session in a separate "set N of M" announcement. ("Set 1 of 3 — one, two, three, four, five. Take a moment. Set 2 of 3 — one, two…") Reasonable; would need an additional "set N of M" audio track per language. Not built today; could be added without changing the composer architecture — just splice a "set 2 of 3" cue between the baked set and the inter-set pauza. Deferred until clinical feedback demands it.

No counting (just form cues and breathing). Simplest, but loses the rep-pacing benefit. Counting is doing pacing work — telling the patient how fast to rep — that breath cues alone can't replicate.

Tradeoffs (kept honest)

  • Repetitive within a long session. A patient doing 4 sets of 5 reps hears "1-5" four times. Variant rotation on the VO (3 variants of the side's VO master per language) provides some variety, but it's still the same count sequence. Worth-it trade vs. per-set audio masters.
  • The composer needs to know reps fit the 5-rep / 20-count window. Validation rejects prescriptions outside {5, 10, 15, 20}. That's a hard constraint of the audio masters, not a UX limitation — physio prescriptions naturally land in those multiples.
  • Adding 25-rep sets needs a longer audio master. Today's 20-count cap is the right call (longer sets are rare in physio); if needed, the audio team records a 30-count or 40-count master per side per language and counts_per_audio_master bumps in the manifest.