Error Envelope
Every error response from the Core API uses the same JSON envelope. Documented once, here, so handler authors copy a known shape and frontends/SDK generators have a stable contract.
The envelope
{
"error": {
"code": "<machine_readable_snake_case>",
"message": "<short human sentence>"
}
}The shape is the same for every status code in the 4xx and 5xx range. Successful responses use { "data": ... } (see API contract conventions — TODO 1.7).
| Field | Type | Always present? | What it carries |
|---|---|---|---|
error.code | string | yes | Stable identifier, snake_case. Frontends switch on this. Examples: unauthorized, forbidden, org_not_found, slug_taken, validation_error, internal_error. |
error.message | string | yes | Short, human-readable, safe to render. Never carries database details, panic text, stack traces, or constraint names. |
error.fields | object<string,string> | only on 422 | Per-field validation reasons. Field names are the request's input names, not column names. |
Status codes
| Status | When | Example code |
|---|---|---|
| 400 | Request was malformed (bad JSON, missing path param) | invalid_body, invalid_id |
| 401 | No authentication / expired token / unknown caller | unauthorized |
| 402 | Org's tier blocks this request — frontend renders an "Upgrade" CTA | tier_entitlement_unavailable, limit_exceeded |
| 403 | Authenticated but not allowed (RBAC, blocked user, wrong tenant, regulated org-entitlement disabled) | forbidden, superadmin_required, org_entitlement_disabled |
| 404 | Resource not visible to caller (real not-found OR RLS-hidden — see note) | <resource>_not_found |
| 409 | Conflict with current state (unique violation, immutability) | slug_taken, form_already_signed |
| 422 | Request is well-formed JSON but failed semantic validation | validation_error (with fields) |
| 429 | Caller is rate-limited (per-IP today; per-principal supported) | rate_limited |
| 500 | Unknown error path (panic, unwrapped DB error, missing config) | internal_error |
| 502 | Upstream third party returned a problem we surface as such | clerk_unavailable, daily_failed |
402 vs 403: the architecture splits "your tier won't allow this — pay more" (402, paywall) from "you can't do this regardless of tier" (403, RBAC + regulated). The frontend's "Upgrade" CTA only fires for 402; 403 means the user needs role grants or platform review, not money.
404 vs 403: when RLS hides a row, we return 404 — telling the caller "you don't have permission to read this org's stuff" by way of a 403 leaks the existence of the row. Stick to 404 unless the caller's permission gap is a UX surface they need to act on (e.g. clinic admin sees "you need data.view_deleted" — there 403 is right because they can request the permission).
Validation errors (422)
Build with httputil.NewValidationError(code, message, fields):
return httputil.NewValidationError(
"validation_error",
"request failed validation",
map[string]string{
"email": "must be a valid email",
"organization_id": "required",
},
)Renders to:
{
"error": {
"code": "validation_error",
"message": "request failed validation",
"fields": {
"email": "must be a valid email",
"organization_id": "required"
}
}
}Rules for fields:
- Keys are request input names (
email,organization_id,tagline), not table columns. - Values are short, human-readable reasons (
required,must be a valid email,must be 1-200 characters). No database driver text, no constraint names, no SQL state codes. - Do not include data the user submitted in the value (avoid echoing
"got 'admin@@example.com'") — keeps logs and UI consistent across input variants.
Tier-gate errors (402 / 403)
The four-gate model — RequirePermission, RequireTierEntitlement, RequireOrgEntitlement, EnforceLimit — produces typed errors that carry per-gate context fields. Frontends key off error.code (the discriminator), then read the typed fields they know belong to that class.
tier_entitlement_unavailable (402)
The org's active subscriptions don't include the entitlement. Built by httputil.NewTierEntitlementUnavailableError(missingEntitlement, currentTierCode, upgradeURL).
{
"error": {
"code": "tier_entitlement_unavailable",
"message": "this entitlement is not included in the organization's current tier",
"missing_entitlement": "telerehab",
"current_tier_code": "free",
"upgrade_url": "https://billing.example.com/upgrade?tier=pro"
}
}current_tier_code and upgrade_url are nullable — the resolver doesn't always know them. Frontends handle missing fields gracefully.
org_entitlement_disabled (403)
The regulated organization_entitlements flag is FALSE for this org. No tier upgrade alone unlocks a regulated entitlement — superadmin review or tier-engine projection is the path. Built by httputil.NewOrgEntitlementDisabledError(missingEntitlement).
{
"error": {
"code": "org_entitlement_disabled",
"message": "this entitlement is disabled for the organization",
"missing_entitlement": "telerehab"
}
}Frontend treats this differently from tier_entitlement_unavailable — the CTA is "Contact support" or "Submit for review," not "Upgrade."
limit_exceeded (402)
The request would breach a hard-block usage cap. Built by httputil.NewLimitExceededError(limitCode, current, cap, upgradeURL). Note: superadmins do not bypass usage limits.
{
"error": {
"code": "limit_exceeded",
"message": "request would exceed the organization's tier limit",
"limit_code": "max_patients",
"current": 50,
"cap": 50,
"upgrade_url": "https://billing.example.com/upgrade?tier=pro"
}
}current and cap are integers in the limit's native unit (count, bytes, seconds, minutes — see limit_definitions.unit). Frontends format them appropriately.
Rate-limited errors (429)
The caller has exceeded a rate-limit policy. Built by httputil.NewRateLimitedError(policyCode, retryAfterSeconds); the middleware that wraps it (internal/core/ratelimit) sets the canonical headers.
{
"error": {
"code": "rate_limited",
"message": "too many requests",
"policy": "public_resolve",
"retry_after": 27
}
}Companion response headers (set on every rate-limited response, allowed or denied — they help legitimate clients self-throttle before they hit 429):
| Header | Meaning |
|---|---|
X-RateLimit-Limit | Policy ceiling (requests per window). |
X-RateLimit-Remaining | Requests left in the current window. Zero on a 429. |
X-RateLimit-Reset | Unix-seconds at which the current window rolls over. |
Retry-After | Seconds the caller should wait before retrying. Set on 429 only; clamped to ≥ 1. |
policy is one of the configured policy codes. Today: public_resolve (slug enumeration cap on /v1/public/organizations/resolve) or auth_verify (entry to /v1, capping JWT verification). Frontend can use the discriminator to render a per-surface message ("the org-lookup endpoint is rate-limited" vs "you've made too many API calls").
429s are audit-logged with action RATE_LIMITED, attributed to the singleton system principal on unauthenticated paths and to the authenticated principal otherwise. See implementation-plan.md § 1.16.
Internal errors (500)
Anything that isn't an *AppError — wrapped DB failures, panics, missing config, surprise nil pointers — is mapped to a single shape:
{ "error": { "code": "internal_error", "message": "An unexpected error occurred" } }The recovery middleware catches panics and returns this same envelope — never the panic value. Stack traces and the underlying error text are written to the structured log only.
Correlation: every response (success or error) carries X-Request-ID set by requestctx.RequestID middleware. The same value is on the structured log line. Customers reporting an issue should be asked for that header value; it pinpoints the request in logs.
Don'ts
- Don't write
httputil.Error(w, status, code, err.Error()). The whole point of the envelope is to keeperr.Error()out of responses. - Don't add ad-hoc fields to the envelope.
codeandmessageare always present;fieldsis reserved for 422 validation; gate-typed errors carry their documented context (missing_entitlement,current_tier_code,limit_code,current,cap,upgrade_url). Anything else is unstructured drift — make a new typed*AppErrorinstead. - Don't reuse
validation_errorfor non-field errors (e.g. "request body is empty"). Useinvalid_bodyfor parsing failures and reservevalidation_errorfor 422 + populatedfields. - Don't include 5xx text from third parties. If Clerk returns "DB-host-A.cluster.amazonaws.com unreachable", surface
{"code": "clerk_unavailable", "message": "authentication is temporarily unavailable"}— log the upstream detail, don't render it.
Code locations
- Envelope renderer:
services/api/internal/shared/httputil/response.go(Error+ internalerrorWithFields). - Typed errors and constructors:
services/api/internal/shared/httputil/errors.go(AppError,NewNotFoundError, …,NewValidationError,HandleError). - Recovery middleware:
services/api/internal/core/middleware/recovery.go. - Tests:
services/api/internal/shared/httputil/errors_test.go,services/api/internal/core/middleware/recovery_test.go.