Skip to content

Internal Event Bus

How features signal "something happened" without coupling to consumers. Implements P28.

What it is

internal/core/events is an in-process pub/sub spine. Every feature that does something noteworthy publishes an Event when its work has committed. Other code subscribes to event types it cares about. The bus does not couple publishers to consumers — a feature shipping today emits events into the void, and a consumer added in Layer 8 (automations, webhooks) starts receiving them without any change to the publishing handler.

It is: in-process, fan-out, fire-and-forget for the publisher, bounded per subscriber.

It is not: durable across restart, cross-process, or a retry mechanism. Consumers that need durability (webhooks → webhook_events table) own their own DB-backed queue downstream of the bus.

Conventions (READ BEFORE PUBLISHING OR SUBSCRIBING)

Publish from day one — no retroactive instrumentation

Every feature publishes its own events at creation time, even before consumers exist. If a feature does not publish at creation, it cannot be silently picked up later — every callsite would have to be revisited. This is why the bus ships at Layer 1.9: features built in Layers 2–7 emit events into a working bus, and Layer 8 wires consumers without retrofitting upstream.

Publish AFTER the DB transaction commits

go
// inside a service method
if err := repo.CreateAppointment(ctx, ...); err != nil {
    return err
}
// transaction is committed here, by the time CreateAppointment returns

events.Publish(ctx, events.Event{
    Type:         "appointment.booked",
    ID:           uuid.New(),
    OrgID:        appt.OrgID,
    OccurredAt:   time.Now().UTC(),
    ResourceType: "appointment",
    ResourceID:   appt.ID.String(),
    Data:         map[string]any{"patient_id": appt.PatientID, "specialist_id": appt.SpecialistID},
})

Never publish inside a transaction that might roll back. A consumer acting on a state the DB does not hold is the failure mode this rule prevents. (When durable cross-process semantics matter — webhooks — Layer 8 introduces an outbox table inside the transaction; the bus stays after-commit.)

Data carries IDs, not PHI

Event payloads should be small: IDs, status changes, the bare minimum a consumer needs to fetch the rest from the API. Webhooks deliver Data verbatim to clinics' external systems — anything in Data is visible outside the platform.

Type strings are stable identifiers

Renaming a Type is a breaking change. It flows into the audit log, automation rules, and webhook subscriptions. Add new types here when the feature ships; don't rename existing ones without migrating subscribers.

CI enforces registry parity (P51)

make check runs services/api/cmd/check-events-registry, which validates two invariants:

  1. Every events.Type constant declared in services/api/ has a matching events.Register(events.EventDef{...}) call in the same domain's events.go init().
  2. The committed _generated/events-catalog.md is in sync with the registry — re-run make events-docs after registering, renaming, or deprecating an event.

Direction is one-way (registered → catalog → docs): registered events without a publisher are EXPECTED while a domain wires up publishing; an events.Type constant with no Register call is DRIFT and fails the build. The fix is always the same: add the registry entry in the domain's events.go.

Event types (catalog)

The catalog grows feature-by-feature. Each events.go init registers its events with a Layer label that drives the section header below. The table is auto-generated from the in-process registry by make events-docs — do not edit by hand.

Foundation

TypeResourceDescription
organization.createdorganizationSuperadmin creates a new tenant
organization.member_addedorganization_membershipPrincipal joins an org for the first time
organization.member_removedorganization_membershipPrincipal removed from an org
organization.member_role_changedorganization_membershipExisting membership's role changes
organization.updatedorganizationOrg metadata fields change (name, slug, branding, etc.)
organization_domain.addedorganization_domainCustom domain added (pre-verification)
organization_domain.removedorganization_domainCustom domain removed
organization_domain.verifiedorganization_domainDomain verification status transition
patient.onboardedpatientPortal onboarding creates a patients row at an org

Schedule events

schedule.daily / schedule.weekly / schedule.date_reached are reserved for the cron driver that lands in Layer 8. The events.Schedule registration interface ships at Layer 1.9 (stub — does not fire) so features can call RegisterSchedule from day one without retroactive instrumentation. They appear in the registry once the first feature registers a schedule.

Consumers

ConsumerSubscribes toOwned byStatus
Automation engineconfigurable per ruleLayer 8Not yet implemented
Webhook dispatcherconfigurable per subscriptionLayer 8Not yet implemented

Layer 8 will register both at startup. Until then, every Publish is a no-op except for the dropped-count counter (always 0 with no subscribers).

Backpressure

Each subscriber has a buffered channel sized at defaultBuffer = 256. If a subscriber consistently runs slower than its publish rate, its overflow events are dropped and counted (events.DroppedCount()), and a warning is logged.

This is the correct default for a bus that lives on the request path: a slow webhook delivery cannot be allowed to back-pressure an API response. When Layer 8 wires real consumers, the webhook dispatcher subscribes once and immediately writes to the durable webhook_events queue — the bus sees a subscriber that drains at near-network-zero latency. The drop counter is the alert signal that something is misconfigured.

Time-based events (stub at Layer 1.9)

go
events.RegisterSchedule(events.Schedule{
    Kind:        events.ScheduleDaily,
    EventType:   "schedule.daily",
    Description: "daily appointment reminder sweep",
})

At Layer 1.9 this records the registration and logs it. The schedule does not fire. Layer 8 introduces a cron driver behind the same RegisterSchedule interface — features can call it from day one without retroactive instrumentation.

Testing

  • make test covers publish/subscribe, fan-out, cancel semantics, queue overflow drops, panic recovery, shutdown drain, and Init replacing a prior bus. All race-detector clean.
  • Tests do not need a DB or any external dependency — the bus is hermetic.