AI Models Registry
Foundation 1C.8 deliverable. Engineer-facing reference for the AI model registry — ai_models + ai_model_pricing_history — and the wiring contract every AI provider impl follows.
The registry, the metering reservation flow, the audit + provenance same-tx write, and the cost snapshot together implement P53 AI Capability Provenance and Variable-Cost Metering. Read that pattern first; this doc is the operational how-to.
When to add a row
The AI feature consumer that wires a real provider impl (Anthropic, OpenAI, Voyage, Deepgram, Google Vision, …) seeds the corresponding ai_models row + initial ai_model_pricing_history row in the same PR. Every UUID returned by aimodels.Repository.LookupActivePricing must point at a registered model — the cmd/check-ai-models CI guard enforces the import-graph invariant; the live audit_ai_provenance.model_id FK enforces the runtime invariant.
Foundation seeds zero rows. The five AI capability skeleton packages (internal/core/ai/{llm,embeddings,transcription,vision,classification}) ship interface declarations + Fake test doubles only — no provider impls.
Registering a model
Two writes in one transaction: the ai_models row + the initial ai_model_pricing_history row. The Console superadmin endpoint POST /v1/admin/ai-models does both in one tx; programmatic registration goes through aimodels.Service.Create.
Required fields:
model_provider— e.g."anthropic","openai","voyage". Free-form (the CI guard validates against the registration path, not an enum).model_name— e.g."claude-opus","text-embedding-3-large".model_version— e.g."4.7","v3". Combined with provider + name to form the UNIQUE key.capability— one oftext_generation,embedding,transcription,vision,classification. CHECK constraint enforces.unit_type— whatusage_records.unit_typewill be set to when this model is invoked. Conventionally"tokens"for LLM/embeddings/classification,"audio_seconds"for transcription,"images"(or"tokens"if multimodal) for vision.validation_status—experimentalfor newly registered models;validatedonce the model passes the per-(model, task) validation regimen the medical-device-readiness rule from CLAUDE.md requires for clinical features.cost_per_input_unit_cents— NUMERIC string (foundation accepts fractional cents like"0.0003"for Anthropic-style pricing).cost_per_output_unit_cents— same shape; set equal to input rate or"0"for capabilities with no output direction (embeddings, transcription).
The pricing row is created with effective_from = NOW() and effective_to = NULL (current). Future rows (price changes) close it.
Changing pricing
POST /v1/admin/ai-models/{id}/price-change is the only path. The endpoint runs in one transaction:
- Closes the model's current pricing row (sets
effective_to = effective_fromof the new row). - Inserts the new pricing row with
effective_to = NULL.
The partial unique index ai_model_pricing_history_current (ON model_id WHERE effective_to IS NULL) guarantees at most one current row per model. A future price change (caller-supplied effective_from > NOW()) is a permitted shape — the close happens at that future timestamp, and LookupActivePricing(modelID, NOW()) continues to return the old row until the boundary passes.
changed_by_principal_id is the superadmin who issued the change; recorded for the audit trail. notes is free-form context ("Anthropic raised input rate 2025-08-15 per their announcement").
Looking up active pricing at call time
The AI provider impl reads the row in effect at call time and snapshots into usage_records.cost_cents so closed-period billing reconstruction is accurate forever:
pricing, err := aiModels.LookupActivePricing(ctx, modelID, time.Now().UTC())
if err != nil { return ... }
inputCost := computeCost(usage.InputTokens, pricing.CostPerInputUnitCents)
outputCost := computeCost(usage.OutputTokens, pricing.CostPerOutputUnitCents)
reservation.Settle(ctx, capabilities.SettleResult{
Entries: []capabilities.SettleEntry{
{UnitType: "input_tokens", Units: usage.InputTokens, CostCents: &inputCost},
{UnitType: "output_tokens", Units: usage.OutputTokens, CostCents: &outputCost},
},
PrincipalID: principalID,
})The two-row Settle is the canonical shape for capabilities with split-direction pricing. Single-direction capabilities (embeddings only have an input direction) emit one entry.
Recording AI provenance with the audit row
When the call involves an AI model, the AuditFunc closure of the wrap helper calls audit.RecordWithProvenance instead of audit.Record. Both rows commit together — same-tx FK + lifecycle:
prov := audit.AIProvenance{
ModelID: modelID,
InputsHash: sha256OfCanonicalisedPrompt(req),
Confidence: usage.ConfidencePtr(),
}
auditID, err := audit.RecordWithProvenance(ctx, audit.Event{...}, prov)InputsHash is the platform's "is this the prompt we ran?" evidence — compliance auditors need it without storing the raw prompt content (clinical features may include patient PII). The hash is computed against a canonicalised representation of the prompt the impl chooses; the choice is per-capability and documented in the impl's package comment.
Confidence is provider-supplied where available (some Anthropic Claude responses surface it; many providers don't). NULL is valid.
Validation status and clinical features
experimental models cannot be invoked for clinical features. The capability impl checks the validation status from ai_models before resolving the provider; an experimental model returns capabilities.ErrProviderUnavailable with a typed reason. Validation status flips to validated only after the per-(model, task) validation regimen passes — exact requirements live in the medical-device-readiness rule in CLAUDE.md and per-feature SaMD docs.
deprecated models still callable but warned (telemetry surfaces a deprecation flag); first AI feature consumer wires the warning. retired models are refused at the capability seam — the validation_status CHECK constraint + the retired_at consistency CHECK ensure the row's lifecycle is internally coherent.
Cross-references
- P53 AI Capability Provenance and Variable-Cost Metering — the locked architectural pattern.
- P50 Capability Convention —
WrapMeteredAIis the AI-shaped sibling ofWrapMeteredProvider. - Foundation 1C.7 Metering & Quotas — the underlying
usage_records/usage_quotas/usage_summariesschema. - SOUP inventory — every AI/ML model is also a SOUP entry with
model_provider/model_version/validation_statusfields. - Data classification —
ai_modelsisorg_internal(registry feeds patient transparency UIs);ai_model_pricing_historyisaudit_onlywith NO support_export egress (pricing is platform-confidential).