Telemetry API Endpoints

Layer 2 feature, not yet implemented. The Telemetry API is a separate Go service (services/telemetry/). All endpoints live there, not in Core API. This doc describes the locked design — see index.md for the full architecture and rationale.

The Telemetry API serves three concerns:

Patient-facing ingest — pose batches, media events, session finalization. Authenticated via short-lived signed session token issued by Core API at exercise-session start.
Internal callbacks — Telemetry API calls Core API as a Cat F service-account principal to publish aggregation events via events.Bus. No public endpoint involved.
Reads — none. All telemetry reads (specialist dashboards, patient progress, cohort views, replay blob fetch) flow through Core API. Telemetry API itself exposes no read endpoints to clients.

Authentication

Signed session token (hot path)

Pose-frame ingest hits 10k+ requests/sec at peak. Verifying a Clerk JWT per batch is wasteful. Instead:

Patient starts an exercise session in Portal.
Portal calls Core API: POST /v1/exercise-sessions (Clerk JWT auth as usual).

Core API creates the session row and returns a short-lived signed token:

json

{
  "session_id": "550e8400-e29b-41d4-a716-446655440000",
  "telemetry_token": "v1.eyJwcmluY2lwYWxfaWQiOiIuLi4iLCJvcmdfaWQiOiIuLi4iLCJleGVyY2lzZV9zZXNzaW9uX2lkIjoiLi4uIiwiZXhwIjoxNzU0OTk5OTk5fQ.signature",
  "telemetry_token_expires_at": "2026-05-07T14:00:00Z"
}

Portal sends every pose/media batch to Telemetry API with Authorization: Bearer <telemetry_token>.
Telemetry API verifies signature only (no Clerk dep on the hot path). Extracts principal_id, org_id, exercise_session_id from claims.

Token claims:

Claim	Type	Purpose
`principal_id`	UUID	Whose session this is (patient principal)
`org_id`	UUID	Tenant scope (must match what Core API recorded)
`exercise_session_id`	UUID	The session the token authorizes ingest for
`iat`	int (unix)	Issue time
`exp`	int (unix)	Expiry — typically session_start + 2 hours, with renewal endpoint if needed

Signing: HS256 today (see index.md → Swap-point interfaces for upgrade path to Ed25519 if needed). Secret rotated via Secrets Manager; Telemetry API holds current + previous, accepts both during rotation window.

Validation: Telemetry API rejects ingest with 401 if signature fails, token expired, or exercise_session_id mismatch. Rejects with 403 if the matching per-purpose consent flag (analytics for media, biometric for pose) is not active for principal_id in org_id — checked via cached read from Core API.

Internal callbacks (Telemetry → Core API)

Telemetry API holds a Cat F service-account principal credential and calls Core API to publish aggregation events:

POST <core-api>/v1/internal/telemetry/session-aggregated
Authorization: Bearer <service-account-jwt>
{
  "session_id": "...",
  "principal_id": "...",
  "org_id": "...",
  "session_metrics": { ... },
  "rep_metrics": [ ... ],
  "media_metrics": { ... }
}

Core API verifies the service-account credential, validates the payload against the events.Bus event schema, and the subscriber writes to PG. RLS scope is set from org_id in the payload; the service-account principal has read-only RLS plus a narrow internal-write permission.

Endpoints

`POST /v1/pose/frames`

Batched pose ingest. Client buffers ~1 second of frames and posts.

Request body (binary, Content-Type: application/octet-stream):

[1-byte version] [4-byte frame_count] [4-byte fps_hint]
[N × 33 × 4 × float32 landmarks]   ← x, y, z, visibility per landmark per frame
[N × 4 byte timestamp_ms]          ← elapsed ms since session_start, per frame
[gzip wrapper around the whole binary blob]

Plus optional JSON sidecar with batch metadata in ?meta=... query param or X-Batch-Meta header (pose_confidence summary, camera_resolution, processing_time_ms — small, not per-frame).

Response: 202 Accepted

json

{
  "frames_accepted": 10,
  "session_id": "550e8400-e29b-41d4-a716-446655440000",
  "buffer_position_bytes": 5280
}

Failure modes:

400 — malformed binary, version unsupported
401 — token signature invalid / expired
403 — biometric consent not active
413 — batch too large (cap at ~64 KB after gzip)
429 — backpressure: drop policy says shed this batch (pose frames are dropable; aggregates are not)

Backpressure behavior: at high load, Telemetry API may drop pose-frame batches. Frames are fungible — losing 5% still gives 95% of an exercise's signal. The aggregator's session_end summary remains accurate within tolerance. Aggregate writes (POST /v1/sessions/{id}/end) MUST NOT drop; they have priority quota.

`POST /v1/media/events`

Video lifecycle ingest. Volume is low (~110 events/sec at 1000 concurrent), so JSON is fine here.

Request body (Content-Type: application/json):

json

{
  "event": "session_start",
  "session_id": "550e8400-e29b-41d4-a716-446655440000",
  "media_id": "exercise-uuid",
  "media_type": "video",
  "timestamp": "2026-05-07T10:00:00.000Z",
  "data": {
    "total_duration_seconds": 120.5,
    "ttfb_ms": 340,
    "video_load_time_ms": 1200,
    "cdn_response_time_ms": 280,
    "connection_type": "wifi",
    "effective_bandwidth": 12.5,
    "rtt_ms": 45,
    "initial_bitrate": 2500000,
    "initial_resolution": "720p"
  }
}

Full event taxonomy (session_start, play, pause, seek, heartbeat, buffering_start, buffering_end, quality_change, milestone, error, session_end) and per-event field schemas live in media-events.md.

Response: 202 Accepted

json

{
  "status": "accepted",
  "session_id": "550e8400-e29b-41d4-a716-446655440000"
}

Failure modes:

400 — unknown event type, schema violation
401 — token invalid / expired
403 — analytics consent not active
429 — rate limit (per-session)

`POST /v1/sessions/{id}/end`

Session finalizer. Triggers server-side aggregation from accumulated landmarks, writes the S3 replay blob, publishes events.Bus event. Must not drop — separate priority quota from pose-frame ingest.

Request body (Content-Type: application/json):

json

{
  "ended_at": "2026-05-07T10:30:00.000Z",
  "client_status": "completed",
  "exercise_id": "exercise-uuid",
  "total_frames_attempted": 18000
}

Server cross-checks total_frames_attempted against actual frames received to flag suspicious gaps.

Response: 200 OK

json

{
  "session_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "completed",
  "frames_received": 17850,
  "frames_dropped": 150,
  "replay_blob": "s3://restartix-telemetry/{org_id}/{session_id}.bin.gz",
  "aggregates_published_at": "2026-05-07T10:30:01.234Z"
}

The replay_blob URL is internal — clients fetch replay via Core API (GET /v1/exercise-sessions/{id}/replay) which mints a short-lived signed S3 URL.

Failure modes:

400 — bad payload
401 — token invalid
404 — session not found
409 — session already finalized (idempotent: same client_status returns 200; conflicting status returns 409)
500 — aggregation or S3 write failed; client retries with the same payload (idempotent)

`GET /v1/healthz`

Internal health check for ALB. Returns 200 OK with {"status": "ok", "version": "..."}. Not authenticated; not exposed publicly.

Reads (none on Telemetry API)

There are no read endpoints on Telemetry API for clients. All reads flow through Core API:

Reader → endpoint	Source data
Patient: `GET /v1/me/exercise-sessions`	`pose_session_metrics` + `media_session_metrics` (PG, RLS by principal)
Specialist: `GET /v1/patients/{id}/exercise-sessions`	Same tables, RLS by org + permission
Specialist: `GET /v1/exercise-sessions/{id}/replay`	Signed S3 URL to replay blob
Clinic admin: `GET /v1/analytics/cohort/exercise-adherence`	Materialized view over `pose_session_metrics`
Console: `GET /v1/admin/platform/exercise-aggregate-counts`	Anonymised aggregates only (no `principal_id` in response)

This is by design: Core API is the single source of authentication, RLS, audit, classification, and per-org permission enforcement. Telemetry API is ingest-only.

Session lifecycle (end-to-end)

Portal                    Core API              Telemetry API           S3              events.Bus      Core API subscriber       PG
  │                          │                       │                  │                   │                  │                   │
  ├── POST /v1/exercise-sessions ─►                  │                  │                   │                  │                   │
  │                          │ create row, mint      │                  │                   │                  │                   │
  │                          │ signed token          │                  │                   │                  │                   │
  ◄── 200 (session_id, telemetry_token) ──           │                  │                   │                  │                   │
  │                          │                       │                  │                   │                  │                   │
  ├── POST /v1/pose/frames (token) ────────────────► │ verify, append   │                   │                  │                   │
  │                          │                       │ to S3 multipart  │                   │                  │                   │
  │                          │                  ◄── 202                 │                   │                  │                   │
  │                          │                  ... repeat for 30 min ... │                  │                   │                  │
  │                          │                       │                  │                   │                  │                   │
  ├── POST /v1/sessions/{id}/end ──────────────────► │ aggregate from   │                   │                  │                   │
  │                          │                       │ landmarks,       │                   │                  │                   │
  │                          │                       │ finalize S3 ───► │                   │                  │                   │
  │                          │                       │ publish event ────────────────────► │                   │                  │
  │                          │                       │                  │                   │ deliver event ─► │ INSERT pose_*    │
  │                          │                       │                  │                   │                  │ + media_*        │
  │                          │                       │                  │                   │                  │ + UPDATE         │
  │                          │                       │                  │                   │                  │ patient_exercise_logs
  ◄── 200 (replay_blob URL) ─────────────────────────                    │                   │                  │                   │
  │                          │                       │                  │                   │                  │                   │
  │ ... later, specialist views ...                  │                  │                   │                  │                   │
  │                          ◄── GET /v1/exercise-sessions/{id} (Clinic app)               │                  │                   │
  │                          │ read PG aggregates ──────────────────────────────────────────────────────────────────────────────► │
  │                          │ mint signed S3 URL    │                  │                   │                  │                   │
  │                          ── 200 ──►              │                  │                   │                  │                   │
  │                          ◄── GET /v1/exercise-sessions/{id}/replay                     │                  │                   │
  │                          │ S3 GET ──────────────────────────────────► (browser)         │                  │                   │

Idempotency

Pose batches are not idempotent — duplicate batches are appended (small cost). Client should not retry pose batches; on failure, the next batch overlaps anyway.
Media events are idempotent by (session_id, event_type, timestamp) — duplicates dropped at ingest.
POST /v1/sessions/{id}/end is idempotent — finalizing a finalized session returns the existing finalization state.

Rate limiting

Per-session quota: ~1 batch/sec per endpoint (matches client batch cadence). Burst allowance ~5 batches/sec for reconnect-and-flush cases. Per-org global quota: based on entitlements + 1C.7 metering — exceeding the org's quota returns 429 across all sessions.

Observability

Telemetry API itself emits OTel traces / metrics / logs to the platform's observability stack (Datadog at scale, CloudWatch at staging). These are operational signals about the service — separate concern from the product telemetry it ingests.

Migration / older spec

Earlier versions of this doc described POST /v1/analytics/track, POST /v1/errors/report, POST /v1/audit/ingest, and an admin/dashboard surface. All four are out — see index.md → Scope for what each was replaced by (or rejected). When Layer 2 builds the service, this doc is the canonical reference; older references in features specs will be cleaned up at that time.

Telemetry API Endpoints ​

Authentication ​

Signed session token (hot path) ​

Internal callbacks (Telemetry → Core API) ​

Endpoints ​

POST /v1/pose/frames ​

POST /v1/media/events ​

POST /v1/sessions/{id}/end ​

GET /v1/healthz ​

Reads (none on Telemetry API) ​

Session lifecycle (end-to-end) ​

Idempotency ​

Rate limiting ​

Observability ​

Migration / older spec ​

Telemetry API Endpoints

Authentication

Signed session token (hot path)

Internal callbacks (Telemetry → Core API)

Endpoints

`POST /v1/pose/frames`

`POST /v1/media/events`

`POST /v1/sessions/{id}/end`

`GET /v1/healthz`

Reads (none on Telemetry API)

Session lifecycle (end-to-end)

Idempotency

Rate limiting

Observability

Migration / older spec