Telemetry — Telemetry & Compliance Service
Overview
Telemetry is the high-frequency data platform for RestartiX. It handles all data that is either too voluminous for the main PostgreSQL database or requires specialized storage patterns (time-series, append-only, immutable audit trails).
Telemetry shares the same API layer as the Core API backend but uses separate databases optimized for different workloads.
Architecture
┌─────────────────────────────────────────────────────────┐
│ Core API (Go) │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Business │ │ Telemetry │ │ Telemetry │ │
│ │ Logic │ │ Analytics │ │ Audit │ │
│ │ Handlers │ │ Handlers │ │ Handlers │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │ │
└─────────┼──────────────────┼──────────────────┼──────────┘
│ │ │
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Main │ │ ClickHouse │ │ Telemetry │
│ PostgreSQL │ │ │ │ PostgreSQL │
│ │ │ Analytics │ │ │
│ Business │ │ Media │ │ Audit Logs │
│ Data │ │ Errors │ │ Security │
│ (RLS) │ │ API Metrics │ │ Privacy │
└──────────────┘ └──────────────┘ └──────────────┘Telemetry PostgreSQL lives within the AWS VPC in a private subnet, not accessible from the internet. ClickHouse is hosted on ClickHouse Cloud, accessed from the Telemetry API over HTTPS (with authentication). No PHI is stored in ClickHouse — all identifiers are hashed and IPs truncated before insertion. See AWS Infrastructure for the full networking setup.
What Goes Where
ClickHouse — High-Frequency Analytics
Time-series data that arrives at high volume and is queried with aggregations (counts, averages, percentiles over time windows).
| Table | Purpose | Volume | Retention |
|---|---|---|---|
analytics_events | Product analytics (page views, clicks, feature usage) | ~10K/day | 2 years |
media_sessions | Video/exercise session tracking + performance metrics | ~5K/day | 2 years |
media_buffering_events | Per-buffering-event detail for video troubleshooting | ~20K/day | 1 year |
error_events | App/video/network errors with device context | ~1K/day | 1 year |
api_metrics | API response times and error rates | ~50K/day | 1 year |
pose_tracking_frames | MediaPipe body tracking during exercises | ~1M/day | 6 months |
staff_metrics | Staff performance analytics | ~2K/day | 1 year |
Why ClickHouse: Columnar storage compresses time-series data 10-20x. Aggregation queries (avg TTFB by country, p95 buffering by device) run in milliseconds over billions of rows. PostgreSQL would struggle with this volume.
Telemetry PostgreSQL — Compliance & Security
Append-only, immutable data with strict retention requirements and legal obligations.
| Table | Purpose | Retention | Legal Basis |
|---|---|---|---|
audit.audit_logs | HIPAA-compliant audit trail of all platform actions | 6-7 years | HIPAA §164.312(b) |
security.security_events | Attack detection, threat analysis | 3 years | Legitimate interest (security) |
privacy.privacy_exclusions | CCPA "Do Not Sell" network blocks (MaxMind) | Active | Legal obligation (CCPA) |
privacy.privacy_exclusion_sync_log | Sync audit trail for exclusion updates | 2 years | Legal obligation |
Why separate PostgreSQL: Immutability triggers prevent UPDATE/DELETE on audit logs. TimescaleDB hypertables partition by month for efficient time-range queries. Main PostgreSQL should never hold compliance data — separation of concerns prevents accidental data loss during business migrations.
Main PostgreSQL — Business Data (NOT Telemetry)
These stay in the main database:
| Data | Why NOT Telemetry |
|---|---|
| Patient records, appointments, treatment plans | Relational, transactional, needs FK integrity |
| Staff CRUD actions | Already tracked via RLS and application logic |
| Service plans, products, orders | Business entities with complex relationships |
| Form submissions, custom fields | Tied to patient records via FKs |
Data Flow
Patient Watches Exercise Video
Patient opens exercise → Frontend fires media event
│
POST /v1/media/events
{event: "session_start", media_id: "...",
ttfb_ms: 340, connection_type: "4g"}
│
┌──────┴──────┐
│ Middleware │
│ - Enrich │ (device, geo, consent)
│ - Hash PII │ (actor_id → SHA-256)
│ - Check │ (CCPA exclusion)
└──────┬──────┘
│
┌───────────┼───────────┐
▼ ▼ ▼
ClickHouse ClickHouse Telemetry PG
media_ error_ audit.
sessions events audit_logs
(if error) (access log)Heartbeat During Playback (every 10s)
Frontend sends heartbeat with current metrics:
{
event: "heartbeat",
position_seconds: 45.2,
buffering_count: 1,
buffering_duration_ms: 800,
current_bitrate: 2500000,
resolution: "720p",
dropped_frames: 2
}
→ ClickHouse UPDATE media_sessions (ReplacingMergeTree)Buffering Event (each occurrence)
Video stalls → Frontend fires buffering event:
{
event: "buffering_start",
position_seconds: 23.1,
bitrate_before: 2500000,
cdn_response_time_ms: 1200
}
→ ClickHouse INSERT media_buffering_eventsPrivacy & Compliance
Data Minimization
All data entering Telemetry is stripped of PII before storage:
| Original | Stored As | Method |
|---|---|---|
| User ID | actor_hash | One-way SHA-256 with rotating salt |
| IP Address | ip_truncated | Last octet zeroed (192.168.1.45 → 192.168.1.0) |
| Location | country_code + region | Coarse geo only (no city/coords) |
| Exercise ID | media_id | Hashed (no exercise names in analytics) |
Consent Levels
| Level | Name | What's Tracked | Legal Basis |
|---|---|---|---|
| 0 | Excluded | Nothing (CCPA opt-out) | Legal obligation |
| 1 | Essential | Security events, error tracking | Legitimate interest |
| 2 | Analytics | + Product analytics, media sessions | User consent |
| 3 | Full | + Marketing, behavioral analysis | Explicit consent |
Consent level is determined per-request from the frontend consent state. Events are gated: a level-2 analytics event is dropped if the user has consent level 1.
CCPA Integration
Weekly sync with MaxMind API fetches "Do Not Sell" IP ranges. The middleware checks every incoming request against these CIDR blocks. Excluded IPs get consent_level = 0 — only security events are logged.
Worker Services
| Worker | Purpose | Queue |
|---|---|---|
audit-worker | Async audit log processing (enrichment, geo lookup, write to PG) | Redis |
analytics-worker | Async analytics event processing (enrichment, write to ClickHouse) | Redis |
Workers run as background goroutines within the Telemetry API process. They consume from Redis queues, enrich events with geo/device data, and write to the appropriate database.
Key Docs
| Doc | Purpose |
|---|---|
| clickhouse-schema.sql | All ClickHouse tables and materialized views |
| postgres-schema.sql | Audit, security, and privacy PostgreSQL tables |
| api.md | All Telemetry API endpoints |
| media-events.md | Video event specification and performance metrics |
Design Principles
- Privacy by default — No PII in analytics. All identifiers are hashed, IPs truncated, locations coarsened.
- Right database for the job — ClickHouse for volume + aggregation, PostgreSQL for compliance + immutability.
- Consent gating — Events are dropped if user consent level is insufficient. No silent tracking.
- Append-only audit — Audit logs cannot be modified or deleted. Immutability enforced at database level.
- Async by default — Analytics and audit processing happens via workers to avoid blocking the API.
- Same network, separate storage — Telemetry databases share the AWS VPC with the Core API but are isolated databases.
- Troubleshooting first — Video performance metrics exist so support can answer "why is the patient's video not loading" with data, not guesswork.