Exercise Content Pipeline
Operational reference for how raw exercise primitives land in S3, how the composer turns them into rendered videos on Bunny Stream, and how to add a new exercise to the platform.
For the compositional model itself (asset bundle shape, recipe contract, variant model), see features/exercise-library/composition.md. For the architectural pattern, see P56.
The pipeline at a glance
Filming / audio team
│
│ prepares manifest.json + names files per convention
▼
aws s3 sync
│
▼
s3://restartix-exercise-assets-{env}/{exercise}/ ← raw primitives, our control
│
│ composer downloads bundle when called
▼
exercise-composer service ← stateless, Fargate task
│
│ ffmpeg bake (intro → sets → outro), upload to Bunny
▼
Bunny Stream library ← transcoded HLS, CDN delivery
│
│ bunny_video_id stored in Core API DB
▼
Patient app ← plays via HLS URL
plays HLS playlist via Bunny's player or HLS.jsThe composer is read-only on S3 and write-only on Bunny. The patient app reads only from Bunny.
AWS S3: source-of-truth buckets
| Env | Bucket | Region | Provisioning | Composer access | Versioning |
|---|---|---|---|---|---|
| dev | restartix-exercise-assets-dev | eu-central-1 | Manual (Console) | local-dev IAM user, read+write | off |
| staging | restartix-exercise-assets-staging | eu-central-1 | Terraform (storage-s3 module) | Fargate task role, read-only | on |
| production | restartix-exercise-assets-production | eu-central-1 | Terraform (storage-s3 module) | Fargate task role, read-only | on |
All three: SSE-S3 encryption, public access blocked (all 4 toggles), no lifecycle rules, no Object Lock. Composer never writes back to S3 by design — outputs go to Bunny, not back into the assets bucket.
Why composer is read-only
Renders are outputs, not derivations stored back into the source bucket. Bunny owns the render lifecycle. Constraining composer to read-only on S3 makes the data-flow direction explicit and prevents accidental corruption of the source-of-truth primitives.
Object layout under s3://{bucket}/
{exercise-slug}/
├── manifest.json
├── intro-video-{1,2,3}.mp4
├── pauza-video-{1,2,3}.mp4
├── outro-video-{1,2,3}.mp4
├── rep-left-video-{1,2,3}.mp4
├── rep-right-video-{1,2,3}.mp4
└── audio/{lang}/
├── intro-vo-{1,2,3}.mp3
├── pauza-vo-{1,2,3}.mp3
├── outro-vo-{1,2,3}.mp3
├── rep-left-vo-{1,2,3}.mp3
└── rep-right-vo-{1,2,3}.mp3The composer auto-discovers variant counts from filenames matching {slot}-{n}.{ext}. Missing files don't crash the composer — they just narrow the variant pool for that slot. (A render request that needs a slot with zero variants fails validation.)
Composer's IAM policy (locked shape)
When the composer ships on Fargate (post-1E close), its task role attaches this policy per env:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "ExerciseAssetsRead",
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:ListBucket"],
"Resource": [
"arn:aws:s3:::restartix-exercise-assets-{env}",
"arn:aws:s3:::restartix-exercise-assets-{env}/*"
]
}
]
}No s3:PutObject, no s3:DeleteObject, no s3:PutObjectTagging. By design.
Local dev credentials
The restartix-platform-local-dev IAM user has read+write on the dev bucket only. Its keys live in services/api/.env.local (and are reused verbatim by services/exercise-composer/.env.local). Reach them via the user's password manager under "RestartiX AWS local dev". No LocalStack — local dev hits the real dev bucket.
Bunny Stream: rendered video delivery
| Env | Library | Region | Provisioning | Credentials |
|---|---|---|---|---|
| dev | manually provisioned (e.g. restartix-exercise-renders-dev) | Frankfurt | Bunny dashboard | local .env.local |
| staging | manually provisioned | Frankfurt | Bunny dashboard | AWS Secrets Manager restartix/staging/bunny-bootstrap |
| production | manually provisioned | Frankfurt | Bunny dashboard | AWS Secrets Manager restartix/production/bunny-bootstrap |
No Terraform Bunny provider. Bunny libraries are provisioned manually in Bunny's dashboard, one-time bootstrap per env. Each library has its own API key — separate libraries per env keep production reputation and quota separate from staging tests.
What the composer needs from each library
| Field | Where in Bunny dashboard | Composer env var |
|---|---|---|
| Library ID (numeric) | Library overview / URL | BUNNY_STREAM_LIBRARY_ID |
| API Key (full access) | Library → API tab → "API Key" (not "Read-only API Key") | BUNNY_STREAM_API_KEY |
| CDN hostname | Library → API tab → "CDN hostname" (e.g. vz-{token}.b-cdn.net) | BUNNY_STREAM_CDN_HOSTNAME |
Replication
For dev/staging libraries: only Frankfurt is enabled as a region. The Singapore / LA / NY replicas Bunny pre-selects by default add cost ($0.005/GB per replica per delivery) without serving real users at those stages. Production library replication should be decided based on patient geography at launch — likely Frankfurt + London for EU coverage.
Replicas cannot be removed after the library is created. Start lean.
Cost shape
For a 200-exercise library at 3 variants per slot, ~10 common prescriptions per exercise, 1 language:
- Encoding (one-time per render): ~$0.025/min × ~3 min × 2,000 renders = ~$150 one-time
- Storage: ~10 GB (transcoded HLS ladder) × $0.03/GB-month = ~$0.30/month
- CDN delivery: ~$0.005/GB; a patient session pulls ~50 MB = $0.00025/session
Per-language overhead: only the audio side (~6 GB across 200 exercises). Adding a 2nd language ≈ +$0.18/month storage.
Trivial at platform scale.
Adding a new exercise
Two flows, depending on the exercise's kind:
reps_based— primitives in S3 + composer renders per recipe. The flow below covers this case end-to-end.duration_based— a single pre-baked MP4. See Duration-based import at the end of this section.
For both, no admin UI exists today — both flows are manual.
1. Receive content from filming/audio team (reps_based)
The filming team delivers a folder of files for the new exercise. Typical raw delivery:
Detensionari Lombare/
├── 001 Detensionari Lombare - Intro.mp4
├── 001 Detensionari Lombare - Outro.mp4
├── 001 Detensionari Lombare - Pauza.mp4
├── 001 Detensionari Lombare 5 Stanga.mp4
├── 001 Detensionari Lombare 5 Dreapta.mp4
└── 001 Detensionari Lombare 1-20 Stanga.mp3
└── 001 Detensionari Lombare 1-20 Dreapta.mp3(Filenames may vary; the team will deliver whatever naming convention they use internally.)
2. Rename files to the composer's convention
Reorganise into the layout the composer expects:
lumbar-detensioning/
├── manifest.json
├── intro-video-1.mp4 ← was "001 ... - Intro.mp4"
├── pauza-video-1.mp4 ← was "001 ... - Pauza.mp4"
├── outro-video-1.mp4 ← was "001 ... - Outro.mp4"
├── rep-left-video-1.mp4 ← was "001 ... 5 Stanga.mp4"
├── rep-right-video-1.mp4 ← was "001 ... 5 Dreapta.mp4"
└── audio/ro/
├── rep-left-vo-1.mp3 ← was "001 ... 1-20 Stanga.mp3"
└── rep-right-vo-1.mp3 ← was "001 ... 1-20 Dreapta.mp3"For the framing VO (intro/pauza/outro) which is currently baked into the framing video's audio track: extract it via ffmpeg into separate mp3s so it can be language-swapped later:
ffmpeg -hide_banner -y -i intro-video-1.mp4 -vn -c:a libmp3lame -b:a 192k audio/ro/intro-vo-1.mp3
ffmpeg -hide_banner -y -i pauza-video-1.mp4 -vn -c:a libmp3lame -b:a 192k audio/ro/pauza-vo-1.mp3
ffmpeg -hide_banner -y -i outro-video-1.mp4 -vn -c:a libmp3lame -b:a 192k audio/ro/outro-vo-1.mp3This is until the filming team delivers intro/pauza/outro as silent video + separate VO per language (the locked future state per the composition spec). For exercises where the framing video has dialogue baked in, the lip-sync mismatch with future-language VO will surface; those exercises will need re-shooting.
3. Add 2nd and 3rd variants (when available)
If the filming team has delivered 3 variants per slot, name them -1, -2, -3. If only 1 variant exists today, placeholder by duplicating:
cp intro-video-1.mp4 intro-video-2.mp4
cp intro-video-1.mp4 intro-video-3.mp4
# … same for every slotThe composer happily picks among "3 variants" that are byte-identical until real variants replace them. The render works, just without anti-repetitiveness benefit until the real 2nd and 3rd variants arrive.
4. Write the manifest
{
"exercise": "lumbar-detensioning",
"reps_per_video_block": 5,
"counts_per_audio_master": 20,
"sides": ["left", "right"],
"languages": ["ro"]
}For bilateral exercises that don't switch sides (e.g. forward fold, plank), set "sides": ["bilateral"] instead of ["left", "right"] — the composer's variant model extends naturally, but the recipe shape would need adjustment. (Today only ["left", "right"] is implemented.)
5. Sync to the dev bucket
aws s3 sync lumbar-detensioning/ \
s3://restartix-exercise-assets-dev/lumbar-detensioning/ \
--exclude '.DS_Store' \
--exclude '*.tmp'aws s3 sync does delta uploads (only changed files), doesn't delete remote files by default. Run again after updating any file.
Verify:
aws s3 ls s3://restartix-exercise-assets-dev/lumbar-detensioning/ --recursiveShould show all expected files. Counts:
- 1 manifest.json
- 9 framing videos (3 per intro/pauza/outro)
- 6 rep videos (3 per side × 2 sides)
- 9 framing VO mp3s per language
- 6 rep VO mp3s per language
Total: ~30 files per language.
6. Test a render
Composer running locally on port 9400:
curl -s -X POST http://localhost:9400/v1/compose \
-H 'Content-Type: application/json' \
-d '{
"exercise": "lumbar-detensioning",
"language": "ro",
"sets": [
{"side":"left","reps":5},
{"side":"right","reps":5}
],
"seed": 42
}' | jq .Returns a bunny_video_id after ~5-10s. Open the returned playback_hls_url in Safari (or VLC, or the Bunny dashboard's library view) — Bunny needs ~30-60s to transcode after upload, so the first hit may show "not ready".
If the render fails, check:
- Composer logs (slog output) for the specific error
- That
aws s3 lsshows the expected files (composer fails fast if manifest is missing or required variants are absent) - That the manifest's
languagesarray includes the requested language - That every set's reps is a multiple of 5 in
[5, 20]
Duration-based import
duration_based exercises are single pre-baked MP4s — no primitives, no S3 asset bundle, no composer involvement. They arrive as a single video file (currently: imported from the old platform's legacy library; future: any new exercise authored as a fixed-content video instead of composed primitives).
The import workflow:
- Upload the MP4 to Bunny Stream directly (Bunny dashboard or API), into the per-slug collection (
{exercise-slug}/). Note the resultingbunny_video_id+duration_seconds. - Insert the
exercisesrow withkind='duration_based',status='draft', noasset_versionrequirement (set to 1 by default). - Insert one
exercise_rendersrow withrecipe_hash='_imported',language='ro'(or whichever language the video carries),recipe=NULL,bunny_video_id,status='ready',asset_version=1. The row is the catalog preview by default. - Update
exercises.catalog_render_idto point at the render row.
No aws s3 sync — there are no S3 primitives for duration_based exercises. The Bunny video IS the entire content.
When more languages are added later, upload another MP4 to the same Bunny collection and INSERT another exercise_renders row with the new language.
If the underlying video needs to be replaced (a re-encode, a correction): re-upload to Bunny, get the new bunny_video_id, UPDATE the existing render row's bunny_video_id + bump asset_version. The cache stays consistent because there's only one row.
A migration script for the legacy import wave (hundreds of pre-existing videos) lives in services/api/migrations/ once the F9.1 import pass happens. For now, manual.
Bunny dashboard pointers
For debugging or browsing renders:
- Library overview:
https://dash.bunny.net/stream/{library-id}— list of all videos, status, sizes, encoded variants - Per-video page: shows embed code, HLS URL, MP4 download links, encoding logs, view counts
- API tab: where credentials live; "Webhook URL" for transcode-complete callbacks (not wired today)
Bunny videos can be deleted from the dashboard if you want to start clean during testing.
Asset versioning
The exercise_renders cache table shipped in commit 3d95e38 (F9.1 Phase 1). Each exercise carries an asset_version INT column; each render row records the asset_version it was baked against. The workflow:
- Filming team updates a variant (e.g. re-shoots
rep-left-video-2.mp4with better lighting). - You
aws s3 syncthe updated file to the bucket. - You bump
exercises.asset_versionfor that exercise. - Cache rows with the old
asset_versionbecome stale; the next prescription request that's a cache miss (or a stale-version match — implementation detail inService.EnsureRender) triggers a re-render. Oldbunny_video_ids linger in Bunny for a grace period before cleanup.
Today (Phase 1 shipped):
- ✅ Schema:
asset_versioncolumn onexercises, recorded on everyexercise_rendersrow - ✅ Cache lookup:
Service.EnsureRendermatches onrecipe_hash + language + asset_version; mismatched version is treated as a miss - ❌ Bump endpoint: no Console action yet — manual
UPDATE exercises SET asset_version = asset_version + 1 WHERE slug = '...'for now (this is a Phase 2 task; see composition.md → Phase 2 backlog) - ❌ Eager catalog preview re-render on bump: same Phase 2 task
For Phase 1 testing without the bump endpoint, the manual SQL is fine. Patient-facing flows aren't live yet, so there's no risk of mid-session staleness.
What's NOT in this pipeline
- Patient recordings — the platform may eventually record sessions (telemetry, supervised reviews). Those are a separate concern, separate storage, separate provider — not in this pipeline.
- Live video calls — Daily.co handles those, not Bunny. Different concern.
- Marketing / brand video — not in this bucket. If we ever ship those, they need their own bucket (or live on whatever marketing CDN is chosen).
- AI-generated content — out of scope for v1.
Cross-references
- P56 Exercise Video Composition Pipeline — the architectural pattern
- features/exercise-library/composition.md — the compositional model spec
- services/exercise-composer/README.md — service implementation details (build, env vars, code structure)
- experiments/exercise-composer/README.md — sandbox for iterating on editorial decisions
- reference/external-providers.md — Bunny Stream provider entry
- reference/file-storage.md — the platform's other S3 bucket (for user-uploaded files; distinct from this exercise-assets bucket)