Skip to content

Exercise Content Pipeline

Operational reference for how raw exercise primitives land in S3, how the composer turns them into rendered videos on Bunny Stream, and how to add a new exercise to the platform.

For the compositional model itself (asset bundle shape, recipe contract, variant model), see features/exercise-library/composition.md. For the architectural pattern, see P56.

The pipeline at a glance

  Filming / audio team

        │  prepares manifest.json + names files per convention

  aws s3 sync


  s3://restartix-exercise-assets-{env}/{exercise}/   ← raw primitives, our control

        │  composer downloads bundle when called

  exercise-composer service                          ← stateless, Fargate task

        │  ffmpeg bake (intro → sets → outro), upload to Bunny

  Bunny Stream library                                ← transcoded HLS, CDN delivery

        │  bunny_video_id stored in Core API DB

  Patient app                                         ← plays via HLS URL
        plays HLS playlist via Bunny's player or HLS.js

The composer is read-only on S3 and write-only on Bunny. The patient app reads only from Bunny.

AWS S3: source-of-truth buckets

EnvBucketRegionProvisioningComposer accessVersioning
devrestartix-exercise-assets-deveu-central-1Manual (Console)local-dev IAM user, read+writeoff
stagingrestartix-exercise-assets-stagingeu-central-1Terraform (storage-s3 module)Fargate task role, read-onlyon
productionrestartix-exercise-assets-productioneu-central-1Terraform (storage-s3 module)Fargate task role, read-onlyon

All three: SSE-S3 encryption, public access blocked (all 4 toggles), no lifecycle rules, no Object Lock. Composer never writes back to S3 by design — outputs go to Bunny, not back into the assets bucket.

Why composer is read-only

Renders are outputs, not derivations stored back into the source bucket. Bunny owns the render lifecycle. Constraining composer to read-only on S3 makes the data-flow direction explicit and prevents accidental corruption of the source-of-truth primitives.

Object layout under s3://{bucket}/

{exercise-slug}/
├── manifest.json
├── intro-video-{1,2,3}.mp4
├── pauza-video-{1,2,3}.mp4
├── outro-video-{1,2,3}.mp4
├── rep-left-video-{1,2,3}.mp4
├── rep-right-video-{1,2,3}.mp4
└── audio/{lang}/
    ├── intro-vo-{1,2,3}.mp3
    ├── pauza-vo-{1,2,3}.mp3
    ├── outro-vo-{1,2,3}.mp3
    ├── rep-left-vo-{1,2,3}.mp3
    └── rep-right-vo-{1,2,3}.mp3

The composer auto-discovers variant counts from filenames matching {slot}-{n}.{ext}. Missing files don't crash the composer — they just narrow the variant pool for that slot. (A render request that needs a slot with zero variants fails validation.)

Composer's IAM policy (locked shape)

When the composer ships on Fargate (post-1E close), its task role attaches this policy per env:

json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "ExerciseAssetsRead",
      "Effect": "Allow",
      "Action": ["s3:GetObject", "s3:ListBucket"],
      "Resource": [
        "arn:aws:s3:::restartix-exercise-assets-{env}",
        "arn:aws:s3:::restartix-exercise-assets-{env}/*"
      ]
    }
  ]
}

No s3:PutObject, no s3:DeleteObject, no s3:PutObjectTagging. By design.

Local dev credentials

The restartix-platform-local-dev IAM user has read+write on the dev bucket only. Its keys live in services/api/.env.local (and are reused verbatim by services/exercise-composer/.env.local). Reach them via the user's password manager under "RestartiX AWS local dev". No LocalStack — local dev hits the real dev bucket.

Bunny Stream: rendered video delivery

EnvLibraryRegionProvisioningCredentials
devmanually provisioned (e.g. restartix-exercise-renders-dev)FrankfurtBunny dashboardlocal .env.local
stagingmanually provisionedFrankfurtBunny dashboardAWS Secrets Manager restartix/staging/bunny-bootstrap
productionmanually provisionedFrankfurtBunny dashboardAWS Secrets Manager restartix/production/bunny-bootstrap

No Terraform Bunny provider. Bunny libraries are provisioned manually in Bunny's dashboard, one-time bootstrap per env. Each library has its own API key — separate libraries per env keep production reputation and quota separate from staging tests.

What the composer needs from each library

FieldWhere in Bunny dashboardComposer env var
Library ID (numeric)Library overview / URLBUNNY_STREAM_LIBRARY_ID
API Key (full access)Library → API tab → "API Key" (not "Read-only API Key")BUNNY_STREAM_API_KEY
CDN hostnameLibrary → API tab → "CDN hostname" (e.g. vz-{token}.b-cdn.net)BUNNY_STREAM_CDN_HOSTNAME

Replication

For dev/staging libraries: only Frankfurt is enabled as a region. The Singapore / LA / NY replicas Bunny pre-selects by default add cost ($0.005/GB per replica per delivery) without serving real users at those stages. Production library replication should be decided based on patient geography at launch — likely Frankfurt + London for EU coverage.

Replicas cannot be removed after the library is created. Start lean.

Cost shape

For a 200-exercise library at 3 variants per slot, ~10 common prescriptions per exercise, 1 language:

  • Encoding (one-time per render): ~$0.025/min × ~3 min × 2,000 renders = ~$150 one-time
  • Storage: ~10 GB (transcoded HLS ladder) × $0.03/GB-month = ~$0.30/month
  • CDN delivery: ~$0.005/GB; a patient session pulls ~50 MB = $0.00025/session

Per-language overhead: only the audio side (~6 GB across 200 exercises). Adding a 2nd language ≈ +$0.18/month storage.

Trivial at platform scale.

Adding a new exercise

Two flows, depending on the exercise's kind:

  • reps_based — primitives in S3 + composer renders per recipe. The flow below covers this case end-to-end.
  • duration_based — a single pre-baked MP4. See Duration-based import at the end of this section.

For both, no admin UI exists today — both flows are manual.

1. Receive content from filming/audio team (reps_based)

The filming team delivers a folder of files for the new exercise. Typical raw delivery:

Detensionari Lombare/
├── 001 Detensionari Lombare - Intro.mp4
├── 001 Detensionari Lombare - Outro.mp4
├── 001 Detensionari Lombare - Pauza.mp4
├── 001 Detensionari Lombare 5 Stanga.mp4
├── 001 Detensionari Lombare 5 Dreapta.mp4
└── 001 Detensionari Lombare 1-20 Stanga.mp3
└── 001 Detensionari Lombare 1-20 Dreapta.mp3

(Filenames may vary; the team will deliver whatever naming convention they use internally.)

2. Rename files to the composer's convention

Reorganise into the layout the composer expects:

lumbar-detensioning/
├── manifest.json
├── intro-video-1.mp4          ← was "001 ... - Intro.mp4"
├── pauza-video-1.mp4          ← was "001 ... - Pauza.mp4"
├── outro-video-1.mp4          ← was "001 ... - Outro.mp4"
├── rep-left-video-1.mp4       ← was "001 ... 5 Stanga.mp4"
├── rep-right-video-1.mp4      ← was "001 ... 5 Dreapta.mp4"
└── audio/ro/
    ├── rep-left-vo-1.mp3      ← was "001 ... 1-20 Stanga.mp3"
    └── rep-right-vo-1.mp3     ← was "001 ... 1-20 Dreapta.mp3"

For the framing VO (intro/pauza/outro) which is currently baked into the framing video's audio track: extract it via ffmpeg into separate mp3s so it can be language-swapped later:

bash
ffmpeg -hide_banner -y -i intro-video-1.mp4 -vn -c:a libmp3lame -b:a 192k audio/ro/intro-vo-1.mp3
ffmpeg -hide_banner -y -i pauza-video-1.mp4 -vn -c:a libmp3lame -b:a 192k audio/ro/pauza-vo-1.mp3
ffmpeg -hide_banner -y -i outro-video-1.mp4 -vn -c:a libmp3lame -b:a 192k audio/ro/outro-vo-1.mp3

This is until the filming team delivers intro/pauza/outro as silent video + separate VO per language (the locked future state per the composition spec). For exercises where the framing video has dialogue baked in, the lip-sync mismatch with future-language VO will surface; those exercises will need re-shooting.

3. Add 2nd and 3rd variants (when available)

If the filming team has delivered 3 variants per slot, name them -1, -2, -3. If only 1 variant exists today, placeholder by duplicating:

bash
cp intro-video-1.mp4 intro-video-2.mp4
cp intro-video-1.mp4 intro-video-3.mp4
# … same for every slot

The composer happily picks among "3 variants" that are byte-identical until real variants replace them. The render works, just without anti-repetitiveness benefit until the real 2nd and 3rd variants arrive.

4. Write the manifest

json
{
  "exercise": "lumbar-detensioning",
  "reps_per_video_block": 5,
  "counts_per_audio_master": 20,
  "sides": ["left", "right"],
  "languages": ["ro"]
}

For bilateral exercises that don't switch sides (e.g. forward fold, plank), set "sides": ["bilateral"] instead of ["left", "right"] — the composer's variant model extends naturally, but the recipe shape would need adjustment. (Today only ["left", "right"] is implemented.)

5. Sync to the dev bucket

bash
aws s3 sync lumbar-detensioning/ \
  s3://restartix-exercise-assets-dev/lumbar-detensioning/ \
  --exclude '.DS_Store' \
  --exclude '*.tmp'

aws s3 sync does delta uploads (only changed files), doesn't delete remote files by default. Run again after updating any file.

Verify:

bash
aws s3 ls s3://restartix-exercise-assets-dev/lumbar-detensioning/ --recursive

Should show all expected files. Counts:

  • 1 manifest.json
  • 9 framing videos (3 per intro/pauza/outro)
  • 6 rep videos (3 per side × 2 sides)
  • 9 framing VO mp3s per language
  • 6 rep VO mp3s per language

Total: ~30 files per language.

6. Test a render

Composer running locally on port 9400:

bash
curl -s -X POST http://localhost:9400/v1/compose \
  -H 'Content-Type: application/json' \
  -d '{
    "exercise": "lumbar-detensioning",
    "language": "ro",
    "sets": [
      {"side":"left","reps":5},
      {"side":"right","reps":5}
    ],
    "seed": 42
  }' | jq .

Returns a bunny_video_id after ~5-10s. Open the returned playback_hls_url in Safari (or VLC, or the Bunny dashboard's library view) — Bunny needs ~30-60s to transcode after upload, so the first hit may show "not ready".

If the render fails, check:

  • Composer logs (slog output) for the specific error
  • That aws s3 ls shows the expected files (composer fails fast if manifest is missing or required variants are absent)
  • That the manifest's languages array includes the requested language
  • That every set's reps is a multiple of 5 in [5, 20]

Duration-based import

duration_based exercises are single pre-baked MP4s — no primitives, no S3 asset bundle, no composer involvement. They arrive as a single video file (currently: imported from the old platform's legacy library; future: any new exercise authored as a fixed-content video instead of composed primitives).

The import workflow:

  1. Upload the MP4 to Bunny Stream directly (Bunny dashboard or API), into the per-slug collection ({exercise-slug}/). Note the resulting bunny_video_id + duration_seconds.
  2. Insert the exercises row with kind='duration_based', status='draft', no asset_version requirement (set to 1 by default).
  3. Insert one exercise_renders row with recipe_hash='_imported', language='ro' (or whichever language the video carries), recipe=NULL, bunny_video_id, status='ready', asset_version=1. The row is the catalog preview by default.
  4. Update exercises.catalog_render_id to point at the render row.

No aws s3 sync — there are no S3 primitives for duration_based exercises. The Bunny video IS the entire content.

When more languages are added later, upload another MP4 to the same Bunny collection and INSERT another exercise_renders row with the new language.

If the underlying video needs to be replaced (a re-encode, a correction): re-upload to Bunny, get the new bunny_video_id, UPDATE the existing render row's bunny_video_id + bump asset_version. The cache stays consistent because there's only one row.

A migration script for the legacy import wave (hundreds of pre-existing videos) lives in services/api/migrations/ once the F9.1 import pass happens. For now, manual.

Bunny dashboard pointers

For debugging or browsing renders:

  • Library overview: https://dash.bunny.net/stream/{library-id} — list of all videos, status, sizes, encoded variants
  • Per-video page: shows embed code, HLS URL, MP4 download links, encoding logs, view counts
  • API tab: where credentials live; "Webhook URL" for transcode-complete callbacks (not wired today)

Bunny videos can be deleted from the dashboard if you want to start clean during testing.

Asset versioning

The exercise_renders cache table shipped in commit 3d95e38 (F9.1 Phase 1). Each exercise carries an asset_version INT column; each render row records the asset_version it was baked against. The workflow:

  1. Filming team updates a variant (e.g. re-shoots rep-left-video-2.mp4 with better lighting).
  2. You aws s3 sync the updated file to the bucket.
  3. You bump exercises.asset_version for that exercise.
  4. Cache rows with the old asset_version become stale; the next prescription request that's a cache miss (or a stale-version match — implementation detail in Service.EnsureRender) triggers a re-render. Old bunny_video_ids linger in Bunny for a grace period before cleanup.

Today (Phase 1 shipped):

  • ✅ Schema: asset_version column on exercises, recorded on every exercise_renders row
  • ✅ Cache lookup: Service.EnsureRender matches on recipe_hash + language + asset_version; mismatched version is treated as a miss
  • ❌ Bump endpoint: no Console action yet — manual UPDATE exercises SET asset_version = asset_version + 1 WHERE slug = '...' for now (this is a Phase 2 task; see composition.md → Phase 2 backlog)
  • ❌ Eager catalog preview re-render on bump: same Phase 2 task

For Phase 1 testing without the bump endpoint, the manual SQL is fine. Patient-facing flows aren't live yet, so there's no risk of mid-session staleness.

What's NOT in this pipeline

  • Patient recordings — the platform may eventually record sessions (telemetry, supervised reviews). Those are a separate concern, separate storage, separate provider — not in this pipeline.
  • Live video calls — Daily.co handles those, not Bunny. Different concern.
  • Marketing / brand video — not in this bucket. If we ever ship those, they need their own bucket (or live on whatever marketing CDN is chosen).
  • AI-generated content — out of scope for v1.

Cross-references