docs(plan): add Phase 18 Opus low-data streaming; resolve Phase 21 OQ5 (no MSE)

This commit is contained in:
daniel-c-harvey
2026-06-23 04:58:21 -04:00
parent a84a99c309
commit 1bdaeaa164
3 changed files with 610 additions and 29 deletions
+105 -9
View File
@@ -443,6 +443,90 @@ not the same work; this phase does not satisfy or depend on that one.
---
## Phase 18 — Opus Low-Data Streaming (dual-format lossless + Opus delivery)
The concrete realization of the long-deferred **"Non-WAV formats"** intent (`CONTEXT.md §5`). Daniel's
direction (2026-06-23): **two delivery formats per track — the existing lossless WAV path, and a new
low-data Ogg Opus (fullband, 320 kbps) path — so the listener gets a choice, with Opus the
bandwidth-friendly default-candidate.** Lossless streaming becomes *optional*, not the only path. The
bespoke Web Audio decode→schedule graph is **retained by deliberate choice** — Opus feeds the same
`IFormatDecoder` seam, not an HTML `<media>` element or MSE (the decision shared with Phase 21 OQ5).
**Sequenced BEFORE Phase 21** — windowing must work across both formats. Surfaces: ingest/preprocessing
in `DeepDrftContent` (`AudioProcessor`/router/`WaveformProfileService`) + `DeepDrftAPI`
(`UnifiedTrackService.UploadAsync`, replace-audio); delivery/decode in `DeepDrftAPI` (stream endpoint +
`Range`) + `DeepDrftPublic` proxy + `DeepDrftPublic.Client` player stack + `DeepDrftPublic/Interop/audio`
TS decoders. Full design, the three directions with SOLID/road-not-taken rationale, the storage and
delivery options, the Opus decoder + seek math, acceptance criteria, open questions, and wave
decomposition: `product-notes/phase-18-opus-low-data-streaming.md`.
**Much further along than the backlog line implies (verified 2026-06-23).** The multi-format *substrate*
already exists on both sides: the producer-side `AudioProcessorRouter` routes `.wav`/`.mp3`/`.flac` and
`TrackContentService.AddTrackAsync` is format-agnostic (it **stores originals**, no transcode); the
decoder-side `AudioPlayer.createFormatDecoder` is a **wired** strategy registry dispatching on
`Content-Type` (WAV/MP3/FLAC decoders all present — correcting the Phase 21 spec's stale
"implemented-not-wired" note). **The actual gap is Daniel's specific ask:** (1) a **transcode-at-ingest**
step that *derives* an Opus 320 artifact per track (nothing derives Opus today), and (2) a **per-format
delivery selection** so one track serves as either WAV or Opus on request.
**Architectural spine — a derived artifact + a delivery param + one new decoder; three new leaf
implementations, zero changes to existing format code (the strong OCP signal).** Transcode is a new
processor sibling in `DeepDrftContent`, invoked post-store alongside `WaveformProfileService`,
**failure-tolerant and off the hot path** (background/queued — a 1 GB WAV transcode must not block the
upload response) — mirroring the landed waveform-datum pattern (derive at ingest, regenerate via a CMS
bulk action + ApiKey endpoint). The Opus bytes are a **derived artifact** stored like the high-res
waveform datum (recommend a dedicated `track-opus` vault, the `track-waveforms` precedent; final call
staff-engineer's). Delivery adds a **`?format=opus|lossless` param** (mirroring the existing `offset`
param threading through `TrackProxyController`) resolved server-side to the right artifact + content-type,
with a **lossless fallback** when no Opus artifact exists (additive, never 404/silence). The player gains
one `OpusFormatDecoder` (`IFormatDecoder`): Ogg-page-aligned segmenting (`OggS` scan — the FLAC
frame-sync analogue), `OpusHead` setup-bytes carry (the FLAC `streamInfoBytes` analogue), and an
**approximate** page-interpolation `calculateByteOffset` (Opus is VBR/paged — this is exactly the Phase
21 C5 case). **Browser constraint flagged:** Ogg-Opus `decodeAudioData` is Safari-18.4+ only (Chrome/FF
long-standing), so the Opus default must be **capability-gated** — fall back to the universal lossless
path on browsers that can't decode it.
**Constraints/invariants:** keep the bespoke graph (no MSE); preprocessing is **additive** (WAV path
untouched, byte-for-byte; a track with no Opus artifact still plays losslessly); reuse the landed
`Range`/offset seek path; no format branches leak outside the new decoder + one selection arm + the
transcode/delivery seam; transcode failure must not block ingest; format selection is a delivery-time
decision resolving one `EntryKey` to one of two artifacts (one source, two views — **not** a second
`TrackEntity` row, which would fracture share/queue/play-count/release identity).
Sequenced as five waves. `18.1 → 18.2 → {18.3, 18.4} → 18.5`. **18.1 (ingest transcode + derived
artifact) is the cold-start prerequisite** — nothing downstream has bytes to serve or decode until an
Opus artifact exists.
- **18.1 — Ingest transcode: derive + store the Opus artifact (cold-start; load-bearing).** New
`OpusTranscodeService`/processor in `DeepDrftContent`, invoked post-store from
`UnifiedTrackService.UploadAsync` alongside `WaveformProfileService`; produces Ogg Opus fullband 320;
stores it as a derived artifact (recommend a `track-opus` vault). Failure-tolerant; off the hot path
(background/queued). **Independent of the delivery/decoder waves — can begin immediately.**
- **18.2 — Storage + lookup contract.** The derived-artifact key/vault convention + server-side "given
`EntryKey` + format, return the right `AudioBinary` + content-type," including the lossless fallback.
**Depends on 18.1.**
- **18.3 — Delivery: `?format=opus|lossless` param + proxy threading.** On the `DeepDrftAPI` stream
endpoint (resolves via 18.2), forwarded through `TrackProxyController` (mirror `offset`), `Range`
serving the chosen artifact; player sends it via `TrackMediaClient`. **Depends on 18.2; parallel-ok
with 18.4.**
- **18.4 — `OpusFormatDecoder` in the player stack.** New `IFormatDecoder` (Ogg-page segmenting,
`OpusHead` carry, approximate page-interpolation `calculateByteOffset` with an `OpusSeekData`
accelerator) + one arm in `createFormatDecoder` on `audio/ogg`/`audio/opus`; capability detection for
the lossless fallback. **Depends on 18.2; parallel-ok with 18.3.**
- **18.5 — Backfill + selection UX + end-to-end validation.** "Backfill Opus" CMS bulk action (third
sibling to Generate-Profiles / Backfill-High-res) + replace-audio Opus regeneration; the listener
selection control (recommend a global persisted quality toggle); the AC1AC8 acceptance pass including
the Phase-21 handshake (Opus is windowable by the same machinery). **Depends on 18.118.4.**
**Dependency shape:** `18.1 → 18.2 → {18.3 ∥ 18.4} → 18.5`; 18.1 is the only cold-start wave.
**Phase-level: 18 precedes Phase 21.** **Open questions for Daniel (spec §6):** selection UX (recommend a
single global quality toggle); default policy (recommend Opus-by-default, capability-gated; defer
network-awareness); whether the choice is remembered + scope (recommend persisted cookie/`localStorage`,
the dark-mode precedent); per-upload Opus opt-out vs. always-on (recommend always-on); Ogg-vs-CAF/WebM
container (recommend Ogg Opus as directed); transcode execution model (background/queued — a track is
lossless-only briefly until its Opus finishes; confirm acceptable). None block 18.1.
---
## Phase 21 — Windowed Streaming Buffer (bounded client memory for long streams)
Bound the **client memory** a playing track consumes to a small, configurable forward window —
@@ -451,6 +535,14 @@ plays without the whole decoded PCM accumulating in the browser. **Public listen
(`DeepDrftPublic.Client` player stack + `DeepDrftPublic` TypeScript audio interop); no CMS, no API
endpoint, no schema change.
**Sequenced AFTER Phase 18 (Opus Low-Data Streaming) — Daniel, 2026-06-23.** Format support (the
derived Ogg Opus 320 low-data path, Phase 18) is a prerequisite that comes first; windowing must work
across **both** delivery formats. Phase 21's C5 invariant already anticipated this ("must not foreclose
MP3/FLAC"); **Opus is now the concrete VBR/paged driver** — windowing an Opus stream uses the decoder's
*approximate* byte↔time mapping (`OpusFormatDecoder.calculateByteOffset`, Ogg-page interpolation), not
the exact CBR-WAV `byteRate` math. Build the window machinery format-agnostically so it inherits Opus
for free.
The network path already streams in adaptive 1664 KB chunks. The accumulation is on the **decode
side**: `PlaybackScheduler` holds an `AudioBuffer[]` it **never evicts** ("Supports pause/resume/seek by
retaining all buffers" — its own doc comment). Decoded PCM is larger than the source (Web Audio is
@@ -467,9 +559,11 @@ just triggered manually and one-shot. The only genuinely new mechanisms are **pa
scheduler and **back-pressure** on the forward read loop (stop calling `ReadAsync` above a high-water
mark, resume below low-water). Recommended **Direction A** (sliding window on the existing single forward
stream); **Direction B** (discrete Range-fetched segments — the HLS/DASH/MSE-eviction analogue) held as
the documented fallback; **Direction C** (adopt MSE and let the browser manage the buffer) flagged as the
real long-term answer but out of scope — it is a playback-substrate rewrite entangled with non-WAV
formats (Phase 1.2), surfaced as OQ5.
the documented fallback; **Direction C** (adopt MSE and let the browser manage the buffer) **rejected
(OQ5 = NO, Daniel 2026-06-23)** — the bespoke Web Audio graph is a deliberate long-term commitment, and
the compressed-delivery move that would have justified MSE is met instead by **Phase 18 (Opus) feeding
the same bespoke graph** through the `IFormatDecoder` seam. Direction A is therefore the permanent
destination, not a stopgap MSE would retire.
**Invariants that must hold (the §3.5 seam contract).** Reuse the Range path, don't fork it; playback-
start latency at parity; the `IFormatDecoder` abstraction untouched (windowing is format-agnostic, so
@@ -513,12 +607,14 @@ parameters fed in later).
running, rapid-seek concurrency. Largely measurement; breaks are tuning fixes in 21.1's anchor math or
21.2's water-marks. **Depends on 21.121.3.**
**Dependency shape:** `21.1 → 21.2 → 21.3 → 21.4`; 21.1 is the only cold-start wave. **Open questions for
Daniel (spec §6):** window-size policy axis (time-based window + memory guard — recommended); seek-back-
past-window re-buffer acceptable (recommend yes, symmetric to forward); a hard total in-flight memory cap
as a guard rail (recommend yes); window everything vs. only long tracks (recommend everything — one path,
short tracks never hit a refill); and whether MSE is the real destination (steer informing scope, not a
blocker). None block 21.1.
**Dependency shape:** `21.1 → 21.2 → 21.3 → 21.4`; 21.1 is the only cold-start wave. **Phase-level
prerequisite: Phase 18 (Opus) lands first** so windowing is built against both formats. **Open questions
for Daniel (spec §6):** window-size policy axis (time-based window + memory guard — recommended); seek-
back-past-window re-buffer acceptable (recommend yes, symmetric to forward); a hard total in-flight
memory cap as a guard rail (recommend yes); window everything vs. only long tracks (recommend everything
— one path, short tracks never hit a refill). **OQ5 (adopt MSE) — RESOLVED NO (Daniel 2026-06-23): the
bespoke graph stays by deliberate choice; recorded considered-and-declined, kept visible per file
convention.** None block 21.1.
---