diff --git a/PLAN.md b/PLAN.md index b2cbb60..ef5bbf3 100644 --- a/PLAN.md +++ b/PLAN.md @@ -457,44 +457,52 @@ plays without the whole decoded PCM accumulating in the browser. **Public listen (`DeepDrftPublic.Client` player stack + `DeepDrftPublic` TypeScript audio interop); no CMS, no API endpoint, no schema change. -**Phase 18 (Opus Low-Data Streaming) has landed — Phase 21 is the next pickup.** The derived Ogg Opus 320 low-data path (Phase 18, `COMPLETED.md`) is the prerequisite; windowing must work across **both** delivery formats. Phase 21's C5 invariant already anticipated this ("must not foreclose -MP3/FLAC"); **Opus is now the concrete VBR/paged driver** — windowing an Opus stream uses the decoder's -**accurate index-based** byte↔time mapping (`OpusFormatDecoder.calculateByteOffset`, a binary search in the -Phase 18 precomputed seek index — *not* the exact CBR-WAV `byteRate` math, and *not* approximate page -interpolation: VBR-safe and exact, per the Phase 18 seek-model resolution 2026-06-23). The windowed refill -controller calls the **same** index resolver an explicit seek does, and a window opening away from byte 0 -still decodes via the Phase 18 sidecar setup header. Build the window machinery format-agnostically so it -inherits Opus for free. +**Phase 18 (Opus Low-Data Streaming) has landed — Phase 21 is the next pickup, reconciled to the as-built +two-decode-path reality.** Phase 18 left **two** decode paths feeding the one `PlaybackScheduler`: +WAV/MP3/FLAC via `StreamDecoder`/`IFormatDecoder`, and Opus via a **WebCodecs `AudioDecoder`** pipeline +(`OggDemuxer` → `OpusStreamDecoder`, the `IStreamingDecoder` seam — *not* `IFormatDecoder`; per-segment +`decodeAudioData` was tried and replaced). Windowing must bound **both**. The accurate index-driven Opus +seek the original spec assumed Phase 21 would build is **already live** (`resolveOpusByteOffset` over the +precomputed seek index → Range fetch → `reinitializeForRangeContinuation` with frame-accurate lead-trim); +Phase 21 **reuses** it for window-miss refills rather than building it. Opus seek is VBR-safe and +**accurate**, not approximate (the earlier "approximate page interpolation" framing is corrected). The network path already streams in adaptive 16–64 KB chunks. The accumulation is on the **decode -side**: `PlaybackScheduler` holds an `AudioBuffer[]` it **never evicts** ("Supports pause/resume/seek by -retaining all buffers" — its own doc comment). Decoded PCM is larger than the source (Web Audio is -32-bit float per sample/channel — a 16-bit stereo WAV roughly doubles once decoded), so a 1 GB WAV -becomes ~2 GB of retained float data. That is the OOM. The fix: hold only a sliding forward window plus a -small back-retain, discard already-played buffers, and refill on demand. +side**, and now has two faces. The shared one: `PlaybackScheduler` holds an `AudioBuffer[]` it **never +evicts** — both paths `addBuffer` into it, nothing is removed. Decoded PCM is larger than the source +(Web Audio is 32-bit float per sample/channel — a 16-bit stereo WAV roughly doubles once decoded), so a +1 GB WAV becomes ~2 GB of retained float; **a low-data Opus mix decodes to the same ~2 GB once played**, +so its small transfer does not spare it. The Opus-only second face: the WebCodecs decode queue + +`decodedQueue` accumulate upstream of the scheduler too. The fix: hold only a sliding forward window plus +a small back-retain, discard already-played buffers, and refill on demand — with back-pressure on the C# +read loop (both paths) **and** on the Opus demux/decode feed (Opus only). **Architectural spine — a sliding window keyed on playback position, built as a generalization of the -landed seek-beyond-buffer path.** The Phase 4 HTTP `Range: bytes=X-` → 206 primitive already does every -plumbing primitive the window needs (discard-buffers-keep-offset via `clearForSeek`/`setPlaybackOffset`; -fetch-from-offset via `TrackMediaClient`; decode-header-less-body via -`StreamDecoder.reinitializeForRangeContinuation`; time→byte via `IFormatDecoder.calculateByteOffset`), -just triggered manually and one-shot. The only genuinely new mechanisms are **partial eviction** on the -scheduler and **back-pressure** on the forward read loop (stop calling `ReadAsync` above a high-water -mark, resume below low-water). Recommended **Direction A** (sliding window on the existing single forward -stream); **Direction B** (discrete Range-fetched segments — the HLS/DASH/MSE-eviction analogue) held as -the documented fallback; **Direction C** (adopt MSE and let the browser manage the buffer) **rejected -(OQ5 = NO, Daniel 2026-06-23)** — the bespoke Web Audio graph is a deliberate long-term commitment, and -the compressed-delivery move that would have justified MSE is met instead by **Phase 18 (Opus) feeding -the same bespoke graph** through the `IFormatDecoder` seam. Direction A is therefore the permanent -destination, not a stopgap MSE would retire. +landed seek-beyond-buffer paths.** The Phase 4 HTTP `Range: bytes=X-` → 206 primitive already does every +plumbing primitive the window needs, with a WAV branch and an Opus branch, both live: discard-buffers-keep- +offset via `clearForSeek`/`setPlaybackOffset` (shared); fetch-from-offset via `TrackMediaClient` (shared, +now with `?format=`); decode-header-less-body via `StreamDecoder.reinitializeForRangeContinuation` (WAV) / +`OpusStreamDecoder.reinitializeForRangeContinuation` (Opus); time→byte via `IFormatDecoder.calculateByteOffset` +(WAV) / `resolveOpusByteOffset` over the seek index (Opus) — just triggered manually and one-shot. The +genuinely new mechanisms: **partial eviction** on the shared scheduler (one implementation, both paths), +and **back-pressure** — on the C# read loop (both paths) **and additionally on the Opus WebCodecs +decode-ahead** (`decodeQueueSize` + `decodedQueue`, Opus only, since throttling the socket alone doesn't +bound the async decoder's queues). Recommended **Direction A** (sliding window on the existing single +forward stream); **Direction B** (discrete Range-fetched segments — the HLS/DASH/MSE-eviction analogue) +held as the documented fallback; **Direction C** (adopt MSE) **rejected (OQ5 = NO, Daniel 2026-06-23)** — +the bespoke Web Audio graph is a deliberate long-term commitment, and the compressed-delivery move that +would have justified MSE was met instead by **Phase 18 (Opus) feeding the same bespoke graph** through the +WebCodecs `IStreamingDecoder` seam. Direction A is the permanent destination, not a stopgap MSE would +retire. -**Invariants that must hold (the §3.5 seam contract).** Reuse the Range path, don't fork it; playback- -start latency at parity; the `IFormatDecoder` abstraction untouched (windowing is format-agnostic, so -wiring MP3/FLAC later inherits it free); read-only playback (no new control); the single-instance JS -decoder stays single-writer (every refill routes through the existing cancellation/drain discipline). The -**Mix visualizer is provably unaffected** — it renders from the preprocessed per-track high-res datum -(Phase 10/12), never from live decoded PCM, so evicting played buffers cannot starve it. The 1 GB mix is -both the canonical case *and* the proof the eviction is safe. +**Invariants that must hold (the §3.5 seam contract).** Reuse each path's Range/seek machinery, don't fork +it; playback-start latency at parity; neither decoder seam's contract forked — eviction is shared at the +scheduler (zero format branches), back-pressure is seam-aware (the one place the two paths diverge); +read-only playback (no new control); the single-writer decoder discipline holds for **both** decoders +(stricter for the stateful Opus `AudioDecoder` — a stale `push` racing a reset+reconfigure corrupts +inter-frame state). The **Mix visualizer is provably unaffected** — it renders from the preprocessed +per-track high-res datum (Phase 10/12), never from live decoded PCM, so evicting played buffers cannot +starve it. The 1 GB mix is both the canonical case *and* the proof the eviction is safe. **Interaction with deferred Phase 1 features (same seam):** windowing should land **before** preload (1.3) — it makes preload of long tracks memory-safe by construction (a staged next-track decoder inherits @@ -512,32 +520,45 @@ Sequenced as four waves. `21.1 → 21.2 → 21.3`, with `21.4` validating the wh prerequisite and the load-bearing change** — independent of the open questions (window *sizes* are parameters fed in later). -- **21.1 — Partial eviction in `PlaybackScheduler` (cold-start; load-bearing).** Drop already-played - buffers while keeping the position/index/time-anchor bookkeeping exact against a buffer array that no - longer begins at absolute time 0 (today `getCurrentPosition`/`playFromPosition`/the schedule loop all - assume `buffers[0]` is the track start). The hardest correctness work in the phase. No refill yet. - **Independent of the open questions — can begin immediately.** -- **21.2 — Back-pressure on the forward read loop.** Stop `ReadAsync` above the high-water mark, resume - below low-water; together with 21.1 this bounds *both* the played and unplayed regions (the AC1 - guarantee). Routes resume/pause through the existing single-loop cancellation discipline. **Depends on - 21.1.** -- **21.3 — Seek-back-past-window refill.** When a backward seek lands earlier than the retained tail, - refetch via the existing seek-beyond-buffer Range path pointed at the earlier offset; plus the minimal - clean refill-failure handling (the 1.6 adjacency). Mostly reuse of the landed seek path. **Depends on - 21.1 + 21.2.** -- **21.4 — Validation against the 1 GB target (acceptance).** Memory profiling (bounded under 1 GB is the - headline), latency parity, edge-to-edge playback, the seek matrix, induced refill failure, visualizer- - running, rapid-seek concurrency. Largely measurement; breaks are tuning fixes in 21.1's anchor math or - 21.2's water-marks. **Depends on 21.1–21.3.** +Decomposition is **by concern** (eviction → back-pressure → seek-back refill → validate), not by format — +eviction is genuinely shared, so a path-split would duplicate the hardest work; the one path-divergent +concern (back-pressure) carries a two-track split *inside* its wave. + +- **21.1 — Partial eviction in `PlaybackScheduler` (cold-start; load-bearing; SHARED by both paths).** Drop + already-played buffers while keeping the position/index/time-anchor bookkeeping exact against a buffer + array that no longer begins at absolute time 0 (today `getCurrentPosition`/`playFromPosition`/the schedule + loop all assume `buffers[0]` is the track start). The hardest correctness work in the phase. Written once, + serves both decode paths (they `addBuffer` identically). No refill yet. **Independent of the open + questions — can begin immediately.** +- **21.2 — Back-pressure (two tracks, one fill signal).** Bound the unplayed region by throttling + production above a high-water mark and resuming below low-water, driven by the scheduler's + decoded-lookahead fill. **21.2a** — stop `ReadAsync` on the C# loop (serves both paths; *sufficient* for + WAV). **21.2b** — additionally stop the Opus demux/decode feed so the WebCodecs decode queue + + `decodedQueue` don't balloon behind a throttled socket (Opus only; no WAV analogue). Together with 21.1 + this bounds both played and unplayed sides on both formats (AC1 + AC1-Opus). Routes through the existing + single-loop cancellation discipline. **Depends on 21.1.** +- **21.3 — Seek-back-past-window refill (one concern, per-path resolver).** When a backward seek lands + earlier than the retained tail, refetch via the existing seek-beyond-buffer path pointed at the earlier + offset, using whichever resolver the active path already ships (`IFormatDecoder`/`StreamDecoder` for WAV; + the live `resolveOpusByteOffset` + `OpusStreamDecoder.reinitializeForRangeContinuation` for Opus); plus + the minimal clean refill-failure handling (the 1.6 adjacency). Mostly reuse of the landed seek paths. + **Depends on 21.1 + 21.2.** +- **21.4 — Validation against the 1 GB target, BOTH formats (acceptance).** Memory profiling (bounded under + 1 GB as WAV *and* as Opus, plus the Opus upstream queues), latency parity, edge-to-edge playback, the + seek matrix, induced refill failure, visualizer-running, rapid-seek concurrency (incl. an Opus + seek-storm). Largely measurement; breaks are tuning fixes in 21.1's anchor math, 21.2's water-marks, or + 21.2b's Opus decode-ahead bound. **Depends on 21.1–21.3.** **Dependency shape:** `21.1 → 21.2 → 21.3 → 21.4`; 21.1 is the only cold-start wave. **Phase-level -prerequisite: Phase 18 (Opus) lands first** so windowing is built against both formats. **Open questions -for Daniel (spec §6):** window-size policy axis (time-based window + memory guard — recommended); seek- -back-past-window re-buffer acceptable (recommend yes, symmetric to forward); a hard total in-flight -memory cap as a guard rail (recommend yes); window everything vs. only long tracks (recommend everything -— one path, short tracks never hit a refill). **OQ5 (adopt MSE) — RESOLVED NO (Daniel 2026-06-23): the -bespoke graph stays by deliberate choice; recorded considered-and-declined, kept visible per file -convention.** None block 21.1. +prerequisite: Phase 18 (Opus) has landed** (`COMPLETED.md`) — windowing is built against both formats. +**Open questions for Daniel (spec §6):** window-size policy axis (time-based window + memory guard — +recommended); seek-back-past-window re-buffer acceptable (recommend yes, symmetric to forward); a hard +total in-flight memory cap as a guard rail (recommend yes); window everything vs. only long tracks +(recommend everything — one path, short tracks never hit a refill). **New staff-engineer architecture +calls (spec §6):** OQ6 — one window controller for both paths or two (recommend shared controller + two +back-pressure hooks); OQ7 — drive the Opus decode-ahead bound from the single scheduler-fill signal +(recommended). **OQ5 (adopt MSE) — RESOLVED NO (Daniel 2026-06-23):** the bespoke graph stays by deliberate +choice. None block 21.1. --- diff --git a/product-notes/phase-21-windowed-streaming-buffer.md b/product-notes/phase-21-windowed-streaming-buffer.md index dd5c3fe..d4604f7 100644 --- a/product-notes/phase-21-windowed-streaming-buffer.md +++ b/product-notes/phase-21-windowed-streaming-buffer.md @@ -1,27 +1,39 @@ # Phase 21 — Windowed Streaming Buffer (bounded client memory for long streams) -Product spec. Status: **design / framing — implementation-ready pending Daniel's open-question calls.** -Author: product-designer. Date: 2026-06-23. **No code has been written by this doc.** +Product spec. Status: **design / framing — reconciled to as-built Phase 18 (two decode paths); +implementation-ready pending Daniel's open-question calls (OQ1–OQ4 product; OQ6–OQ7 staff-engineer +architecture).** Author: product-designer. Date: 2026-06-23 (reconciliation pass after Phase 18 landed). +**No code has been written by this doc.** Surface: **public listener site only** (`DeepDrftPublic.Client` player stack + `DeepDrftPublic` TypeScript audio interop). No CMS (`DeepDrftManager`) change. No data-model or schema change. The one server touch is **reuse, not new surface**: the existing `DeepDrftAPI` HTTP `Range: bytes=X-` partial-content primitive (Phase 4, landed) is the load-bearing dependency; this phase adds no new API endpoint. -> **Sequencing dependency (Daniel, 2026-06-23): Phase 18 (Opus Low-Data Streaming) comes BEFORE this -> phase.** Format support — specifically the derived **Ogg Opus fullband 320** low-data delivery path -> (`product-notes/phase-18-opus-low-data-streaming.md`) — is a prerequisite that sequences ahead of -> windowing. Phase 21's windowing must work across **both** delivery formats (lossless WAV and Opus). -> Its C5 invariant below already anticipated this ("must not foreclose MP3/FLAC"); **Opus is now the -> concrete VBR/containerized driver of C5.** Windowing an Opus stream uses the decoder's **accurate -> index-based** byte↔time mapping (`OpusFormatDecoder.calculateByteOffset` — a binary search in the Phase 18 -> precomputed seek index), exactly the C5 case — *not* the exact CBR-WAV `byteRate` math, and *not* -> approximate Ogg-page interpolation. **Correction (Daniel, 2026-06-23):** an earlier draft described the -> Opus mapping as "approximate page interpolation"; the Phase 18 seek-model resolution rejected that — Opus -> seeking is **accurate**, backed by a precomputed seek index built at transcode time, so refill resolves to -> the *exact* page offset. The windowed refill controller calls the **same** index resolver an explicit seek -> does (Phase 18 §3.4a D); a window opening away from byte 0 still decodes via the Phase 18 sidecar setup -> header. Build the window machinery format-agnostically (§2 C3/C5) so it inherits Opus for free. +> **Phase 18 (Opus Low-Data Streaming) has LANDED (2026-06-23, `COMPLETED.md`). This spec is reconciled +> to the as-built reality.** Phase 18 changed the landscape in two ways that reshape this phase: +> +> 1. **There are now TWO decode paths feeding the one `PlaybackScheduler`, not one.** (a) The original +> **WAV/MP3/FLAC** path — `StreamDecoder` → `IFormatDecoder` (wrap-each-segment + `decodeAudioData`). +> (b) A new **Opus** path — `OggDemuxer` → `OpusStreamDecoder` (the `IStreamingDecoder` seam, a stateful +> **WebCodecs `AudioDecoder`** pipeline). The §3.1 unbounded-memory root cause (the scheduler's +> push-only `AudioBuffer[]`) applies to **both** — but the Opus path adds a *second* accumulation locus +> upstream of the scheduler (the WebCodecs decode queue + `decodedQueue: AudioData[]`), so windowing it +> is not the same mechanism as windowing WAV. See §3.1. +> 2. **The accurate index-driven Opus seek the original spec assumed Phase 21 would build is ALREADY +> LIVE.** Phase 18 ships `resolveOpusByteOffset` (binary-search the precomputed seek index in +> `OpusSeekData`) → Range fetch → `OpusStreamDecoder.reinitializeForRangeContinuation(landingTime, +> target)` with frame-accurate lead-trim. Opus seek is **accurate, not approximate** — and **already +> shipping**. Phase 21 does **not** build Opus seek; it **reuses** that live seek for window-miss +> refills. +> +> **Correction of stale spec language.** The original draft described Opus as a future-wired +> `OpusFormatDecoder.calculateByteOffset` joining the `IFormatDecoder` registry, with seek as "approximate +> vs accurate." All of that is now wrong against the landed code: Opus does **not** use `IFormatDecoder` +> (it diverged to the `IStreamingDecoder`/WebCodecs seam precisely because per-segment `decodeAudioData` is +> architecturally wrong for Opus — see `IStreamingDecoder.ts`), and its seek is accurate and shipping. The +> body below is rewritten to the two-path reality. **The headline is unchanged:** bound client memory to a +> sliding window regardless of stream length, for the canonical 1 GB mix, across both delivery formats. --- @@ -31,18 +43,34 @@ Bound the **client memory** a playing track consumes to a small, configurable fo **independent of total stream length** — so a 1 GB+ DJ MIX (Phase 9 `Mix` medium: a single long track) plays without the whole decoded PCM accumulating in the browser. -**The defect, stated precisely.** The network path already streams in adaptive 16–64 KB chunks -(`StreamingAudioPlayerService.StreamAudioWithEarlyPlayback`) — that part is fine. The accumulation is on -the **decode side**: `PlaybackScheduler` holds `private buffers: AudioBuffer[]` and **never evicts** -("Supports pause/resume/seek by **retaining all buffers**" — its own doc comment). Every 64 KB segment -the `StreamDecoder` decodes is pushed via `addBuffer()` and kept for the life of the track. Decoded PCM -is **larger than the compressed-or-raw source** in memory (Web Audio `AudioBuffer` is 32-bit float per -sample per channel — a 16-bit stereo WAV roughly **doubles** in size once decoded), so a 1 GB WAV becomes -~2 GB of retained `AudioBuffer` float data. That is the OOM. +**The defect, stated precisely — and it now has two faces, one shared.** The network path already +streams in adaptive 16–64 KB chunks (`StreamingAudioPlayerService.StreamAudioWithEarlyPlayback`) — that +part is fine. The accumulation is on the **decode side**, and Phase 18 split the decode side into two +pipelines that both terminate at the same sink: -**One-line framing:** today the player decodes the whole track into memory and keeps it; Phase 21 makes -it keep only a sliding forward window and discard what has already played, refilling on demand from the -Range primitive it already uses for seek. +- **The shared sink (both paths) — the unbounded scheduler.** `PlaybackScheduler` holds + `private buffers: AudioBuffer[]` and **never evicts** ("Supports pause/resume/seek by **retaining all + buffers**" — its own doc comment). Both decode paths call `scheduler.addBuffer()` (via + `AudioPlayer.processFormatChunk` for WAV/MP3/FLAC and `processOpusChunk` for Opus); nothing is ever + removed. Decoded PCM is **larger than the source** in memory (Web Audio `AudioBuffer` is 32-bit float + per sample per channel — a 16-bit stereo WAV roughly **doubles** once decoded; Opus decodes to the same + 48 kHz float PCM regardless of how few bytes the *compressed* stream was). So a 1 GB WAV becomes ~2 GB + of retained float, **and a low-data Opus mix becomes the same ~2 GB of decoded float once played** — + the compressed transfer is small, but the *decoded* footprint is identical. The scheduler is the OOM for + both. **This is the §3.1 root cause, unchanged from the original spec — it just now afflicts two + producers.** +- **The Opus-only second locus — upstream decode-ahead.** The Opus path accumulates *before* the + scheduler too: the WebCodecs `AudioDecoder` work queue (`decodeQueueSize`), the `decodedQueue: + AudioData[]` awaiting conversion, and the `OggDemuxer`'s partial-page state. Bounding the scheduler + alone does not bound these — they fill from the same C# `ReadAsync` loop, so they need their own + back-pressure (on the *demuxer/decoder feed*), not only the read loop's. WAV has no equivalent + upstream queue (its `StreamDecoder` decodes synchronously into the scheduler), so this is genuinely + Opus-specific. + +**One-line framing:** today the player decodes the whole track into memory and keeps it — true for both +formats; Phase 21 makes it keep only a sliding forward window and discard what has already played, +refilling on demand from the Range primitive both paths already use for seek (WAV via `IFormatDecoder`, +Opus via the live index-driven `resolveOpusByteOffset`). --- @@ -60,36 +88,51 @@ docs. This phase **modifies that seam** — so the contract it must preserve is - **C2 — Playback start latency unchanged.** Today playback starts as soon as a configurable minimum buffer count is queued (header-derived duration, not full-file). The window model must keep first-audio latency at parity — bounding memory must not reintroduce a fetch-then-play stall. -- **C3 — The format-decoder abstraction is untouched.** `IFormatDecoder` owns all format-specific - byte math; `AudioPlayer.createFormatDecoder` already dispatches on `Content-Type` (WAV/MP3/FLAC - decoders all wired today — verified 2026-06-23; an `OpusFormatDecoder` joins them in Phase 18). - Windowing lives in the - **format-agnostic** layer (`PlaybackScheduler` eviction + `StreamDecoder`/player refill - orchestration); it must add **no** format-specific branches. A future wired MP3/FLAC decoder inherits - windowing for free. +- **C3 — Neither decoder seam's contract is forked; windowing lives in the shared layer plus a thin + per-seam hook.** There are two decoder seams as of Phase 18: `IFormatDecoder` (WAV/MP3/FLAC, owns + format byte math; `AudioPlayer.createFormatDecoder` dispatches on `Content-Type`) and `IStreamingDecoder` + (Opus, the WebCodecs pipeline; selected in `initializeStreaming` when the content type is + `audio/ogg`/`audio/opus` and a sidecar is present). **The eviction half of windowing is fully shared** — + it lives in `PlaybackScheduler`, which both seams feed identically via `addBuffer`, so eviction adds + **zero** format branches. **The back-pressure / decode-ahead half is necessarily seam-aware** — the WAV + path back-pressures the C# `ReadAsync` loop; the Opus path must additionally bound the WebCodecs + decode-ahead and the `decodedQueue` (§3.1). Express that as a **small uniform signal** ("the scheduler is + full, stop producing") that each decode path honors in its own way, rather than a windowing controller + that reaches into either decoder's internals. The goal the original C3 stated still holds — no + format-specific logic leaking into the *scheduler* — but the spec now acknowledges the producer side has + two shapes, not one. - **C4 — Read-only playback only.** This is a memory-management change, not a UX change. No new user-visible control, no change to seek/transport semantics beyond what the listener already experiences. Seek must still feel identical. -- **C5 — Must window both delivery formats (WAV lossless AND Opus low-data).** Byte↔time mapping for - refill is exact and cheap for WAV (CBR: `byteRate` from the header). **Phase 18 (Opus) is sequenced - before this phase and is the concrete VBR driver here** — and its mapping is **also exact**, but by a - different mechanism: an Ogg Opus 320 stream has no linear time↔byte relationship, so - `OpusFormatDecoder.calculateByteOffset` resolves via a **precomputed seek index** (granule→byte, built at - transcode; Phase 18 §3.4a), a binary search that returns the exact page offset — **not** an approximate - page interpolation. (An earlier draft of this invariant said "approximate"; the Phase 18 seek-model - resolution, Daniel 2026-06-23, made Opus seeking accurate. Corrected here.) The window machinery must - express refill purely in terms of the decoder's existing `calculateByteOffset`, so the same code windows - WAV (via `byteRate`) and Opus (via the index) — **no WAV-special-cased offset math in the window layer**, - and no approximation for either. A window that opens away from byte 0 must also prepend the decoder's - retained/sidecar setup header (Phase 18 §3.4a B) — the format-agnostic refill path already routes - continuations through the decoder's header-carry, so this comes for free. (MP3/FLAC decoders are already - wired in the registry too — the registry dispatches on content-type today; an `OpusFormatDecoder` joins - them in Phase 18.) -- **C6 — No regression to the single-instance JS decoder concurrency guarantees.** The current code is - careful that only one streaming loop touches the single JS `StreamDecoder` at a time - (`DrainActiveStreamingTaskAsync`, the `_streamingCancellation` identity dance). Windowed refill - introduces *more* mid-stream fetches; it must route through the **same** drain/cancellation discipline, - not around it. +- **C5 — Window both decode paths without forking the scheduler/seam, reusing the live index-driven + seek for refill.** Both delivery formats must be windowed, and the byte↔time mapping each refill needs is + **already accurate and already shipping** for both: + - **WAV/MP3/FLAC** — `IFormatDecoder.calculateByteOffset` (CBR `byteRate` for WAV; the MP3/FLAC seek + accelerators for those), reached through `StreamDecoder.calculateByteOffset` / `AudioPlayer.seekBeyondBuffer`. + - **Opus** — `resolveOpusByteOffset(activeOpusSidecar, t)` (binary search the precomputed granule→byte + seek index in `OpusSeekData`), returning an exact page-start offset **and** a `landingTimeSeconds` for + the decoder's frame-accurate lead-trim. This is **accurate, not approximate, and landed in Phase 18.** + Phase 21 does **not** build either mapping. The window's refill trigger calls *whichever resolver the + active path already uses* — for Opus, the **same** `resolveOpusByteOffset` an explicit listener seek + calls (the live path in `AudioPlayer.seekBeyondBuffer`), so windowed refill is literally "a seek the + listener didn't initiate." A window opening away from byte 0 decodes correctly on the Opus path because + the setup header (`OpusHead`/`OpusTags`) is already cached from the sidecar and re-applied by + `reinitializeForRangeContinuation` (Phase 18 §3.4a B); the WAV path re-applies its retained header the + same way. **No new offset math, no approximation, no header re-fetch — all reused.** The invariant is + therefore *not* "make refill format-agnostic" (the two paths legitimately resolve offsets through + different code); it is **"reuse the live seek of each path verbatim; add only the eviction and the + refill *trigger*, never a second seek mechanism."** +- **C6 — No regression to the single-writer decoder concurrency guarantee — now covering both decoders.** + The C# loop is careful that only one streaming task feeds the active JS decoder at a time + (`DrainActiveStreamingTaskAsync`, the `_streamingCancellation` identity dance in + `StreamingAudioPlayerService`). This matters *more* for Opus: the WebCodecs `AudioDecoder` is stateful + and async — a `reset()`+`configure()` on a range-continuation (`reinitializeForRangeContinuation`) racing + a still-draining `push()` from a stale loop would corrupt inter-frame state, not merely deliver a wrong + buffer. Windowed refill introduces *more* mid-stream fetches against whichever decoder is active; every + one must route through the **same** drain/cancellation discipline, not around it. The discipline is + already decoder-agnostic at the C# layer (it cancels the loop, not the decoder), so this is a "keep using + it" invariant — but it is the rule most likely to be violated by a naive Opus refill, and is the hardest + failure to diagnose, so it is called out as a hard invariant for both paths. - **C7 — The Mix visualizer's data source is independent and must stay that way.** The Phase 10/12 WebGL2 lava visualizer renders from a **preprocessed high-res waveform datum** fetched per-track (`GET api/track/{entryKey}/waveform/high-res`), **not** from live decoded PCM. Confirmed: evicting @@ -103,8 +146,10 @@ docs. This phase **modifies that seam** — so the contract it must preserve is ### 3.0 The mental model A track's audio is a byte range `[0, fileLength)` on disk. At any moment the listener is at playback -position `P` (seconds → byte offset via the format decoder). The player should hold decoded -`AudioBuffer`s only for a bounded window roughly `[P - back, P + ahead]`: +position `P` (seconds → byte offset via the active path's resolver — `IFormatDecoder.calculateByteOffset` +for WAV/MP3/FLAC, `resolveOpusByteOffset` over the seek index for Opus). The player should hold decoded +`AudioBuffer`s only for a bounded window roughly `[P - back, P + ahead]` — and, on the Opus path, keep the +upstream WebCodecs decode queue near-empty too (§3.1): - **forward fill (`ahead`)** — enough decoded lookahead that playback never starves (covers the existing 500 ms scheduler lookahead plus network jitter headroom); @@ -119,42 +164,64 @@ position `P` (seconds → byte offset via the format decoder). The player should This is a **ring/sliding-window buffer keyed on playback position**, driven by high/low-water marks — the standard bounded-producer/bounded-consumer pattern, transplanted onto the decode→schedule seam. -### 3.1 Why this is a generalization of seek-beyond-buffer, not a new mechanism +### 3.1 Why refill is a generalization of seek-beyond-buffer, not a new mechanism — for both paths -The seek-beyond-buffer path already does **every primitive** the window needs, just triggered manually -and one-shot: +The seek-beyond-buffer path already does **every refill primitive** the window needs, just triggered +manually and one-shot. As of Phase 18 each primitive has a WAV branch and an Opus branch, both live: -| Window operation | Existing seek-beyond-buffer machinery it reuses | -|-------------------------------|-----------------------------------------------------------------------------------| -| Discard buffers, keep offset | `PlaybackScheduler.clearForSeek()` + `setPlaybackOffset()` (clears buffers, retains the absolute-time anchor) | -| Fetch from a byte offset | `TrackMediaClient.GetTrackMedia(key, byteOffset)` → `Range: bytes=X-` → 206 | -| Decode a header-less body | `StreamDecoder.reinitializeForRangeContinuation(remainingByteLength)` | -| Map time → byte offset | `StreamDecoder.calculateByteOffset()` → `IFormatDecoder.calculateByteOffset()` | -| Single-loop safety on refetch | `_streamingCancellation` swap + `DrainActiveStreamingTaskAsync()` | +| Window operation | WAV/MP3/FLAC machinery reused | Opus machinery reused (Phase 18, landed) | +|-------------------------------|--------------------------------------------------------------------|--------------------------------------------------------------------------------------| +| Discard buffers, keep offset | `PlaybackScheduler.clearForSeek()` + `setPlaybackOffset()` | *same* — the scheduler is shared | +| Fetch from a byte offset | `TrackMediaClient` → `Range: bytes=X-` → 206 | *same* (with `?format=opus`) — the Range path is shared | +| Map time → byte offset | `StreamDecoder.calculateByteOffset()` → `IFormatDecoder` | `resolveOpusByteOffset(activeOpusSidecar, t)` (index binary search → exact page) | +| Decode a header-less body | `StreamDecoder.reinitializeForRangeContinuation(len)` | `OpusStreamDecoder.reinitializeForRangeContinuation(landingTime, target)` (demux/codec reset + lead-trim) | +| Single-loop safety on refetch | `_streamingCancellation` swap + `DrainActiveStreamingTaskAsync()` | *same* — the C# discipline is decoder-agnostic | -The difference is **eviction does not exist yet** (the scheduler only ever `clear()`s wholesale) and -**refill is one-shot** (a seek, not a continuous low-water-triggered loop). So the new work is two -seams: a *partial-evict* on the scheduler, and a *position-driven refill controller* on the player. The -fetch/decode/offset plumbing is reused verbatim. +The genuinely-new work, by path: + +- **Shared (both paths):** *partial eviction* on `PlaybackScheduler` (today it only ever `clear()`s + wholesale), and a *position-driven refill trigger* (a continuous low-water loop, not a one-shot seek). +- **WAV path:** *back-pressure on the C# `ReadAsync` loop* — stop reading the socket above the high-water + mark, resume below low-water. WAV's `StreamDecoder` decodes synchronously into the scheduler, so the + read loop is the *only* producer to throttle; pausing `ReadAsync` bounds it fully. +- **Opus path:** *the same C# back-pressure, plus bounding the WebCodecs decode-ahead.* Throttling + `ReadAsync` alone is **not sufficient** for Opus, because `OpusStreamDecoder.push()` is async and the + WebCodecs `AudioDecoder` keeps its own internal work queue (`decodeQueueSize`) plus a `decodedQueue: + AudioData[]` of decoded-but-not-yet-converted frames. The Opus producer must also stop *feeding the + decoder* (stop demuxing/decoding new packets) when the scheduler is full, and resume below low-water — + back-pressure on the **demuxer/decoder feed**, not only on the socket read. This is the one place the + two paths' windowing genuinely diverges. + +Everything else — the fetch, the offset resolution, the header-carry continuation, the single-loop +cancellation safety — is **reused verbatim** on both paths. Phase 21 builds eviction + the refill trigger ++ the (per-path) back-pressure; it builds **no** new fetch, offset, or seek mechanism. ### 3.2 The three candidate directions Per file convention the alternatives are recorded; the recommendation follows. **Direction A — Sliding window on the existing single forward stream (recommended).** -Keep the current model where the C# loop reads one forward HTTP stream and pumps chunks into the JS -decoder. Add two things: (1) `PlaybackScheduler` gains *partial eviction* — drop buffers whose +Keep the current model where the C# loop reads one forward HTTP stream and pumps chunks into the active +JS decoder. Add three things: (1) `PlaybackScheduler` gains *partial eviction* — drop buffers whose absolute-time end is older than `P - back`, adjusting its index bookkeeping so `getCurrentPosition()` -and scheduling stay correct against a buffer array that no longer starts at index 0; (2) a -*back-pressure* signal — when forward decoded lookahead exceeds the high-water mark, the C# loop -**pauses reading** the HTTP stream (stops calling `ReadAsync`) until playback drains it below low-water, -then resumes. Memory is bounded by high-water + back-retain. Seek-back beyond the retained window falls -through to the **existing** seek-beyond-buffer path unchanged. +and scheduling stay correct against a buffer array that no longer starts at index 0 (**shared by both +paths** — the scheduler is the common sink); (2) *back-pressure on the C# read loop* — when forward +decoded lookahead exceeds the high-water mark, the C# loop **pauses reading** the HTTP stream (stops +calling `ReadAsync`) until playback drains it below low-water, then resumes; (3) **for the Opus path +only, back-pressure on the WebCodecs decode-ahead** — the producer also stops demuxing/decoding new +packets when the scheduler is full, so the `AudioDecoder` work queue and `decodedQueue` do not balloon +behind a throttled socket. Memory is bounded by high-water + back-retain on both paths. Seek-back beyond +the retained window falls through to the **existing** seek-beyond-buffer path (the right one per format) +unchanged. *Why recommended:* smallest change to the load-bearing seam; reuses the live forward stream (no extra -connections in the common case); eviction and back-pressure are the only genuinely new mechanisms, and -both are local (one to the scheduler, one to the read loop). Back-pressure via "stop reading the socket" -is exactly how TCP flow control already wants to behave — pausing `ReadAsync` lets the kernel window -close; we are not fighting the transport. +connections in the common case); eviction and back-pressure are the only genuinely new mechanisms, all +local (the scheduler; the read loop; for Opus, the demux/decode feed). Back-pressure via "stop reading +the socket" is exactly how TCP flow control already wants to behave — pausing `ReadAsync` lets the kernel +window close; we are not fighting the transport. The Opus decode-ahead bound is the one addition Phase 18 +forces, and it is local to the Opus producer. +*Open question it raises (OQ6, new):* whether the two paths' back-pressure is driven by **one shared +window controller** that exposes a "scheduler full / drained" signal both producers poll, or by **two +parallel implementations** sharing only the eviction code. Recommend the **shared signal** — see §6 OQ6. **Direction B — Discrete window segments, each its own Range fetch.** Treat the file as fixed-size byte segments (e.g. 4 MB). Hold N decoded segments around `P`; fetch the @@ -177,32 +244,41 @@ problem. wants containerized formats (fragmented MP4/WebM, or MP3/AAC elementary streams). The entire bespoke visualizer/spectrum graph is wired to the Web Audio `AudioContext`, not a `` element. Adopting MSE is a **rewrite of the playback substrate**, not a windowing change. It *looked* like the real -long-term answer once compressed delivery arrived — but Daniel has decided compressed delivery -(**Phase 18 Opus**) will feed the **same bespoke graph** via the `IFormatDecoder` seam, so the -compressed-delivery move that would have justified MSE happens *without* surrendering the graph. **The -bespoke graph is a deliberate long-term commitment; MSE is rejected.** Direction A is therefore the -permanent destination, not a stopgap that MSE will retire. Recorded as considered-and-declined. +long-term answer once compressed delivery arrived — but compressed delivery (**Phase 18 Opus, now +landed**) feeds the **same bespoke graph** via the WebCodecs `IStreamingDecoder` seam (parallel to the +WAV `IFormatDecoder` seam, both terminating at the shared `PlaybackScheduler`), so the compressed-delivery +move that would have justified MSE happened *without* surrendering the graph. Notably, Phase 18 chose a +**WebCodecs `AudioDecoder`** for Opus rather than `decodeAudioData` — which is itself the "use the platform +codec, keep the bespoke graph" move, but at the *decoder* granularity, not the *media-element* granularity +MSE would impose. **The bespoke graph is a deliberate long-term commitment; MSE is rejected.** Direction A +is therefore the permanent destination, not a stopgap that MSE will retire. Recorded as +considered-and-declined. ### 3.3 Recommended direction: A, with B held as the documented fallback Direction A is the smallest coherent change that hits the headline (bounded memory under a 1 GB stream) -while honoring C1–C7. It keeps the live forward stream, reuses the seek-beyond-buffer path for the only -genuinely random-access case (seek-back past the retained tail), and isolates the two new mechanisms. -**The final architecture and the exact eviction/back-pressure API are staff-engineer's call at -implementation** (per file convention); this spec fixes the *shape* and the invariants, not the method -signatures. +while honoring C1–C7. It keeps the live forward stream, reuses each path's seek-beyond-buffer machinery +for the only genuinely random-access case (seek-back past the retained tail), and isolates the new +mechanisms (eviction shared; back-pressure per path). **The final architecture and the exact +eviction/back-pressure API are staff-engineer's call at implementation** (per file convention); this spec +fixes the *shape* and the invariants, not the method signatures. ### 3.4 SOLID / road-not-taken rationale -- **SRP, preserved.** Eviction is a `PlaybackScheduler` concern (it already owns buffer storage); refill - orchestration is a player-service/`StreamDecoder` concern (they already own the fetch loop); byte↔time - math stays in `IFormatDecoder`. No responsibility crosses a boundary it does not already own. -- **OCP, via C3/C5.** Windowing added in the format-agnostic layer means wiring MP3/FLAC later changes - zero window code. The window expresses refill through `calculateByteOffset` — the one seam the - decoders already implement. -- **The seam stays single-writer (C6).** Every new refetch routes through the existing - cancellation/drain discipline, so "only one loop touches the JS decoder" remains true. This is the - rule most likely to be violated by a naive implementation and is called out as a hard invariant. +- **SRP, preserved.** Eviction is a `PlaybackScheduler` concern (it already owns buffer storage, and is + the single shared sink both decode paths feed); refill orchestration is a player-service concern (it + already owns the C# fetch loop and the seek dispatch); byte↔time math stays where each path already keeps + it — `IFormatDecoder.calculateByteOffset` for WAV/MP3/FLAC, `resolveOpusByteOffset` (over `OpusSeekData`) + for Opus. No responsibility crosses a boundary it does not already own. +- **OCP, via the shared sink + the live per-path seek.** Eviction added at the scheduler changes zero + decoder code on either path. Refill reuses each path's *already-implemented* offset resolver — Phase 21 + adds no offset math to either seam. The one place windowing is not purely additive is the Opus + decode-ahead bound (§3.1), which lives inside the Opus producer, not in the shared layer. +- **The seam stays single-writer (C6) — for both decoders.** Every new refetch routes through the existing + C# cancellation/drain discipline, so "only one loop feeds the active decoder" remains true for the WAV + `StreamDecoder` and the stateful Opus `AudioDecoder` alike. This is the rule most likely to be violated + by a naive Opus refill (a stale `push()` racing a `reset()`+`configure()`), and is called out as a hard + invariant. - **Road not taken — eager full decode with a memory cap that just stops decoding.** Tempting (decode until you hit a byte budget, then stop) but it breaks playback of long tracks past the cap entirely — it bounds memory by *refusing to play the rest*, not by sliding. Rejected: it is a degradation, not a @@ -213,8 +289,15 @@ signatures. ## 4. Use cases - **UC1 — Play a 1 GB+ DJ MIX start to finish (the headline).** Memory stays bounded throughout; the - listener experiences continuous playback identical to a short track. -- **UC2 — Seek forward within a long track.** Already handled by seek-beyond-buffer; under windowing the + listener experiences continuous playback identical to a short track. **Holds in both formats** — the + lossless WAV mix (~2 GB decoded if unbounded) and the low-data Opus mix (small transfer, but the *same* + ~2 GB decoded float once played, so it needs windowing just as much; see §1). +- **UC1-Opus — The same mix streamed as Opus, windowed.** The low-data win (Phase 18) shrinks the + *transfer*; Phase 21 shrinks the *decoded footprint*. The two compound: a metered-connection listener on + Opus gets both the small download and the bounded memory. Windowing the Opus path additionally bounds the + WebCodecs decode-ahead and `decodedQueue`, not only the scheduler (§3.1). +- **UC2 — Seek forward within a long track.** Already handled by seek-beyond-buffer (the right resolver per + format — `IFormatDecoder` for WAV, the live `resolveOpusByteOffset` for Opus); under windowing the forward seek clears the window and refills at the target — no behavior change, now with eviction so the pre-seek region does not linger. - **UC3 — Seek back a few seconds.** Served from the back-retain window with **no** network refetch @@ -248,8 +331,11 @@ This phase touches the **same decoder/scheduler seam** as the deferred Phase 1.3 - **1.5 Gapless (deferred).** Sample-accurate hand-off of the next track's first buffer at the current track's last buffer. Windowing changes *which* buffers are retained but not the hand-off mechanism; the only care point is that the current track's **final** window must not be evicted before the gapless - boundary is scheduled. A minor invariant for whoever builds 1.5, not a blocker. Note 1.5's existing - WAV-only caveat stands. + boundary is scheduled. A minor invariant for whoever builds 1.5, not a blocker. **Phase 18 note:** the + former "1.5 is WAV-only" caveat is superseded — Opus is live, and it has its own encoder pre-skip/priming + (handled once by the WebCodecs decoder, see `OpusStreamDecoder.ts`), so a gapless Opus hand-off must + respect the end-trim against the sidecar's authoritative total length. That is 1.5's problem to absorb, + not Phase 21's; flagged so 1.5 inherits it. - **1.6 Track-skip on error (deferred).** *Windowing enlarges the error surface — call this out.* Today a fetch failure happens at load (one fetch) or at a user seek (one fetch). Windowed refill issues **mid-stream** fetches the listener did not initiate; one of those can fail at byte 700 M of a 1 GB @@ -260,9 +346,12 @@ This phase touches the **same decoder/scheduler seam** as the deferred Phase 1.3 failure handling into Phase 21's acceptance criteria** (AC6) rather than leaving it entirely to 1.6 — it is created by this phase. - **1.7 Safari compatibility (deferred).** Windowing adds no new Safari-specific surface beyond what the - streaming path already has. The one adjacency: more frequent `AudioContext` activity during refill - should be checked against the older-Safari `webkitAudioContext` quirks when 1.7 is addressed — note it, - do not block on it. + streaming path already has. Two adjacencies, both Phase-18-introduced: (a) more frequent `AudioContext` + activity during refill should be checked against older-Safari `webkitAudioContext` quirks; (b) the Opus + path depends on **WebCodecs `AudioDecoder`**, whose Safari availability is narrower than `decodeAudioData` + Ogg-Opus support — Phase 18's capability gate already falls a non-WebCodecs browser back to the lossless + WAV path, so a Safari that can't run the Opus pipeline windows the *WAV* path (which has no decode-ahead + locus, only the scheduler), i.e. the simpler windowing case. Note it; do not block on it. --- @@ -294,22 +383,50 @@ These are policy calls with user-visible or resource trade-offs — flagged rath adopt MSE. The bespoke Web Audio decode→schedule graph stays — it is bespoke by deliberate choice, a long-term commitment, not a stopgap.** Daniel's rationale: the player is intentionally a custom graph, not an HTML `` element; the compressed-delivery move that *would* have made MSE - tempting is being met instead by **Phase 18 (Opus low-data path)** feeding the **same bespoke graph** - through the `IFormatDecoder` seam — so compressed delivery arrives *without* surrendering the graph. - Consequence for this phase: Direction A (the hand-rolled sliding window) is the destination, not a - placeholder; invest in it as permanent machinery. It will window both the WAV and the Opus path - (the sequencing note at the top). Direction C is recorded as **considered and declined** per file - convention; kept visible so a future reader sees the road not taken and why. - `[RESOLVED — bespoke graph retained; MSE rejected]` + tempting was met instead by **Phase 18 (Opus low-data path, now landed)** feeding the **same bespoke + graph** through the WebCodecs `IStreamingDecoder` seam (parallel to the WAV `IFormatDecoder` seam) — so + compressed delivery arrived *without* surrendering the graph. Consequence for this phase: Direction A + (the hand-rolled sliding window) is the destination, not a placeholder; invest in it as permanent + machinery. It windows both the WAV and the Opus path (the header note). Direction C is recorded as + **considered and declined** per file convention; kept visible so a future reader sees the road not taken + and why. `[RESOLVED — bespoke graph retained; MSE rejected]` +- **OQ6 — One window controller for both decode paths, or two? (NEW — raised by the Phase 18 two-path + reality.)** Eviction is unambiguously shared (the scheduler is the one sink). Back-pressure is not: the + WAV path throttles the C# `ReadAsync` loop; the Opus path must *also* throttle the WebCodecs + decode-ahead (§3.1). Should there be **one window controller** exposing a uniform "scheduler full / + drained" signal that both producers honor in their own way (recommended — keeps the *policy* — window + sizes, water-marks, OQ1/OQ3 — in one place, with two thin per-path back-pressure hooks), or **two + parallel windowing implementations** sharing only the eviction code (simpler per-path, but duplicates the + water-mark logic and risks the two drifting)? Recommend the **shared controller + per-path hook**. This + is more an architecture call than a product call — flagged for staff-engineer at implementation, with the + recommendation as the default. `[staff-engineer call; recommendation: shared controller]` +- **OQ7 — How does the Opus WebCodecs decode-ahead bound interact with scheduler eviction? (NEW; technical, + for staff-engineer.)** The Opus producer has two queues to bound (the `AudioDecoder` work queue and + `decodedQueue: AudioData[]`) *plus* the shared scheduler. The clean rule is "stop feeding the decoder when + decoded-lookahead-in-the-scheduler exceeds high-water" — i.e. the **scheduler's** fill level is the + single back-pressure signal, and the upstream Opus queues are kept near-empty by simply not demuxing + ahead. The alternative (let the decoder run ahead into `decodedQueue` and bound *that* separately) adds a + second budget to tune and a second eviction point. Recommend the former: **one fill signal (scheduler + decoded-lookahead), drive both the read-loop pause and the demux/decode pause from it.** Confirm at + implementation that the WebCodecs decoder tolerates being starved of input mid-stream and resumes cleanly + (it should — it is fed packet-by-packet via `decode()`), and that `decodedQueue` is drained promptly so + it never holds more than one `push()` worth. `[staff-engineer call; recommendation: single + scheduler-fill signal]` --- ## 7. Acceptance criteria -- **AC1 (headline) — Bounded memory under a 1 GB stream.** Playing a 1 GB+ WAV mix start to finish, the - browser tab's retained decoded-audio memory stays bounded to the configured window (not growing toward - ~2 GB). Verifiable via browser memory tooling: peak decoded-audio footprint is independent of track - length and tracks the window-size policy, not the file size. +- **AC1 (headline) — Bounded memory under a 1 GB stream, in BOTH formats.** Playing a 1 GB+ mix start to + finish — **as lossless WAV and as low-data Opus** — the browser tab's retained decoded-audio memory + stays bounded to the configured window (not growing toward ~2 GB). Verifiable via browser memory tooling: + peak decoded-audio footprint is independent of track length and tracks the window-size policy, not the + file size. The Opus case must be verified explicitly — its small *transfer* does not imply a small + *decoded* footprint (§1), so "Opus already streams small" is **not** sufficient. +- **AC1-Opus — The Opus upstream decode-ahead is bounded too (§3.1 / OQ7).** Under a long Opus stream, the + WebCodecs decode queue and `decodedQueue` do not grow unboundedly behind the scheduler — back-pressure + reaches the demux/decode feed, not only the scheduler. Verifiable: the upstream queues stay near-empty + (one `push()` worth) regardless of stream length. - **AC2 — Playback-start latency at parity (C2).** First-audio latency for a track is unchanged from pre-windowing (within noise). Windowing does not introduce a fetch-then-play stall. - **AC3 — Continuous playback, no starvation.** A long mix plays edge to edge with no audible gaps, @@ -324,41 +441,70 @@ These are policy calls with user-visible or resource trade-offs — flagged rath "playing" with a starved scheduler). It must not silently hang. - **AC7 — The Mix visualizer is unaffected (C7).** With the lava visualizer running on a long mix, the visualizer renders identically (it reads the preprocessed datum, never the evicted buffers). -- **AC8 — Single-decoder concurrency invariant holds (C6).** Under rapid seek + refill activity, no - interleaved `ProcessStreamingChunk` calls corrupt the single JS decoder (the existing drain/cancel - discipline still governs every fetch). +- **AC8 — Single-writer decoder concurrency invariant holds (C6) — both decoders.** Under rapid seek + + refill activity, no interleaved `ProcessStreamingChunk` / `push` calls corrupt the active decoder — the + existing drain/cancel discipline still governs every fetch. **For Opus this is stricter:** no stale + `push()` may land against the WebCodecs `AudioDecoder` across a `reinitializeForRangeContinuation` + reset+reconfigure (which would corrupt inter-frame state, not just a buffer). Verify under a rapid + seek-storm on an Opus mix specifically. --- ## 8. Wave decomposition +**Decomposition choice: split by *concern* (eviction → back-pressure → seek-back refill → validate), not +by *path* (WAV-track vs Opus-track).** Rationale: the eviction concern (21.1) is genuinely shared — the +scheduler is the one sink both paths feed — so a path-split would duplicate the hardest correctness work or +arbitrarily assign it to one track. The concern spine keeps that shared work as a single cold-start wave +and lets the *one* genuinely path-divergent concern (back-pressure, 21.2) carry an explicit two-track +split *inside* the wave rather than fracturing the whole phase. This also matches how the seek-back refill +(21.3) reuses each path's already-live seek — it is one concern (window-miss → refetch) with a per-path +resolver underneath, not two features. The spine is unchanged from the original spec; the mechanisms +inside 21.2 and 21.3 are made correct for both paths. + Dependency shape: `21.1 → 21.2 → 21.3`, with `21.4` validating the whole. 21.1 is the cold-start prerequisite and the load-bearing change; the rest layer on it. -- **21.1 — Partial eviction in `PlaybackScheduler` (cold-start; the load-bearing change).** Give the - scheduler the ability to drop already-played buffers and keep its position/index bookkeeping correct - against a buffer array that no longer begins at absolute time 0 (today `getCurrentPosition`, - `playFromPosition`, and the scheduling loop all assume `buffers[0]` is the track start). This is the - hardest correctness work in the phase — the time-anchor math must stay exact through eviction. No - refill yet; with eviction alone and the forward read loop unchanged, this is provably memory-bounded - for the *played* region. **Independent of the §6 open questions** — it can begin immediately; the - window *sizes* (OQ1/OQ3) are parameters fed in later. Settled and cold-start. -- **21.2 — Back-pressure on the forward read loop (the bound on the *unplayed* region).** Make the C# - `StreamAudioWithEarlyPlayback` loop stop calling `ReadAsync` when forward decoded lookahead exceeds the - high-water mark, and resume below low-water. Together with 21.1, this bounds *both* the played and - unplayed sides — the full memory guarantee (AC1). Must route resume/pause through the existing - cancellation-safe single-loop discipline (C6). **Depends on 21.1** (eviction must exist so the drained - region is reclaimed, not merely un-read). -- **21.3 — Seek-back-past-window refill (close the random-access case).** Wire UC4 — when a backward - seek lands earlier than the retained tail, refetch via the existing seek-beyond-buffer Range path - pointed at the earlier offset, and the minimal AC6 refill-failure handling. Mostly **reuse** of the - landed seek path; the new work is the trigger (window-miss detection) and the clean-failure path. - **Depends on 21.1 + 21.2** (needs the window boundaries they define). -- **21.4 — Validation pass against the 1 GB target (acceptance).** Exercise AC1–AC8 against a real 1 GB+ - mix: memory profiling (AC1), latency parity (AC2), edge-to-edge playback (AC3), the seek matrix - (AC4/AC5), induced refill failure (AC6), visualizer-running (AC7), and rapid-seek concurrency (AC8). - Largely test/measurement; any break is likely a tuning fix in the 21.1 anchor math or the 21.2 - water-marks. **Depends on 21.1–21.3.** +- **21.1 — Partial eviction in `PlaybackScheduler` (cold-start; the load-bearing change; SHARED by both + paths).** Give the scheduler the ability to drop already-played buffers and keep its position/index + bookkeeping correct against a buffer array that no longer begins at absolute time 0 (today + `getCurrentPosition`, `playFromPosition`, and the scheduling loop all assume `buffers[0]` is the track + start). This is the hardest correctness work in the phase — the time-anchor math must stay exact through + eviction. Because both decode paths feed the scheduler identically via `addBuffer`, **eviction is written + once and serves both** — no per-path branch. No refill yet; with eviction alone and the forward producers + unchanged, this is provably memory-bounded for the *played* region on both paths. **Independent of the §6 + open questions** — it can begin immediately; the window *sizes* (OQ1/OQ3) are parameters fed in later. + Settled and cold-start. +- **21.2 — Back-pressure (the bound on the *unplayed* region) — two tracks, one signal.** Bound the + not-yet-played decoded audio by stopping production above a high-water mark and resuming below low-water, + driven by the scheduler's decoded-lookahead fill (OQ7). The fill *signal* is shared; the *throttle* has + two sites because Phase 18 gave the two paths different producers: + - **21.2a — C# read-loop back-pressure (serves both paths).** Make `StreamAudioWithEarlyPlayback` stop + calling `ReadAsync` above high-water and resume below low-water. Routes resume/pause through the + existing cancellation-safe single-loop discipline (C6). For the WAV path this is *sufficient* (its + `StreamDecoder` decodes synchronously into the scheduler). + - **21.2b — Opus decode-ahead back-pressure (Opus path only).** Additionally stop demuxing/decoding new + packets when the same fill signal is over high-water, so the WebCodecs decode queue and `decodedQueue` + do not balloon behind a throttled socket (§3.1, OQ7). This is the one mechanism with no WAV analogue. + Confirm the WebCodecs decoder resumes cleanly after being starved of input mid-stream. + Together with 21.1 this bounds *both* the played and unplayed sides on *both* formats — the full memory + guarantee (AC1 + AC1-Opus). **Depends on 21.1** (eviction must exist so the drained region is reclaimed, + not merely un-read). Per OQ6, 21.2a and 21.2b ideally share one window controller exposing the fill + signal; the recommendation is the shared controller + two thin hooks. +- **21.3 — Seek-back-past-window refill (close the random-access case; one concern, per-path resolver).** + Wire UC4 — when a backward seek lands earlier than the retained tail, refetch via the existing + seek-beyond-buffer path pointed at the earlier offset, **using whichever resolver the active path already + ships** (`IFormatDecoder`/`StreamDecoder.calculateByteOffset` for WAV; the live + `resolveOpusByteOffset` + `OpusStreamDecoder.reinitializeForRangeContinuation` for Opus) — plus the + minimal AC6 refill-failure handling. Mostly **reuse** of the landed seek paths; the new work is the + trigger (window-miss detection) and the clean-failure path, both format-agnostic. **Depends on 21.1 + + 21.2** (needs the window boundaries they define). +- **21.4 — Validation pass against the 1 GB target, BOTH formats (acceptance).** Exercise AC1–AC8 against a + real 1 GB+ mix **streamed as WAV and as Opus**: memory profiling (AC1 both formats + AC1-Opus upstream + queues), latency parity (AC2), edge-to-edge playback (AC3), the seek matrix (AC4/AC5), induced refill + failure (AC6), visualizer-running (AC7), and rapid-seek concurrency (AC8 — including the Opus + seek-storm). Largely test/measurement; any break is likely a tuning fix in the 21.1 anchor math, the + 21.2 water-marks, or the 21.2b Opus decode-ahead bound. **Depends on 21.1–21.3.** --- @@ -366,18 +512,35 @@ prerequisite and the load-bearing change; the rest layer on it. - Root `CLAUDE.md` "Streaming-first audio playback" / `CONTEXT.md §3.5` — the seam this phase modifies; the §2 invariants here restate its contract. Both flag it as the most load-bearing path. -- `PLAN.md` Phase 4 (landed) / `COMPLETED.md` — the HTTP Range `bytes=X-` primitive this generalizes. +- **`COMPLETED.md` Phase 18 — Opus Low-Data Streaming (landed 2026-06-23) — read this first.** The + "as-built divergence" note records why Opus uses a **WebCodecs `AudioDecoder`** streaming pipeline + (`IStreamingDecoder`) rather than the spec'd-and-replaced per-segment `decodeAudioData`/`IFormatDecoder` + model. This is the two-path reality this phase reconciles to. `product-notes/phase-18-opus-low-data-streaming.md` + is the design memo (note: its §3.4 `OpusFormatDecoder` framing predates the WebCodecs divergence — the + *seek-index/sidecar* design in §3.4a is accurate and landed; the *decoder-shape* discussion was superseded + by `IStreamingDecoder`). +- `PLAN.md` Phase 4 (landed) / `COMPLETED.md` — the HTTP Range `bytes=X-` primitive this generalizes + (now serving both `?format=lossless` and `?format=opus`). - `PLAN.md` Phase 1.3 / 1.4 / 1.5 / 1.6 / 1.7 — the deferred decoder/scheduler-seam features; §5 above - reconciles each. + reconciles each (1.5 and 1.7 updated for the Opus path). - `PLAN.md` Phase 9 — defines the `Mix` medium (single long track), the canonical 1 GB case. - `PLAN.md` Phase 10 / `product-notes/phase-10-mix-visualizer-lava-reframe.md` / `product-notes/phase-12-waveform-visualizer-generalization.md` — establishes the preprocessed per-track high-res waveform datum; the basis for C7 (visualizer does not read live PCM). -- `DeepDrftPublic/Interop/audio/PlaybackScheduler.ts` — owns the unbounded `buffers: AudioBuffer[]`; - 21.1 lives here. -- `DeepDrftPublic/Interop/audio/StreamDecoder.ts` — `reinitializeForRangeContinuation`, - `calculateByteOffset`; the refill substrate. +- `DeepDrftPublic/Interop/audio/PlaybackScheduler.ts` — owns the unbounded `buffers: AudioBuffer[]`, the + **shared sink for both decode paths**; 21.1 (eviction) lives here. +- `DeepDrftPublic/Interop/audio/AudioPlayer.ts` — the dispatch: `processFormatChunk` (WAV/MP3/FLAC) vs + `processOpusChunk` (Opus), both calling `scheduler.addBuffer`; `seekBeyondBuffer`/`reinitializeFromOffset` + branch per path; the place the refill trigger (21.3) and the fill-signal wiring (21.2) hook. +- `DeepDrftPublic/Interop/audio/StreamDecoder.ts` + `IFormatDecoder.ts` — the WAV/MP3/FLAC refill substrate + (`reinitializeForRangeContinuation`, `calculateByteOffset`). +- `DeepDrftPublic/Interop/audio/IStreamingDecoder.ts` + `OpusStreamDecoder.ts` + `OggDemuxer.ts` + + `OpusSidecar.ts` — the **Opus** path: the WebCodecs decode pipeline, the `decodeQueueSize`/`decodedQueue` + upstream accumulation 21.2b must bound, and the live `resolveOpusByteOffset` / + `reinitializeForRangeContinuation(landingTime, target)` seek 21.3 reuses. **`IStreamingDecoder.ts` is the + seam the Opus windowing hooks into** (push/complete/reinitialize lifecycle). - `DeepDrftPublic.Client/Services/StreamingAudioPlayerService.cs` — the C# forward read loop - (`StreamAudioWithEarlyPlayback`), the seek-beyond-buffer path (`SeekBeyondBuffer`), and the - cancellation/drain discipline (C6); 21.2/21.3 live here. -- `DeepDrftPublic.Client/Clients/TrackMediaClient.cs` — the Range-capable media fetch reused by refill. + (`StreamAudioWithEarlyPlayback`, feeding *both* decoders), the seek-beyond-buffer path (`SeekBeyondBuffer`), + and the cancellation/drain discipline (C6); 21.2a/21.3 live here. +- `DeepDrftPublic.Client/Clients/TrackMediaClient.cs` — the Range-capable media fetch (with the `?format=` + param) reused by refill on both paths.