docs: reconcile Phase 21 spec with as-built Phase 18 (two decode paths)

Window both the WAV StreamDecoder and Opus WebCodecs paths feeding one PlaybackScheduler — shared eviction, per-path back-pressure; reuse the now-live index-driven Opus seek for refill. Drops stale approximate-seek language; adds OQ6/OQ7.
This commit is contained in:
daniel-c-harvey
2026-06-23 22:01:49 -04:00
parent bbcf8be677
commit ccf7d3dbe3
2 changed files with 395 additions and 211 deletions
+78 -57
View File
@@ -457,44 +457,52 @@ plays without the whole decoded PCM accumulating in the browser. **Public listen
(`DeepDrftPublic.Client` player stack + `DeepDrftPublic` TypeScript audio interop); no CMS, no API (`DeepDrftPublic.Client` player stack + `DeepDrftPublic` TypeScript audio interop); no CMS, no API
endpoint, no schema change. endpoint, no schema change.
**Phase 18 (Opus Low-Data Streaming) has landed — Phase 21 is the next pickup.** The derived Ogg Opus 320 low-data path (Phase 18, `COMPLETED.md`) is the prerequisite; windowing must work across **both** delivery formats. Phase 21's C5 invariant already anticipated this ("must not foreclose **Phase 18 (Opus Low-Data Streaming) has landed — Phase 21 is the next pickup, reconciled to the as-built
MP3/FLAC"); **Opus is now the concrete VBR/paged driver** — windowing an Opus stream uses the decoder's two-decode-path reality.** Phase 18 left **two** decode paths feeding the one `PlaybackScheduler`:
**accurate index-based** byte↔time mapping (`OpusFormatDecoder.calculateByteOffset`, a binary search in the WAV/MP3/FLAC via `StreamDecoder`/`IFormatDecoder`, and Opus via a **WebCodecs `AudioDecoder`** pipeline
Phase 18 precomputed seek index — *not* the exact CBR-WAV `byteRate` math, and *not* approximate page (`OggDemuxer` → `OpusStreamDecoder`, the `IStreamingDecoder` seam — *not* `IFormatDecoder`; per-segment
interpolation: VBR-safe and exact, per the Phase 18 seek-model resolution 2026-06-23). The windowed refill `decodeAudioData` was tried and replaced). Windowing must bound **both**. The accurate index-driven Opus
controller calls the **same** index resolver an explicit seek does, and a window opening away from byte 0 seek the original spec assumed Phase 21 would build is **already live** (`resolveOpusByteOffset` over the
still decodes via the Phase 18 sidecar setup header. Build the window machinery format-agnostically so it precomputed seek index → Range fetch → `reinitializeForRangeContinuation` with frame-accurate lead-trim);
inherits Opus for free. Phase 21 **reuses** it for window-miss refills rather than building it. Opus seek is VBR-safe and
**accurate**, not approximate (the earlier "approximate page interpolation" framing is corrected).
The network path already streams in adaptive 1664 KB chunks. The accumulation is on the **decode The network path already streams in adaptive 1664 KB chunks. The accumulation is on the **decode
side**: `PlaybackScheduler` holds an `AudioBuffer[]` it **never evicts** ("Supports pause/resume/seek by side**, and now has two faces. The shared one: `PlaybackScheduler` holds an `AudioBuffer[]` it **never
retaining all buffers" — its own doc comment). Decoded PCM is larger than the source (Web Audio is evicts** — both paths `addBuffer` into it, nothing is removed. Decoded PCM is larger than the source
32-bit float per sample/channel — a 16-bit stereo WAV roughly doubles once decoded), so a 1 GB WAV (Web Audio is 32-bit float per sample/channel — a 16-bit stereo WAV roughly doubles once decoded), so a
becomes ~2 GB of retained float data. That is the OOM. The fix: hold only a sliding forward window plus a 1 GB WAV becomes ~2 GB of retained float; **a low-data Opus mix decodes to the same ~2 GB once played**,
small back-retain, discard already-played buffers, and refill on demand. so its small transfer does not spare it. The Opus-only second face: the WebCodecs decode queue +
`decodedQueue` accumulate upstream of the scheduler too. The fix: hold only a sliding forward window plus
a small back-retain, discard already-played buffers, and refill on demand — with back-pressure on the C#
read loop (both paths) **and** on the Opus demux/decode feed (Opus only).
**Architectural spine — a sliding window keyed on playback position, built as a generalization of the **Architectural spine — a sliding window keyed on playback position, built as a generalization of the
landed seek-beyond-buffer path.** The Phase 4 HTTP `Range: bytes=X-` → 206 primitive already does every landed seek-beyond-buffer paths.** The Phase 4 HTTP `Range: bytes=X-` → 206 primitive already does every
plumbing primitive the window needs (discard-buffers-keep-offset via `clearForSeek`/`setPlaybackOffset`; plumbing primitive the window needs, with a WAV branch and an Opus branch, both live: discard-buffers-keep-
fetch-from-offset via `TrackMediaClient`; decode-header-less-body via offset via `clearForSeek`/`setPlaybackOffset` (shared); fetch-from-offset via `TrackMediaClient` (shared,
`StreamDecoder.reinitializeForRangeContinuation`; time→byte via `IFormatDecoder.calculateByteOffset`), now with `?format=`); decode-header-less-body via `StreamDecoder.reinitializeForRangeContinuation` (WAV) /
just triggered manually and one-shot. The only genuinely new mechanisms are **partial eviction** on the `OpusStreamDecoder.reinitializeForRangeContinuation` (Opus); time→byte via `IFormatDecoder.calculateByteOffset`
scheduler and **back-pressure** on the forward read loop (stop calling `ReadAsync` above a high-water (WAV) / `resolveOpusByteOffset` over the seek index (Opus) — just triggered manually and one-shot. The
mark, resume below low-water). Recommended **Direction A** (sliding window on the existing single forward genuinely new mechanisms: **partial eviction** on the shared scheduler (one implementation, both paths),
stream); **Direction B** (discrete Range-fetched segments — the HLS/DASH/MSE-eviction analogue) held as and **back-pressure** — on the C# read loop (both paths) **and additionally on the Opus WebCodecs
the documented fallback; **Direction C** (adopt MSE and let the browser manage the buffer) **rejected decode-ahead** (`decodeQueueSize` + `decodedQueue`, Opus only, since throttling the socket alone doesn't
(OQ5 = NO, Daniel 2026-06-23)** — the bespoke Web Audio graph is a deliberate long-term commitment, and bound the async decoder's queues). Recommended **Direction A** (sliding window on the existing single
the compressed-delivery move that would have justified MSE is met instead by **Phase 18 (Opus) feeding forward stream); **Direction B** (discrete Range-fetched segments — the HLS/DASH/MSE-eviction analogue)
the same bespoke graph** through the `IFormatDecoder` seam. Direction A is therefore the permanent held as the documented fallback; **Direction C** (adopt MSE) **rejected (OQ5 = NO, Daniel 2026-06-23)** —
destination, not a stopgap MSE would retire. the bespoke Web Audio graph is a deliberate long-term commitment, and the compressed-delivery move that
would have justified MSE was met instead by **Phase 18 (Opus) feeding the same bespoke graph** through the
WebCodecs `IStreamingDecoder` seam. Direction A is the permanent destination, not a stopgap MSE would
retire.
**Invariants that must hold (the §3.5 seam contract).** Reuse the Range path, don't fork it; playback- **Invariants that must hold (the §3.5 seam contract).** Reuse each path's Range/seek machinery, don't fork
start latency at parity; the `IFormatDecoder` abstraction untouched (windowing is format-agnostic, so it; playback-start latency at parity; neither decoder seam's contract forked — eviction is shared at the
wiring MP3/FLAC later inherits it free); read-only playback (no new control); the single-instance JS scheduler (zero format branches), back-pressure is seam-aware (the one place the two paths diverge);
decoder stays single-writer (every refill routes through the existing cancellation/drain discipline). The read-only playback (no new control); the single-writer decoder discipline holds for **both** decoders
**Mix visualizer is provably unaffected** — it renders from the preprocessed per-track high-res datum (stricter for the stateful Opus `AudioDecoder` — a stale `push` racing a reset+reconfigure corrupts
(Phase 10/12), never from live decoded PCM, so evicting played buffers cannot starve it. The 1 GB mix is inter-frame state). The **Mix visualizer is provably unaffected** — it renders from the preprocessed
both the canonical case *and* the proof the eviction is safe. per-track high-res datum (Phase 10/12), never from live decoded PCM, so evicting played buffers cannot
starve it. The 1 GB mix is both the canonical case *and* the proof the eviction is safe.
**Interaction with deferred Phase 1 features (same seam):** windowing should land **before** preload **Interaction with deferred Phase 1 features (same seam):** windowing should land **before** preload
(1.3) — it makes preload of long tracks memory-safe by construction (a staged next-track decoder inherits (1.3) — it makes preload of long tracks memory-safe by construction (a staged next-track decoder inherits
@@ -512,32 +520,45 @@ Sequenced as four waves. `21.1 → 21.2 → 21.3`, with `21.4` validating the wh
prerequisite and the load-bearing change** — independent of the open questions (window *sizes* are prerequisite and the load-bearing change** — independent of the open questions (window *sizes* are
parameters fed in later). parameters fed in later).
- **21.1 — Partial eviction in `PlaybackScheduler` (cold-start; load-bearing).** Drop already-played Decomposition is **by concern** (eviction → back-pressure → seek-back refill → validate), not by format —
buffers while keeping the position/index/time-anchor bookkeeping exact against a buffer array that no eviction is genuinely shared, so a path-split would duplicate the hardest work; the one path-divergent
longer begins at absolute time 0 (today `getCurrentPosition`/`playFromPosition`/the schedule loop all concern (back-pressure) carries a two-track split *inside* its wave.
assume `buffers[0]` is the track start). The hardest correctness work in the phase. No refill yet.
**Independent of the open questions — can begin immediately.** - **21.1 — Partial eviction in `PlaybackScheduler` (cold-start; load-bearing; SHARED by both paths).** Drop
- **21.2 — Back-pressure on the forward read loop.** Stop `ReadAsync` above the high-water mark, resume already-played buffers while keeping the position/index/time-anchor bookkeeping exact against a buffer
below low-water; together with 21.1 this bounds *both* the played and unplayed regions (the AC1 array that no longer begins at absolute time 0 (today `getCurrentPosition`/`playFromPosition`/the schedule
guarantee). Routes resume/pause through the existing single-loop cancellation discipline. **Depends on loop all assume `buffers[0]` is the track start). The hardest correctness work in the phase. Written once,
21.1.** serves both decode paths (they `addBuffer` identically). No refill yet. **Independent of the open
- **21.3 — Seek-back-past-window refill.** When a backward seek lands earlier than the retained tail, questions — can begin immediately.**
refetch via the existing seek-beyond-buffer Range path pointed at the earlier offset; plus the minimal - **21.2 — Back-pressure (two tracks, one fill signal).** Bound the unplayed region by throttling
clean refill-failure handling (the 1.6 adjacency). Mostly reuse of the landed seek path. **Depends on production above a high-water mark and resuming below low-water, driven by the scheduler's
21.1 + 21.2.** decoded-lookahead fill. **21.2a** — stop `ReadAsync` on the C# loop (serves both paths; *sufficient* for
- **21.4 — Validation against the 1 GB target (acceptance).** Memory profiling (bounded under 1 GB is the WAV). **21.2b** — additionally stop the Opus demux/decode feed so the WebCodecs decode queue +
headline), latency parity, edge-to-edge playback, the seek matrix, induced refill failure, visualizer- `decodedQueue` don't balloon behind a throttled socket (Opus only; no WAV analogue). Together with 21.1
running, rapid-seek concurrency. Largely measurement; breaks are tuning fixes in 21.1's anchor math or this bounds both played and unplayed sides on both formats (AC1 + AC1-Opus). Routes through the existing
21.2's water-marks. **Depends on 21.121.3.** single-loop cancellation discipline. **Depends on 21.1.**
- **21.3 — Seek-back-past-window refill (one concern, per-path resolver).** When a backward seek lands
earlier than the retained tail, refetch via the existing seek-beyond-buffer path pointed at the earlier
offset, using whichever resolver the active path already ships (`IFormatDecoder`/`StreamDecoder` for WAV;
the live `resolveOpusByteOffset` + `OpusStreamDecoder.reinitializeForRangeContinuation` for Opus); plus
the minimal clean refill-failure handling (the 1.6 adjacency). Mostly reuse of the landed seek paths.
**Depends on 21.1 + 21.2.**
- **21.4 — Validation against the 1 GB target, BOTH formats (acceptance).** Memory profiling (bounded under
1 GB as WAV *and* as Opus, plus the Opus upstream queues), latency parity, edge-to-edge playback, the
seek matrix, induced refill failure, visualizer-running, rapid-seek concurrency (incl. an Opus
seek-storm). Largely measurement; breaks are tuning fixes in 21.1's anchor math, 21.2's water-marks, or
21.2b's Opus decode-ahead bound. **Depends on 21.121.3.**
**Dependency shape:** `21.1 → 21.2 → 21.3 → 21.4`; 21.1 is the only cold-start wave. **Phase-level **Dependency shape:** `21.1 → 21.2 → 21.3 → 21.4`; 21.1 is the only cold-start wave. **Phase-level
prerequisite: Phase 18 (Opus) lands first** so windowing is built against both formats. **Open questions prerequisite: Phase 18 (Opus) has landed** (`COMPLETED.md`) — windowing is built against both formats.
for Daniel (spec §6):** window-size policy axis (time-based window + memory guard — recommended); seek- **Open questions for Daniel (spec §6):** window-size policy axis (time-based window + memory guard —
back-past-window re-buffer acceptable (recommend yes, symmetric to forward); a hard total in-flight recommended); seek-back-past-window re-buffer acceptable (recommend yes, symmetric to forward); a hard
memory cap as a guard rail (recommend yes); window everything vs. only long tracks (recommend everything total in-flight memory cap as a guard rail (recommend yes); window everything vs. only long tracks
— one path, short tracks never hit a refill). **OQ5 (adopt MSE) — RESOLVED NO (Daniel 2026-06-23): the (recommend everything — one path, short tracks never hit a refill). **New staff-engineer architecture
bespoke graph stays by deliberate choice; recorded considered-and-declined, kept visible per file calls (spec §6):** OQ6 — one window controller for both paths or two (recommend shared controller + two
convention.** None block 21.1. back-pressure hooks); OQ7 — drive the Opus decode-ahead bound from the single scheduler-fill signal
(recommended). **OQ5 (adopt MSE) — RESOLVED NO (Daniel 2026-06-23):** the bespoke graph stays by deliberate
choice. None block 21.1.
--- ---
@@ -1,27 +1,39 @@
# Phase 21 — Windowed Streaming Buffer (bounded client memory for long streams) # Phase 21 — Windowed Streaming Buffer (bounded client memory for long streams)
Product spec. Status: **design / framing — implementation-ready pending Daniel's open-question calls.** Product spec. Status: **design / framing — reconciled to as-built Phase 18 (two decode paths);
Author: product-designer. Date: 2026-06-23. **No code has been written by this doc.** implementation-ready pending Daniel's open-question calls (OQ1OQ4 product; OQ6OQ7 staff-engineer
architecture).** Author: product-designer. Date: 2026-06-23 (reconciliation pass after Phase 18 landed).
**No code has been written by this doc.**
Surface: **public listener site only** (`DeepDrftPublic.Client` player stack + `DeepDrftPublic` Surface: **public listener site only** (`DeepDrftPublic.Client` player stack + `DeepDrftPublic`
TypeScript audio interop). No CMS (`DeepDrftManager`) change. No data-model or schema change. The one TypeScript audio interop). No CMS (`DeepDrftManager`) change. No data-model or schema change. The one
server touch is **reuse, not new surface**: the existing `DeepDrftAPI` HTTP `Range: bytes=X-` server touch is **reuse, not new surface**: the existing `DeepDrftAPI` HTTP `Range: bytes=X-`
partial-content primitive (Phase 4, landed) is the load-bearing dependency; this phase adds no new API partial-content primitive (Phase 4, landed) is the load-bearing dependency; this phase adds no new API
endpoint. endpoint.
> **Sequencing dependency (Daniel, 2026-06-23): Phase 18 (Opus Low-Data Streaming) comes BEFORE this > **Phase 18 (Opus Low-Data Streaming) has LANDED (2026-06-23, `COMPLETED.md`). This spec is reconciled
> phase.** Format support — specifically the derived **Ogg Opus fullband 320** low-data delivery path > to the as-built reality.** Phase 18 changed the landscape in two ways that reshape this phase:
> (`product-notes/phase-18-opus-low-data-streaming.md`) — is a prerequisite that sequences ahead of >
> windowing. Phase 21's windowing must work across **both** delivery formats (lossless WAV and Opus). > 1. **There are now TWO decode paths feeding the one `PlaybackScheduler`, not one.** (a) The original
> Its C5 invariant below already anticipated this ("must not foreclose MP3/FLAC"); **Opus is now the > **WAV/MP3/FLAC** path — `StreamDecoder` → `IFormatDecoder` (wrap-each-segment + `decodeAudioData`).
> concrete VBR/containerized driver of C5.** Windowing an Opus stream uses the decoder's **accurate > (b) A new **Opus** path — `OggDemuxer` → `OpusStreamDecoder` (the `IStreamingDecoder` seam, a stateful
> index-based** byte↔time mapping (`OpusFormatDecoder.calculateByteOffset` — a binary search in the Phase 18 > **WebCodecs `AudioDecoder`** pipeline). The §3.1 unbounded-memory root cause (the scheduler's
> precomputed seek index), exactly the C5 case — *not* the exact CBR-WAV `byteRate` math, and *not* > push-only `AudioBuffer[]`) applies to **both** — but the Opus path adds a *second* accumulation locus
> approximate Ogg-page interpolation. **Correction (Daniel, 2026-06-23):** an earlier draft described the > upstream of the scheduler (the WebCodecs decode queue + `decodedQueue: AudioData[]`), so windowing it
> Opus mapping as "approximate page interpolation"; the Phase 18 seek-model resolution rejected that — Opus > is not the same mechanism as windowing WAV. See §3.1.
> seeking is **accurate**, backed by a precomputed seek index built at transcode time, so refill resolves to > 2. **The accurate index-driven Opus seek the original spec assumed Phase 21 would build is ALREADY
> the *exact* page offset. The windowed refill controller calls the **same** index resolver an explicit seek > LIVE.** Phase 18 ships `resolveOpusByteOffset` (binary-search the precomputed seek index in
> does (Phase 18 §3.4a D); a window opening away from byte 0 still decodes via the Phase 18 sidecar setup > `OpusSeekData`) → Range fetch → `OpusStreamDecoder.reinitializeForRangeContinuation(landingTime,
> header. Build the window machinery format-agnostically (§2 C3/C5) so it inherits Opus for free. > target)` with frame-accurate lead-trim. Opus seek is **accurate, not approximate** — and **already
> shipping**. Phase 21 does **not** build Opus seek; it **reuses** that live seek for window-miss
> refills.
>
> **Correction of stale spec language.** The original draft described Opus as a future-wired
> `OpusFormatDecoder.calculateByteOffset` joining the `IFormatDecoder` registry, with seek as "approximate
> vs accurate." All of that is now wrong against the landed code: Opus does **not** use `IFormatDecoder`
> (it diverged to the `IStreamingDecoder`/WebCodecs seam precisely because per-segment `decodeAudioData` is
> architecturally wrong for Opus — see `IStreamingDecoder.ts`), and its seek is accurate and shipping. The
> body below is rewritten to the two-path reality. **The headline is unchanged:** bound client memory to a
> sliding window regardless of stream length, for the canonical 1 GB mix, across both delivery formats.
--- ---
@@ -31,18 +43,34 @@ Bound the **client memory** a playing track consumes to a small, configurable fo
**independent of total stream length** — so a 1 GB+ DJ MIX (Phase 9 `Mix` medium: a single long track) **independent of total stream length** — so a 1 GB+ DJ MIX (Phase 9 `Mix` medium: a single long track)
plays without the whole decoded PCM accumulating in the browser. plays without the whole decoded PCM accumulating in the browser.
**The defect, stated precisely.** The network path already streams in adaptive 1664 KB chunks **The defect, stated precisely — and it now has two faces, one shared.** The network path already
(`StreamingAudioPlayerService.StreamAudioWithEarlyPlayback`) — that part is fine. The accumulation is on streams in adaptive 1664 KB chunks (`StreamingAudioPlayerService.StreamAudioWithEarlyPlayback`) — that
the **decode side**: `PlaybackScheduler` holds `private buffers: AudioBuffer[]` and **never evicts** part is fine. The accumulation is on the **decode side**, and Phase 18 split the decode side into two
("Supports pause/resume/seek by **retaining all buffers**" — its own doc comment). Every 64 KB segment pipelines that both terminate at the same sink:
the `StreamDecoder` decodes is pushed via `addBuffer()` and kept for the life of the track. Decoded PCM
is **larger than the compressed-or-raw source** in memory (Web Audio `AudioBuffer` is 32-bit float per
sample per channel — a 16-bit stereo WAV roughly **doubles** in size once decoded), so a 1 GB WAV becomes
~2 GB of retained `AudioBuffer` float data. That is the OOM.
**One-line framing:** today the player decodes the whole track into memory and keeps it; Phase 21 makes - **The shared sink (both paths) — the unbounded scheduler.** `PlaybackScheduler` holds
it keep only a sliding forward window and discard what has already played, refilling on demand from the `private buffers: AudioBuffer[]` and **never evicts** ("Supports pause/resume/seek by **retaining all
Range primitive it already uses for seek. buffers**" — its own doc comment). Both decode paths call `scheduler.addBuffer()` (via
`AudioPlayer.processFormatChunk` for WAV/MP3/FLAC and `processOpusChunk` for Opus); nothing is ever
removed. Decoded PCM is **larger than the source** in memory (Web Audio `AudioBuffer` is 32-bit float
per sample per channel — a 16-bit stereo WAV roughly **doubles** once decoded; Opus decodes to the same
48 kHz float PCM regardless of how few bytes the *compressed* stream was). So a 1 GB WAV becomes ~2 GB
of retained float, **and a low-data Opus mix becomes the same ~2 GB of decoded float once played**
the compressed transfer is small, but the *decoded* footprint is identical. The scheduler is the OOM for
both. **This is the §3.1 root cause, unchanged from the original spec — it just now afflicts two
producers.**
- **The Opus-only second locus — upstream decode-ahead.** The Opus path accumulates *before* the
scheduler too: the WebCodecs `AudioDecoder` work queue (`decodeQueueSize`), the `decodedQueue:
AudioData[]` awaiting conversion, and the `OggDemuxer`'s partial-page state. Bounding the scheduler
alone does not bound these — they fill from the same C# `ReadAsync` loop, so they need their own
back-pressure (on the *demuxer/decoder feed*), not only the read loop's. WAV has no equivalent
upstream queue (its `StreamDecoder` decodes synchronously into the scheduler), so this is genuinely
Opus-specific.
**One-line framing:** today the player decodes the whole track into memory and keeps it — true for both
formats; Phase 21 makes it keep only a sliding forward window and discard what has already played,
refilling on demand from the Range primitive both paths already use for seek (WAV via `IFormatDecoder`,
Opus via the live index-driven `resolveOpusByteOffset`).
--- ---
@@ -60,36 +88,51 @@ docs. This phase **modifies that seam** — so the contract it must preserve is
- **C2 — Playback start latency unchanged.** Today playback starts as soon as a configurable minimum - **C2 — Playback start latency unchanged.** Today playback starts as soon as a configurable minimum
buffer count is queued (header-derived duration, not full-file). The window model must keep first-audio buffer count is queued (header-derived duration, not full-file). The window model must keep first-audio
latency at parity — bounding memory must not reintroduce a fetch-then-play stall. latency at parity — bounding memory must not reintroduce a fetch-then-play stall.
- **C3 — The format-decoder abstraction is untouched.** `IFormatDecoder` owns all format-specific - **C3 — Neither decoder seam's contract is forked; windowing lives in the shared layer plus a thin
byte math; `AudioPlayer.createFormatDecoder` already dispatches on `Content-Type` (WAV/MP3/FLAC per-seam hook.** There are two decoder seams as of Phase 18: `IFormatDecoder` (WAV/MP3/FLAC, owns
decoders all wired today — verified 2026-06-23; an `OpusFormatDecoder` joins them in Phase 18). format byte math; `AudioPlayer.createFormatDecoder` dispatches on `Content-Type`) and `IStreamingDecoder`
Windowing lives in the (Opus, the WebCodecs pipeline; selected in `initializeStreaming` when the content type is
**format-agnostic** layer (`PlaybackScheduler` eviction + `StreamDecoder`/player refill `audio/ogg`/`audio/opus` and a sidecar is present). **The eviction half of windowing is fully shared**
orchestration); it must add **no** format-specific branches. A future wired MP3/FLAC decoder inherits it lives in `PlaybackScheduler`, which both seams feed identically via `addBuffer`, so eviction adds
windowing for free. **zero** format branches. **The back-pressure / decode-ahead half is necessarily seam-aware** — the WAV
path back-pressures the C# `ReadAsync` loop; the Opus path must additionally bound the WebCodecs
decode-ahead and the `decodedQueue` (§3.1). Express that as a **small uniform signal** ("the scheduler is
full, stop producing") that each decode path honors in its own way, rather than a windowing controller
that reaches into either decoder's internals. The goal the original C3 stated still holds — no
format-specific logic leaking into the *scheduler* — but the spec now acknowledges the producer side has
two shapes, not one.
- **C4 — Read-only playback only.** This is a memory-management change, not a UX change. No new - **C4 — Read-only playback only.** This is a memory-management change, not a UX change. No new
user-visible control, no change to seek/transport semantics beyond what the listener already user-visible control, no change to seek/transport semantics beyond what the listener already
experiences. Seek must still feel identical. experiences. Seek must still feel identical.
- **C5 — Must window both delivery formats (WAV lossless AND Opus low-data).** Byte↔time mapping for - **C5 — Window both decode paths without forking the scheduler/seam, reusing the live index-driven
refill is exact and cheap for WAV (CBR: `byteRate` from the header). **Phase 18 (Opus) is sequenced seek for refill.** Both delivery formats must be windowed, and the byte↔time mapping each refill needs is
before this phase and is the concrete VBR driver here** — and its mapping is **also exact**, but by a **already accurate and already shipping** for both:
different mechanism: an Ogg Opus 320 stream has no linear time↔byte relationship, so - **WAV/MP3/FLAC** — `IFormatDecoder.calculateByteOffset` (CBR `byteRate` for WAV; the MP3/FLAC seek
`OpusFormatDecoder.calculateByteOffset` resolves via a **precomputed seek index** (granule→byte, built at accelerators for those), reached through `StreamDecoder.calculateByteOffset` / `AudioPlayer.seekBeyondBuffer`.
transcode; Phase 18 §3.4a), a binary search that returns the exact page offset — **not** an approximate - **Opus** — `resolveOpusByteOffset(activeOpusSidecar, t)` (binary search the precomputed granule→byte
page interpolation. (An earlier draft of this invariant said "approximate"; the Phase 18 seek-model seek index in `OpusSeekData`), returning an exact page-start offset **and** a `landingTimeSeconds` for
resolution, Daniel 2026-06-23, made Opus seeking accurate. Corrected here.) The window machinery must the decoder's frame-accurate lead-trim. This is **accurate, not approximate, and landed in Phase 18.**
express refill purely in terms of the decoder's existing `calculateByteOffset`, so the same code windows Phase 21 does **not** build either mapping. The window's refill trigger calls *whichever resolver the
WAV (via `byteRate`) and Opus (via the index) — **no WAV-special-cased offset math in the window layer**, active path already uses* — for Opus, the **same** `resolveOpusByteOffset` an explicit listener seek
and no approximation for either. A window that opens away from byte 0 must also prepend the decoder's calls (the live path in `AudioPlayer.seekBeyondBuffer`), so windowed refill is literally "a seek the
retained/sidecar setup header (Phase 18 §3.4a B) — the format-agnostic refill path already routes listener didn't initiate." A window opening away from byte 0 decodes correctly on the Opus path because
continuations through the decoder's header-carry, so this comes for free. (MP3/FLAC decoders are already the setup header (`OpusHead`/`OpusTags`) is already cached from the sidecar and re-applied by
wired in the registry too — the registry dispatches on content-type today; an `OpusFormatDecoder` joins `reinitializeForRangeContinuation` (Phase 18 §3.4a B); the WAV path re-applies its retained header the
them in Phase 18.) same way. **No new offset math, no approximation, no header re-fetch — all reused.** The invariant is
- **C6 — No regression to the single-instance JS decoder concurrency guarantees.** The current code is therefore *not* "make refill format-agnostic" (the two paths legitimately resolve offsets through
careful that only one streaming loop touches the single JS `StreamDecoder` at a time different code); it is **"reuse the live seek of each path verbatim; add only the eviction and the
(`DrainActiveStreamingTaskAsync`, the `_streamingCancellation` identity dance). Windowed refill refill *trigger*, never a second seek mechanism."**
introduces *more* mid-stream fetches; it must route through the **same** drain/cancellation discipline, - **C6 — No regression to the single-writer decoder concurrency guarantee — now covering both decoders.**
not around it. The C# loop is careful that only one streaming task feeds the active JS decoder at a time
(`DrainActiveStreamingTaskAsync`, the `_streamingCancellation` identity dance in
`StreamingAudioPlayerService`). This matters *more* for Opus: the WebCodecs `AudioDecoder` is stateful
and async — a `reset()`+`configure()` on a range-continuation (`reinitializeForRangeContinuation`) racing
a still-draining `push()` from a stale loop would corrupt inter-frame state, not merely deliver a wrong
buffer. Windowed refill introduces *more* mid-stream fetches against whichever decoder is active; every
one must route through the **same** drain/cancellation discipline, not around it. The discipline is
already decoder-agnostic at the C# layer (it cancels the loop, not the decoder), so this is a "keep using
it" invariant — but it is the rule most likely to be violated by a naive Opus refill, and is the hardest
failure to diagnose, so it is called out as a hard invariant for both paths.
- **C7 — The Mix visualizer's data source is independent and must stay that way.** The Phase 10/12 - **C7 — The Mix visualizer's data source is independent and must stay that way.** The Phase 10/12
WebGL2 lava visualizer renders from a **preprocessed high-res waveform datum** fetched per-track WebGL2 lava visualizer renders from a **preprocessed high-res waveform datum** fetched per-track
(`GET api/track/{entryKey}/waveform/high-res`), **not** from live decoded PCM. Confirmed: evicting (`GET api/track/{entryKey}/waveform/high-res`), **not** from live decoded PCM. Confirmed: evicting
@@ -103,8 +146,10 @@ docs. This phase **modifies that seam** — so the contract it must preserve is
### 3.0 The mental model ### 3.0 The mental model
A track's audio is a byte range `[0, fileLength)` on disk. At any moment the listener is at playback A track's audio is a byte range `[0, fileLength)` on disk. At any moment the listener is at playback
position `P` (seconds → byte offset via the format decoder). The player should hold decoded position `P` (seconds → byte offset via the active path's resolver — `IFormatDecoder.calculateByteOffset`
`AudioBuffer`s only for a bounded window roughly `[P - back, P + ahead]`: for WAV/MP3/FLAC, `resolveOpusByteOffset` over the seek index for Opus). The player should hold decoded
`AudioBuffer`s only for a bounded window roughly `[P - back, P + ahead]` — and, on the Opus path, keep the
upstream WebCodecs decode queue near-empty too (§3.1):
- **forward fill (`ahead`)** — enough decoded lookahead that playback never starves (covers the existing - **forward fill (`ahead`)** — enough decoded lookahead that playback never starves (covers the existing
500 ms scheduler lookahead plus network jitter headroom); 500 ms scheduler lookahead plus network jitter headroom);
@@ -119,42 +164,64 @@ position `P` (seconds → byte offset via the format decoder). The player should
This is a **ring/sliding-window buffer keyed on playback position**, driven by high/low-water marks — This is a **ring/sliding-window buffer keyed on playback position**, driven by high/low-water marks —
the standard bounded-producer/bounded-consumer pattern, transplanted onto the decode→schedule seam. the standard bounded-producer/bounded-consumer pattern, transplanted onto the decode→schedule seam.
### 3.1 Why this is a generalization of seek-beyond-buffer, not a new mechanism ### 3.1 Why refill is a generalization of seek-beyond-buffer, not a new mechanism — for both paths
The seek-beyond-buffer path already does **every primitive** the window needs, just triggered manually The seek-beyond-buffer path already does **every refill primitive** the window needs, just triggered
and one-shot: manually and one-shot. As of Phase 18 each primitive has a WAV branch and an Opus branch, both live:
| Window operation | Existing seek-beyond-buffer machinery it reuses | | Window operation | WAV/MP3/FLAC machinery reused | Opus machinery reused (Phase 18, landed) |
|-------------------------------|-----------------------------------------------------------------------------------| |-------------------------------|--------------------------------------------------------------------|--------------------------------------------------------------------------------------|
| Discard buffers, keep offset | `PlaybackScheduler.clearForSeek()` + `setPlaybackOffset()` (clears buffers, retains the absolute-time anchor) | | Discard buffers, keep offset | `PlaybackScheduler.clearForSeek()` + `setPlaybackOffset()` | *same* — the scheduler is shared |
| Fetch from a byte offset | `TrackMediaClient.GetTrackMedia(key, byteOffset)``Range: bytes=X-` → 206 | | Fetch from a byte offset | `TrackMediaClient``Range: bytes=X-` → 206 | *same* (with `?format=opus`) — the Range path is shared |
| Decode a header-less body | `StreamDecoder.reinitializeForRangeContinuation(remainingByteLength)` | | Map time → byte offset | `StreamDecoder.calculateByteOffset()``IFormatDecoder` | `resolveOpusByteOffset(activeOpusSidecar, t)` (index binary search → exact page) |
| Map time → byte offset | `StreamDecoder.calculateByteOffset()``IFormatDecoder.calculateByteOffset()` | | Decode a header-less body | `StreamDecoder.reinitializeForRangeContinuation(len)` | `OpusStreamDecoder.reinitializeForRangeContinuation(landingTime, target)` (demux/codec reset + lead-trim) |
| Single-loop safety on refetch | `_streamingCancellation` swap + `DrainActiveStreamingTaskAsync()` | | Single-loop safety on refetch | `_streamingCancellation` swap + `DrainActiveStreamingTaskAsync()` | *same* — the C# discipline is decoder-agnostic |
The difference is **eviction does not exist yet** (the scheduler only ever `clear()`s wholesale) and The genuinely-new work, by path:
**refill is one-shot** (a seek, not a continuous low-water-triggered loop). So the new work is two
seams: a *partial-evict* on the scheduler, and a *position-driven refill controller* on the player. The - **Shared (both paths):** *partial eviction* on `PlaybackScheduler` (today it only ever `clear()`s
fetch/decode/offset plumbing is reused verbatim. wholesale), and a *position-driven refill trigger* (a continuous low-water loop, not a one-shot seek).
- **WAV path:** *back-pressure on the C# `ReadAsync` loop* — stop reading the socket above the high-water
mark, resume below low-water. WAV's `StreamDecoder` decodes synchronously into the scheduler, so the
read loop is the *only* producer to throttle; pausing `ReadAsync` bounds it fully.
- **Opus path:** *the same C# back-pressure, plus bounding the WebCodecs decode-ahead.* Throttling
`ReadAsync` alone is **not sufficient** for Opus, because `OpusStreamDecoder.push()` is async and the
WebCodecs `AudioDecoder` keeps its own internal work queue (`decodeQueueSize`) plus a `decodedQueue:
AudioData[]` of decoded-but-not-yet-converted frames. The Opus producer must also stop *feeding the
decoder* (stop demuxing/decoding new packets) when the scheduler is full, and resume below low-water —
back-pressure on the **demuxer/decoder feed**, not only on the socket read. This is the one place the
two paths' windowing genuinely diverges.
Everything else — the fetch, the offset resolution, the header-carry continuation, the single-loop
cancellation safety — is **reused verbatim** on both paths. Phase 21 builds eviction + the refill trigger
+ the (per-path) back-pressure; it builds **no** new fetch, offset, or seek mechanism.
### 3.2 The three candidate directions ### 3.2 The three candidate directions
Per file convention the alternatives are recorded; the recommendation follows. Per file convention the alternatives are recorded; the recommendation follows.
**Direction A — Sliding window on the existing single forward stream (recommended).** **Direction A — Sliding window on the existing single forward stream (recommended).**
Keep the current model where the C# loop reads one forward HTTP stream and pumps chunks into the JS Keep the current model where the C# loop reads one forward HTTP stream and pumps chunks into the active
decoder. Add two things: (1) `PlaybackScheduler` gains *partial eviction* — drop buffers whose JS decoder. Add three things: (1) `PlaybackScheduler` gains *partial eviction* — drop buffers whose
absolute-time end is older than `P - back`, adjusting its index bookkeeping so `getCurrentPosition()` absolute-time end is older than `P - back`, adjusting its index bookkeeping so `getCurrentPosition()`
and scheduling stay correct against a buffer array that no longer starts at index 0; (2) a and scheduling stay correct against a buffer array that no longer starts at index 0 (**shared by both
*back-pressure* signalwhen forward decoded lookahead exceeds the high-water mark, the C# loop paths**the scheduler is the common sink); (2) *back-pressure on the C# read loop* — when forward
**pauses reading** the HTTP stream (stops calling `ReadAsync`) until playback drains it below low-water, decoded lookahead exceeds the high-water mark, the C# loop **pauses reading** the HTTP stream (stops
then resumes. Memory is bounded by high-water + back-retain. Seek-back beyond the retained window falls calling `ReadAsync`) until playback drains it below low-water, then resumes; (3) **for the Opus path
through to the **existing** seek-beyond-buffer path unchanged. only, back-pressure on the WebCodecs decode-ahead** — the producer also stops demuxing/decoding new
packets when the scheduler is full, so the `AudioDecoder` work queue and `decodedQueue` do not balloon
behind a throttled socket. Memory is bounded by high-water + back-retain on both paths. Seek-back beyond
the retained window falls through to the **existing** seek-beyond-buffer path (the right one per format)
unchanged.
*Why recommended:* smallest change to the load-bearing seam; reuses the live forward stream (no extra *Why recommended:* smallest change to the load-bearing seam; reuses the live forward stream (no extra
connections in the common case); eviction and back-pressure are the only genuinely new mechanisms, and connections in the common case); eviction and back-pressure are the only genuinely new mechanisms, all
both are local (one to the scheduler, one to the read loop). Back-pressure via "stop reading the socket" local (the scheduler; the read loop; for Opus, the demux/decode feed). Back-pressure via "stop reading
is exactly how TCP flow control already wants to behave — pausing `ReadAsync` lets the kernel window the socket" is exactly how TCP flow control already wants to behave — pausing `ReadAsync` lets the kernel
close; we are not fighting the transport. window close; we are not fighting the transport. The Opus decode-ahead bound is the one addition Phase 18
forces, and it is local to the Opus producer.
*Open question it raises (OQ6, new):* whether the two paths' back-pressure is driven by **one shared
window controller** that exposes a "scheduler full / drained" signal both producers poll, or by **two
parallel implementations** sharing only the eviction code. Recommend the **shared signal** — see §6 OQ6.
**Direction B — Discrete window segments, each its own Range fetch.** **Direction B — Discrete window segments, each its own Range fetch.**
Treat the file as fixed-size byte segments (e.g. 4 MB). Hold N decoded segments around `P`; fetch the Treat the file as fixed-size byte segments (e.g. 4 MB). Hold N decoded segments around `P`; fetch the
@@ -177,32 +244,41 @@ problem.
wants containerized formats (fragmented MP4/WebM, or MP3/AAC elementary streams). The entire bespoke wants containerized formats (fragmented MP4/WebM, or MP3/AAC elementary streams). The entire bespoke
visualizer/spectrum graph is wired to the Web Audio `AudioContext`, not a `<media>` element. Adopting visualizer/spectrum graph is wired to the Web Audio `AudioContext`, not a `<media>` element. Adopting
MSE is a **rewrite of the playback substrate**, not a windowing change. It *looked* like the real MSE is a **rewrite of the playback substrate**, not a windowing change. It *looked* like the real
long-term answer once compressed delivery arrived — but Daniel has decided compressed delivery long-term answer once compressed delivery arrived — but compressed delivery (**Phase 18 Opus, now
(**Phase 18 Opus**) will feed the **same bespoke graph** via the `IFormatDecoder` seam, so the landed**) feeds the **same bespoke graph** via the WebCodecs `IStreamingDecoder` seam (parallel to the
compressed-delivery move that would have justified MSE happens *without* surrendering the graph. **The WAV `IFormatDecoder` seam, both terminating at the shared `PlaybackScheduler`), so the compressed-delivery
bespoke graph is a deliberate long-term commitment; MSE is rejected.** Direction A is therefore the move that would have justified MSE happened *without* surrendering the graph. Notably, Phase 18 chose a
permanent destination, not a stopgap that MSE will retire. Recorded as considered-and-declined. **WebCodecs `AudioDecoder`** for Opus rather than `decodeAudioData` — which is itself the "use the platform
codec, keep the bespoke graph" move, but at the *decoder* granularity, not the *media-element* granularity
MSE would impose. **The bespoke graph is a deliberate long-term commitment; MSE is rejected.** Direction A
is therefore the permanent destination, not a stopgap that MSE will retire. Recorded as
considered-and-declined.
### 3.3 Recommended direction: A, with B held as the documented fallback ### 3.3 Recommended direction: A, with B held as the documented fallback
Direction A is the smallest coherent change that hits the headline (bounded memory under a 1 GB stream) Direction A is the smallest coherent change that hits the headline (bounded memory under a 1 GB stream)
while honoring C1C7. It keeps the live forward stream, reuses the seek-beyond-buffer path for the only while honoring C1C7. It keeps the live forward stream, reuses each path's seek-beyond-buffer machinery
genuinely random-access case (seek-back past the retained tail), and isolates the two new mechanisms. for the only genuinely random-access case (seek-back past the retained tail), and isolates the new
**The final architecture and the exact eviction/back-pressure API are staff-engineer's call at mechanisms (eviction shared; back-pressure per path). **The final architecture and the exact
implementation** (per file convention); this spec fixes the *shape* and the invariants, not the method eviction/back-pressure API are staff-engineer's call at implementation** (per file convention); this spec
signatures. fixes the *shape* and the invariants, not the method signatures.
### 3.4 SOLID / road-not-taken rationale ### 3.4 SOLID / road-not-taken rationale
- **SRP, preserved.** Eviction is a `PlaybackScheduler` concern (it already owns buffer storage); refill - **SRP, preserved.** Eviction is a `PlaybackScheduler` concern (it already owns buffer storage, and is
orchestration is a player-service/`StreamDecoder` concern (they already own the fetch loop); byte↔time the single shared sink both decode paths feed); refill orchestration is a player-service concern (it
math stays in `IFormatDecoder`. No responsibility crosses a boundary it does not already own. already owns the C# fetch loop and the seek dispatch); byte↔time math stays where each path already keeps
- **OCP, via C3/C5.** Windowing added in the format-agnostic layer means wiring MP3/FLAC later changes it — `IFormatDecoder.calculateByteOffset` for WAV/MP3/FLAC, `resolveOpusByteOffset` (over `OpusSeekData`)
zero window code. The window expresses refill through `calculateByteOffset` — the one seam the for Opus. No responsibility crosses a boundary it does not already own.
decoders already implement. - **OCP, via the shared sink + the live per-path seek.** Eviction added at the scheduler changes zero
- **The seam stays single-writer (C6).** Every new refetch routes through the existing decoder code on either path. Refill reuses each path's *already-implemented* offset resolver — Phase 21
cancellation/drain discipline, so "only one loop touches the JS decoder" remains true. This is the adds no offset math to either seam. The one place windowing is not purely additive is the Opus
rule most likely to be violated by a naive implementation and is called out as a hard invariant. decode-ahead bound (§3.1), which lives inside the Opus producer, not in the shared layer.
- **The seam stays single-writer (C6) — for both decoders.** Every new refetch routes through the existing
C# cancellation/drain discipline, so "only one loop feeds the active decoder" remains true for the WAV
`StreamDecoder` and the stateful Opus `AudioDecoder` alike. This is the rule most likely to be violated
by a naive Opus refill (a stale `push()` racing a `reset()`+`configure()`), and is called out as a hard
invariant.
- **Road not taken — eager full decode with a memory cap that just stops decoding.** Tempting (decode - **Road not taken — eager full decode with a memory cap that just stops decoding.** Tempting (decode
until you hit a byte budget, then stop) but it breaks playback of long tracks past the cap entirely — until you hit a byte budget, then stop) but it breaks playback of long tracks past the cap entirely —
it bounds memory by *refusing to play the rest*, not by sliding. Rejected: it is a degradation, not a it bounds memory by *refusing to play the rest*, not by sliding. Rejected: it is a degradation, not a
@@ -213,8 +289,15 @@ signatures.
## 4. Use cases ## 4. Use cases
- **UC1 — Play a 1 GB+ DJ MIX start to finish (the headline).** Memory stays bounded throughout; the - **UC1 — Play a 1 GB+ DJ MIX start to finish (the headline).** Memory stays bounded throughout; the
listener experiences continuous playback identical to a short track. listener experiences continuous playback identical to a short track. **Holds in both formats** — the
- **UC2 — Seek forward within a long track.** Already handled by seek-beyond-buffer; under windowing the lossless WAV mix (~2 GB decoded if unbounded) and the low-data Opus mix (small transfer, but the *same*
~2 GB decoded float once played, so it needs windowing just as much; see §1).
- **UC1-Opus — The same mix streamed as Opus, windowed.** The low-data win (Phase 18) shrinks the
*transfer*; Phase 21 shrinks the *decoded footprint*. The two compound: a metered-connection listener on
Opus gets both the small download and the bounded memory. Windowing the Opus path additionally bounds the
WebCodecs decode-ahead and `decodedQueue`, not only the scheduler (§3.1).
- **UC2 — Seek forward within a long track.** Already handled by seek-beyond-buffer (the right resolver per
format — `IFormatDecoder` for WAV, the live `resolveOpusByteOffset` for Opus); under windowing the
forward seek clears the window and refills at the target — no behavior change, now with eviction so the forward seek clears the window and refills at the target — no behavior change, now with eviction so the
pre-seek region does not linger. pre-seek region does not linger.
- **UC3 — Seek back a few seconds.** Served from the back-retain window with **no** network refetch - **UC3 — Seek back a few seconds.** Served from the back-retain window with **no** network refetch
@@ -248,8 +331,11 @@ This phase touches the **same decoder/scheduler seam** as the deferred Phase 1.3
- **1.5 Gapless (deferred).** Sample-accurate hand-off of the next track's first buffer at the current - **1.5 Gapless (deferred).** Sample-accurate hand-off of the next track's first buffer at the current
track's last buffer. Windowing changes *which* buffers are retained but not the hand-off mechanism; track's last buffer. Windowing changes *which* buffers are retained but not the hand-off mechanism;
the only care point is that the current track's **final** window must not be evicted before the gapless the only care point is that the current track's **final** window must not be evicted before the gapless
boundary is scheduled. A minor invariant for whoever builds 1.5, not a blocker. Note 1.5's existing boundary is scheduled. A minor invariant for whoever builds 1.5, not a blocker. **Phase 18 note:** the
WAV-only caveat stands. former "1.5 is WAV-only" caveat is superseded — Opus is live, and it has its own encoder pre-skip/priming
(handled once by the WebCodecs decoder, see `OpusStreamDecoder.ts`), so a gapless Opus hand-off must
respect the end-trim against the sidecar's authoritative total length. That is 1.5's problem to absorb,
not Phase 21's; flagged so 1.5 inherits it.
- **1.6 Track-skip on error (deferred).** *Windowing enlarges the error surface — call this out.* Today - **1.6 Track-skip on error (deferred).** *Windowing enlarges the error surface — call this out.* Today
a fetch failure happens at load (one fetch) or at a user seek (one fetch). Windowed refill issues a fetch failure happens at load (one fetch) or at a user seek (one fetch). Windowed refill issues
**mid-stream** fetches the listener did not initiate; one of those can fail at byte 700 M of a 1 GB **mid-stream** fetches the listener did not initiate; one of those can fail at byte 700 M of a 1 GB
@@ -260,9 +346,12 @@ This phase touches the **same decoder/scheduler seam** as the deferred Phase 1.3
failure handling into Phase 21's acceptance criteria** (AC6) rather than leaving it entirely to 1.6 — failure handling into Phase 21's acceptance criteria** (AC6) rather than leaving it entirely to 1.6 —
it is created by this phase. it is created by this phase.
- **1.7 Safari compatibility (deferred).** Windowing adds no new Safari-specific surface beyond what the - **1.7 Safari compatibility (deferred).** Windowing adds no new Safari-specific surface beyond what the
streaming path already has. The one adjacency: more frequent `AudioContext` activity during refill streaming path already has. Two adjacencies, both Phase-18-introduced: (a) more frequent `AudioContext`
should be checked against the older-Safari `webkitAudioContext` quirks when 1.7 is addressed — note it, activity during refill should be checked against older-Safari `webkitAudioContext` quirks; (b) the Opus
do not block on it. path depends on **WebCodecs `AudioDecoder`**, whose Safari availability is narrower than `decodeAudioData`
Ogg-Opus support — Phase 18's capability gate already falls a non-WebCodecs browser back to the lossless
WAV path, so a Safari that can't run the Opus pipeline windows the *WAV* path (which has no decode-ahead
locus, only the scheduler), i.e. the simpler windowing case. Note it; do not block on it.
--- ---
@@ -294,22 +383,50 @@ These are policy calls with user-visible or resource trade-offs — flagged rath
adopt MSE. The bespoke Web Audio decode→schedule graph stays — it is bespoke by deliberate choice, a adopt MSE. The bespoke Web Audio decode→schedule graph stays — it is bespoke by deliberate choice, a
long-term commitment, not a stopgap.** Daniel's rationale: the player is intentionally a custom long-term commitment, not a stopgap.** Daniel's rationale: the player is intentionally a custom
graph, not an HTML `<media>` element; the compressed-delivery move that *would* have made MSE graph, not an HTML `<media>` element; the compressed-delivery move that *would* have made MSE
tempting is being met instead by **Phase 18 (Opus low-data path)** feeding the **same bespoke graph** tempting was met instead by **Phase 18 (Opus low-data path, now landed)** feeding the **same bespoke
through the `IFormatDecoder` seam — so compressed delivery arrives *without* surrendering the graph. graph** through the WebCodecs `IStreamingDecoder` seam (parallel to the WAV `IFormatDecoder` seam) — so
Consequence for this phase: Direction A (the hand-rolled sliding window) is the destination, not a compressed delivery arrived *without* surrendering the graph. Consequence for this phase: Direction A
placeholder; invest in it as permanent machinery. It will window both the WAV and the Opus path (the hand-rolled sliding window) is the destination, not a placeholder; invest in it as permanent
(the sequencing note at the top). Direction C is recorded as **considered and declined** per file machinery. It windows both the WAV and the Opus path (the header note). Direction C is recorded as
convention; kept visible so a future reader sees the road not taken and why. **considered and declined** per file convention; kept visible so a future reader sees the road not taken
`[RESOLVED — bespoke graph retained; MSE rejected]` and why. `[RESOLVED — bespoke graph retained; MSE rejected]`
- **OQ6 — One window controller for both decode paths, or two? (NEW — raised by the Phase 18 two-path
reality.)** Eviction is unambiguously shared (the scheduler is the one sink). Back-pressure is not: the
WAV path throttles the C# `ReadAsync` loop; the Opus path must *also* throttle the WebCodecs
decode-ahead (§3.1). Should there be **one window controller** exposing a uniform "scheduler full /
drained" signal that both producers honor in their own way (recommended — keeps the *policy* — window
sizes, water-marks, OQ1/OQ3 — in one place, with two thin per-path back-pressure hooks), or **two
parallel windowing implementations** sharing only the eviction code (simpler per-path, but duplicates the
water-mark logic and risks the two drifting)? Recommend the **shared controller + per-path hook**. This
is more an architecture call than a product call — flagged for staff-engineer at implementation, with the
recommendation as the default. `[staff-engineer call; recommendation: shared controller]`
- **OQ7 — How does the Opus WebCodecs decode-ahead bound interact with scheduler eviction? (NEW; technical,
for staff-engineer.)** The Opus producer has two queues to bound (the `AudioDecoder` work queue and
`decodedQueue: AudioData[]`) *plus* the shared scheduler. The clean rule is "stop feeding the decoder when
decoded-lookahead-in-the-scheduler exceeds high-water" — i.e. the **scheduler's** fill level is the
single back-pressure signal, and the upstream Opus queues are kept near-empty by simply not demuxing
ahead. The alternative (let the decoder run ahead into `decodedQueue` and bound *that* separately) adds a
second budget to tune and a second eviction point. Recommend the former: **one fill signal (scheduler
decoded-lookahead), drive both the read-loop pause and the demux/decode pause from it.** Confirm at
implementation that the WebCodecs decoder tolerates being starved of input mid-stream and resumes cleanly
(it should — it is fed packet-by-packet via `decode()`), and that `decodedQueue` is drained promptly so
it never holds more than one `push()` worth. `[staff-engineer call; recommendation: single
scheduler-fill signal]`
--- ---
## 7. Acceptance criteria ## 7. Acceptance criteria
- **AC1 (headline) — Bounded memory under a 1 GB stream.** Playing a 1 GB+ WAV mix start to finish, the - **AC1 (headline) — Bounded memory under a 1 GB stream, in BOTH formats.** Playing a 1 GB+ mix start to
browser tab's retained decoded-audio memory stays bounded to the configured window (not growing toward finish — **as lossless WAV and as low-data Opus** — the browser tab's retained decoded-audio memory
~2 GB). Verifiable via browser memory tooling: peak decoded-audio footprint is independent of track stays bounded to the configured window (not growing toward ~2 GB). Verifiable via browser memory tooling:
length and tracks the window-size policy, not the file size. peak decoded-audio footprint is independent of track length and tracks the window-size policy, not the
file size. The Opus case must be verified explicitly — its small *transfer* does not imply a small
*decoded* footprint (§1), so "Opus already streams small" is **not** sufficient.
- **AC1-Opus — The Opus upstream decode-ahead is bounded too (§3.1 / OQ7).** Under a long Opus stream, the
WebCodecs decode queue and `decodedQueue` do not grow unboundedly behind the scheduler — back-pressure
reaches the demux/decode feed, not only the scheduler. Verifiable: the upstream queues stay near-empty
(one `push()` worth) regardless of stream length.
- **AC2 — Playback-start latency at parity (C2).** First-audio latency for a track is unchanged from - **AC2 — Playback-start latency at parity (C2).** First-audio latency for a track is unchanged from
pre-windowing (within noise). Windowing does not introduce a fetch-then-play stall. pre-windowing (within noise). Windowing does not introduce a fetch-then-play stall.
- **AC3 — Continuous playback, no starvation.** A long mix plays edge to edge with no audible gaps, - **AC3 — Continuous playback, no starvation.** A long mix plays edge to edge with no audible gaps,
@@ -324,41 +441,70 @@ These are policy calls with user-visible or resource trade-offs — flagged rath
"playing" with a starved scheduler). It must not silently hang. "playing" with a starved scheduler). It must not silently hang.
- **AC7 — The Mix visualizer is unaffected (C7).** With the lava visualizer running on a long mix, the - **AC7 — The Mix visualizer is unaffected (C7).** With the lava visualizer running on a long mix, the
visualizer renders identically (it reads the preprocessed datum, never the evicted buffers). visualizer renders identically (it reads the preprocessed datum, never the evicted buffers).
- **AC8 — Single-decoder concurrency invariant holds (C6).** Under rapid seek + refill activity, no - **AC8 — Single-writer decoder concurrency invariant holds (C6) — both decoders.** Under rapid seek +
interleaved `ProcessStreamingChunk` calls corrupt the single JS decoder (the existing drain/cancel refill activity, no interleaved `ProcessStreamingChunk` / `push` calls corrupt the active decoder the
discipline still governs every fetch). existing drain/cancel discipline still governs every fetch. **For Opus this is stricter:** no stale
`push()` may land against the WebCodecs `AudioDecoder` across a `reinitializeForRangeContinuation`
reset+reconfigure (which would corrupt inter-frame state, not just a buffer). Verify under a rapid
seek-storm on an Opus mix specifically.
--- ---
## 8. Wave decomposition ## 8. Wave decomposition
**Decomposition choice: split by *concern* (eviction → back-pressure → seek-back refill → validate), not
by *path* (WAV-track vs Opus-track).** Rationale: the eviction concern (21.1) is genuinely shared — the
scheduler is the one sink both paths feed — so a path-split would duplicate the hardest correctness work or
arbitrarily assign it to one track. The concern spine keeps that shared work as a single cold-start wave
and lets the *one* genuinely path-divergent concern (back-pressure, 21.2) carry an explicit two-track
split *inside* the wave rather than fracturing the whole phase. This also matches how the seek-back refill
(21.3) reuses each path's already-live seek — it is one concern (window-miss → refetch) with a per-path
resolver underneath, not two features. The spine is unchanged from the original spec; the mechanisms
inside 21.2 and 21.3 are made correct for both paths.
Dependency shape: `21.1 → 21.2 → 21.3`, with `21.4` validating the whole. 21.1 is the cold-start Dependency shape: `21.1 → 21.2 → 21.3`, with `21.4` validating the whole. 21.1 is the cold-start
prerequisite and the load-bearing change; the rest layer on it. prerequisite and the load-bearing change; the rest layer on it.
- **21.1 — Partial eviction in `PlaybackScheduler` (cold-start; the load-bearing change).** Give the - **21.1 — Partial eviction in `PlaybackScheduler` (cold-start; the load-bearing change; SHARED by both
scheduler the ability to drop already-played buffers and keep its position/index bookkeeping correct paths).** Give the scheduler the ability to drop already-played buffers and keep its position/index
against a buffer array that no longer begins at absolute time 0 (today `getCurrentPosition`, bookkeeping correct against a buffer array that no longer begins at absolute time 0 (today
`playFromPosition`, and the scheduling loop all assume `buffers[0]` is the track start). This is the `getCurrentPosition`, `playFromPosition`, and the scheduling loop all assume `buffers[0]` is the track
hardest correctness work in the phase — the time-anchor math must stay exact through eviction. No start). This is the hardest correctness work in the phase — the time-anchor math must stay exact through
refill yet; with eviction alone and the forward read loop unchanged, this is provably memory-bounded eviction. Because both decode paths feed the scheduler identically via `addBuffer`, **eviction is written
for the *played* region. **Independent of the §6 open questions** — it can begin immediately; the once and serves both** — no per-path branch. No refill yet; with eviction alone and the forward producers
window *sizes* (OQ1/OQ3) are parameters fed in later. Settled and cold-start. unchanged, this is provably memory-bounded for the *played* region on both paths. **Independent of the §6
- **21.2 — Back-pressure on the forward read loop (the bound on the *unplayed* region).** Make the C# open questions** — it can begin immediately; the window *sizes* (OQ1/OQ3) are parameters fed in later.
`StreamAudioWithEarlyPlayback` loop stop calling `ReadAsync` when forward decoded lookahead exceeds the Settled and cold-start.
high-water mark, and resume below low-water. Together with 21.1, this bounds *both* the played and - **21.2 — Back-pressure (the bound on the *unplayed* region) — two tracks, one signal.** Bound the
unplayed sides — the full memory guarantee (AC1). Must route resume/pause through the existing not-yet-played decoded audio by stopping production above a high-water mark and resuming below low-water,
cancellation-safe single-loop discipline (C6). **Depends on 21.1** (eviction must exist so the drained driven by the scheduler's decoded-lookahead fill (OQ7). The fill *signal* is shared; the *throttle* has
region is reclaimed, not merely un-read). two sites because Phase 18 gave the two paths different producers:
- **21.3Seek-back-past-window refill (close the random-access case).** Wire UC4 — when a backward - **21.2aC# read-loop back-pressure (serves both paths).** Make `StreamAudioWithEarlyPlayback` stop
seek lands earlier than the retained tail, refetch via the existing seek-beyond-buffer Range path calling `ReadAsync` above high-water and resume below low-water. Routes resume/pause through the
pointed at the earlier offset, and the minimal AC6 refill-failure handling. Mostly **reuse** of the existing cancellation-safe single-loop discipline (C6). For the WAV path this is *sufficient* (its
landed seek path; the new work is the trigger (window-miss detection) and the clean-failure path. `StreamDecoder` decodes synchronously into the scheduler).
**Depends on 21.1 + 21.2** (needs the window boundaries they define). - **21.2b — Opus decode-ahead back-pressure (Opus path only).** Additionally stop demuxing/decoding new
- **21.4 — Validation pass against the 1 GB target (acceptance).** Exercise AC1AC8 against a real 1 GB+ packets when the same fill signal is over high-water, so the WebCodecs decode queue and `decodedQueue`
mix: memory profiling (AC1), latency parity (AC2), edge-to-edge playback (AC3), the seek matrix do not balloon behind a throttled socket (§3.1, OQ7). This is the one mechanism with no WAV analogue.
(AC4/AC5), induced refill failure (AC6), visualizer-running (AC7), and rapid-seek concurrency (AC8). Confirm the WebCodecs decoder resumes cleanly after being starved of input mid-stream.
Largely test/measurement; any break is likely a tuning fix in the 21.1 anchor math or the 21.2 Together with 21.1 this bounds *both* the played and unplayed sides on *both* formats — the full memory
water-marks. **Depends on 21.121.3.** guarantee (AC1 + AC1-Opus). **Depends on 21.1** (eviction must exist so the drained region is reclaimed,
not merely un-read). Per OQ6, 21.2a and 21.2b ideally share one window controller exposing the fill
signal; the recommendation is the shared controller + two thin hooks.
- **21.3 — Seek-back-past-window refill (close the random-access case; one concern, per-path resolver).**
Wire UC4 — when a backward seek lands earlier than the retained tail, refetch via the existing
seek-beyond-buffer path pointed at the earlier offset, **using whichever resolver the active path already
ships** (`IFormatDecoder`/`StreamDecoder.calculateByteOffset` for WAV; the live
`resolveOpusByteOffset` + `OpusStreamDecoder.reinitializeForRangeContinuation` for Opus) — plus the
minimal AC6 refill-failure handling. Mostly **reuse** of the landed seek paths; the new work is the
trigger (window-miss detection) and the clean-failure path, both format-agnostic. **Depends on 21.1 +
21.2** (needs the window boundaries they define).
- **21.4 — Validation pass against the 1 GB target, BOTH formats (acceptance).** Exercise AC1AC8 against a
real 1 GB+ mix **streamed as WAV and as Opus**: memory profiling (AC1 both formats + AC1-Opus upstream
queues), latency parity (AC2), edge-to-edge playback (AC3), the seek matrix (AC4/AC5), induced refill
failure (AC6), visualizer-running (AC7), and rapid-seek concurrency (AC8 — including the Opus
seek-storm). Largely test/measurement; any break is likely a tuning fix in the 21.1 anchor math, the
21.2 water-marks, or the 21.2b Opus decode-ahead bound. **Depends on 21.121.3.**
--- ---
@@ -366,18 +512,35 @@ prerequisite and the load-bearing change; the rest layer on it.
- Root `CLAUDE.md` "Streaming-first audio playback" / `CONTEXT.md §3.5` — the seam this phase modifies; - Root `CLAUDE.md` "Streaming-first audio playback" / `CONTEXT.md §3.5` — the seam this phase modifies;
the §2 invariants here restate its contract. Both flag it as the most load-bearing path. the §2 invariants here restate its contract. Both flag it as the most load-bearing path.
- `PLAN.md` Phase 4 (landed) / `COMPLETED.md` — the HTTP Range `bytes=X-` primitive this generalizes. - **`COMPLETED.md` Phase 18 — Opus Low-Data Streaming (landed 2026-06-23) — read this first.** The
"as-built divergence" note records why Opus uses a **WebCodecs `AudioDecoder`** streaming pipeline
(`IStreamingDecoder`) rather than the spec'd-and-replaced per-segment `decodeAudioData`/`IFormatDecoder`
model. This is the two-path reality this phase reconciles to. `product-notes/phase-18-opus-low-data-streaming.md`
is the design memo (note: its §3.4 `OpusFormatDecoder` framing predates the WebCodecs divergence — the
*seek-index/sidecar* design in §3.4a is accurate and landed; the *decoder-shape* discussion was superseded
by `IStreamingDecoder`).
- `PLAN.md` Phase 4 (landed) / `COMPLETED.md` — the HTTP Range `bytes=X-` primitive this generalizes
(now serving both `?format=lossless` and `?format=opus`).
- `PLAN.md` Phase 1.3 / 1.4 / 1.5 / 1.6 / 1.7 — the deferred decoder/scheduler-seam features; §5 above - `PLAN.md` Phase 1.3 / 1.4 / 1.5 / 1.6 / 1.7 — the deferred decoder/scheduler-seam features; §5 above
reconciles each. reconciles each (1.5 and 1.7 updated for the Opus path).
- `PLAN.md` Phase 9 — defines the `Mix` medium (single long track), the canonical 1 GB case. - `PLAN.md` Phase 9 — defines the `Mix` medium (single long track), the canonical 1 GB case.
- `PLAN.md` Phase 10 / `product-notes/phase-10-mix-visualizer-lava-reframe.md` / - `PLAN.md` Phase 10 / `product-notes/phase-10-mix-visualizer-lava-reframe.md` /
`product-notes/phase-12-waveform-visualizer-generalization.md` — establishes the preprocessed `product-notes/phase-12-waveform-visualizer-generalization.md` — establishes the preprocessed
per-track high-res waveform datum; the basis for C7 (visualizer does not read live PCM). per-track high-res waveform datum; the basis for C7 (visualizer does not read live PCM).
- `DeepDrftPublic/Interop/audio/PlaybackScheduler.ts` — owns the unbounded `buffers: AudioBuffer[]`; - `DeepDrftPublic/Interop/audio/PlaybackScheduler.ts` — owns the unbounded `buffers: AudioBuffer[]`, the
21.1 lives here. **shared sink for both decode paths**; 21.1 (eviction) lives here.
- `DeepDrftPublic/Interop/audio/StreamDecoder.ts``reinitializeForRangeContinuation`, - `DeepDrftPublic/Interop/audio/AudioPlayer.ts` — the dispatch: `processFormatChunk` (WAV/MP3/FLAC) vs
`calculateByteOffset`; the refill substrate. `processOpusChunk` (Opus), both calling `scheduler.addBuffer`; `seekBeyondBuffer`/`reinitializeFromOffset`
branch per path; the place the refill trigger (21.3) and the fill-signal wiring (21.2) hook.
- `DeepDrftPublic/Interop/audio/StreamDecoder.ts` + `IFormatDecoder.ts` — the WAV/MP3/FLAC refill substrate
(`reinitializeForRangeContinuation`, `calculateByteOffset`).
- `DeepDrftPublic/Interop/audio/IStreamingDecoder.ts` + `OpusStreamDecoder.ts` + `OggDemuxer.ts` +
`OpusSidecar.ts` — the **Opus** path: the WebCodecs decode pipeline, the `decodeQueueSize`/`decodedQueue`
upstream accumulation 21.2b must bound, and the live `resolveOpusByteOffset` /
`reinitializeForRangeContinuation(landingTime, target)` seek 21.3 reuses. **`IStreamingDecoder.ts` is the
seam the Opus windowing hooks into** (push/complete/reinitialize lifecycle).
- `DeepDrftPublic.Client/Services/StreamingAudioPlayerService.cs` — the C# forward read loop - `DeepDrftPublic.Client/Services/StreamingAudioPlayerService.cs` — the C# forward read loop
(`StreamAudioWithEarlyPlayback`), the seek-beyond-buffer path (`SeekBeyondBuffer`), and the (`StreamAudioWithEarlyPlayback`, feeding *both* decoders), the seek-beyond-buffer path (`SeekBeyondBuffer`),
cancellation/drain discipline (C6); 21.2/21.3 live here. and the cancellation/drain discipline (C6); 21.2a/21.3 live here.
- `DeepDrftPublic.Client/Clients/TrackMediaClient.cs` — the Range-capable media fetch reused by refill. - `DeepDrftPublic.Client/Clients/TrackMediaClient.cs` — the Range-capable media fetch (with the `?format=`
param) reused by refill on both paths.