docs: reconcile Phase 21 spec with as-built Phase 18 (two decode paths)
Window both the WAV StreamDecoder and Opus WebCodecs paths feeding one PlaybackScheduler — shared eviction, per-path back-pressure; reuse the now-live index-driven Opus seek for refill. Drops stale approximate-seek language; adds OQ6/OQ7.
This commit is contained in:
@@ -457,44 +457,52 @@ plays without the whole decoded PCM accumulating in the browser. **Public listen
|
||||
(`DeepDrftPublic.Client` player stack + `DeepDrftPublic` TypeScript audio interop); no CMS, no API
|
||||
endpoint, no schema change.
|
||||
|
||||
**Phase 18 (Opus Low-Data Streaming) has landed — Phase 21 is the next pickup.** The derived Ogg Opus 320 low-data path (Phase 18, `COMPLETED.md`) is the prerequisite; windowing must work across **both** delivery formats. Phase 21's C5 invariant already anticipated this ("must not foreclose
|
||||
MP3/FLAC"); **Opus is now the concrete VBR/paged driver** — windowing an Opus stream uses the decoder's
|
||||
**accurate index-based** byte↔time mapping (`OpusFormatDecoder.calculateByteOffset`, a binary search in the
|
||||
Phase 18 precomputed seek index — *not* the exact CBR-WAV `byteRate` math, and *not* approximate page
|
||||
interpolation: VBR-safe and exact, per the Phase 18 seek-model resolution 2026-06-23). The windowed refill
|
||||
controller calls the **same** index resolver an explicit seek does, and a window opening away from byte 0
|
||||
still decodes via the Phase 18 sidecar setup header. Build the window machinery format-agnostically so it
|
||||
inherits Opus for free.
|
||||
**Phase 18 (Opus Low-Data Streaming) has landed — Phase 21 is the next pickup, reconciled to the as-built
|
||||
two-decode-path reality.** Phase 18 left **two** decode paths feeding the one `PlaybackScheduler`:
|
||||
WAV/MP3/FLAC via `StreamDecoder`/`IFormatDecoder`, and Opus via a **WebCodecs `AudioDecoder`** pipeline
|
||||
(`OggDemuxer` → `OpusStreamDecoder`, the `IStreamingDecoder` seam — *not* `IFormatDecoder`; per-segment
|
||||
`decodeAudioData` was tried and replaced). Windowing must bound **both**. The accurate index-driven Opus
|
||||
seek the original spec assumed Phase 21 would build is **already live** (`resolveOpusByteOffset` over the
|
||||
precomputed seek index → Range fetch → `reinitializeForRangeContinuation` with frame-accurate lead-trim);
|
||||
Phase 21 **reuses** it for window-miss refills rather than building it. Opus seek is VBR-safe and
|
||||
**accurate**, not approximate (the earlier "approximate page interpolation" framing is corrected).
|
||||
|
||||
The network path already streams in adaptive 16–64 KB chunks. The accumulation is on the **decode
|
||||
side**: `PlaybackScheduler` holds an `AudioBuffer[]` it **never evicts** ("Supports pause/resume/seek by
|
||||
retaining all buffers" — its own doc comment). Decoded PCM is larger than the source (Web Audio is
|
||||
32-bit float per sample/channel — a 16-bit stereo WAV roughly doubles once decoded), so a 1 GB WAV
|
||||
becomes ~2 GB of retained float data. That is the OOM. The fix: hold only a sliding forward window plus a
|
||||
small back-retain, discard already-played buffers, and refill on demand.
|
||||
side**, and now has two faces. The shared one: `PlaybackScheduler` holds an `AudioBuffer[]` it **never
|
||||
evicts** — both paths `addBuffer` into it, nothing is removed. Decoded PCM is larger than the source
|
||||
(Web Audio is 32-bit float per sample/channel — a 16-bit stereo WAV roughly doubles once decoded), so a
|
||||
1 GB WAV becomes ~2 GB of retained float; **a low-data Opus mix decodes to the same ~2 GB once played**,
|
||||
so its small transfer does not spare it. The Opus-only second face: the WebCodecs decode queue +
|
||||
`decodedQueue` accumulate upstream of the scheduler too. The fix: hold only a sliding forward window plus
|
||||
a small back-retain, discard already-played buffers, and refill on demand — with back-pressure on the C#
|
||||
read loop (both paths) **and** on the Opus demux/decode feed (Opus only).
|
||||
|
||||
**Architectural spine — a sliding window keyed on playback position, built as a generalization of the
|
||||
landed seek-beyond-buffer path.** The Phase 4 HTTP `Range: bytes=X-` → 206 primitive already does every
|
||||
plumbing primitive the window needs (discard-buffers-keep-offset via `clearForSeek`/`setPlaybackOffset`;
|
||||
fetch-from-offset via `TrackMediaClient`; decode-header-less-body via
|
||||
`StreamDecoder.reinitializeForRangeContinuation`; time→byte via `IFormatDecoder.calculateByteOffset`),
|
||||
just triggered manually and one-shot. The only genuinely new mechanisms are **partial eviction** on the
|
||||
scheduler and **back-pressure** on the forward read loop (stop calling `ReadAsync` above a high-water
|
||||
mark, resume below low-water). Recommended **Direction A** (sliding window on the existing single forward
|
||||
stream); **Direction B** (discrete Range-fetched segments — the HLS/DASH/MSE-eviction analogue) held as
|
||||
the documented fallback; **Direction C** (adopt MSE and let the browser manage the buffer) **rejected
|
||||
(OQ5 = NO, Daniel 2026-06-23)** — the bespoke Web Audio graph is a deliberate long-term commitment, and
|
||||
the compressed-delivery move that would have justified MSE is met instead by **Phase 18 (Opus) feeding
|
||||
the same bespoke graph** through the `IFormatDecoder` seam. Direction A is therefore the permanent
|
||||
destination, not a stopgap MSE would retire.
|
||||
landed seek-beyond-buffer paths.** The Phase 4 HTTP `Range: bytes=X-` → 206 primitive already does every
|
||||
plumbing primitive the window needs, with a WAV branch and an Opus branch, both live: discard-buffers-keep-
|
||||
offset via `clearForSeek`/`setPlaybackOffset` (shared); fetch-from-offset via `TrackMediaClient` (shared,
|
||||
now with `?format=`); decode-header-less-body via `StreamDecoder.reinitializeForRangeContinuation` (WAV) /
|
||||
`OpusStreamDecoder.reinitializeForRangeContinuation` (Opus); time→byte via `IFormatDecoder.calculateByteOffset`
|
||||
(WAV) / `resolveOpusByteOffset` over the seek index (Opus) — just triggered manually and one-shot. The
|
||||
genuinely new mechanisms: **partial eviction** on the shared scheduler (one implementation, both paths),
|
||||
and **back-pressure** — on the C# read loop (both paths) **and additionally on the Opus WebCodecs
|
||||
decode-ahead** (`decodeQueueSize` + `decodedQueue`, Opus only, since throttling the socket alone doesn't
|
||||
bound the async decoder's queues). Recommended **Direction A** (sliding window on the existing single
|
||||
forward stream); **Direction B** (discrete Range-fetched segments — the HLS/DASH/MSE-eviction analogue)
|
||||
held as the documented fallback; **Direction C** (adopt MSE) **rejected (OQ5 = NO, Daniel 2026-06-23)** —
|
||||
the bespoke Web Audio graph is a deliberate long-term commitment, and the compressed-delivery move that
|
||||
would have justified MSE was met instead by **Phase 18 (Opus) feeding the same bespoke graph** through the
|
||||
WebCodecs `IStreamingDecoder` seam. Direction A is the permanent destination, not a stopgap MSE would
|
||||
retire.
|
||||
|
||||
**Invariants that must hold (the §3.5 seam contract).** Reuse the Range path, don't fork it; playback-
|
||||
start latency at parity; the `IFormatDecoder` abstraction untouched (windowing is format-agnostic, so
|
||||
wiring MP3/FLAC later inherits it free); read-only playback (no new control); the single-instance JS
|
||||
decoder stays single-writer (every refill routes through the existing cancellation/drain discipline). The
|
||||
**Mix visualizer is provably unaffected** — it renders from the preprocessed per-track high-res datum
|
||||
(Phase 10/12), never from live decoded PCM, so evicting played buffers cannot starve it. The 1 GB mix is
|
||||
both the canonical case *and* the proof the eviction is safe.
|
||||
**Invariants that must hold (the §3.5 seam contract).** Reuse each path's Range/seek machinery, don't fork
|
||||
it; playback-start latency at parity; neither decoder seam's contract forked — eviction is shared at the
|
||||
scheduler (zero format branches), back-pressure is seam-aware (the one place the two paths diverge);
|
||||
read-only playback (no new control); the single-writer decoder discipline holds for **both** decoders
|
||||
(stricter for the stateful Opus `AudioDecoder` — a stale `push` racing a reset+reconfigure corrupts
|
||||
inter-frame state). The **Mix visualizer is provably unaffected** — it renders from the preprocessed
|
||||
per-track high-res datum (Phase 10/12), never from live decoded PCM, so evicting played buffers cannot
|
||||
starve it. The 1 GB mix is both the canonical case *and* the proof the eviction is safe.
|
||||
|
||||
**Interaction with deferred Phase 1 features (same seam):** windowing should land **before** preload
|
||||
(1.3) — it makes preload of long tracks memory-safe by construction (a staged next-track decoder inherits
|
||||
@@ -512,32 +520,45 @@ Sequenced as four waves. `21.1 → 21.2 → 21.3`, with `21.4` validating the wh
|
||||
prerequisite and the load-bearing change** — independent of the open questions (window *sizes* are
|
||||
parameters fed in later).
|
||||
|
||||
- **21.1 — Partial eviction in `PlaybackScheduler` (cold-start; load-bearing).** Drop already-played
|
||||
buffers while keeping the position/index/time-anchor bookkeeping exact against a buffer array that no
|
||||
longer begins at absolute time 0 (today `getCurrentPosition`/`playFromPosition`/the schedule loop all
|
||||
assume `buffers[0]` is the track start). The hardest correctness work in the phase. No refill yet.
|
||||
**Independent of the open questions — can begin immediately.**
|
||||
- **21.2 — Back-pressure on the forward read loop.** Stop `ReadAsync` above the high-water mark, resume
|
||||
below low-water; together with 21.1 this bounds *both* the played and unplayed regions (the AC1
|
||||
guarantee). Routes resume/pause through the existing single-loop cancellation discipline. **Depends on
|
||||
21.1.**
|
||||
- **21.3 — Seek-back-past-window refill.** When a backward seek lands earlier than the retained tail,
|
||||
refetch via the existing seek-beyond-buffer Range path pointed at the earlier offset; plus the minimal
|
||||
clean refill-failure handling (the 1.6 adjacency). Mostly reuse of the landed seek path. **Depends on
|
||||
21.1 + 21.2.**
|
||||
- **21.4 — Validation against the 1 GB target (acceptance).** Memory profiling (bounded under 1 GB is the
|
||||
headline), latency parity, edge-to-edge playback, the seek matrix, induced refill failure, visualizer-
|
||||
running, rapid-seek concurrency. Largely measurement; breaks are tuning fixes in 21.1's anchor math or
|
||||
21.2's water-marks. **Depends on 21.1–21.3.**
|
||||
Decomposition is **by concern** (eviction → back-pressure → seek-back refill → validate), not by format —
|
||||
eviction is genuinely shared, so a path-split would duplicate the hardest work; the one path-divergent
|
||||
concern (back-pressure) carries a two-track split *inside* its wave.
|
||||
|
||||
- **21.1 — Partial eviction in `PlaybackScheduler` (cold-start; load-bearing; SHARED by both paths).** Drop
|
||||
already-played buffers while keeping the position/index/time-anchor bookkeeping exact against a buffer
|
||||
array that no longer begins at absolute time 0 (today `getCurrentPosition`/`playFromPosition`/the schedule
|
||||
loop all assume `buffers[0]` is the track start). The hardest correctness work in the phase. Written once,
|
||||
serves both decode paths (they `addBuffer` identically). No refill yet. **Independent of the open
|
||||
questions — can begin immediately.**
|
||||
- **21.2 — Back-pressure (two tracks, one fill signal).** Bound the unplayed region by throttling
|
||||
production above a high-water mark and resuming below low-water, driven by the scheduler's
|
||||
decoded-lookahead fill. **21.2a** — stop `ReadAsync` on the C# loop (serves both paths; *sufficient* for
|
||||
WAV). **21.2b** — additionally stop the Opus demux/decode feed so the WebCodecs decode queue +
|
||||
`decodedQueue` don't balloon behind a throttled socket (Opus only; no WAV analogue). Together with 21.1
|
||||
this bounds both played and unplayed sides on both formats (AC1 + AC1-Opus). Routes through the existing
|
||||
single-loop cancellation discipline. **Depends on 21.1.**
|
||||
- **21.3 — Seek-back-past-window refill (one concern, per-path resolver).** When a backward seek lands
|
||||
earlier than the retained tail, refetch via the existing seek-beyond-buffer path pointed at the earlier
|
||||
offset, using whichever resolver the active path already ships (`IFormatDecoder`/`StreamDecoder` for WAV;
|
||||
the live `resolveOpusByteOffset` + `OpusStreamDecoder.reinitializeForRangeContinuation` for Opus); plus
|
||||
the minimal clean refill-failure handling (the 1.6 adjacency). Mostly reuse of the landed seek paths.
|
||||
**Depends on 21.1 + 21.2.**
|
||||
- **21.4 — Validation against the 1 GB target, BOTH formats (acceptance).** Memory profiling (bounded under
|
||||
1 GB as WAV *and* as Opus, plus the Opus upstream queues), latency parity, edge-to-edge playback, the
|
||||
seek matrix, induced refill failure, visualizer-running, rapid-seek concurrency (incl. an Opus
|
||||
seek-storm). Largely measurement; breaks are tuning fixes in 21.1's anchor math, 21.2's water-marks, or
|
||||
21.2b's Opus decode-ahead bound. **Depends on 21.1–21.3.**
|
||||
|
||||
**Dependency shape:** `21.1 → 21.2 → 21.3 → 21.4`; 21.1 is the only cold-start wave. **Phase-level
|
||||
prerequisite: Phase 18 (Opus) lands first** so windowing is built against both formats. **Open questions
|
||||
for Daniel (spec §6):** window-size policy axis (time-based window + memory guard — recommended); seek-
|
||||
back-past-window re-buffer acceptable (recommend yes, symmetric to forward); a hard total in-flight
|
||||
memory cap as a guard rail (recommend yes); window everything vs. only long tracks (recommend everything
|
||||
— one path, short tracks never hit a refill). **OQ5 (adopt MSE) — RESOLVED NO (Daniel 2026-06-23): the
|
||||
bespoke graph stays by deliberate choice; recorded considered-and-declined, kept visible per file
|
||||
convention.** None block 21.1.
|
||||
prerequisite: Phase 18 (Opus) has landed** (`COMPLETED.md`) — windowing is built against both formats.
|
||||
**Open questions for Daniel (spec §6):** window-size policy axis (time-based window + memory guard —
|
||||
recommended); seek-back-past-window re-buffer acceptable (recommend yes, symmetric to forward); a hard
|
||||
total in-flight memory cap as a guard rail (recommend yes); window everything vs. only long tracks
|
||||
(recommend everything — one path, short tracks never hit a refill). **New staff-engineer architecture
|
||||
calls (spec §6):** OQ6 — one window controller for both paths or two (recommend shared controller + two
|
||||
back-pressure hooks); OQ7 — drive the Opus decode-ahead bound from the single scheduler-fill signal
|
||||
(recommended). **OQ5 (adopt MSE) — RESOLVED NO (Daniel 2026-06-23):** the bespoke graph stays by deliberate
|
||||
choice. None block 21.1.
|
||||
|
||||
---
|
||||
|
||||
|
||||
Reference in New Issue
Block a user