docs: spec Phase 21 — windowed streaming buffer for bounded client memory

This commit is contained in:
daniel-c-harvey
2026-06-23 00:14:44 -04:00
parent 2c1571057a
commit a84a99c309
2 changed files with 428 additions and 0 deletions
+79
View File
@@ -443,6 +443,85 @@ not the same work; this phase does not satisfy or depend on that one.
---
## Phase 21 — Windowed Streaming Buffer (bounded client memory for long streams)
Bound the **client memory** a playing track consumes to a small, configurable forward window —
**independent of total stream length** — so a 1 GB+ DJ MIX (Phase 9 `Mix` medium: a single long track)
plays without the whole decoded PCM accumulating in the browser. **Public listener site only**
(`DeepDrftPublic.Client` player stack + `DeepDrftPublic` TypeScript audio interop); no CMS, no API
endpoint, no schema change.
The network path already streams in adaptive 1664 KB chunks. The accumulation is on the **decode
side**: `PlaybackScheduler` holds an `AudioBuffer[]` it **never evicts** ("Supports pause/resume/seek by
retaining all buffers" — its own doc comment). Decoded PCM is larger than the source (Web Audio is
32-bit float per sample/channel — a 16-bit stereo WAV roughly doubles once decoded), so a 1 GB WAV
becomes ~2 GB of retained float data. That is the OOM. The fix: hold only a sliding forward window plus a
small back-retain, discard already-played buffers, and refill on demand.
**Architectural spine — a sliding window keyed on playback position, built as a generalization of the
landed seek-beyond-buffer path.** The Phase 4 HTTP `Range: bytes=X-` → 206 primitive already does every
plumbing primitive the window needs (discard-buffers-keep-offset via `clearForSeek`/`setPlaybackOffset`;
fetch-from-offset via `TrackMediaClient`; decode-header-less-body via
`StreamDecoder.reinitializeForRangeContinuation`; time→byte via `IFormatDecoder.calculateByteOffset`),
just triggered manually and one-shot. The only genuinely new mechanisms are **partial eviction** on the
scheduler and **back-pressure** on the forward read loop (stop calling `ReadAsync` above a high-water
mark, resume below low-water). Recommended **Direction A** (sliding window on the existing single forward
stream); **Direction B** (discrete Range-fetched segments — the HLS/DASH/MSE-eviction analogue) held as
the documented fallback; **Direction C** (adopt MSE and let the browser manage the buffer) flagged as the
real long-term answer but out of scope — it is a playback-substrate rewrite entangled with non-WAV
formats (Phase 1.2), surfaced as OQ5.
**Invariants that must hold (the §3.5 seam contract).** Reuse the Range path, don't fork it; playback-
start latency at parity; the `IFormatDecoder` abstraction untouched (windowing is format-agnostic, so
wiring MP3/FLAC later inherits it free); read-only playback (no new control); the single-instance JS
decoder stays single-writer (every refill routes through the existing cancellation/drain discipline). The
**Mix visualizer is provably unaffected** — it renders from the preprocessed per-track high-res datum
(Phase 10/12), never from live decoded PCM, so evicting played buffers cannot starve it. The 1 GB mix is
both the canonical case *and* the proof the eviction is safe.
**Interaction with deferred Phase 1 features (same seam):** windowing should land **before** preload
(1.3) — it makes preload of long tracks memory-safe by construction (a staged next-track decoder inherits
the bounded scheduler); it makes crossfade (1.4) between two long mixes affordable (the overlap doubles
the *window*, not the track); it adds a minor "don't evict the final window before the gapless boundary"
care point for 1.5. It **enlarges the error surface** (1.6): windowed refill issues mid-stream fetches
the listener didn't initiate, one of which can fail deep into a 1 GB mix — so the *cheap* half of 1.6
(clean refill-failure handling, no wedged player) is folded into this phase's acceptance criteria, not
left fully to 1.6.
Full design, the three directions with SOLID/road-not-taken rationale, use cases, acceptance criteria,
the open-question set, and the wave decomposition: `product-notes/phase-21-windowed-streaming-buffer.md`.
Sequenced as four waves. `21.1 → 21.2 → 21.3`, with `21.4` validating the whole. **21.1 is the cold-start
prerequisite and the load-bearing change** — independent of the open questions (window *sizes* are
parameters fed in later).
- **21.1 — Partial eviction in `PlaybackScheduler` (cold-start; load-bearing).** Drop already-played
buffers while keeping the position/index/time-anchor bookkeeping exact against a buffer array that no
longer begins at absolute time 0 (today `getCurrentPosition`/`playFromPosition`/the schedule loop all
assume `buffers[0]` is the track start). The hardest correctness work in the phase. No refill yet.
**Independent of the open questions — can begin immediately.**
- **21.2 — Back-pressure on the forward read loop.** Stop `ReadAsync` above the high-water mark, resume
below low-water; together with 21.1 this bounds *both* the played and unplayed regions (the AC1
guarantee). Routes resume/pause through the existing single-loop cancellation discipline. **Depends on
21.1.**
- **21.3 — Seek-back-past-window refill.** When a backward seek lands earlier than the retained tail,
refetch via the existing seek-beyond-buffer Range path pointed at the earlier offset; plus the minimal
clean refill-failure handling (the 1.6 adjacency). Mostly reuse of the landed seek path. **Depends on
21.1 + 21.2.**
- **21.4 — Validation against the 1 GB target (acceptance).** Memory profiling (bounded under 1 GB is the
headline), latency parity, edge-to-edge playback, the seek matrix, induced refill failure, visualizer-
running, rapid-seek concurrency. Largely measurement; breaks are tuning fixes in 21.1's anchor math or
21.2's water-marks. **Depends on 21.121.3.**
**Dependency shape:** `21.1 → 21.2 → 21.3 → 21.4`; 21.1 is the only cold-start wave. **Open questions for
Daniel (spec §6):** window-size policy axis (time-based window + memory guard — recommended); seek-back-
past-window re-buffer acceptable (recommend yes, symmetric to forward); a hard total in-flight memory cap
as a guard rail (recommend yes); window everything vs. only long tracks (recommend everything — one path,
short tracks never hit a refill); and whether MSE is the real destination (steer informing scope, not a
blocker). None block 21.1.
---
---
## Working with this file