docs(plan): add Phase 18 Opus low-data streaming; resolve Phase 21 OQ5 (no MSE)
This commit is contained in:
@@ -8,6 +8,16 @@ server touch is **reuse, not new surface**: the existing `DeepDrftAPI` HTTP `Ran
|
||||
partial-content primitive (Phase 4, landed) is the load-bearing dependency; this phase adds no new API
|
||||
endpoint.
|
||||
|
||||
> **Sequencing dependency (Daniel, 2026-06-23): Phase 18 (Opus Low-Data Streaming) comes BEFORE this
|
||||
> phase.** Format support — specifically the derived **Ogg Opus fullband 320** low-data delivery path
|
||||
> (`product-notes/phase-18-opus-low-data-streaming.md`) — is a prerequisite that sequences ahead of
|
||||
> windowing. Phase 21's windowing must work across **both** delivery formats (lossless WAV and Opus).
|
||||
> Its C5 invariant below already anticipated this ("must not foreclose MP3/FLAC"); **Opus is now the
|
||||
> concrete VBR/containerized driver of C5.** Windowing an Opus stream uses the decoder's *approximate*
|
||||
> byte↔time mapping (`OpusFormatDecoder.calculateByteOffset` — Ogg-page interpolation), exactly the C5
|
||||
> case — not the exact CBR-WAV `byteRate` math. Build the window machinery format-agnostically
|
||||
> (§2 C3/C5) so it inherits Opus for free.
|
||||
|
||||
---
|
||||
|
||||
## 1. Goal
|
||||
@@ -45,19 +55,25 @@ docs. This phase **modifies that seam** — so the contract it must preserve is
|
||||
- **C2 — Playback start latency unchanged.** Today playback starts as soon as a configurable minimum
|
||||
buffer count is queued (header-derived duration, not full-file). The window model must keep first-audio
|
||||
latency at parity — bounding memory must not reintroduce a fetch-then-play stall.
|
||||
- **C3 — The format-decoder abstraction is untouched.** `IFormatDecoder` (WAV active; MP3/FLAC
|
||||
implemented, not yet wired) owns all format-specific byte math. Windowing lives in the
|
||||
- **C3 — The format-decoder abstraction is untouched.** `IFormatDecoder` owns all format-specific
|
||||
byte math; `AudioPlayer.createFormatDecoder` already dispatches on `Content-Type` (WAV/MP3/FLAC
|
||||
decoders all wired today — verified 2026-06-23; an `OpusFormatDecoder` joins them in Phase 18).
|
||||
Windowing lives in the
|
||||
**format-agnostic** layer (`PlaybackScheduler` eviction + `StreamDecoder`/player refill
|
||||
orchestration); it must add **no** format-specific branches. A future wired MP3/FLAC decoder inherits
|
||||
windowing for free.
|
||||
- **C4 — Read-only playback only.** This is a memory-management change, not a UX change. No new
|
||||
user-visible control, no change to seek/transport semantics beyond what the listener already
|
||||
experiences. Seek must still feel identical.
|
||||
- **C5 — WAV-only is the shipping target; the design must not foreclose MP3/FLAC.** Byte↔time mapping
|
||||
for refill is exact and cheap for WAV (CBR: `byteRate` from the header). For VBR formats the mapping is
|
||||
approximate (the decoders already carry TOC/SEEKTABLE seek math). The window machinery must express
|
||||
refill in terms of the decoder's existing `calculateByteOffset`, so the same code works when those
|
||||
formats are wired — **no WAV-special-cased offset math in the window layer.**
|
||||
- **C5 — Must window both delivery formats (WAV lossless AND Opus low-data).** Byte↔time mapping for
|
||||
refill is exact and cheap for WAV (CBR: `byteRate` from the header). For VBR/containerized formats it
|
||||
is approximate (the decoders carry TOC/SEEKTABLE/Ogg-page seek math). **Phase 18 (Opus) is sequenced
|
||||
before this phase and is the concrete driver here:** an Ogg Opus 320 stream is VBR and page-paged, so
|
||||
its `calculateByteOffset` is an *approximate* page-interpolation, not exact-offset. The window
|
||||
machinery must express refill purely in terms of the decoder's existing `calculateByteOffset`, so the
|
||||
same code windows WAV exactly and Opus approximately — **no WAV-special-cased offset math in the
|
||||
window layer.** (MP3/FLAC decoders are already wired in the registry too — the registry dispatches on
|
||||
content-type today; an `OpusFormatDecoder` joins them in Phase 18.)
|
||||
- **C6 — No regression to the single-instance JS decoder concurrency guarantees.** The current code is
|
||||
careful that only one streaming loop touches the single JS `StreamDecoder` at a time
|
||||
(`DrainActiveStreamingTaskAsync`, the `_streamingCancellation` identity dance). Windowed refill
|
||||
@@ -146,14 +162,15 @@ because the stack is a bespoke Web Audio graph, not `<media>` + MSE.
|
||||
Stop hand-rolling the decode→schedule graph for long tracks; feed the Range stream into a `SourceBuffer`
|
||||
and let the browser evict via its built-in quota + `remove()`. Memory management becomes the platform's
|
||||
problem.
|
||||
*Why not (now, but flag for Daniel):* MSE does not accept raw WAV/PCM — it wants containerized formats
|
||||
(fragmented MP4/WebM, or MP3/AAC elementary streams). The current producer is WAV-only, and the entire
|
||||
bespoke visualizer/spectrum graph is wired to the Web Audio `AudioContext`, not a `<media>` element.
|
||||
Adopting MSE is a **rewrite of the playback substrate**, not a windowing change — out of scope for this
|
||||
phase. But it is the *real* long-term answer and is entangled with Phase 1.2 (non-WAV formats): if
|
||||
DeepDrft moves to a compressed delivery format, MSE becomes viable and could retire the hand-rolled
|
||||
decoder, the seek-beyond-buffer path, *and* this phase's window machinery in one move. **Surfaced as
|
||||
open question OQ5** — not to decide now, but so this phase is built knowing it may be superseded.
|
||||
*Why not — RESOLVED, rejected (Daniel, 2026-06-23; see OQ5):* MSE does not accept raw WAV/PCM — it
|
||||
wants containerized formats (fragmented MP4/WebM, or MP3/AAC elementary streams). The entire bespoke
|
||||
visualizer/spectrum graph is wired to the Web Audio `AudioContext`, not a `<media>` element. Adopting
|
||||
MSE is a **rewrite of the playback substrate**, not a windowing change. It *looked* like the real
|
||||
long-term answer once compressed delivery arrived — but Daniel has decided compressed delivery
|
||||
(**Phase 18 Opus**) will feed the **same bespoke graph** via the `IFormatDecoder` seam, so the
|
||||
compressed-delivery move that would have justified MSE happens *without* surrendering the graph. **The
|
||||
bespoke graph is a deliberate long-term commitment; MSE is rejected.** Direction A is therefore the
|
||||
permanent destination, not a stopgap that MSE will retire. Recorded as considered-and-declined.
|
||||
|
||||
### 3.3 Recommended direction: A, with B held as the documented fallback
|
||||
|
||||
@@ -262,11 +279,17 @@ These are policy calls with user-visible or resource trade-offs — flagged rath
|
||||
tracks that never needed it. Recommend **window everything** (one path, C6-safe, and short tracks
|
||||
simply never hit a refill because they fit inside the forward window) — but Daniel may prefer a
|
||||
size threshold. `[Daniel decision]`
|
||||
- **OQ5 — Is MSE (Direction C) the real destination?** Not for this phase, but it bears on how much to
|
||||
invest here. If DeepDrft will move to compressed delivery (Phase 1.2) and MSE within ~a year, Phase 21
|
||||
should be the *minimal* Direction-A change (don't gold-plate machinery MSE would retire). If WAV +
|
||||
bespoke graph is the long-term commitment, a more thorough windowing investment is justified.
|
||||
`[Daniel steer — informs scope, not a blocker]`
|
||||
- **OQ5 — Is MSE (Direction C) the real destination? — RESOLVED: NO (Daniel, 2026-06-23).** **Do not
|
||||
adopt MSE. The bespoke Web Audio decode→schedule graph stays — it is bespoke by deliberate choice, a
|
||||
long-term commitment, not a stopgap.** Daniel's rationale: the player is intentionally a custom
|
||||
graph, not an HTML `<media>` element; the compressed-delivery move that *would* have made MSE
|
||||
tempting is being met instead by **Phase 18 (Opus low-data path)** feeding the **same bespoke graph**
|
||||
through the `IFormatDecoder` seam — so compressed delivery arrives *without* surrendering the graph.
|
||||
Consequence for this phase: Direction A (the hand-rolled sliding window) is the destination, not a
|
||||
placeholder; invest in it as permanent machinery. It will window both the WAV and the Opus path
|
||||
(the sequencing note at the top). Direction C is recorded as **considered and declined** per file
|
||||
convention; kept visible so a future reader sees the road not taken and why.
|
||||
`[RESOLVED — bespoke graph retained; MSE rejected]`
|
||||
|
||||
---
|
||||
|
||||
|
||||
Reference in New Issue
Block a user