From 1bdaeaa164c3c7d84cc10955ec7947a0d3a76594 Mon Sep 17 00:00:00 2001 From: daniel-c-harvey Date: Tue, 23 Jun 2026 04:58:21 -0400 Subject: [PATCH] docs(plan): add Phase 18 Opus low-data streaming; resolve Phase 21 OQ5 (no MSE) --- PLAN.md | 114 ++++- .../phase-18-opus-low-data-streaming.md | 462 ++++++++++++++++++ .../phase-21-windowed-streaming-buffer.md | 63 ++- 3 files changed, 610 insertions(+), 29 deletions(-) create mode 100644 product-notes/phase-18-opus-low-data-streaming.md diff --git a/PLAN.md b/PLAN.md index 5a46e12..2c6d9fe 100644 --- a/PLAN.md +++ b/PLAN.md @@ -443,6 +443,90 @@ not the same work; this phase does not satisfy or depend on that one. --- +## Phase 18 — Opus Low-Data Streaming (dual-format lossless + Opus delivery) + +The concrete realization of the long-deferred **"Non-WAV formats"** intent (`CONTEXT.md §5`). Daniel's +direction (2026-06-23): **two delivery formats per track — the existing lossless WAV path, and a new +low-data Ogg Opus (fullband, 320 kbps) path — so the listener gets a choice, with Opus the +bandwidth-friendly default-candidate.** Lossless streaming becomes *optional*, not the only path. The +bespoke Web Audio decode→schedule graph is **retained by deliberate choice** — Opus feeds the same +`IFormatDecoder` seam, not an HTML `` element or MSE (the decision shared with Phase 21 OQ5). +**Sequenced BEFORE Phase 21** — windowing must work across both formats. Surfaces: ingest/preprocessing +in `DeepDrftContent` (`AudioProcessor`/router/`WaveformProfileService`) + `DeepDrftAPI` +(`UnifiedTrackService.UploadAsync`, replace-audio); delivery/decode in `DeepDrftAPI` (stream endpoint + +`Range`) + `DeepDrftPublic` proxy + `DeepDrftPublic.Client` player stack + `DeepDrftPublic/Interop/audio` +TS decoders. Full design, the three directions with SOLID/road-not-taken rationale, the storage and +delivery options, the Opus decoder + seek math, acceptance criteria, open questions, and wave +decomposition: `product-notes/phase-18-opus-low-data-streaming.md`. + +**Much further along than the backlog line implies (verified 2026-06-23).** The multi-format *substrate* +already exists on both sides: the producer-side `AudioProcessorRouter` routes `.wav`/`.mp3`/`.flac` and +`TrackContentService.AddTrackAsync` is format-agnostic (it **stores originals**, no transcode); the +decoder-side `AudioPlayer.createFormatDecoder` is a **wired** strategy registry dispatching on +`Content-Type` (WAV/MP3/FLAC decoders all present — correcting the Phase 21 spec's stale +"implemented-not-wired" note). **The actual gap is Daniel's specific ask:** (1) a **transcode-at-ingest** +step that *derives* an Opus 320 artifact per track (nothing derives Opus today), and (2) a **per-format +delivery selection** so one track serves as either WAV or Opus on request. + +**Architectural spine — a derived artifact + a delivery param + one new decoder; three new leaf +implementations, zero changes to existing format code (the strong OCP signal).** Transcode is a new +processor sibling in `DeepDrftContent`, invoked post-store alongside `WaveformProfileService`, +**failure-tolerant and off the hot path** (background/queued — a 1 GB WAV transcode must not block the +upload response) — mirroring the landed waveform-datum pattern (derive at ingest, regenerate via a CMS +bulk action + ApiKey endpoint). The Opus bytes are a **derived artifact** stored like the high-res +waveform datum (recommend a dedicated `track-opus` vault, the `track-waveforms` precedent; final call +staff-engineer's). Delivery adds a **`?format=opus|lossless` param** (mirroring the existing `offset` +param threading through `TrackProxyController`) resolved server-side to the right artifact + content-type, +with a **lossless fallback** when no Opus artifact exists (additive, never 404/silence). The player gains +one `OpusFormatDecoder` (`IFormatDecoder`): Ogg-page-aligned segmenting (`OggS` scan — the FLAC +frame-sync analogue), `OpusHead` setup-bytes carry (the FLAC `streamInfoBytes` analogue), and an +**approximate** page-interpolation `calculateByteOffset` (Opus is VBR/paged — this is exactly the Phase +21 C5 case). **Browser constraint flagged:** Ogg-Opus `decodeAudioData` is Safari-18.4+ only (Chrome/FF +long-standing), so the Opus default must be **capability-gated** — fall back to the universal lossless +path on browsers that can't decode it. + +**Constraints/invariants:** keep the bespoke graph (no MSE); preprocessing is **additive** (WAV path +untouched, byte-for-byte; a track with no Opus artifact still plays losslessly); reuse the landed +`Range`/offset seek path; no format branches leak outside the new decoder + one selection arm + the +transcode/delivery seam; transcode failure must not block ingest; format selection is a delivery-time +decision resolving one `EntryKey` to one of two artifacts (one source, two views — **not** a second +`TrackEntity` row, which would fracture share/queue/play-count/release identity). + +Sequenced as five waves. `18.1 → 18.2 → {18.3, 18.4} → 18.5`. **18.1 (ingest transcode + derived +artifact) is the cold-start prerequisite** — nothing downstream has bytes to serve or decode until an +Opus artifact exists. + +- **18.1 — Ingest transcode: derive + store the Opus artifact (cold-start; load-bearing).** New + `OpusTranscodeService`/processor in `DeepDrftContent`, invoked post-store from + `UnifiedTrackService.UploadAsync` alongside `WaveformProfileService`; produces Ogg Opus fullband 320; + stores it as a derived artifact (recommend a `track-opus` vault). Failure-tolerant; off the hot path + (background/queued). **Independent of the delivery/decoder waves — can begin immediately.** +- **18.2 — Storage + lookup contract.** The derived-artifact key/vault convention + server-side "given + `EntryKey` + format, return the right `AudioBinary` + content-type," including the lossless fallback. + **Depends on 18.1.** +- **18.3 — Delivery: `?format=opus|lossless` param + proxy threading.** On the `DeepDrftAPI` stream + endpoint (resolves via 18.2), forwarded through `TrackProxyController` (mirror `offset`), `Range` + serving the chosen artifact; player sends it via `TrackMediaClient`. **Depends on 18.2; parallel-ok + with 18.4.** +- **18.4 — `OpusFormatDecoder` in the player stack.** New `IFormatDecoder` (Ogg-page segmenting, + `OpusHead` carry, approximate page-interpolation `calculateByteOffset` with an `OpusSeekData` + accelerator) + one arm in `createFormatDecoder` on `audio/ogg`/`audio/opus`; capability detection for + the lossless fallback. **Depends on 18.2; parallel-ok with 18.3.** +- **18.5 — Backfill + selection UX + end-to-end validation.** "Backfill Opus" CMS bulk action (third + sibling to Generate-Profiles / Backfill-High-res) + replace-audio Opus regeneration; the listener + selection control (recommend a global persisted quality toggle); the AC1–AC8 acceptance pass including + the Phase-21 handshake (Opus is windowable by the same machinery). **Depends on 18.1–18.4.** + +**Dependency shape:** `18.1 → 18.2 → {18.3 ∥ 18.4} → 18.5`; 18.1 is the only cold-start wave. +**Phase-level: 18 precedes Phase 21.** **Open questions for Daniel (spec §6):** selection UX (recommend a +single global quality toggle); default policy (recommend Opus-by-default, capability-gated; defer +network-awareness); whether the choice is remembered + scope (recommend persisted cookie/`localStorage`, +the dark-mode precedent); per-upload Opus opt-out vs. always-on (recommend always-on); Ogg-vs-CAF/WebM +container (recommend Ogg Opus as directed); transcode execution model (background/queued — a track is +lossless-only briefly until its Opus finishes; confirm acceptable). None block 18.1. + +--- + ## Phase 21 — Windowed Streaming Buffer (bounded client memory for long streams) Bound the **client memory** a playing track consumes to a small, configurable forward window — @@ -451,6 +535,14 @@ plays without the whole decoded PCM accumulating in the browser. **Public listen (`DeepDrftPublic.Client` player stack + `DeepDrftPublic` TypeScript audio interop); no CMS, no API endpoint, no schema change. +**Sequenced AFTER Phase 18 (Opus Low-Data Streaming) — Daniel, 2026-06-23.** Format support (the +derived Ogg Opus 320 low-data path, Phase 18) is a prerequisite that comes first; windowing must work +across **both** delivery formats. Phase 21's C5 invariant already anticipated this ("must not foreclose +MP3/FLAC"); **Opus is now the concrete VBR/paged driver** — windowing an Opus stream uses the decoder's +*approximate* byte↔time mapping (`OpusFormatDecoder.calculateByteOffset`, Ogg-page interpolation), not +the exact CBR-WAV `byteRate` math. Build the window machinery format-agnostically so it inherits Opus +for free. + The network path already streams in adaptive 16–64 KB chunks. The accumulation is on the **decode side**: `PlaybackScheduler` holds an `AudioBuffer[]` it **never evicts** ("Supports pause/resume/seek by retaining all buffers" — its own doc comment). Decoded PCM is larger than the source (Web Audio is @@ -467,9 +559,11 @@ just triggered manually and one-shot. The only genuinely new mechanisms are **pa scheduler and **back-pressure** on the forward read loop (stop calling `ReadAsync` above a high-water mark, resume below low-water). Recommended **Direction A** (sliding window on the existing single forward stream); **Direction B** (discrete Range-fetched segments — the HLS/DASH/MSE-eviction analogue) held as -the documented fallback; **Direction C** (adopt MSE and let the browser manage the buffer) flagged as the -real long-term answer but out of scope — it is a playback-substrate rewrite entangled with non-WAV -formats (Phase 1.2), surfaced as OQ5. +the documented fallback; **Direction C** (adopt MSE and let the browser manage the buffer) **rejected +(OQ5 = NO, Daniel 2026-06-23)** — the bespoke Web Audio graph is a deliberate long-term commitment, and +the compressed-delivery move that would have justified MSE is met instead by **Phase 18 (Opus) feeding +the same bespoke graph** through the `IFormatDecoder` seam. Direction A is therefore the permanent +destination, not a stopgap MSE would retire. **Invariants that must hold (the §3.5 seam contract).** Reuse the Range path, don't fork it; playback- start latency at parity; the `IFormatDecoder` abstraction untouched (windowing is format-agnostic, so @@ -513,12 +607,14 @@ parameters fed in later). running, rapid-seek concurrency. Largely measurement; breaks are tuning fixes in 21.1's anchor math or 21.2's water-marks. **Depends on 21.1–21.3.** -**Dependency shape:** `21.1 → 21.2 → 21.3 → 21.4`; 21.1 is the only cold-start wave. **Open questions for -Daniel (spec §6):** window-size policy axis (time-based window + memory guard — recommended); seek-back- -past-window re-buffer acceptable (recommend yes, symmetric to forward); a hard total in-flight memory cap -as a guard rail (recommend yes); window everything vs. only long tracks (recommend everything — one path, -short tracks never hit a refill); and whether MSE is the real destination (steer informing scope, not a -blocker). None block 21.1. +**Dependency shape:** `21.1 → 21.2 → 21.3 → 21.4`; 21.1 is the only cold-start wave. **Phase-level +prerequisite: Phase 18 (Opus) lands first** so windowing is built against both formats. **Open questions +for Daniel (spec §6):** window-size policy axis (time-based window + memory guard — recommended); seek- +back-past-window re-buffer acceptable (recommend yes, symmetric to forward); a hard total in-flight +memory cap as a guard rail (recommend yes); window everything vs. only long tracks (recommend everything +— one path, short tracks never hit a refill). **OQ5 (adopt MSE) — RESOLVED NO (Daniel 2026-06-23): the +bespoke graph stays by deliberate choice; recorded considered-and-declined, kept visible per file +convention.** None block 21.1. --- diff --git a/product-notes/phase-18-opus-low-data-streaming.md b/product-notes/phase-18-opus-low-data-streaming.md new file mode 100644 index 0000000..b882904 --- /dev/null +++ b/product-notes/phase-18-opus-low-data-streaming.md @@ -0,0 +1,462 @@ +# Phase 18 — Opus Low-Data Streaming (dual-format lossless + Opus delivery) + +Product spec. Status: **design / framing — implementation-ready pending Daniel's open-question calls.** +Author: product-designer. Date: 2026-06-23. **No code has been written by this doc.** + +This phase is the concrete realization of the long-deferred **"Non-WAV formats"** intent +(`CONTEXT.md §5`, the "1.2" the streaming-feature items reference). It supersedes the abstract "a +processor per format + a decoder strategy" framing with a specific, Daniel-directed product: **two +delivery formats per track — the existing lossless WAV path and a new low-data Ogg Opus path — so the +listener gets a choice, with Opus the bandwidth-friendly default-candidate.** + +Surfaces (named precisely): + +- **Ingest / preprocessing:** `DeepDrftContent` (`AudioProcessor` / `AudioProcessorRouter` / + `TrackContentService` / `WaveformProfileService`) + `DeepDrftAPI` (upload/persist — + `UnifiedTrackService.UploadAsync`, replace-audio) + `DeepDrftManager` (CMS upload form, only if a + per-upload control is wanted — see OQ4). +- **Delivery / decode:** `DeepDrftAPI` (the track stream endpoint + `Range` handler) + + `DeepDrftPublic` proxy (`TrackProxyController`) + `DeepDrftPublic.Client` player stack + (`StreamingAudioPlayerService`, `TrackMediaClient`) + `DeepDrftPublic/Interop/audio` TS decoders + (`AudioPlayer.createFormatDecoder` registry, a new `OpusFormatDecoder`). + +**Sequencing headline: Phase 18 comes BEFORE Phase 21 (Windowed Streaming Buffer).** Phase 21's +windowing must work across both formats — its C5 invariant already anticipated this ("must not +foreclose MP3/FLAC"); Opus is now the concrete VBR/containerized driver of that invariant. See §6 and +the Phase 21 cross-reference. + +--- + +## 0. State of the world (what already exists — verified 2026-06-23) + +This phase is **much further along than the "Non-WAV formats" backlog line implies**, on both sides. +Two prior efforts already built most of the multi-format substrate; what is *missing* is specifically +the **derived-Opus-artifact** idea, not generic format support. + +**Producer side is already multi-format (router landed):** +- `AudioProcessorRouter.ProcessAudioFileAsync(filePath)` routes by extension — `.wav` → + `AudioProcessor`, `.mp3` → `Mp3AudioProcessor`, `.flac` → `FlacAudioProcessor` + (`DeepDrftContent/CLAUDE.md`). +- `TrackContentService.AddTrackAsync(filePath, mimeType)` is **format-agnostic**: it selects the + processor, generates an entry GUID, and **stores the original bytes** with correct extension/MIME + in the `tracks` vault. +- So today the system can *ingest and store* WAV/MP3/FLAC. It **does not transcode** — it keeps the + original. There is no derived artifact and no second format per track. + +**Decoder side is a wired strategy registry (not "implemented-not-wired" anymore):** +- `AudioPlayer.createFormatDecoder(contentType)` (`AudioPlayer.ts:117`) dispatches on `Content-Type`: + `audio/mpeg|audio/mp3` → `Mp3FormatDecoder`, `audio/flac|audio/x-flac` → `FlacFormatDecoder`, + default → `WavFormatDecoder`. All three decoders exist and implement `IFormatDecoder`. +- `IFormatDecoder` (`IFormatDecoder.ts`) is a clean per-format strategy: `tryParseHeader`, + `getAlignedSegmentSize`, `wrapSegment`, `calculateByteOffset`, plus a `FormatInfo` carrying + `byteRate`, `blockAlign`, `audioDataOffset`, and a `seekData` accelerator slot (already polymorphic: + `Mp3VbrSeekData | FlacSeekData`). **This is the seam an `OpusFormatDecoder` slots into.** +- **Correction to the Phase 21 spec's §2 C3 note** ("MP3/FLAC implemented, not yet wired"): the + registry *is* wired and dispatches on content-type today. Phase 21's invariant still holds; the + parenthetical is stale and is corrected by this phase's reconciliation. + +**What this means for the gap.** Daniel's direction is **not** "add format support" — that substrate +exists. It is "**derive a second, low-data artifact (Opus fullband 320) at ingest and let the listener +pick which to stream.**" That is two genuinely new things: (1) a **transcode-at-ingest** step that +produces a derived artifact per track (the router stores originals; nothing derives Opus), and (2) a +**per-format delivery selection** so the same track can be served as either WAV or Opus on request. + +--- + +## 1. Goal + +**Dual-format delivery.** Every track is streamable in two formats: + +- **Lossless** — the existing WAV path, unchanged. The archival / audiophile option. +- **Low-data** — a derived **Ogg Opus, fullband, 320 kbps** artifact. The bandwidth-friendly + default-candidate. + +The listener chooses; Opus is the recommended default. The bespoke Web Audio decode→schedule graph is +**retained by deliberate choice** (Daniel) — Opus is fed through the same `IFormatDecoder` strategy +seam, not through an HTML `` element or MSE. + +**Why Opus fullband 320.** Opus is the modern, royalty-free, best-in-class lossy codec; "fullband" +(48 kHz, full 20 kHz audio bandwidth) at 320 kbps is transparent-to-most-listeners quality at roughly +**1/4 to 1/5 the bytes of 16-bit/44.1 stereo WAV** (~1411 kbps). For a 1 GB DJ MIX (Phase 9 `Mix` +medium), that is the difference between a ~1 GB transfer and a ~220 MB transfer — the headline +low-data win, and directly relevant to the Phase 21 long-stream case. + +**Non-goals.** This phase does not retire WAV (it stays as the lossless option), does not change the +bespoke graph for MSE (explicitly rejected — see §2 / Phase 21 OQ5), and does not add new transport +mechanisms beyond the existing stream + `Range` primitive. + +--- + +## 2. Constraints / invariants (the contract that must hold) + +- **C1 — Keep the bespoke Web Audio graph. MSE is rejected (Daniel, deliberate).** The custom + decode→schedule graph is a long-term commitment, not a stopgap. Opus is fed through the existing + `IFormatDecoder` → `StreamDecoder` → `PlaybackScheduler` pipeline. (This is the same decision + recorded as **Phase 21 OQ5 = NO**; the two phases share it.) +- **C2 — Preprocessing is additive; the WAV path is untouched.** The Opus artifact is a **second + derived artifact per track**, not a replacement. The existing WAV in the `tracks` vault stays + byte-for-byte as it is today; the lossless stream path is unchanged. A track with no Opus artifact + (legacy rows, or a transcode that hasn't run yet) must still play losslessly — Opus is strictly + additive. +- **C3 — Reuse the landed `Range`/offset seek path; do not fork it.** Phase 4's + `Range: bytes=X-` → `206` primitive (client `TrackMediaClient` → `DeepDrftPublic` proxy → + `DeepDrftAPI`) is the substrate for Opus seek too. Opus seek math differs from WAV (VBR / + container-paged, see §3.4) but it is expressed through the **same** `IFormatDecoder.calculateByteOffset` + seam the MP3/FLAC decoders already use — no second seek mechanism. +- **C4 — Opus slots the `IFormatDecoder` registry; no format branches leak elsewhere.** The new + `OpusFormatDecoder` is selected by `AudioPlayer.createFormatDecoder` on `Content-Type: + audio/ogg`/`audio/opus`. The rest of the player stack stays format-agnostic. No `if (opus)` outside + the decoder and the one selection point. +- **C5 — Format selection is a delivery-time decision, resolved server-side from a listener + signal.** The same `TrackEntity` / `EntryKey` addresses both artifacts; the *format* is a parameter + on the stream request (query param or `Accept` negotiation — see §3.3), not a different track id and + not a different vault entry key. One track, two renderings (the standing "one source, multiple + views" preference applied to delivery). +- **C6 — Transcode failure must not block ingest.** If the Opus transcode fails or is slow, the + track still persists with its lossless artifact and is playable. Opus is generated best-effort and + can be (re)generated later — mirror the **waveform-datum** model (`WaveformProfileService`: compute + on upload, regenerate on demand via a CMS action), which is exactly the "derived artifact, generated + at ingest, regenerable" pattern this needs. +- **C7 — The vault model holds: derived artifact is a new entry, not a mutation.** The Opus bytes + live in the FileDatabase under the track's `EntryKey` — either in the existing `tracks` vault under + a derived key, or in a new sibling vault (see §3.2 options). Either way it is `AudioBinary` with the + `.opus`/`.ogg` extension and correct MIME, registered like any other vault resource. + +--- + +## 3. Architectural shape + +### 3.0 The mental model + +A track has one **source artifact** (the uploaded WAV/MP3/FLAC, stored as-is today) and gains one +**derived low-data artifact** (Ogg Opus fullband 320, produced at ingest). The stream endpoint serves +*either*, selected per request. The player picks a decoder by the response `Content-Type` exactly as +it does today. Seeking uses the same `Range` primitive; the byte↔time math is the decoder's job. + +``` +INGEST (DeepDrftContent + DeepDrftAPI) + upload → AudioProcessorRouter (existing) → store SOURCE artifact in vault [unchanged] + → TRANSCODE to Opus 320 → store DERIVED artifact [NEW] + → WaveformProfileService (existing, unchanged) + +DELIVERY (DeepDrftAPI → DeepDrftPublic proxy → DeepDrftPublic.Client → Interop/audio) + GET api/track/{id}?format=opus|lossless → serve the chosen artifact's bytes (+ Range) [NEW param] + player: createFormatDecoder(Content-Type) → OpusFormatDecoder | Wav | Mp3 | Flac [+1 decoder] +``` + +### 3.1 Where the transcode lives (relative to existing processing) + +The transcode is a **new processor sibling** to the existing format processors, invoked **after** the +source is stored, in the same orchestration that already calls `WaveformProfileService`: + +- It belongs in `DeepDrftContent` (the binary-content domain library) as e.g. an + `OpusTranscodeService` / `OpusProcessor`, **not** in a host and **not** in a controller (per the + `*.Services`-owns-domain-logic convention). +- It is invoked from `UnifiedTrackService.UploadAsync` (the same place `WaveformProfileService` + computes the high-res datum on every new track) and from the **replace-audio** path (which already + regenerates both waveform datums — Opus is the third derived thing to regenerate there). +- Like the waveform datum, it gets a **regenerate trigger**: a CMS per-track / bulk action and an + ApiKey-gated endpoint, so existing tracks can be backfilled. This mirrors the landed + "Generate All Profiles / Backfill High-res" bulk actions on `Releases.razor` — **Backfill Opus** + is the natural third bulk action. + +**The transcode engine itself is staff-engineer's call** (FFmpeg/libopus via a process invocation, a +managed binding, or a libopus P/Invoke). The spec fixes the *artifact* (Ogg Opus, fullband, 320 kbps) +and the *seam* (a derived artifact produced post-store, regenerable, failure-tolerant), not the tool. +Note a real operational constraint to flag for implementation: transcoding a 1 GB WAV is **CPU- and +time-expensive** and must not block the upload response — it wants the same off-the-hot-path treatment +the upload body staging already gets (`Upload:StagingPath`), likely a background/queued step. This is +the single biggest implementation risk and is called out as such. + +### 3.2 Where the Opus artifact is stored (two options) + +**Option S1 — derived key in the existing `tracks` vault (recommended).** Store the Opus bytes under +a derived entry key alongside the source, e.g. `{entryKey}` for source and `{entryKey}.opus` (or a +parallel key convention) in the same `tracks` vault. *Pro:* no new vault type, co-located with the +source, simplest lookup. *Con:* mixes two artifacts per logical track in one vault's index. + +**Option S2 — a new sibling vault (e.g. `track-opus`).** Mirror the `track-waveforms` precedent +(Phase 12 added a dedicated vault for the derived high-res datum). Opus bytes keyed by the same +`EntryKey` in a `track-opus` vault. *Pro:* clean separation of source vs. derived, matches the +established "derived artifacts get their own vault" pattern (`track-waveforms`), easy to enumerate / +backfill / purge independently. *Con:* one more vault to register. + +**Recommendation: S2** — it is the pattern the codebase already chose for the *other* derived +per-track artifact (the high-res waveform datum), so it is the least surprising and keeps the source +`tracks` vault meaning exactly one thing. **Final call is staff-engineer's**; both are viable. + +### 3.3 How a listener's format choice reaches the bytes + +The stream endpoint gains a **format selector**. Two candidate mechanisms: + +- **D-a — explicit query param** `GET api/track/{id}?format=opus|lossless` (recommended). Mirrors the + existing `offset` query param the proxy already forwards (`TrackProxyController`). Explicit, + cache-friendly (distinct URLs), trivial to thread through the proxy, and the player already knows + which it asked for. Server resolves the param → the right artifact → sets the right `Content-Type`, + which the player's existing `createFormatDecoder` then dispatches on. **No new decoder-selection + mechanism** — the response content-type does the work it already does. +- **D-b — HTTP content negotiation** (`Accept: audio/ogg` vs `audio/wav`). More "correct" REST, but + the proxy + WASM client wiring is fussier and caches are content-type-varied. Not worth it here. + +**Recommended: D-a.** The selection *policy* (which format a given listener gets by default, and how +they switch) is a genuine **product call — see OQ1/OQ2**, deliberately not decided here. The +*mechanism* (a query param resolved server-side to an artifact + content-type) is settled. + +Server-side fallback rule (C2): if `format=opus` is requested but no Opus artifact exists for that +track (not yet transcoded / backfilled), the endpoint **falls back to lossless** rather than 404ing — +Opus is additive, so its absence degrades to "you get the lossless one," never to "no audio." + +### 3.4 The Opus decoder + seek math (the genuinely new decode work) + +`OpusFormatDecoder implements IFormatDecoder` is the new code on the delivery side. Two things make it +harder than the WAV decoder and need to be flagged: + +- **Containerized, paged format — not raw-frame-sliceable.** WAV's `wrapSegment` prepends a 44-byte + PCM header to any PCM-aligned byte run; the current model assumes you can wrap an arbitrary aligned + raw-audio slice and hand it to `decodeAudioData`. **Ogg Opus is page-structured** (Ogg pages + carrying Opus packets, plus mandatory `OpusHead`/`OpusTags` setup pages at the start). A mid-stream + byte slice is not independently decodable without the setup header and without landing on Ogg page + boundaries. So `OpusFormatDecoder`'s `getAlignedSegmentSize` must align to **Ogg page boundaries** + (scan for the `OggS` capture pattern — analogous to FLAC's frame-sync scan, for which the + `IFormatDecoder` interface already passes `rawData` to `getAlignedSegmentSize`), and + `wrapSegment`/the continuation path must carry the `OpusHead` setup (analogous to FLAC's + `streamInfoBytes` in `FlacSeekData`). **The `IFormatDecoder` abstraction already has the shape for + this** — a format-specific `seekData` accelerator and a setup-bytes carry — because FLAC needed the + same kind of thing. A new `OpusSeekData` variant joins `Mp3VbrSeekData | FlacSeekData`. +- **VBR byte↔time mapping is approximate (the Phase 21 C5 case, concretely).** Opus at "320 kbps" is + effectively VBR; there is no exact `byteRate` for offset math the way CBR WAV has. Seek-by-offset + uses an **approximate** mapping (granule-position/Ogg-page interpolation, the Opus analogue of MP3's + Xing TOC or FLAC's SEEKTABLE). `calculateByteOffset` returns a best-effort page-aligned offset; the + decoder then re-syncs to the next Ogg page. This is exactly the "VBR formats: the mapping is + approximate" case Phase 21's C5 invariant anticipated — **Opus is the format that makes that + invariant load-bearing rather than hypothetical.** + +**Browser decode-support constraint (real, must be designed around).** The bespoke graph decodes +segments via `AudioContext.decodeAudioData`. Ogg-Opus support in `decodeAudioData` is long-standing in +Chrome and Firefox but arrived in **Safari only at 18.4 (macOS 15.4 / iOS 18.4, March 2025)**; older +Safari decodes Opus only in a CAF container, not Ogg. iOS Safari is a primary music-listening surface, +so this is not a corner case. Implications: (1) the **lossless WAV path is the universal fallback** for +listeners whose browser can't decode Ogg Opus — which C2's additive design already provides for free; +(2) format-default policy (OQ2) should consider capability detection — don't hand Ogg Opus to a Safari +that can't decode it. This intersects Phase 1.7 (Safari compatibility) and is flagged there too. +([Browser support: caniuse / WebKit 18.4 release notes — see Sources.]) + +### 3.5 The three candidate directions (shape-level) + +Per file convention the alternatives are recorded; the recommendation follows. + +**Direction A — Derived Opus artifact at ingest + format param on delivery (recommended).** What §3.1 +–3.4 describe: transcode to Opus 320 post-store, store as a derived artifact (S2 vault), serve via a +`?format=` param resolved server-side to bytes + content-type, decode via a new `OpusFormatDecoder` in +the existing registry. *Why recommended:* additive (C2), reuses every existing seam (the processor +orchestration, the waveform-datum derived-artifact pattern, the `Range` path, the decoder registry), +and the only genuinely new code is one transcode step + one decoder. Two derived artifacts per track, +both regenerable. + +**Direction B — On-the-fly transcode at delivery (no stored Opus artifact).** Transcode WAV→Opus per +request in the stream endpoint, streaming the Opus out as it encodes. *Why not (default):* moves +expensive CPU onto the **hot request path** (a 1 GB mix transcoded per play is untenable), breaks +`Range`/seek (you can't byte-offset into a stream you're encoding live), and defeats caching. It *is* +storage-cheaper (no second artifact on disk), so it is the fallback only if disk cost ever dominates — +but for a music site where the same tracks are played repeatedly, precompute-once wins decisively. +Rejected as the primary. + +**Direction C — Replace WAV ingest with Opus-only (transcode and discard the lossless source).** Make +Opus *the* stored format; drop WAV. *Why not:* violates Daniel's explicit "lossless streaming +*optional* — two delivery formats, listener gets a choice." Lossless is a kept option, not a thing to +transcode away. Also irreversibly lossy at ingest (you can never recover the WAV). Rejected outright; +recorded only because "just store Opus" is the tempting simplification and the spec should say why not. + +### 3.6 SOLID / road-not-taken rationale + +- **OCP, via the existing seams.** The transcode is a new processor sibling (the router pattern is + already open for extension); the decoder is a new `IFormatDecoder` (the registry is already open for + extension); the artifact is a new derived vault resource (the `track-waveforms` precedent is exactly + this). Phase 18 adds **three new leaf implementations** and **zero changes to existing format code** + — the strongest possible OCP signal that the seams were designed right. +- **SRP, preserved.** Transcoding is a content-domain processor concern (`DeepDrftContent`); delivery + selection is a thin endpoint concern (`DeepDrftAPI` resolves a param to an artifact); decode is the + `OpusFormatDecoder`'s concern; byte↔time math stays inside that decoder via `calculateByteOffset`. + No responsibility crosses a boundary it doesn't already own. +- **DIP / "one source, multiple views."** One `TrackEntity`/`EntryKey` is the single source; "lossless + WAV" and "low-data Opus" are two *views* (renderings) of it, diverging only at the delivery/decode + layer — the same discipline the dark-mode and track-browse surfaces follow. +- **Road not taken — a separate `TrackEntity` row (or a new track id) per format.** Tempting (one row + = one streamable file) but it fractures the track identity: shares, queues, play-counts (Phase 16), + release membership, and waveform data all key on one track, and doubling rows to carry a format + would force every one of those surfaces to dedupe. Format is a *delivery attribute of one track*, + not a *second track*. Rejected — keep one identity, two artifacts. + +--- + +## 4. Format selection — the product surface (deliberately under-specified; see OQ1/OQ2) + +Daniel has **not** specified the selection UX. What is settled by his direction: there are two formats, +Opus is the bandwidth-friendly **default-candidate**, lossless is the kept option. What is open: how a +listener expresses the choice, whether it is remembered, and whether the default is global or adapts. +These are genuine product calls — see §6. The *mechanism* (a `?format=` param the player sends; §3.3) +supports any of the policies, so the policy can be decided after the substrate lands. + +--- + +## 5. Use cases + +- **UC1 — Listener streams the low-data Opus of a long mix (the headline win).** A ~1 GB lossless mix + transfers as ~220 MB of Opus; playback through the bespoke graph is identical in feel, far cheaper + on bandwidth. (Compounds with Phase 21 windowing for the memory side.) +- **UC2 — Listener prefers lossless and switches to it.** The same track served as WAV via + `?format=lossless`; the bespoke graph decodes it exactly as today. +- **UC3 — Legacy / not-yet-transcoded track.** `?format=opus` requested, no Opus artifact yet → + server falls back to lossless (C2); the listener still hears the track. A later Backfill-Opus pass + produces the artifact. +- **UC4 — Admin backfills Opus for the existing catalogue.** A bulk "Backfill Opus" CMS action (the + third sibling to the existing Generate-Profiles / Backfill-High-res actions) transcodes every track + lacking an Opus artifact. +- **UC5 — Replace-audio regenerates Opus.** The existing replace-audio path (which already regenerates + both waveform datums and re-derives duration) also regenerates the Opus artifact from the new + source. +- **UC6 — Seek within an Opus stream.** Backward/forward seek resolves via the existing `Range` path; + the offset is the `OpusFormatDecoder`'s approximate page-aligned mapping (§3.4), re-syncing to the + next Ogg page — the VBR analogue of the WAV exact-offset seek. +- **UC7 — Safari that can't decode Ogg Opus.** Capability-gated to the lossless path (§3.4), so the + listener still plays audio. (Ties to OQ2 + Phase 1.7.) + +--- + +## 6. Open questions for Daniel (genuine product decisions, not implementation detail) + +- **OQ1 — Selection UX: how does a listener choose lossless vs. low-data?** Candidates: a global + toggle in the player bar / settings ("Stream quality: Low-data / Lossless"); a per-track control; an + automatic default with a manual override. Recommend a **single global quality toggle** (player bar + or a settings affordance) — it is the Spotify/Bandcamp/SoundCloud idiom (one account/session-level + "streaming quality" setting), low-friction, and matches a small-sharp-tool posture better than + per-track choosers. `[Daniel decision]` +- **OQ2 — Default policy: what does a listener get before they choose?** Opus is the + *default-candidate* per Daniel — confirm Opus-by-default. Sub-questions: should the default be + **capability-aware** (don't serve Ogg Opus to a browser that can't decode it — §3.4 Safari < 18.4)? + Should it be **network-aware** (Opus on cellular, lossless on wifi)? Recommend **Opus by default, + capability-gated** (fall back to lossless when the browser can't decode Ogg Opus), and **defer + network-awareness** as gold-plating for v1. `[Daniel decision]` +- **OQ3 — Is the choice remembered, and at what scope?** Per-session (resets each visit) vs. + persisted (cookie/`localStorage`, like the `darkMode` cookie) vs. (future) per-account once identity + exists. Recommend **persisted via a cookie/`localStorage` setting**, mirroring the dark-mode + precedent — one truth, seeded at prerender, carried to WASM. `[Daniel decision]` +- **OQ4 — Per-upload Opus control in the CMS, or always-on?** Should the CMS upload form let an admin + opt a track *out* of Opus generation (e.g. a track meant to be lossless-only), or is Opus always + generated for every track? Recommend **always-on** (simpler; Opus is additive and cheap to serve; + the listener's format choice already covers "I want lossless"). A per-track opt-out is a later + refinement if a real need appears. `[Daniel decision]` +- **OQ5 — Opus container/extension specifics.** Ogg Opus (`.opus` / `audio/ogg`) is the assumption + (broadest `decodeAudioData` support; Daniel said "Ogg Opus"). Confirm — vs. CAF-wrapped Opus (older + Safari) or WebM-Opus. Recommend **Ogg Opus** as Daniel directed; CAF-fallback for old Safari is not + worth it given the lossless fallback already covers those browsers (§3.4). `[Daniel steer — confirms + §3.4, not a blocker]` +- **OQ6 — Transcode execution model (flag, leans implementation).** Synchronous-at-upload is a + non-starter for 1 GB mixes (§3.1); the realistic options are a background/queued transcode after the + source is stored. This is largely staff-engineer's call, but it has a **product-visible + consequence**: a freshly uploaded track may be lossless-only for a short window until its Opus + artifact finishes. Confirm that "Opus appears shortly after upload, lossless available immediately" + is acceptable (it is the waveform-datum model already in place). `[Daniel steer]` + +--- + +## 7. Acceptance criteria + +- **AC1 (headline) — Dual-format delivery works.** A track can be streamed as either lossless WAV or + Ogg Opus 320 from the same `EntryKey`, selected per request; both play correctly through the bespoke + Web Audio graph. +- **AC2 — Opus is the low-data win.** The Opus artifact of a representative track is materially smaller + than its lossless source (target ~1/4–1/5 the bytes); a long mix's Opus transfer is correspondingly + smaller. +- **AC3 — Additive, non-breaking (C2).** The existing lossless WAV path is byte-for-byte unchanged; a + track with no Opus artifact still plays losslessly; `?format=opus` on such a track falls back to + lossless (no 404, no silence). +- **AC4 — Transcode at ingest, regenerable (C6).** A new upload produces an Opus artifact best-effort + after the source is stored; a transcode failure does not block the upload or break playback; a + Backfill-Opus action (re)generates artifacts for existing tracks; replace-audio regenerates the + Opus artifact from the new source. +- **AC5 — Opus seek via the existing `Range` path (C3).** Forward and backward seek in an Opus stream + resolve through the landed `Range: bytes=X-` primitive, with the offset coming from + `OpusFormatDecoder.calculateByteOffset`; no new seek mechanism is introduced. +- **AC6 — No format branches leak (C4).** The only Opus-specific code is `OpusFormatDecoder`, its + `OpusSeekData`, the one `createFormatDecoder` selection arm, and the transcode processor + delivery + param resolution. The format-agnostic player/scheduler code is unchanged. +- **AC7 — Capability-safe default (OQ2).** A browser that cannot decode Ogg Opus is served (or falls + back to) the lossless path and plays audio; no listener gets silence because of codec support. +- **AC8 — Windowing-ready (the Phase 21 handshake).** The `OpusFormatDecoder`'s approximate byte↔time + mapping is the one Phase 21's windowed refill will call; Opus playback must be windowable by the + same machinery (verified jointly when Phase 21 lands on top — see §8 / Phase 21 cross-ref). + +--- + +## 8. Wave decomposition + +Dependency shape: `18.1 → 18.2 → {18.3, 18.4}`, with `18.5` validating end-to-end. **18.1 (the +transcode/derived-artifact ingest) is the cold-start prerequisite** — until an Opus artifact exists, +nothing downstream has bytes to serve or decode. 18.3 (delivery param) and 18.4 (the decoder) are +largely parallel once 18.2 (storage/lookup) settles, but both need an artifact to test against. + +- **18.1 — Ingest transcode: derive + store the Opus artifact (cold-start; load-bearing).** New + `OpusTranscodeService`/processor in `DeepDrftContent`, invoked post-store from + `UnifiedTrackService.UploadAsync` alongside `WaveformProfileService`; produces Ogg Opus fullband + 320; stores it as a derived artifact (S2 vault recommended). Failure-tolerant (C6) and off the hot + path (background/queued — OQ6). **Independent of the delivery/decoder waves; can begin immediately.** +- **18.2 — Storage + lookup contract.** The derived-artifact key/vault convention and the server-side + resolution "given `EntryKey` + format, return the right `AudioBinary` + content-type," including the + C2 fallback (no Opus → lossless). **Depends on 18.1** (an artifact must exist to resolve to). +- **18.3 — Delivery: format param + proxy threading.** `?format=opus|lossless` on the + `DeepDrftAPI` track stream endpoint (resolves via 18.2), forwarded through the `DeepDrftPublic` + `TrackProxyController` (mirror the existing `offset` param threading), and the `Range` handler + serving the chosen artifact's bytes. The player sends the param via `TrackMediaClient`. **Depends on + 18.2.** Parallel-ok with 18.4. +- **18.4 — `OpusFormatDecoder` in the player stack.** New `IFormatDecoder` implementation + (Ogg-page-aligned `getAlignedSegmentSize` via `OggS` scan, `OpusHead` setup carry in + `wrapSegment`/continuation, approximate page-interpolation `calculateByteOffset` with an + `OpusSeekData` accelerator); one new arm in `AudioPlayer.createFormatDecoder` on + `audio/ogg`/`audio/opus`. Capability detection for the lossless fallback (§3.4, OQ2). **Depends on + 18.2** (needs Opus bytes to decode). Parallel-ok with 18.3; they meet at 18.5. +- **18.5 — Backfill + selection UX + end-to-end validation.** The Backfill-Opus CMS bulk action (third + sibling to Generate-Profiles / Backfill-High-res) and replace-audio Opus regeneration; the listener + selection control per OQ1/OQ3 (global persisted quality toggle, recommended); and the AC1–AC8 + acceptance pass — including AC8's confirmation that Opus is windowable so Phase 21 can build on it. + **Depends on 18.1–18.4.** (Selection UX can be split out if Daniel wants the substrate proven before + the control lands — flag at planning time.) + +--- + +## 9. Cross-references (read before implementing) + +- `CONTEXT.md §5` "Non-WAV formats" — the deferred intent this phase realizes (now concrete: derived + Opus low-data path, not generic format support). +- `PLAN.md` Phase 21 / `product-notes/phase-21-windowed-streaming-buffer.md` — **sequenced AFTER this + phase.** Phase 21's C5 invariant ("WAV-only shipping target; must not foreclose MP3/FLAC") is now + driven by Opus's VBR/paged seek math; Phase 21 OQ5 (adopt MSE) is resolved **NO** — the bespoke + graph stays (the same C1 decision recorded here). Windowing a VBR/Opus stream uses + `OpusFormatDecoder.calculateByteOffset`'s approximate mapping — exactly the C5 case. +- `PLAN.md` Phase 4 (landed) / `COMPLETED.md` — the HTTP `Range: bytes=X-` primitive Opus seek reuses. +- `PLAN.md` Phase 1.5 (gapless) / 1.6 (track-skip on error) / 1.7 (Safari) — 1.5's "encoder + padding/priming" caveat applies to Opus (it has pre-skip samples in `OpusHead`); 1.6's + byte-scan-to-next-frame is the Ogg-page-sync analogue; 1.7's Safari floor intersects §3.4's Ogg-Opus + `decodeAudioData` support (Safari < 18.4). +- `PLAN.md` Phase 12 / `product-notes/phase-12-waveform-visualizer-generalization.md` — the + `WaveformProfileService` derived-artifact-at-ingest + regenerate pattern this transcode mirrors + (compute on upload, regenerate via CMS action / endpoint, its own `track-waveforms` vault → the S2 + precedent). +- `PLAN.md` Phase 9 — defines the `Mix` medium (single long track), the canonical low-data case. +- `PLAN.md` Phase 16 — play/share telemetry keys on one track identity; the §3.6 road-not-taken + (one-row-per-format) would have fractured this — kept to one identity, two artifacts. +- `DeepDrftContent/Processors/AudioProcessor.cs` + `AudioProcessorRouter` + `DeepDrftContent/CLAUDE.md` + — the existing format-router and the `WaveformProfileService` derived-artifact seam; 18.1 lives here. +- `DeepDrftPublic/Interop/audio/IFormatDecoder.ts` — the strategy interface `OpusFormatDecoder` + implements; `FlacFormatDecoder.ts` is the nearest prior art (setup-bytes carry + frame-sync scan). +- `DeepDrftPublic/Interop/audio/AudioPlayer.ts` (`createFormatDecoder`, lines 117–125) — the decoder + registry gaining the Opus arm. +- `DeepDrftPublic.Client/Clients/TrackMediaClient.cs` + `DeepDrftPublic/Controllers/TrackProxyController.cs` + — the media fetch + proxy that thread the new `?format=` param (mirroring `offset`). + +## Sources + +- Ogg Opus support in `decodeAudioData`: Chrome/Firefox long-standing; Safari added Ogg-Opus at 18.4 + (macOS 15.4 / iOS 18.4, March 2025) — prior Safari decoded Opus only in CAF. + https://chromestatus.com/feature/5649634416394240 ; + https://www.testmuai.com/learning-hub/opus-audio-codec-browser-support/ diff --git a/product-notes/phase-21-windowed-streaming-buffer.md b/product-notes/phase-21-windowed-streaming-buffer.md index b9cd494..e188ece 100644 --- a/product-notes/phase-21-windowed-streaming-buffer.md +++ b/product-notes/phase-21-windowed-streaming-buffer.md @@ -8,6 +8,16 @@ server touch is **reuse, not new surface**: the existing `DeepDrftAPI` HTTP `Ran partial-content primitive (Phase 4, landed) is the load-bearing dependency; this phase adds no new API endpoint. +> **Sequencing dependency (Daniel, 2026-06-23): Phase 18 (Opus Low-Data Streaming) comes BEFORE this +> phase.** Format support — specifically the derived **Ogg Opus fullband 320** low-data delivery path +> (`product-notes/phase-18-opus-low-data-streaming.md`) — is a prerequisite that sequences ahead of +> windowing. Phase 21's windowing must work across **both** delivery formats (lossless WAV and Opus). +> Its C5 invariant below already anticipated this ("must not foreclose MP3/FLAC"); **Opus is now the +> concrete VBR/containerized driver of C5.** Windowing an Opus stream uses the decoder's *approximate* +> byte↔time mapping (`OpusFormatDecoder.calculateByteOffset` — Ogg-page interpolation), exactly the C5 +> case — not the exact CBR-WAV `byteRate` math. Build the window machinery format-agnostically +> (§2 C3/C5) so it inherits Opus for free. + --- ## 1. Goal @@ -45,19 +55,25 @@ docs. This phase **modifies that seam** — so the contract it must preserve is - **C2 — Playback start latency unchanged.** Today playback starts as soon as a configurable minimum buffer count is queued (header-derived duration, not full-file). The window model must keep first-audio latency at parity — bounding memory must not reintroduce a fetch-then-play stall. -- **C3 — The format-decoder abstraction is untouched.** `IFormatDecoder` (WAV active; MP3/FLAC - implemented, not yet wired) owns all format-specific byte math. Windowing lives in the +- **C3 — The format-decoder abstraction is untouched.** `IFormatDecoder` owns all format-specific + byte math; `AudioPlayer.createFormatDecoder` already dispatches on `Content-Type` (WAV/MP3/FLAC + decoders all wired today — verified 2026-06-23; an `OpusFormatDecoder` joins them in Phase 18). + Windowing lives in the **format-agnostic** layer (`PlaybackScheduler` eviction + `StreamDecoder`/player refill orchestration); it must add **no** format-specific branches. A future wired MP3/FLAC decoder inherits windowing for free. - **C4 — Read-only playback only.** This is a memory-management change, not a UX change. No new user-visible control, no change to seek/transport semantics beyond what the listener already experiences. Seek must still feel identical. -- **C5 — WAV-only is the shipping target; the design must not foreclose MP3/FLAC.** Byte↔time mapping - for refill is exact and cheap for WAV (CBR: `byteRate` from the header). For VBR formats the mapping is - approximate (the decoders already carry TOC/SEEKTABLE seek math). The window machinery must express - refill in terms of the decoder's existing `calculateByteOffset`, so the same code works when those - formats are wired — **no WAV-special-cased offset math in the window layer.** +- **C5 — Must window both delivery formats (WAV lossless AND Opus low-data).** Byte↔time mapping for + refill is exact and cheap for WAV (CBR: `byteRate` from the header). For VBR/containerized formats it + is approximate (the decoders carry TOC/SEEKTABLE/Ogg-page seek math). **Phase 18 (Opus) is sequenced + before this phase and is the concrete driver here:** an Ogg Opus 320 stream is VBR and page-paged, so + its `calculateByteOffset` is an *approximate* page-interpolation, not exact-offset. The window + machinery must express refill purely in terms of the decoder's existing `calculateByteOffset`, so the + same code windows WAV exactly and Opus approximately — **no WAV-special-cased offset math in the + window layer.** (MP3/FLAC decoders are already wired in the registry too — the registry dispatches on + content-type today; an `OpusFormatDecoder` joins them in Phase 18.) - **C6 — No regression to the single-instance JS decoder concurrency guarantees.** The current code is careful that only one streaming loop touches the single JS `StreamDecoder` at a time (`DrainActiveStreamingTaskAsync`, the `_streamingCancellation` identity dance). Windowed refill @@ -146,14 +162,15 @@ because the stack is a bespoke Web Audio graph, not `` + MSE. Stop hand-rolling the decode→schedule graph for long tracks; feed the Range stream into a `SourceBuffer` and let the browser evict via its built-in quota + `remove()`. Memory management becomes the platform's problem. -*Why not (now, but flag for Daniel):* MSE does not accept raw WAV/PCM — it wants containerized formats -(fragmented MP4/WebM, or MP3/AAC elementary streams). The current producer is WAV-only, and the entire -bespoke visualizer/spectrum graph is wired to the Web Audio `AudioContext`, not a `` element. -Adopting MSE is a **rewrite of the playback substrate**, not a windowing change — out of scope for this -phase. But it is the *real* long-term answer and is entangled with Phase 1.2 (non-WAV formats): if -DeepDrft moves to a compressed delivery format, MSE becomes viable and could retire the hand-rolled -decoder, the seek-beyond-buffer path, *and* this phase's window machinery in one move. **Surfaced as -open question OQ5** — not to decide now, but so this phase is built knowing it may be superseded. +*Why not — RESOLVED, rejected (Daniel, 2026-06-23; see OQ5):* MSE does not accept raw WAV/PCM — it +wants containerized formats (fragmented MP4/WebM, or MP3/AAC elementary streams). The entire bespoke +visualizer/spectrum graph is wired to the Web Audio `AudioContext`, not a `` element. Adopting +MSE is a **rewrite of the playback substrate**, not a windowing change. It *looked* like the real +long-term answer once compressed delivery arrived — but Daniel has decided compressed delivery +(**Phase 18 Opus**) will feed the **same bespoke graph** via the `IFormatDecoder` seam, so the +compressed-delivery move that would have justified MSE happens *without* surrendering the graph. **The +bespoke graph is a deliberate long-term commitment; MSE is rejected.** Direction A is therefore the +permanent destination, not a stopgap that MSE will retire. Recorded as considered-and-declined. ### 3.3 Recommended direction: A, with B held as the documented fallback @@ -262,11 +279,17 @@ These are policy calls with user-visible or resource trade-offs — flagged rath tracks that never needed it. Recommend **window everything** (one path, C6-safe, and short tracks simply never hit a refill because they fit inside the forward window) — but Daniel may prefer a size threshold. `[Daniel decision]` -- **OQ5 — Is MSE (Direction C) the real destination?** Not for this phase, but it bears on how much to - invest here. If DeepDrft will move to compressed delivery (Phase 1.2) and MSE within ~a year, Phase 21 - should be the *minimal* Direction-A change (don't gold-plate machinery MSE would retire). If WAV + - bespoke graph is the long-term commitment, a more thorough windowing investment is justified. - `[Daniel steer — informs scope, not a blocker]` +- **OQ5 — Is MSE (Direction C) the real destination? — RESOLVED: NO (Daniel, 2026-06-23).** **Do not + adopt MSE. The bespoke Web Audio decode→schedule graph stays — it is bespoke by deliberate choice, a + long-term commitment, not a stopgap.** Daniel's rationale: the player is intentionally a custom + graph, not an HTML `` element; the compressed-delivery move that *would* have made MSE + tempting is being met instead by **Phase 18 (Opus low-data path)** feeding the **same bespoke graph** + through the `IFormatDecoder` seam — so compressed delivery arrives *without* surrendering the graph. + Consequence for this phase: Direction A (the hand-rolled sliding window) is the destination, not a + placeholder; invest in it as permanent machinery. It will window both the WAV and the Opus path + (the sequencing note at the top). Direction C is recorded as **considered and declined** per file + convention; kept visible so a future reader sees the road not taken and why. + `[RESOLVED — bespoke graph retained; MSE rejected]` ---