docs(plan): add Phase 18 Opus low-data streaming; resolve Phase 21 OQ5 (no MSE)

This commit is contained in:
daniel-c-harvey
2026-06-23 04:58:21 -04:00
parent a84a99c309
commit 1bdaeaa164
3 changed files with 610 additions and 29 deletions
+105 -9
View File
@@ -443,6 +443,90 @@ not the same work; this phase does not satisfy or depend on that one.
---
## Phase 18 — Opus Low-Data Streaming (dual-format lossless + Opus delivery)
The concrete realization of the long-deferred **"Non-WAV formats"** intent (`CONTEXT.md §5`). Daniel's
direction (2026-06-23): **two delivery formats per track — the existing lossless WAV path, and a new
low-data Ogg Opus (fullband, 320 kbps) path — so the listener gets a choice, with Opus the
bandwidth-friendly default-candidate.** Lossless streaming becomes *optional*, not the only path. The
bespoke Web Audio decode→schedule graph is **retained by deliberate choice** — Opus feeds the same
`IFormatDecoder` seam, not an HTML `<media>` element or MSE (the decision shared with Phase 21 OQ5).
**Sequenced BEFORE Phase 21** — windowing must work across both formats. Surfaces: ingest/preprocessing
in `DeepDrftContent` (`AudioProcessor`/router/`WaveformProfileService`) + `DeepDrftAPI`
(`UnifiedTrackService.UploadAsync`, replace-audio); delivery/decode in `DeepDrftAPI` (stream endpoint +
`Range`) + `DeepDrftPublic` proxy + `DeepDrftPublic.Client` player stack + `DeepDrftPublic/Interop/audio`
TS decoders. Full design, the three directions with SOLID/road-not-taken rationale, the storage and
delivery options, the Opus decoder + seek math, acceptance criteria, open questions, and wave
decomposition: `product-notes/phase-18-opus-low-data-streaming.md`.
**Much further along than the backlog line implies (verified 2026-06-23).** The multi-format *substrate*
already exists on both sides: the producer-side `AudioProcessorRouter` routes `.wav`/`.mp3`/`.flac` and
`TrackContentService.AddTrackAsync` is format-agnostic (it **stores originals**, no transcode); the
decoder-side `AudioPlayer.createFormatDecoder` is a **wired** strategy registry dispatching on
`Content-Type` (WAV/MP3/FLAC decoders all present — correcting the Phase 21 spec's stale
"implemented-not-wired" note). **The actual gap is Daniel's specific ask:** (1) a **transcode-at-ingest**
step that *derives* an Opus 320 artifact per track (nothing derives Opus today), and (2) a **per-format
delivery selection** so one track serves as either WAV or Opus on request.
**Architectural spine — a derived artifact + a delivery param + one new decoder; three new leaf
implementations, zero changes to existing format code (the strong OCP signal).** Transcode is a new
processor sibling in `DeepDrftContent`, invoked post-store alongside `WaveformProfileService`,
**failure-tolerant and off the hot path** (background/queued — a 1 GB WAV transcode must not block the
upload response) — mirroring the landed waveform-datum pattern (derive at ingest, regenerate via a CMS
bulk action + ApiKey endpoint). The Opus bytes are a **derived artifact** stored like the high-res
waveform datum (recommend a dedicated `track-opus` vault, the `track-waveforms` precedent; final call
staff-engineer's). Delivery adds a **`?format=opus|lossless` param** (mirroring the existing `offset`
param threading through `TrackProxyController`) resolved server-side to the right artifact + content-type,
with a **lossless fallback** when no Opus artifact exists (additive, never 404/silence). The player gains
one `OpusFormatDecoder` (`IFormatDecoder`): Ogg-page-aligned segmenting (`OggS` scan — the FLAC
frame-sync analogue), `OpusHead` setup-bytes carry (the FLAC `streamInfoBytes` analogue), and an
**approximate** page-interpolation `calculateByteOffset` (Opus is VBR/paged — this is exactly the Phase
21 C5 case). **Browser constraint flagged:** Ogg-Opus `decodeAudioData` is Safari-18.4+ only (Chrome/FF
long-standing), so the Opus default must be **capability-gated** — fall back to the universal lossless
path on browsers that can't decode it.
**Constraints/invariants:** keep the bespoke graph (no MSE); preprocessing is **additive** (WAV path
untouched, byte-for-byte; a track with no Opus artifact still plays losslessly); reuse the landed
`Range`/offset seek path; no format branches leak outside the new decoder + one selection arm + the
transcode/delivery seam; transcode failure must not block ingest; format selection is a delivery-time
decision resolving one `EntryKey` to one of two artifacts (one source, two views — **not** a second
`TrackEntity` row, which would fracture share/queue/play-count/release identity).
Sequenced as five waves. `18.1 → 18.2 → {18.3, 18.4} → 18.5`. **18.1 (ingest transcode + derived
artifact) is the cold-start prerequisite** — nothing downstream has bytes to serve or decode until an
Opus artifact exists.
- **18.1 — Ingest transcode: derive + store the Opus artifact (cold-start; load-bearing).** New
`OpusTranscodeService`/processor in `DeepDrftContent`, invoked post-store from
`UnifiedTrackService.UploadAsync` alongside `WaveformProfileService`; produces Ogg Opus fullband 320;
stores it as a derived artifact (recommend a `track-opus` vault). Failure-tolerant; off the hot path
(background/queued). **Independent of the delivery/decoder waves — can begin immediately.**
- **18.2 — Storage + lookup contract.** The derived-artifact key/vault convention + server-side "given
`EntryKey` + format, return the right `AudioBinary` + content-type," including the lossless fallback.
**Depends on 18.1.**
- **18.3 — Delivery: `?format=opus|lossless` param + proxy threading.** On the `DeepDrftAPI` stream
endpoint (resolves via 18.2), forwarded through `TrackProxyController` (mirror `offset`), `Range`
serving the chosen artifact; player sends it via `TrackMediaClient`. **Depends on 18.2; parallel-ok
with 18.4.**
- **18.4 — `OpusFormatDecoder` in the player stack.** New `IFormatDecoder` (Ogg-page segmenting,
`OpusHead` carry, approximate page-interpolation `calculateByteOffset` with an `OpusSeekData`
accelerator) + one arm in `createFormatDecoder` on `audio/ogg`/`audio/opus`; capability detection for
the lossless fallback. **Depends on 18.2; parallel-ok with 18.3.**
- **18.5 — Backfill + selection UX + end-to-end validation.** "Backfill Opus" CMS bulk action (third
sibling to Generate-Profiles / Backfill-High-res) + replace-audio Opus regeneration; the listener
selection control (recommend a global persisted quality toggle); the AC1AC8 acceptance pass including
the Phase-21 handshake (Opus is windowable by the same machinery). **Depends on 18.118.4.**
**Dependency shape:** `18.1 → 18.2 → {18.3 ∥ 18.4} → 18.5`; 18.1 is the only cold-start wave.
**Phase-level: 18 precedes Phase 21.** **Open questions for Daniel (spec §6):** selection UX (recommend a
single global quality toggle); default policy (recommend Opus-by-default, capability-gated; defer
network-awareness); whether the choice is remembered + scope (recommend persisted cookie/`localStorage`,
the dark-mode precedent); per-upload Opus opt-out vs. always-on (recommend always-on); Ogg-vs-CAF/WebM
container (recommend Ogg Opus as directed); transcode execution model (background/queued — a track is
lossless-only briefly until its Opus finishes; confirm acceptable). None block 18.1.
---
## Phase 21 — Windowed Streaming Buffer (bounded client memory for long streams)
Bound the **client memory** a playing track consumes to a small, configurable forward window —
@@ -451,6 +535,14 @@ plays without the whole decoded PCM accumulating in the browser. **Public listen
(`DeepDrftPublic.Client` player stack + `DeepDrftPublic` TypeScript audio interop); no CMS, no API
endpoint, no schema change.
**Sequenced AFTER Phase 18 (Opus Low-Data Streaming) — Daniel, 2026-06-23.** Format support (the
derived Ogg Opus 320 low-data path, Phase 18) is a prerequisite that comes first; windowing must work
across **both** delivery formats. Phase 21's C5 invariant already anticipated this ("must not foreclose
MP3/FLAC"); **Opus is now the concrete VBR/paged driver** — windowing an Opus stream uses the decoder's
*approximate* byte↔time mapping (`OpusFormatDecoder.calculateByteOffset`, Ogg-page interpolation), not
the exact CBR-WAV `byteRate` math. Build the window machinery format-agnostically so it inherits Opus
for free.
The network path already streams in adaptive 1664 KB chunks. The accumulation is on the **decode
side**: `PlaybackScheduler` holds an `AudioBuffer[]` it **never evicts** ("Supports pause/resume/seek by
retaining all buffers" — its own doc comment). Decoded PCM is larger than the source (Web Audio is
@@ -467,9 +559,11 @@ just triggered manually and one-shot. The only genuinely new mechanisms are **pa
scheduler and **back-pressure** on the forward read loop (stop calling `ReadAsync` above a high-water
mark, resume below low-water). Recommended **Direction A** (sliding window on the existing single forward
stream); **Direction B** (discrete Range-fetched segments — the HLS/DASH/MSE-eviction analogue) held as
the documented fallback; **Direction C** (adopt MSE and let the browser manage the buffer) flagged as the
real long-term answer but out of scope — it is a playback-substrate rewrite entangled with non-WAV
formats (Phase 1.2), surfaced as OQ5.
the documented fallback; **Direction C** (adopt MSE and let the browser manage the buffer) **rejected
(OQ5 = NO, Daniel 2026-06-23)** — the bespoke Web Audio graph is a deliberate long-term commitment, and
the compressed-delivery move that would have justified MSE is met instead by **Phase 18 (Opus) feeding
the same bespoke graph** through the `IFormatDecoder` seam. Direction A is therefore the permanent
destination, not a stopgap MSE would retire.
**Invariants that must hold (the §3.5 seam contract).** Reuse the Range path, don't fork it; playback-
start latency at parity; the `IFormatDecoder` abstraction untouched (windowing is format-agnostic, so
@@ -513,12 +607,14 @@ parameters fed in later).
running, rapid-seek concurrency. Largely measurement; breaks are tuning fixes in 21.1's anchor math or
21.2's water-marks. **Depends on 21.121.3.**
**Dependency shape:** `21.1 → 21.2 → 21.3 → 21.4`; 21.1 is the only cold-start wave. **Open questions for
Daniel (spec §6):** window-size policy axis (time-based window + memory guard — recommended); seek-back-
past-window re-buffer acceptable (recommend yes, symmetric to forward); a hard total in-flight memory cap
as a guard rail (recommend yes); window everything vs. only long tracks (recommend everything — one path,
short tracks never hit a refill); and whether MSE is the real destination (steer informing scope, not a
blocker). None block 21.1.
**Dependency shape:** `21.1 → 21.2 → 21.3 → 21.4`; 21.1 is the only cold-start wave. **Phase-level
prerequisite: Phase 18 (Opus) lands first** so windowing is built against both formats. **Open questions
for Daniel (spec §6):** window-size policy axis (time-based window + memory guard — recommended); seek-
back-past-window re-buffer acceptable (recommend yes, symmetric to forward); a hard total in-flight
memory cap as a guard rail (recommend yes); window everything vs. only long tracks (recommend everything
— one path, short tracks never hit a refill). **OQ5 (adopt MSE) — RESOLVED NO (Daniel 2026-06-23): the
bespoke graph stays by deliberate choice; recorded considered-and-declined, kept visible per file
convention.** None block 21.1.
---
@@ -0,0 +1,462 @@
# Phase 18 — Opus Low-Data Streaming (dual-format lossless + Opus delivery)
Product spec. Status: **design / framing — implementation-ready pending Daniel's open-question calls.**
Author: product-designer. Date: 2026-06-23. **No code has been written by this doc.**
This phase is the concrete realization of the long-deferred **"Non-WAV formats"** intent
(`CONTEXT.md §5`, the "1.2" the streaming-feature items reference). It supersedes the abstract "a
processor per format + a decoder strategy" framing with a specific, Daniel-directed product: **two
delivery formats per track — the existing lossless WAV path and a new low-data Ogg Opus path — so the
listener gets a choice, with Opus the bandwidth-friendly default-candidate.**
Surfaces (named precisely):
- **Ingest / preprocessing:** `DeepDrftContent` (`AudioProcessor` / `AudioProcessorRouter` /
`TrackContentService` / `WaveformProfileService`) + `DeepDrftAPI` (upload/persist —
`UnifiedTrackService.UploadAsync`, replace-audio) + `DeepDrftManager` (CMS upload form, only if a
per-upload control is wanted — see OQ4).
- **Delivery / decode:** `DeepDrftAPI` (the track stream endpoint + `Range` handler) +
`DeepDrftPublic` proxy (`TrackProxyController`) + `DeepDrftPublic.Client` player stack
(`StreamingAudioPlayerService`, `TrackMediaClient`) + `DeepDrftPublic/Interop/audio` TS decoders
(`AudioPlayer.createFormatDecoder` registry, a new `OpusFormatDecoder`).
**Sequencing headline: Phase 18 comes BEFORE Phase 21 (Windowed Streaming Buffer).** Phase 21's
windowing must work across both formats — its C5 invariant already anticipated this ("must not
foreclose MP3/FLAC"); Opus is now the concrete VBR/containerized driver of that invariant. See §6 and
the Phase 21 cross-reference.
---
## 0. State of the world (what already exists — verified 2026-06-23)
This phase is **much further along than the "Non-WAV formats" backlog line implies**, on both sides.
Two prior efforts already built most of the multi-format substrate; what is *missing* is specifically
the **derived-Opus-artifact** idea, not generic format support.
**Producer side is already multi-format (router landed):**
- `AudioProcessorRouter.ProcessAudioFileAsync(filePath)` routes by extension — `.wav`
`AudioProcessor`, `.mp3``Mp3AudioProcessor`, `.flac``FlacAudioProcessor`
(`DeepDrftContent/CLAUDE.md`).
- `TrackContentService.AddTrackAsync(filePath, mimeType)` is **format-agnostic**: it selects the
processor, generates an entry GUID, and **stores the original bytes** with correct extension/MIME
in the `tracks` vault.
- So today the system can *ingest and store* WAV/MP3/FLAC. It **does not transcode** — it keeps the
original. There is no derived artifact and no second format per track.
**Decoder side is a wired strategy registry (not "implemented-not-wired" anymore):**
- `AudioPlayer.createFormatDecoder(contentType)` (`AudioPlayer.ts:117`) dispatches on `Content-Type`:
`audio/mpeg|audio/mp3``Mp3FormatDecoder`, `audio/flac|audio/x-flac``FlacFormatDecoder`,
default → `WavFormatDecoder`. All three decoders exist and implement `IFormatDecoder`.
- `IFormatDecoder` (`IFormatDecoder.ts`) is a clean per-format strategy: `tryParseHeader`,
`getAlignedSegmentSize`, `wrapSegment`, `calculateByteOffset`, plus a `FormatInfo` carrying
`byteRate`, `blockAlign`, `audioDataOffset`, and a `seekData` accelerator slot (already polymorphic:
`Mp3VbrSeekData | FlacSeekData`). **This is the seam an `OpusFormatDecoder` slots into.**
- **Correction to the Phase 21 spec's §2 C3 note** ("MP3/FLAC implemented, not yet wired"): the
registry *is* wired and dispatches on content-type today. Phase 21's invariant still holds; the
parenthetical is stale and is corrected by this phase's reconciliation.
**What this means for the gap.** Daniel's direction is **not** "add format support" — that substrate
exists. It is "**derive a second, low-data artifact (Opus fullband 320) at ingest and let the listener
pick which to stream.**" That is two genuinely new things: (1) a **transcode-at-ingest** step that
produces a derived artifact per track (the router stores originals; nothing derives Opus), and (2) a
**per-format delivery selection** so the same track can be served as either WAV or Opus on request.
---
## 1. Goal
**Dual-format delivery.** Every track is streamable in two formats:
- **Lossless** — the existing WAV path, unchanged. The archival / audiophile option.
- **Low-data** — a derived **Ogg Opus, fullband, 320 kbps** artifact. The bandwidth-friendly
default-candidate.
The listener chooses; Opus is the recommended default. The bespoke Web Audio decode→schedule graph is
**retained by deliberate choice** (Daniel) — Opus is fed through the same `IFormatDecoder` strategy
seam, not through an HTML `<media>` element or MSE.
**Why Opus fullband 320.** Opus is the modern, royalty-free, best-in-class lossy codec; "fullband"
(48 kHz, full 20 kHz audio bandwidth) at 320 kbps is transparent-to-most-listeners quality at roughly
**1/4 to 1/5 the bytes of 16-bit/44.1 stereo WAV** (~1411 kbps). For a 1 GB DJ MIX (Phase 9 `Mix`
medium), that is the difference between a ~1 GB transfer and a ~220 MB transfer — the headline
low-data win, and directly relevant to the Phase 21 long-stream case.
**Non-goals.** This phase does not retire WAV (it stays as the lossless option), does not change the
bespoke graph for MSE (explicitly rejected — see §2 / Phase 21 OQ5), and does not add new transport
mechanisms beyond the existing stream + `Range` primitive.
---
## 2. Constraints / invariants (the contract that must hold)
- **C1 — Keep the bespoke Web Audio graph. MSE is rejected (Daniel, deliberate).** The custom
decode→schedule graph is a long-term commitment, not a stopgap. Opus is fed through the existing
`IFormatDecoder``StreamDecoder``PlaybackScheduler` pipeline. (This is the same decision
recorded as **Phase 21 OQ5 = NO**; the two phases share it.)
- **C2 — Preprocessing is additive; the WAV path is untouched.** The Opus artifact is a **second
derived artifact per track**, not a replacement. The existing WAV in the `tracks` vault stays
byte-for-byte as it is today; the lossless stream path is unchanged. A track with no Opus artifact
(legacy rows, or a transcode that hasn't run yet) must still play losslessly — Opus is strictly
additive.
- **C3 — Reuse the landed `Range`/offset seek path; do not fork it.** Phase 4's
`Range: bytes=X-``206` primitive (client `TrackMediaClient``DeepDrftPublic` proxy →
`DeepDrftAPI`) is the substrate for Opus seek too. Opus seek math differs from WAV (VBR /
container-paged, see §3.4) but it is expressed through the **same** `IFormatDecoder.calculateByteOffset`
seam the MP3/FLAC decoders already use — no second seek mechanism.
- **C4 — Opus slots the `IFormatDecoder` registry; no format branches leak elsewhere.** The new
`OpusFormatDecoder` is selected by `AudioPlayer.createFormatDecoder` on `Content-Type:
audio/ogg`/`audio/opus`. The rest of the player stack stays format-agnostic. No `if (opus)` outside
the decoder and the one selection point.
- **C5 — Format selection is a delivery-time decision, resolved server-side from a listener
signal.** The same `TrackEntity` / `EntryKey` addresses both artifacts; the *format* is a parameter
on the stream request (query param or `Accept` negotiation — see §3.3), not a different track id and
not a different vault entry key. One track, two renderings (the standing "one source, multiple
views" preference applied to delivery).
- **C6 — Transcode failure must not block ingest.** If the Opus transcode fails or is slow, the
track still persists with its lossless artifact and is playable. Opus is generated best-effort and
can be (re)generated later — mirror the **waveform-datum** model (`WaveformProfileService`: compute
on upload, regenerate on demand via a CMS action), which is exactly the "derived artifact, generated
at ingest, regenerable" pattern this needs.
- **C7 — The vault model holds: derived artifact is a new entry, not a mutation.** The Opus bytes
live in the FileDatabase under the track's `EntryKey` — either in the existing `tracks` vault under
a derived key, or in a new sibling vault (see §3.2 options). Either way it is `AudioBinary` with the
`.opus`/`.ogg` extension and correct MIME, registered like any other vault resource.
---
## 3. Architectural shape
### 3.0 The mental model
A track has one **source artifact** (the uploaded WAV/MP3/FLAC, stored as-is today) and gains one
**derived low-data artifact** (Ogg Opus fullband 320, produced at ingest). The stream endpoint serves
*either*, selected per request. The player picks a decoder by the response `Content-Type` exactly as
it does today. Seeking uses the same `Range` primitive; the byte↔time math is the decoder's job.
```
INGEST (DeepDrftContent + DeepDrftAPI)
upload → AudioProcessorRouter (existing) → store SOURCE artifact in vault [unchanged]
→ TRANSCODE to Opus 320 → store DERIVED artifact [NEW]
→ WaveformProfileService (existing, unchanged)
DELIVERY (DeepDrftAPI → DeepDrftPublic proxy → DeepDrftPublic.Client → Interop/audio)
GET api/track/{id}?format=opus|lossless → serve the chosen artifact's bytes (+ Range) [NEW param]
player: createFormatDecoder(Content-Type) → OpusFormatDecoder | Wav | Mp3 | Flac [+1 decoder]
```
### 3.1 Where the transcode lives (relative to existing processing)
The transcode is a **new processor sibling** to the existing format processors, invoked **after** the
source is stored, in the same orchestration that already calls `WaveformProfileService`:
- It belongs in `DeepDrftContent` (the binary-content domain library) as e.g. an
`OpusTranscodeService` / `OpusProcessor`, **not** in a host and **not** in a controller (per the
`*.Services`-owns-domain-logic convention).
- It is invoked from `UnifiedTrackService.UploadAsync` (the same place `WaveformProfileService`
computes the high-res datum on every new track) and from the **replace-audio** path (which already
regenerates both waveform datums — Opus is the third derived thing to regenerate there).
- Like the waveform datum, it gets a **regenerate trigger**: a CMS per-track / bulk action and an
ApiKey-gated endpoint, so existing tracks can be backfilled. This mirrors the landed
"Generate All Profiles / Backfill High-res" bulk actions on `Releases.razor` — **Backfill Opus**
is the natural third bulk action.
**The transcode engine itself is staff-engineer's call** (FFmpeg/libopus via a process invocation, a
managed binding, or a libopus P/Invoke). The spec fixes the *artifact* (Ogg Opus, fullband, 320 kbps)
and the *seam* (a derived artifact produced post-store, regenerable, failure-tolerant), not the tool.
Note a real operational constraint to flag for implementation: transcoding a 1 GB WAV is **CPU- and
time-expensive** and must not block the upload response — it wants the same off-the-hot-path treatment
the upload body staging already gets (`Upload:StagingPath`), likely a background/queued step. This is
the single biggest implementation risk and is called out as such.
### 3.2 Where the Opus artifact is stored (two options)
**Option S1 — derived key in the existing `tracks` vault (recommended).** Store the Opus bytes under
a derived entry key alongside the source, e.g. `{entryKey}` for source and `{entryKey}.opus` (or a
parallel key convention) in the same `tracks` vault. *Pro:* no new vault type, co-located with the
source, simplest lookup. *Con:* mixes two artifacts per logical track in one vault's index.
**Option S2 — a new sibling vault (e.g. `track-opus`).** Mirror the `track-waveforms` precedent
(Phase 12 added a dedicated vault for the derived high-res datum). Opus bytes keyed by the same
`EntryKey` in a `track-opus` vault. *Pro:* clean separation of source vs. derived, matches the
established "derived artifacts get their own vault" pattern (`track-waveforms`), easy to enumerate /
backfill / purge independently. *Con:* one more vault to register.
**Recommendation: S2** — it is the pattern the codebase already chose for the *other* derived
per-track artifact (the high-res waveform datum), so it is the least surprising and keeps the source
`tracks` vault meaning exactly one thing. **Final call is staff-engineer's**; both are viable.
### 3.3 How a listener's format choice reaches the bytes
The stream endpoint gains a **format selector**. Two candidate mechanisms:
- **D-a — explicit query param** `GET api/track/{id}?format=opus|lossless` (recommended). Mirrors the
existing `offset` query param the proxy already forwards (`TrackProxyController`). Explicit,
cache-friendly (distinct URLs), trivial to thread through the proxy, and the player already knows
which it asked for. Server resolves the param → the right artifact → sets the right `Content-Type`,
which the player's existing `createFormatDecoder` then dispatches on. **No new decoder-selection
mechanism** — the response content-type does the work it already does.
- **D-b — HTTP content negotiation** (`Accept: audio/ogg` vs `audio/wav`). More "correct" REST, but
the proxy + WASM client wiring is fussier and caches are content-type-varied. Not worth it here.
**Recommended: D-a.** The selection *policy* (which format a given listener gets by default, and how
they switch) is a genuine **product call — see OQ1/OQ2**, deliberately not decided here. The
*mechanism* (a query param resolved server-side to an artifact + content-type) is settled.
Server-side fallback rule (C2): if `format=opus` is requested but no Opus artifact exists for that
track (not yet transcoded / backfilled), the endpoint **falls back to lossless** rather than 404ing —
Opus is additive, so its absence degrades to "you get the lossless one," never to "no audio."
### 3.4 The Opus decoder + seek math (the genuinely new decode work)
`OpusFormatDecoder implements IFormatDecoder` is the new code on the delivery side. Two things make it
harder than the WAV decoder and need to be flagged:
- **Containerized, paged format — not raw-frame-sliceable.** WAV's `wrapSegment` prepends a 44-byte
PCM header to any PCM-aligned byte run; the current model assumes you can wrap an arbitrary aligned
raw-audio slice and hand it to `decodeAudioData`. **Ogg Opus is page-structured** (Ogg pages
carrying Opus packets, plus mandatory `OpusHead`/`OpusTags` setup pages at the start). A mid-stream
byte slice is not independently decodable without the setup header and without landing on Ogg page
boundaries. So `OpusFormatDecoder`'s `getAlignedSegmentSize` must align to **Ogg page boundaries**
(scan for the `OggS` capture pattern — analogous to FLAC's frame-sync scan, for which the
`IFormatDecoder` interface already passes `rawData` to `getAlignedSegmentSize`), and
`wrapSegment`/the continuation path must carry the `OpusHead` setup (analogous to FLAC's
`streamInfoBytes` in `FlacSeekData`). **The `IFormatDecoder` abstraction already has the shape for
this** — a format-specific `seekData` accelerator and a setup-bytes carry — because FLAC needed the
same kind of thing. A new `OpusSeekData` variant joins `Mp3VbrSeekData | FlacSeekData`.
- **VBR byte↔time mapping is approximate (the Phase 21 C5 case, concretely).** Opus at "320 kbps" is
effectively VBR; there is no exact `byteRate` for offset math the way CBR WAV has. Seek-by-offset
uses an **approximate** mapping (granule-position/Ogg-page interpolation, the Opus analogue of MP3's
Xing TOC or FLAC's SEEKTABLE). `calculateByteOffset` returns a best-effort page-aligned offset; the
decoder then re-syncs to the next Ogg page. This is exactly the "VBR formats: the mapping is
approximate" case Phase 21's C5 invariant anticipated — **Opus is the format that makes that
invariant load-bearing rather than hypothetical.**
**Browser decode-support constraint (real, must be designed around).** The bespoke graph decodes
segments via `AudioContext.decodeAudioData`. Ogg-Opus support in `decodeAudioData` is long-standing in
Chrome and Firefox but arrived in **Safari only at 18.4 (macOS 15.4 / iOS 18.4, March 2025)**; older
Safari decodes Opus only in a CAF container, not Ogg. iOS Safari is a primary music-listening surface,
so this is not a corner case. Implications: (1) the **lossless WAV path is the universal fallback** for
listeners whose browser can't decode Ogg Opus — which C2's additive design already provides for free;
(2) format-default policy (OQ2) should consider capability detection — don't hand Ogg Opus to a Safari
that can't decode it. This intersects Phase 1.7 (Safari compatibility) and is flagged there too.
([Browser support: caniuse / WebKit 18.4 release notes — see Sources.])
### 3.5 The three candidate directions (shape-level)
Per file convention the alternatives are recorded; the recommendation follows.
**Direction A — Derived Opus artifact at ingest + format param on delivery (recommended).** What §3.1
3.4 describe: transcode to Opus 320 post-store, store as a derived artifact (S2 vault), serve via a
`?format=` param resolved server-side to bytes + content-type, decode via a new `OpusFormatDecoder` in
the existing registry. *Why recommended:* additive (C2), reuses every existing seam (the processor
orchestration, the waveform-datum derived-artifact pattern, the `Range` path, the decoder registry),
and the only genuinely new code is one transcode step + one decoder. Two derived artifacts per track,
both regenerable.
**Direction B — On-the-fly transcode at delivery (no stored Opus artifact).** Transcode WAV→Opus per
request in the stream endpoint, streaming the Opus out as it encodes. *Why not (default):* moves
expensive CPU onto the **hot request path** (a 1 GB mix transcoded per play is untenable), breaks
`Range`/seek (you can't byte-offset into a stream you're encoding live), and defeats caching. It *is*
storage-cheaper (no second artifact on disk), so it is the fallback only if disk cost ever dominates —
but for a music site where the same tracks are played repeatedly, precompute-once wins decisively.
Rejected as the primary.
**Direction C — Replace WAV ingest with Opus-only (transcode and discard the lossless source).** Make
Opus *the* stored format; drop WAV. *Why not:* violates Daniel's explicit "lossless streaming
*optional* — two delivery formats, listener gets a choice." Lossless is a kept option, not a thing to
transcode away. Also irreversibly lossy at ingest (you can never recover the WAV). Rejected outright;
recorded only because "just store Opus" is the tempting simplification and the spec should say why not.
### 3.6 SOLID / road-not-taken rationale
- **OCP, via the existing seams.** The transcode is a new processor sibling (the router pattern is
already open for extension); the decoder is a new `IFormatDecoder` (the registry is already open for
extension); the artifact is a new derived vault resource (the `track-waveforms` precedent is exactly
this). Phase 18 adds **three new leaf implementations** and **zero changes to existing format code**
— the strongest possible OCP signal that the seams were designed right.
- **SRP, preserved.** Transcoding is a content-domain processor concern (`DeepDrftContent`); delivery
selection is a thin endpoint concern (`DeepDrftAPI` resolves a param to an artifact); decode is the
`OpusFormatDecoder`'s concern; byte↔time math stays inside that decoder via `calculateByteOffset`.
No responsibility crosses a boundary it doesn't already own.
- **DIP / "one source, multiple views."** One `TrackEntity`/`EntryKey` is the single source; "lossless
WAV" and "low-data Opus" are two *views* (renderings) of it, diverging only at the delivery/decode
layer — the same discipline the dark-mode and track-browse surfaces follow.
- **Road not taken — a separate `TrackEntity` row (or a new track id) per format.** Tempting (one row
= one streamable file) but it fractures the track identity: shares, queues, play-counts (Phase 16),
release membership, and waveform data all key on one track, and doubling rows to carry a format
would force every one of those surfaces to dedupe. Format is a *delivery attribute of one track*,
not a *second track*. Rejected — keep one identity, two artifacts.
---
## 4. Format selection — the product surface (deliberately under-specified; see OQ1/OQ2)
Daniel has **not** specified the selection UX. What is settled by his direction: there are two formats,
Opus is the bandwidth-friendly **default-candidate**, lossless is the kept option. What is open: how a
listener expresses the choice, whether it is remembered, and whether the default is global or adapts.
These are genuine product calls — see §6. The *mechanism* (a `?format=` param the player sends; §3.3)
supports any of the policies, so the policy can be decided after the substrate lands.
---
## 5. Use cases
- **UC1 — Listener streams the low-data Opus of a long mix (the headline win).** A ~1 GB lossless mix
transfers as ~220 MB of Opus; playback through the bespoke graph is identical in feel, far cheaper
on bandwidth. (Compounds with Phase 21 windowing for the memory side.)
- **UC2 — Listener prefers lossless and switches to it.** The same track served as WAV via
`?format=lossless`; the bespoke graph decodes it exactly as today.
- **UC3 — Legacy / not-yet-transcoded track.** `?format=opus` requested, no Opus artifact yet →
server falls back to lossless (C2); the listener still hears the track. A later Backfill-Opus pass
produces the artifact.
- **UC4 — Admin backfills Opus for the existing catalogue.** A bulk "Backfill Opus" CMS action (the
third sibling to the existing Generate-Profiles / Backfill-High-res actions) transcodes every track
lacking an Opus artifact.
- **UC5 — Replace-audio regenerates Opus.** The existing replace-audio path (which already regenerates
both waveform datums and re-derives duration) also regenerates the Opus artifact from the new
source.
- **UC6 — Seek within an Opus stream.** Backward/forward seek resolves via the existing `Range` path;
the offset is the `OpusFormatDecoder`'s approximate page-aligned mapping (§3.4), re-syncing to the
next Ogg page — the VBR analogue of the WAV exact-offset seek.
- **UC7 — Safari that can't decode Ogg Opus.** Capability-gated to the lossless path (§3.4), so the
listener still plays audio. (Ties to OQ2 + Phase 1.7.)
---
## 6. Open questions for Daniel (genuine product decisions, not implementation detail)
- **OQ1 — Selection UX: how does a listener choose lossless vs. low-data?** Candidates: a global
toggle in the player bar / settings ("Stream quality: Low-data / Lossless"); a per-track control; an
automatic default with a manual override. Recommend a **single global quality toggle** (player bar
or a settings affordance) — it is the Spotify/Bandcamp/SoundCloud idiom (one account/session-level
"streaming quality" setting), low-friction, and matches a small-sharp-tool posture better than
per-track choosers. `[Daniel decision]`
- **OQ2 — Default policy: what does a listener get before they choose?** Opus is the
*default-candidate* per Daniel — confirm Opus-by-default. Sub-questions: should the default be
**capability-aware** (don't serve Ogg Opus to a browser that can't decode it — §3.4 Safari < 18.4)?
Should it be **network-aware** (Opus on cellular, lossless on wifi)? Recommend **Opus by default,
capability-gated** (fall back to lossless when the browser can't decode Ogg Opus), and **defer
network-awareness** as gold-plating for v1. `[Daniel decision]`
- **OQ3 — Is the choice remembered, and at what scope?** Per-session (resets each visit) vs.
persisted (cookie/`localStorage`, like the `darkMode` cookie) vs. (future) per-account once identity
exists. Recommend **persisted via a cookie/`localStorage` setting**, mirroring the dark-mode
precedent — one truth, seeded at prerender, carried to WASM. `[Daniel decision]`
- **OQ4 — Per-upload Opus control in the CMS, or always-on?** Should the CMS upload form let an admin
opt a track *out* of Opus generation (e.g. a track meant to be lossless-only), or is Opus always
generated for every track? Recommend **always-on** (simpler; Opus is additive and cheap to serve;
the listener's format choice already covers "I want lossless"). A per-track opt-out is a later
refinement if a real need appears. `[Daniel decision]`
- **OQ5 — Opus container/extension specifics.** Ogg Opus (`.opus` / `audio/ogg`) is the assumption
(broadest `decodeAudioData` support; Daniel said "Ogg Opus"). Confirm — vs. CAF-wrapped Opus (older
Safari) or WebM-Opus. Recommend **Ogg Opus** as Daniel directed; CAF-fallback for old Safari is not
worth it given the lossless fallback already covers those browsers (§3.4). `[Daniel steer — confirms
§3.4, not a blocker]`
- **OQ6 — Transcode execution model (flag, leans implementation).** Synchronous-at-upload is a
non-starter for 1 GB mixes (§3.1); the realistic options are a background/queued transcode after the
source is stored. This is largely staff-engineer's call, but it has a **product-visible
consequence**: a freshly uploaded track may be lossless-only for a short window until its Opus
artifact finishes. Confirm that "Opus appears shortly after upload, lossless available immediately"
is acceptable (it is the waveform-datum model already in place). `[Daniel steer]`
---
## 7. Acceptance criteria
- **AC1 (headline) — Dual-format delivery works.** A track can be streamed as either lossless WAV or
Ogg Opus 320 from the same `EntryKey`, selected per request; both play correctly through the bespoke
Web Audio graph.
- **AC2 — Opus is the low-data win.** The Opus artifact of a representative track is materially smaller
than its lossless source (target ~1/41/5 the bytes); a long mix's Opus transfer is correspondingly
smaller.
- **AC3 — Additive, non-breaking (C2).** The existing lossless WAV path is byte-for-byte unchanged; a
track with no Opus artifact still plays losslessly; `?format=opus` on such a track falls back to
lossless (no 404, no silence).
- **AC4 — Transcode at ingest, regenerable (C6).** A new upload produces an Opus artifact best-effort
after the source is stored; a transcode failure does not block the upload or break playback; a
Backfill-Opus action (re)generates artifacts for existing tracks; replace-audio regenerates the
Opus artifact from the new source.
- **AC5 — Opus seek via the existing `Range` path (C3).** Forward and backward seek in an Opus stream
resolve through the landed `Range: bytes=X-` primitive, with the offset coming from
`OpusFormatDecoder.calculateByteOffset`; no new seek mechanism is introduced.
- **AC6 — No format branches leak (C4).** The only Opus-specific code is `OpusFormatDecoder`, its
`OpusSeekData`, the one `createFormatDecoder` selection arm, and the transcode processor + delivery
param resolution. The format-agnostic player/scheduler code is unchanged.
- **AC7 — Capability-safe default (OQ2).** A browser that cannot decode Ogg Opus is served (or falls
back to) the lossless path and plays audio; no listener gets silence because of codec support.
- **AC8 — Windowing-ready (the Phase 21 handshake).** The `OpusFormatDecoder`'s approximate byte↔time
mapping is the one Phase 21's windowed refill will call; Opus playback must be windowable by the
same machinery (verified jointly when Phase 21 lands on top — see §8 / Phase 21 cross-ref).
---
## 8. Wave decomposition
Dependency shape: `18.1 → 18.2 → {18.3, 18.4}`, with `18.5` validating end-to-end. **18.1 (the
transcode/derived-artifact ingest) is the cold-start prerequisite** — until an Opus artifact exists,
nothing downstream has bytes to serve or decode. 18.3 (delivery param) and 18.4 (the decoder) are
largely parallel once 18.2 (storage/lookup) settles, but both need an artifact to test against.
- **18.1 — Ingest transcode: derive + store the Opus artifact (cold-start; load-bearing).** New
`OpusTranscodeService`/processor in `DeepDrftContent`, invoked post-store from
`UnifiedTrackService.UploadAsync` alongside `WaveformProfileService`; produces Ogg Opus fullband
320; stores it as a derived artifact (S2 vault recommended). Failure-tolerant (C6) and off the hot
path (background/queued — OQ6). **Independent of the delivery/decoder waves; can begin immediately.**
- **18.2 — Storage + lookup contract.** The derived-artifact key/vault convention and the server-side
resolution "given `EntryKey` + format, return the right `AudioBinary` + content-type," including the
C2 fallback (no Opus → lossless). **Depends on 18.1** (an artifact must exist to resolve to).
- **18.3 — Delivery: format param + proxy threading.** `?format=opus|lossless` on the
`DeepDrftAPI` track stream endpoint (resolves via 18.2), forwarded through the `DeepDrftPublic`
`TrackProxyController` (mirror the existing `offset` param threading), and the `Range` handler
serving the chosen artifact's bytes. The player sends the param via `TrackMediaClient`. **Depends on
18.2.** Parallel-ok with 18.4.
- **18.4 — `OpusFormatDecoder` in the player stack.** New `IFormatDecoder` implementation
(Ogg-page-aligned `getAlignedSegmentSize` via `OggS` scan, `OpusHead` setup carry in
`wrapSegment`/continuation, approximate page-interpolation `calculateByteOffset` with an
`OpusSeekData` accelerator); one new arm in `AudioPlayer.createFormatDecoder` on
`audio/ogg`/`audio/opus`. Capability detection for the lossless fallback (§3.4, OQ2). **Depends on
18.2** (needs Opus bytes to decode). Parallel-ok with 18.3; they meet at 18.5.
- **18.5 — Backfill + selection UX + end-to-end validation.** The Backfill-Opus CMS bulk action (third
sibling to Generate-Profiles / Backfill-High-res) and replace-audio Opus regeneration; the listener
selection control per OQ1/OQ3 (global persisted quality toggle, recommended); and the AC1AC8
acceptance pass — including AC8's confirmation that Opus is windowable so Phase 21 can build on it.
**Depends on 18.118.4.** (Selection UX can be split out if Daniel wants the substrate proven before
the control lands — flag at planning time.)
---
## 9. Cross-references (read before implementing)
- `CONTEXT.md §5` "Non-WAV formats" — the deferred intent this phase realizes (now concrete: derived
Opus low-data path, not generic format support).
- `PLAN.md` Phase 21 / `product-notes/phase-21-windowed-streaming-buffer.md` — **sequenced AFTER this
phase.** Phase 21's C5 invariant ("WAV-only shipping target; must not foreclose MP3/FLAC") is now
driven by Opus's VBR/paged seek math; Phase 21 OQ5 (adopt MSE) is resolved **NO** — the bespoke
graph stays (the same C1 decision recorded here). Windowing a VBR/Opus stream uses
`OpusFormatDecoder.calculateByteOffset`'s approximate mapping — exactly the C5 case.
- `PLAN.md` Phase 4 (landed) / `COMPLETED.md` — the HTTP `Range: bytes=X-` primitive Opus seek reuses.
- `PLAN.md` Phase 1.5 (gapless) / 1.6 (track-skip on error) / 1.7 (Safari) — 1.5's "encoder
padding/priming" caveat applies to Opus (it has pre-skip samples in `OpusHead`); 1.6's
byte-scan-to-next-frame is the Ogg-page-sync analogue; 1.7's Safari floor intersects §3.4's Ogg-Opus
`decodeAudioData` support (Safari < 18.4).
- `PLAN.md` Phase 12 / `product-notes/phase-12-waveform-visualizer-generalization.md` — the
`WaveformProfileService` derived-artifact-at-ingest + regenerate pattern this transcode mirrors
(compute on upload, regenerate via CMS action / endpoint, its own `track-waveforms` vault → the S2
precedent).
- `PLAN.md` Phase 9 — defines the `Mix` medium (single long track), the canonical low-data case.
- `PLAN.md` Phase 16 — play/share telemetry keys on one track identity; the §3.6 road-not-taken
(one-row-per-format) would have fractured this — kept to one identity, two artifacts.
- `DeepDrftContent/Processors/AudioProcessor.cs` + `AudioProcessorRouter` + `DeepDrftContent/CLAUDE.md`
— the existing format-router and the `WaveformProfileService` derived-artifact seam; 18.1 lives here.
- `DeepDrftPublic/Interop/audio/IFormatDecoder.ts` — the strategy interface `OpusFormatDecoder`
implements; `FlacFormatDecoder.ts` is the nearest prior art (setup-bytes carry + frame-sync scan).
- `DeepDrftPublic/Interop/audio/AudioPlayer.ts` (`createFormatDecoder`, lines 117125) — the decoder
registry gaining the Opus arm.
- `DeepDrftPublic.Client/Clients/TrackMediaClient.cs` + `DeepDrftPublic/Controllers/TrackProxyController.cs`
— the media fetch + proxy that thread the new `?format=` param (mirroring `offset`).
## Sources
- Ogg Opus support in `decodeAudioData`: Chrome/Firefox long-standing; Safari added Ogg-Opus at 18.4
(macOS 15.4 / iOS 18.4, March 2025) — prior Safari decoded Opus only in CAF.
https://chromestatus.com/feature/5649634416394240 ;
https://www.testmuai.com/learning-hub/opus-audio-codec-browser-support/
@@ -8,6 +8,16 @@ server touch is **reuse, not new surface**: the existing `DeepDrftAPI` HTTP `Ran
partial-content primitive (Phase 4, landed) is the load-bearing dependency; this phase adds no new API
endpoint.
> **Sequencing dependency (Daniel, 2026-06-23): Phase 18 (Opus Low-Data Streaming) comes BEFORE this
> phase.** Format support — specifically the derived **Ogg Opus fullband 320** low-data delivery path
> (`product-notes/phase-18-opus-low-data-streaming.md`) — is a prerequisite that sequences ahead of
> windowing. Phase 21's windowing must work across **both** delivery formats (lossless WAV and Opus).
> Its C5 invariant below already anticipated this ("must not foreclose MP3/FLAC"); **Opus is now the
> concrete VBR/containerized driver of C5.** Windowing an Opus stream uses the decoder's *approximate*
> byte↔time mapping (`OpusFormatDecoder.calculateByteOffset` — Ogg-page interpolation), exactly the C5
> case — not the exact CBR-WAV `byteRate` math. Build the window machinery format-agnostically
> (§2 C3/C5) so it inherits Opus for free.
---
## 1. Goal
@@ -45,19 +55,25 @@ docs. This phase **modifies that seam** — so the contract it must preserve is
- **C2 — Playback start latency unchanged.** Today playback starts as soon as a configurable minimum
buffer count is queued (header-derived duration, not full-file). The window model must keep first-audio
latency at parity — bounding memory must not reintroduce a fetch-then-play stall.
- **C3 — The format-decoder abstraction is untouched.** `IFormatDecoder` (WAV active; MP3/FLAC
implemented, not yet wired) owns all format-specific byte math. Windowing lives in the
- **C3 — The format-decoder abstraction is untouched.** `IFormatDecoder` owns all format-specific
byte math; `AudioPlayer.createFormatDecoder` already dispatches on `Content-Type` (WAV/MP3/FLAC
decoders all wired today — verified 2026-06-23; an `OpusFormatDecoder` joins them in Phase 18).
Windowing lives in the
**format-agnostic** layer (`PlaybackScheduler` eviction + `StreamDecoder`/player refill
orchestration); it must add **no** format-specific branches. A future wired MP3/FLAC decoder inherits
windowing for free.
- **C4 — Read-only playback only.** This is a memory-management change, not a UX change. No new
user-visible control, no change to seek/transport semantics beyond what the listener already
experiences. Seek must still feel identical.
- **C5 — WAV-only is the shipping target; the design must not foreclose MP3/FLAC.** Byte↔time mapping
for refill is exact and cheap for WAV (CBR: `byteRate` from the header). For VBR formats the mapping is
approximate (the decoders already carry TOC/SEEKTABLE seek math). The window machinery must express
refill in terms of the decoder's existing `calculateByteOffset`, so the same code works when those
formats are wired — **no WAV-special-cased offset math in the window layer.**
- **C5 — Must window both delivery formats (WAV lossless AND Opus low-data).** Byte↔time mapping for
refill is exact and cheap for WAV (CBR: `byteRate` from the header). For VBR/containerized formats it
is approximate (the decoders carry TOC/SEEKTABLE/Ogg-page seek math). **Phase 18 (Opus) is sequenced
before this phase and is the concrete driver here:** an Ogg Opus 320 stream is VBR and page-paged, so
its `calculateByteOffset` is an *approximate* page-interpolation, not exact-offset. The window
machinery must express refill purely in terms of the decoder's existing `calculateByteOffset`, so the
same code windows WAV exactly and Opus approximately — **no WAV-special-cased offset math in the
window layer.** (MP3/FLAC decoders are already wired in the registry too — the registry dispatches on
content-type today; an `OpusFormatDecoder` joins them in Phase 18.)
- **C6 — No regression to the single-instance JS decoder concurrency guarantees.** The current code is
careful that only one streaming loop touches the single JS `StreamDecoder` at a time
(`DrainActiveStreamingTaskAsync`, the `_streamingCancellation` identity dance). Windowed refill
@@ -146,14 +162,15 @@ because the stack is a bespoke Web Audio graph, not `<media>` + MSE.
Stop hand-rolling the decode→schedule graph for long tracks; feed the Range stream into a `SourceBuffer`
and let the browser evict via its built-in quota + `remove()`. Memory management becomes the platform's
problem.
*Why not (now, but flag for Daniel):* MSE does not accept raw WAV/PCM — it wants containerized formats
(fragmented MP4/WebM, or MP3/AAC elementary streams). The current producer is WAV-only, and the entire
bespoke visualizer/spectrum graph is wired to the Web Audio `AudioContext`, not a `<media>` element.
Adopting MSE is a **rewrite of the playback substrate**, not a windowing change — out of scope for this
phase. But it is the *real* long-term answer and is entangled with Phase 1.2 (non-WAV formats): if
DeepDrft moves to a compressed delivery format, MSE becomes viable and could retire the hand-rolled
decoder, the seek-beyond-buffer path, *and* this phase's window machinery in one move. **Surfaced as
open question OQ5** — not to decide now, but so this phase is built knowing it may be superseded.
*Why not — RESOLVED, rejected (Daniel, 2026-06-23; see OQ5):* MSE does not accept raw WAV/PCM — it
wants containerized formats (fragmented MP4/WebM, or MP3/AAC elementary streams). The entire bespoke
visualizer/spectrum graph is wired to the Web Audio `AudioContext`, not a `<media>` element. Adopting
MSE is a **rewrite of the playback substrate**, not a windowing change. It *looked* like the real
long-term answer once compressed delivery arrived — but Daniel has decided compressed delivery
(**Phase 18 Opus**) will feed the **same bespoke graph** via the `IFormatDecoder` seam, so the
compressed-delivery move that would have justified MSE happens *without* surrendering the graph. **The
bespoke graph is a deliberate long-term commitment; MSE is rejected.** Direction A is therefore the
permanent destination, not a stopgap that MSE will retire. Recorded as considered-and-declined.
### 3.3 Recommended direction: A, with B held as the documented fallback
@@ -262,11 +279,17 @@ These are policy calls with user-visible or resource trade-offs — flagged rath
tracks that never needed it. Recommend **window everything** (one path, C6-safe, and short tracks
simply never hit a refill because they fit inside the forward window) — but Daniel may prefer a
size threshold. `[Daniel decision]`
- **OQ5 — Is MSE (Direction C) the real destination?** Not for this phase, but it bears on how much to
invest here. If DeepDrft will move to compressed delivery (Phase 1.2) and MSE within ~a year, Phase 21
should be the *minimal* Direction-A change (don't gold-plate machinery MSE would retire). If WAV +
bespoke graph is the long-term commitment, a more thorough windowing investment is justified.
`[Daniel steer — informs scope, not a blocker]`
- **OQ5 — Is MSE (Direction C) the real destination? — RESOLVED: NO (Daniel, 2026-06-23).** **Do not
adopt MSE. The bespoke Web Audio decode→schedule graph stays — it is bespoke by deliberate choice, a
long-term commitment, not a stopgap.** Daniel's rationale: the player is intentionally a custom
graph, not an HTML `<media>` element; the compressed-delivery move that *would* have made MSE
tempting is being met instead by **Phase 18 (Opus low-data path)** feeding the **same bespoke graph**
through the `IFormatDecoder` seam — so compressed delivery arrives *without* surrendering the graph.
Consequence for this phase: Direction A (the hand-rolled sliding window) is the destination, not a
placeholder; invest in it as permanent machinery. It will window both the WAV and the Opus path
(the sequencing note at the top). Direction C is recorded as **considered and declined** per file
convention; kept visible so a future reader sees the road not taken and why.
`[RESOLVED — bespoke graph retained; MSE rejected]`
---