From e3a4364b8cd43e7f506a4210204c6f887e856176 Mon Sep 17 00:00:00 2001 From: daniel-c-harvey Date: Tue, 23 Jun 2026 05:26:58 -0400 Subject: [PATCH] docs(plan): Phase 18 OQ resolutions + VBR-safe accurate Opus seek model --- PLAN.md | 129 ++-- .../phase-18-opus-low-data-streaming.md | 562 ++++++++++++++---- .../phase-21-windowed-streaming-buffer.md | 35 +- 3 files changed, 543 insertions(+), 183 deletions(-) diff --git a/PLAN.md b/PLAN.md index aab4855..7bd65b1 100644 --- a/PLAN.md +++ b/PLAN.md @@ -468,23 +468,44 @@ decoder-side `AudioPlayer.createFormatDecoder` is a **wired** strategy registry step that *derives* an Opus 320 artifact per track (nothing derives Opus today), and (2) a **per-format delivery selection** so one track serves as either WAV or Opus on request. -**Architectural spine — a derived artifact + a delivery param + one new decoder; three new leaf -implementations, zero changes to existing format code (the strong OCP signal).** Transcode is a new -processor sibling in `DeepDrftContent`, invoked post-store alongside `WaveformProfileService`, -**failure-tolerant and off the hot path** (background/queued — a 1 GB WAV transcode must not block the -upload response) — mirroring the landed waveform-datum pattern (derive at ingest, regenerate via a CMS -bulk action + ApiKey endpoint). The Opus bytes are a **derived artifact** stored like the high-res -waveform datum (recommend a dedicated `track-opus` vault, the `track-waveforms` precedent; final call -staff-engineer's). Delivery adds a **`?format=opus|lossless` param** (mirroring the existing `offset` -param threading through `TrackProxyController`) resolved server-side to the right artifact + content-type, -with a **lossless fallback** when no Opus artifact exists (additive, never 404/silence). The player gains -one `OpusFormatDecoder` (`IFormatDecoder`): Ogg-page-aligned segmenting (`OggS` scan — the FLAC -frame-sync analogue), `OpusHead` setup-bytes carry (the FLAC `streamInfoBytes` analogue), and an -**approximate** page-interpolation `calculateByteOffset` (Opus is VBR/paged — this is exactly the Phase -21 C5 case). **Browser constraint flagged:** Ogg-Opus `decodeAudioData` is Safari-18.4+ only (Chrome/FF -long-standing), so the Opus default must be **capability-gated** — fall back to the universal lossless +**Open questions RESOLVED (Daniel, 2026-06-23).** OQ1 selection UX → **global, via a new public-site +Settings menu** (not a bare app-bar control); OQ2 default → **Opus by default, capability-gated** (defer +network-awareness); OQ3 remembered → **persisted via the dark-mode seam** (cookie → prerender-read → +`PersistentComponentState` → client cookie service); OQ4 → **always-on Opus + Backfill-Opus**; OQ5 → +**Ogg Opus**; OQ6 transcode model → **background job after the file is available, with a visible +Post-Processing phase on the CMS upload meter.** One new tuning OQ (OQ7: seek-index granularity — recommend +~1–2 s buckets) is non-blocking. + +**Architectural spine — a derived artifact set + a delivery param + one new decoder + a precomputed +accurate seek index; leaf implementations only, zero changes to existing format code (the strong OCP +signal).** Transcode is a new processor sibling in `DeepDrftContent`, invoked post-store alongside +`WaveformProfileService` **as a background job** (a 1 GB WAV transcode must not block the upload; the source +is stored and the track plays lossless *first*, then Opus is derived) — mirroring the landed waveform-datum +pattern (derive at ingest, regenerate via a CMS bulk action + ApiKey endpoint). The Opus bytes are a +**derived artifact** stored like the high-res waveform datum (recommend a dedicated `track-opus` vault, the +`track-waveforms` precedent; final call staff-engineer's). Delivery adds a **`?format=opus|lossless` param** +(mirroring the existing `offset` param threading through `TrackProxyController`) resolved server-side to the +right artifact + content-type, with a **lossless fallback** when no Opus artifact exists (additive, never +404/silence). The player gains one `OpusFormatDecoder` (`IFormatDecoder`): Ogg-page-aligned segmenting +(`OggS` scan — the FLAC frame-sync analogue) and `OpusHead`/`OpusTags` setup-bytes carry (the FLAC +`streamInfoBytes` analogue). **Browser constraint flagged:** Ogg-Opus `decodeAudioData` is Safari-18.4+ only +(Chrome/FF long-standing), so the Opus default is **capability-gated** — fall back to the universal lossless path on browsers that can't decode it. +**VBR-safe ACCURATE seeking (Daniel, 2026-06-23 — supersedes the earlier "approximate" hand-wave).** Raw +byte-offset seek and rough page interpolation are inadequate for VBR Opus — there is no linear time↔byte +relationship. The fix is an **accurate transfer function built at transcode time** (the one moment the +whole encoded stream is walked): a precomputed **seek index** mapping Ogg-page `granulepos` (48 kHz sample +counts → time) → exact byte offset (recommend ~1–2 s buckets snapped to page starts; ~58 KB for a 1-hour +mix). The decode **setup header** (`OpusHead`/`OpusTags`, needed to decode any mid-stream slice) is made +available too. Recommended concrete design: **one sidecar artifact per track = `[setup header][seek +index]`, built at transcode, stored beside the Opus bytes, fetched once on track load**, parsed into +`OpusSeekData`. Client seek flow: `calculateByteOffset(t)` binary-searches the index for the exact page +offset → `Range: bytes=X-` fetch (landed Phase 4 primitive, unchanged) → prepend the cached setup header → +decode → fine re-sync to `t` within the bucket. **The listener lands at the correct time, not +approximately** (AC9), **without** the full PCM in memory — so it composes with Phase 21 windowed refill, +which calls the **same** index resolver. The earlier "approximate page-interpolation" language is rejected. + **Constraints/invariants:** keep the bespoke graph (no MSE); preprocessing is **additive** (WAV path untouched, byte-for-byte; a track with no Opus artifact still plays losslessly); reuse the landed `Range`/offset seek path; no format branches leak outside the new decoder + one selection arm + the @@ -492,38 +513,49 @@ transcode/delivery seam; transcode failure must not block ingest; format selecti decision resolving one `EntryKey` to one of two artifacts (one source, two views — **not** a second `TrackEntity` row, which would fracture share/queue/play-count/release identity). -Sequenced as five waves. `18.1 → 18.2 → {18.3, 18.4} → 18.5`. **18.1 (ingest transcode + derived -artifact) is the cold-start prerequisite** — nothing downstream has bytes to serve or decode until an -Opus artifact exists. +Sequenced as six waves. `18.1 → 18.2 → {18.3, 18.4} → 18.5`, with `18.6` (Settings menu) able to run in +parallel (it needs only 18.3's format mechanism before its toggle is live). **18.1 (ingest transcode + +seek-index + setup-header derived artifacts) is the cold-start prerequisite** — nothing downstream has +bytes to serve, decode, or seek against until those artifacts exist. -- **18.1 — Ingest transcode: derive + store the Opus artifact (cold-start; load-bearing).** New +- **18.1 — Ingest transcode + seek-index + setup-header (cold-start; load-bearing).** New `OpusTranscodeService`/processor in `DeepDrftContent`, invoked post-store from - `UnifiedTrackService.UploadAsync` alongside `WaveformProfileService`; produces Ogg Opus fullband 320; - stores it as a derived artifact (recommend a `track-opus` vault). Failure-tolerant; off the hot path - (background/queued). **Independent of the delivery/decoder waves — can begin immediately.** -- **18.2 — Storage + lookup contract.** The derived-artifact key/vault convention + server-side "given - `EntryKey` + format, return the right `AudioBinary` + content-type," including the lossless fallback. - **Depends on 18.1.** -- **18.3 — Delivery: `?format=opus|lossless` param + proxy threading.** On the `DeepDrftAPI` stream - endpoint (resolves via 18.2), forwarded through `TrackProxyController` (mirror `offset`), `Range` - serving the chosen artifact; player sends it via `TrackMediaClient`. **Depends on 18.2; parallel-ok - with 18.4.** -- **18.4 — `OpusFormatDecoder` in the player stack.** New `IFormatDecoder` (Ogg-page segmenting, - `OpusHead` carry, approximate page-interpolation `calculateByteOffset` with an `OpusSeekData` - accelerator) + one arm in `createFormatDecoder` on `audio/ogg`/`audio/opus`; capability detection for - the lossless fallback. **Depends on 18.2; parallel-ok with 18.3.** -- **18.5 — Backfill + selection UX + end-to-end validation.** "Backfill Opus" CMS bulk action (third - sibling to Generate-Profiles / Backfill-High-res) + replace-audio Opus regeneration; the listener - selection control (recommend a global persisted quality toggle); the AC1–AC8 acceptance pass including - the Phase-21 handshake (Opus is windowable by the same machinery). **Depends on 18.1–18.4.** + `UnifiedTrackService.UploadAsync` alongside `WaveformProfileService` **as a background job** (OQ6); + produces Ogg Opus fullband 320; **walks the encoded stream once to build the granule→byte seek index and + extract the `OpusHead`/`OpusTags` setup header**; stores the Opus bytes **and** the combined seek/setup + **sidecar** as derived artifacts (recommend a `track-opus` vault). Failure-tolerant. **Independent of the + delivery/decoder waves — can begin immediately.** +- **18.2 — Storage + lookup contract.** The derived-artifact key/vault convention (Opus bytes + sidecar) + + server-side "given `EntryKey` + format, return the right `AudioBinary` + content-type (+ the sidecar)," + including the lossless fallback. **Depends on 18.1.** +- **18.3 — Delivery: `?format=opus|lossless` param + sidecar serving + proxy threading.** On the + `DeepDrftAPI` stream endpoint (resolves via 18.2), forwarded through `TrackProxyController` (mirror + `offset`), `Range` serving the chosen artifact; **plus serving the seek/setup sidecar**; player sends the + format param via `TrackMediaClient`. **Depends on 18.2; parallel-ok with 18.4.** +- **18.4 — `OpusFormatDecoder` + index-based seek resolver in the player stack.** New `IFormatDecoder` + (Ogg-page segmenting via `OggS` scan, `OpusHead`/`OpusTags` setup carry from the cached sidecar, + **`calculateByteOffset` that binary-searches the precomputed seek index** — NOT interpolation — with an + `OpusSeekData` accelerator holding the parsed index + setup bytes, and the one-time sidecar fetch+parse on + track load) + one arm in `createFormatDecoder` on `audio/ogg`/`audio/opus`; capability detection for the + lossless fallback. **Depends on 18.2; parallel-ok with 18.3.** +- **18.5 — Backfill + replace-audio + end-to-end validation (incl. seek accuracy).** "Backfill Opus" CMS + bulk action (third sibling to Generate-Profiles / Backfill-High-res), rebuilding Opus bytes + sidecar for + existing tracks; replace-audio Opus + sidecar regeneration; the AC1–AC10 acceptance pass **including AC9 + (an Opus seek lands at the correct time, not approximately)** and the Phase-21 handshake (Opus windowable + via the index resolver + sidecar setup header). **Depends on 18.1–18.4.** +- **18.6 — Public Settings menu + quality toggle (the listener selection UX).** New public-site + Settings-menu shell (app-bar trigger + MudBlazor menu + a settings-item abstraction + a + `PublicSiteSettings`/`ListenerSettings` object + the dark-mode-pattern persistence seam: `streamQuality` + cookie, a `DeepDrftPublic` prerender-read service, `PersistentComponentState` bridge, client cookie + service); the **quality toggle is its first occupant** (Low-data/Lossless, Opus default, capability-gated) + + the CMS upload meter's **Post-Processing phase** (OQ6). Built design-for-adaptability so dark mode can + plug in later without restructuring (not migrated now). **Depends on 18.3** for the toggle; the menu shell + can be built ahead. *Splittable* (shell, then toggle) if Daniel wants the shell proven first. -**Dependency shape:** `18.1 → 18.2 → {18.3 ∥ 18.4} → 18.5`; 18.1 is the only cold-start wave. -**Phase-level: 18 precedes Phase 21.** **Open questions for Daniel (spec §6):** selection UX (recommend a -single global quality toggle); default policy (recommend Opus-by-default, capability-gated; defer -network-awareness); whether the choice is remembered + scope (recommend persisted cookie/`localStorage`, -the dark-mode precedent); per-upload Opus opt-out vs. always-on (recommend always-on); Ogg-vs-CAF/WebM -container (recommend Ogg Opus as directed); transcode execution model (background/queued — a track is -lossless-only briefly until its Opus finishes; confirm acceptable). None block 18.1. +**Dependency shape:** `18.1 → 18.2 → {18.3 ∥ 18.4} → 18.5`; `18.6 ∥` (needs 18.3 for the live toggle); +18.1 is the only cold-start wave. **Phase-level: 18 precedes Phase 21** (windowed refill consumes the Phase +18 seek-index resolver). **OQ1–OQ6 RESOLVED (above); OQ7 (seek-index granularity, recommend ~1–2 s buckets) +is a non-blocking tuning steer.** None block 18.1. --- @@ -539,9 +571,12 @@ endpoint, no schema change. derived Ogg Opus 320 low-data path, Phase 18) is a prerequisite that comes first; windowing must work across **both** delivery formats. Phase 21's C5 invariant already anticipated this ("must not foreclose MP3/FLAC"); **Opus is now the concrete VBR/paged driver** — windowing an Opus stream uses the decoder's -*approximate* byte↔time mapping (`OpusFormatDecoder.calculateByteOffset`, Ogg-page interpolation), not -the exact CBR-WAV `byteRate` math. Build the window machinery format-agnostically so it inherits Opus -for free. +**accurate index-based** byte↔time mapping (`OpusFormatDecoder.calculateByteOffset`, a binary search in the +Phase 18 precomputed seek index — *not* the exact CBR-WAV `byteRate` math, and *not* approximate page +interpolation: VBR-safe and exact, per the Phase 18 seek-model resolution 2026-06-23). The windowed refill +controller calls the **same** index resolver an explicit seek does, and a window opening away from byte 0 +still decodes via the Phase 18 sidecar setup header. Build the window machinery format-agnostically so it +inherits Opus for free. The network path already streams in adaptive 16–64 KB chunks. The accumulation is on the **decode side**: `PlaybackScheduler` holds an `AudioBuffer[]` it **never evicts** ("Supports pause/resume/seek by diff --git a/product-notes/phase-18-opus-low-data-streaming.md b/product-notes/phase-18-opus-low-data-streaming.md index b882904..a54ec7b 100644 --- a/product-notes/phase-18-opus-low-data-streaming.md +++ b/product-notes/phase-18-opus-low-data-streaming.md @@ -1,8 +1,17 @@ # Phase 18 — Opus Low-Data Streaming (dual-format lossless + Opus delivery) -Product spec. Status: **design / framing — implementation-ready pending Daniel's open-question calls.** +Product spec. Status: **design / framing — open questions RESOLVED (Daniel, 2026-06-23); implementation-ready.** Author: product-designer. Date: 2026-06-23. **No code has been written by this doc.** +> **Resolution pass (Daniel, 2026-06-23).** OQ1–OQ6 are resolved (see §6 — each marked RESOLVED, kept +> visible per file convention). Two resolutions reshaped the spec materially: (a) the listener quality +> selection lives inside a **new public-site Settings menu surface** (not a bare app-bar control) — §4 + +> §4a; and (b) Daniel rejected the "approximate page-interpolation" seek hand-wave outright — **VBR-safe +> *accurate* seeking is now a first-class part of the architecture** (a precomputed seek-index artifact + +> a separately-available setup header). §3.4 is rewritten and a dedicated seek-model section (§3.4a) +> added. The Phase 21 cross-reference is updated to read "accurate index-based mapping," not +> "approximate." + This phase is the concrete realization of the long-deferred **"Non-WAV formats"** intent (`CONTEXT.md §5`, the "1.2" the streaming-feature items reference). It supersedes the abstract "a processor per format + a decoder strategy" framing with a specific, Daniel-directed product: **two @@ -13,12 +22,17 @@ Surfaces (named precisely): - **Ingest / preprocessing:** `DeepDrftContent` (`AudioProcessor` / `AudioProcessorRouter` / `TrackContentService` / `WaveformProfileService`) + `DeepDrftAPI` (upload/persist — - `UnifiedTrackService.UploadAsync`, replace-audio) + `DeepDrftManager` (CMS upload form, only if a - per-upload control is wanted — see OQ4). -- **Delivery / decode:** `DeepDrftAPI` (the track stream endpoint + `Range` handler) + - `DeepDrftPublic` proxy (`TrackProxyController`) + `DeepDrftPublic.Client` player stack - (`StreamingAudioPlayerService`, `TrackMediaClient`) + `DeepDrftPublic/Interop/audio` TS decoders - (`AudioPlayer.createFormatDecoder` registry, a new `OpusFormatDecoder`). + `UnifiedTrackService.UploadAsync`, replace-audio) + `DeepDrftManager` (CMS upload form — the + **Post-Processing phase** on the existing upload progress meter, §3.1a). +- **Delivery / decode:** `DeepDrftAPI` (the track stream endpoint + `Range` handler + the new + **seek-index** and **setup-header** sidecar endpoints, §3.4a) + `DeepDrftPublic` proxy + (`TrackProxyController`) + `DeepDrftPublic.Client` player stack (`StreamingAudioPlayerService`, + `TrackMediaClient`) + `DeepDrftPublic/Interop/audio` TS decoders (`AudioPlayer.createFormatDecoder` + registry, a new `OpusFormatDecoder`). +- **Listener settings (NEW surface):** `DeepDrftPublic.Client` — a public-site **Settings menu** (app-bar + menu/popover) hosting the quality toggle as its first occupant, with a dark-mode-pattern persistence + seam (cookie → settings object → `PersistentComponentState` → client cookie service). §4a. The + prerender-cookie read lives in `DeepDrftPublic` (alongside `DarkModeService`). **Sequencing headline: Phase 18 comes BEFORE Phase 21 (Windowed Streaming Buffer).** Phase 21's windowing must work across both formats — its C5 invariant already anticipated this ("must not @@ -165,8 +179,44 @@ managed binding, or a libopus P/Invoke). The spec fixes the *artifact* (Ogg Opus and the *seam* (a derived artifact produced post-store, regenerable, failure-tolerant), not the tool. Note a real operational constraint to flag for implementation: transcoding a 1 GB WAV is **CPU- and time-expensive** and must not block the upload response — it wants the same off-the-hot-path treatment -the upload body staging already gets (`Upload:StagingPath`), likely a background/queued step. This is -the single biggest implementation risk and is called out as such. +the upload body staging already gets (`Upload:StagingPath`). This is the single biggest implementation +risk and is called out as such. The execution model is now **decided** (OQ6): **the source is stored and +the track is playable (lossless) first, then the Opus transcode runs as a background job** — see §3.1a +for the user-visible consequence on the upload UI. + +### 3.1a Transcode execution model + the Post-Processing upload phase (RESOLVED — OQ6) + +**Execution model (Daniel, 2026-06-23): background process *after* the file is available.** The upload +flow is now two distinct server-side stages with a hard ordering: + +1. **Transfer + store + persist (existing, synchronous).** The WAV body streams in (the landed + `ProgressStreamContent` two-phase cancellation), the source is stored in the vault, the `TrackEntity` + is persisted, the waveform datums are computed. At the end of this stage **the track is fully playable + losslessly** — nothing about Opus gates a successful upload. +2. **Opus transcode (NEW, background, after stage 1 completes).** A queued/background job reads the + stored source, transcodes to Ogg Opus 320, builds the **seek index** and extracts the **setup header** + (§3.4a), and stores all three derived artifacts. Until it finishes, `?format=opus` for that track + falls back to lossless (C2). On failure the track stays lossless-only and is eligible for Backfill-Opus + (C6). + +**The upload progress meter gains a visible Post-Processing phase.** The CMS upload forms +(`BatchUpload.razor` / `BatchEdit.razor`) already render a progress meter driven by `ProgressStreamContent` +(byte-transfer progress) and the two-phase cancellation (idle window during transfer, response-wait budget +after the body completes). The transcode is a **third visible phase** appended to that meter — after the +existing "uploading bytes" and "server is persisting" phases, a **Post-Processing** phase reflects the +background transcode's status (queued → transcoding → done / failed). This is an *addition* to the +existing meter, not a new UI. + +- The admin sees: bytes transfer → server persists (track now exists + plays lossless) → **Post-Processing** + (Opus being derived). The form may complete/return the admin to the catalogue after stage 1 (the track + is live); the Post-Processing phase can continue to report against that track in the browse/release view + (the Opus waveform/profile columns on `Releases.razor` already poll-and-show per-track derived-artifact + status — Post-Processing status fits the same affordance family). +- **How status reaches the UI is staff-engineer's call** (poll the track's Opus-artifact presence, an SSE/ + long-poll job channel, or a status field on the track read). The spec fixes that the phase is *visible* + and *non-blocking* — the admin is never made to wait on the transcode to consider the upload done. +- This composes with the **always-on** decision (OQ4): every upload triggers the background transcode; + there is no per-upload opt-out, so the Post-Processing phase always appears. ### 3.2 Where the Opus artifact is stored (two options) @@ -206,30 +256,36 @@ Server-side fallback rule (C2): if `format=opus` is requested but no Opus artifa track (not yet transcoded / backfilled), the endpoint **falls back to lossless** rather than 404ing — Opus is additive, so its absence degrades to "you get the lossless one," never to "no audio." -### 3.4 The Opus decoder + seek math (the genuinely new decode work) +### 3.4 The Opus decoder (the genuinely new decode work) -`OpusFormatDecoder implements IFormatDecoder` is the new code on the delivery side. Two things make it -harder than the WAV decoder and need to be flagged: +`OpusFormatDecoder implements IFormatDecoder` is the new code on the delivery side. **Ogg Opus is a +containerized, paged format — not raw-frame-sliceable** the way WAV PCM is. WAV's `wrapSegment` prepends a +44-byte PCM header to any PCM-aligned byte run; the current model assumes you can wrap an arbitrary aligned +raw-audio slice and hand it to `decodeAudioData`. Ogg Opus is page-structured (Ogg pages carrying Opus +packets, plus mandatory `OpusHead`/`OpusTags` **setup pages** at the very start). A mid-stream byte slice +is **not** independently decodable: it needs (1) the setup header prepended, and (2) to begin on an Ogg +**page boundary**. So: -- **Containerized, paged format — not raw-frame-sliceable.** WAV's `wrapSegment` prepends a 44-byte - PCM header to any PCM-aligned byte run; the current model assumes you can wrap an arbitrary aligned - raw-audio slice and hand it to `decodeAudioData`. **Ogg Opus is page-structured** (Ogg pages - carrying Opus packets, plus mandatory `OpusHead`/`OpusTags` setup pages at the start). A mid-stream - byte slice is not independently decodable without the setup header and without landing on Ogg page - boundaries. So `OpusFormatDecoder`'s `getAlignedSegmentSize` must align to **Ogg page boundaries** - (scan for the `OggS` capture pattern — analogous to FLAC's frame-sync scan, for which the - `IFormatDecoder` interface already passes `rawData` to `getAlignedSegmentSize`), and - `wrapSegment`/the continuation path must carry the `OpusHead` setup (analogous to FLAC's - `streamInfoBytes` in `FlacSeekData`). **The `IFormatDecoder` abstraction already has the shape for - this** — a format-specific `seekData` accelerator and a setup-bytes carry — because FLAC needed the - same kind of thing. A new `OpusSeekData` variant joins `Mp3VbrSeekData | FlacSeekData`. -- **VBR byte↔time mapping is approximate (the Phase 21 C5 case, concretely).** Opus at "320 kbps" is - effectively VBR; there is no exact `byteRate` for offset math the way CBR WAV has. Seek-by-offset - uses an **approximate** mapping (granule-position/Ogg-page interpolation, the Opus analogue of MP3's - Xing TOC or FLAC's SEEKTABLE). `calculateByteOffset` returns a best-effort page-aligned offset; the - decoder then re-syncs to the next Ogg page. This is exactly the "VBR formats: the mapping is - approximate" case Phase 21's C5 invariant anticipated — **Opus is the format that makes that - invariant load-bearing rather than hypothetical.** +- `OpusFormatDecoder.getAlignedSegmentSize` aligns to **Ogg page boundaries** — scan for the `OggS` + capture pattern (analogous to FLAC's frame-sync scan; the `IFormatDecoder` interface already passes + `rawData` to `getAlignedSegmentSize` for exactly this reason). +- `wrapSegment` / the continuation path **prepends the `OpusHead`/`OpusTags` setup bytes** to a mid-stream + page run before handing it to `decodeAudioData` (analogous to FLAC's `streamInfoBytes` carry in + `FlacSeekData`). The setup bytes come from the **setup-header mechanism** (§3.4a), not from re-reading + the stream start. +- A new `OpusSeekData` variant joins `Mp3VbrSeekData | FlacSeekData` in the `seekData` accelerator slot — + but for Opus it carries the **accurate seek index** (§3.4a), not a heuristic TOC. + +**The `IFormatDecoder` abstraction already has the shape for both needs** — a format-specific `seekData` +accelerator and a setup-bytes carry — because FLAC needed the same kind of thing. The genuinely new part +is **where the seek index and setup header come from**, which §3.4a designs. + +> **Seek is NOT approximate for Opus (Daniel, 2026-06-23 — supersedes the earlier hand-wave).** An earlier +> draft of this section proposed "granule-position/Ogg-page interpolation" — a best-effort approximate +> offset, the Opus analogue of MP3's Xing TOC. **That is rejected.** Daniel: *"Killing seeking for +> decoding is unacceptable… Raw bytes offset for seeking is no longer adequate due to VBR. We need an +> accurate transfer function for seek time → true file byte offset."* Opus seeking is **accurate**, backed +> by a precomputed index built at transcode time. See §3.4a. **Browser decode-support constraint (real, must be designed around).** The bespoke graph decodes segments via `AudioContext.decodeAudioData`. Ogg-Opus support in `decodeAudioData` is long-standing in @@ -237,21 +293,145 @@ Chrome and Firefox but arrived in **Safari only at 18.4 (macOS 15.4 / iOS 18.4, Safari decodes Opus only in a CAF container, not Ogg. iOS Safari is a primary music-listening surface, so this is not a corner case. Implications: (1) the **lossless WAV path is the universal fallback** for listeners whose browser can't decode Ogg Opus — which C2's additive design already provides for free; -(2) format-default policy (OQ2) should consider capability detection — don't hand Ogg Opus to a Safari -that can't decode it. This intersects Phase 1.7 (Safari compatibility) and is flagged there too. +(2) the format default is **capability-gated** (OQ2, RESOLVED) — don't hand Ogg Opus to a Safari that +can't decode it; detect support (a probe `decodeAudioData` on a tiny Opus blob, or a UA/version gate) and +fall back to lossless. This intersects Phase 1.7 (Safari compatibility) and is flagged there too. ([Browser support: caniuse / WebKit 18.4 release notes — see Sources.]) +### 3.4a VBR-safe accurate seeking (the seek-index artifact + the setup-header mechanism) + +This is the architectural core of the Opus delivery path, and it must compose with **Phase 21 windowed +refill** (where most of the stream is *not* in memory). The requirement, decomposed from Daniel's +direction: + +1. Seeking must be preserved for Opus **without** having the full PCM decoded in memory. +2. Raw byte-offset seek is inadequate — a VBR Opus stream has **no linear time↔byte relationship**, so + `byteRate` math and even rough page interpolation are not accurate enough. +3. We need an **accurate transfer function: seek-time → true file byte offset.** +4. The decode setup header must be **available separately** (or cached before seeking past it), because a + mid-stream slice is undecodable without `OpusHead`/`OpusTags`. + +**The key insight: the one moment we already walk the entire encoded stream is the transcode.** That is +precisely when an accurate index can be built for free. We never have to guess at delivery time — we read +the answer out of a precomputed artifact. + +#### A. The seek-index artifact (the accurate transfer function) + +At transcode time, after the Opus bytes are produced, **walk the encoded Ogg stream once and record, for +each Ogg page (or coarser bucket), the page's `granulepos` (a 48 kHz sample count → time) paired with its +**byte offset** in the file.** That granule→byte table *is* the exact transfer function. This is the Opus +analogue of FLAC's `SEEKTABLE` / MP3's Xing TOC — but **precomputed and exact**, not derived by +interpolation guessing. Ogg granule positions are authoritative sample counts, so the mapping is true, not +estimated. + +- **What it contains.** An ordered list of `(timeSeconds | granulepos, byteOffset)` entries, plus the + total duration and total byte length (for clamping a seek to range). A binary little-endian array of + fixed-width records is the natural shape (e.g. a `uint64 granulepos` + `uint64 byteOffset` per entry); + the exact encoding is staff-engineer's, but it should be a **compact binary blob**, fetched once and + parsed into a typed array client-side. +- **Granularity vs. size (the one real tuning knob).** One entry per Ogg page is the most precise but + largest; an Ogg page is typically a few KB of audio (~tens of ms to a few hundred ms), so a 1-hour mix + could be tens of thousands of pages. Recommend a **coarser bucket: one index entry per ~1–2 seconds of + audio** (snap each bucket boundary to the *nearest enclosing page start*, so every indexed offset is + still an exact page boundary). At ~1 s granularity a 1-hour mix is ~3,600 entries × 16 bytes ≈ **~58 KB** + — a trivial one-time fetch, and 1 s seek resolution is more than fine (the decoder re-syncs to the exact + page within the bucket anyway — see the client flow). **Per-page precision is the fallback if 1 s buckets + ever prove too coarse**, at a larger index. The number is staff-engineer's call; the *shape* (precomputed + exact granule→byte, bucketed, snapped to page starts) is fixed. +- **Sidecar, not embedded (recommended).** Store the index as a **third derived artifact** alongside the + Opus bytes and the waveform datum — the same "derived artifacts get their own vault" pattern this phase + already uses (S2 / `track-opus`; the `track-waveforms` precedent). Keep it a separate vault resource + (e.g. `{entryKey}.seekidx` in a `track-opus` vault, or its own `track-opus-index` vault) rather than + embedding it in the Ogg stream. *Why sidecar:* it is fetched **once, up front** (small, cacheable), + independent of the audio byte stream; embedding it in the Ogg would force the client to read into the + stream to find it, defeating the "resolve the offset *before* the Range fetch" flow. *Road not taken — + derive the index lazily on first seek by scanning server-side:* rejected, because it re-walks the stream + at request time (the cost we avoid by computing at transcode) and gives nothing the precomputed sidecar + doesn't. + +#### B. The setup-header mechanism (decodability of any mid-stream slice) + +Any post-seek slice needs `OpusHead` + `OpusTags` prepended to decode. Two ways to make those bytes +available to the client: + +- **B-a — Client-side caching of the leading setup pages on first read (recommended).** On first play, the + stream already begins at byte 0, so the client *already receives* the `OpusHead`/`OpusTags` pages as the + opening bytes. `OpusFormatDecoder.tryParseHeader` captures and **retains** those setup bytes (exactly as + `WavFormatDecoder` retains the parsed WAV header for `reinitializeForRangeContinuation` today, and FLAC + retains `streamInfoBytes`). Every subsequent post-seek continuation prepends the cached setup bytes. *No + new endpoint;* it reuses the header-retention discipline already in the codebase. +- **B-b — A dedicated setup-header sidecar endpoint** (`GET api/track/{id}/opus/header` → just the + `OpusHead`/`OpusTags` bytes, also derivable at transcode time and stored as a tiny artifact). *Pro:* a + seek can be served even if the listener seeks **before** the stream start has been read (e.g. a deep-link + that begins mid-track, or a Phase 21 window that opens away from byte 0). *Con:* one more endpoint + + artifact. + +**Recommendation: B-a as the primary, B-b as a cheap insurance artifact.** B-a covers the overwhelming +common case (play-then-seek) with **zero new surface** — it is the WAV-header-retention pattern applied to +Opus. But Phase 21 windowing and deep-links can legitimately open a window that never read byte 0, so the +setup header should **also** be derivable on demand. Cheapest reconciliation: **extract the setup bytes at +transcode time and store them as a tiny sidecar artifact** (they are a few hundred bytes), and expose them +**either** as a small endpoint **or** simply prepend them to the seek-index sidecar's header region so the +single up-front index fetch *also* delivers the setup bytes. The latter folds B-b into the B-a fetch: **the +client's one up-front sidecar fetch returns both the seek index and the setup header**, so it always has +both before it ever issues a seek — and never needs byte 0 to have been read. **Recommended concrete +design: one sidecar per track = `[setup-header bytes][seek-index table]`, fetched once on track load, +parsed into `OpusSeekData`.** This is the cleanest: one new artifact, one new fetch, both needs met. + +#### C. The client-side seek flow, end to end + +With the sidecar (`OpusSeekData` = setup header + granule→byte index) fetched and parsed at track load: + +1. **Resolve time → byte offset (accurate).** Listener seeks to `t` seconds. `OpusFormatDecoder.calculateByteOffset(t)` + does a binary search in the index for the largest entry with `time ≤ t`, returns its exact (page-start) + `byteOffset`. **No interpolation, no `byteRate` math.** (For WAV this method stays the exact CBR + calculation it is today — the seam is identical; only the Opus implementation reads an index.) +2. **Range fetch from the offset.** Issue `GET api/track/{id}?format=opus` with `Range: bytes={byteOffset}-` + — the **landed Phase 4 Range primitive, unchanged**. Server streams raw Opus bytes from that exact page + boundary (`206 Partial Content`). +3. **Prepend the cached setup header + decode.** The continuation path (the Opus analogue of + `StreamDecoder.reinitializeForRangeContinuation`) prepends the retained/sidecar `OpusHead`/`OpusTags` + bytes to the incoming page run, then feeds it to `decodeAudioData`. Because the index offset is an exact + page start, the stream is immediately Ogg-sync-aligned. +4. **Fine re-sync within the bucket.** The granule of the first decoded page tells the decoder the *exact* + time it landed at (≤ the bucket granularity ahead of `t`); the scheduler trims/positions to land + playback at `t` precisely. With ~1 s buckets the trim is sub-second; with per-page granularity it is + near-zero. **Either way the listener lands at the correct time, not approximately** (AC9). + +#### D. Composition with Phase 21 windowed refill + +Phase 21's windowed refill controller resolves "I need bytes for playback position `P`" → a byte offset → +a Range fetch. **It calls the *same* `OpusFormatDecoder.calculateByteOffset` (the index-based resolver) +for Opus** that an explicit seek does — windowed refill is just a seek the listener didn't initiate. So the +seek index serves both: explicit seeks and the window's low-water refills both resolve through the index, +and both prepend the cached setup header. This is why §3.4a is in **Phase 18** (where the transcode that +builds the index lives), and Phase 21 *consumes* it. The Phase 21 spec's "approximate mapping" language for +Opus is now wrong and is corrected to **"accurate index-based mapping."** + +#### E. Reuse vs. extend (the seam discipline) + +- **Reused verbatim:** the Phase 4 `Range: bytes=X-` → 206 primitive (client → proxy → API); the + `IFormatDecoder.calculateByteOffset` seam; the header-retention/continuation discipline + (`reinitializeForRangeContinuation`'s Opus analogue); the derived-artifact-in-its-own-vault pattern + (`track-waveforms` → `track-opus`); the derive-at-transcode-regenerate-on-backfill lifecycle. +- **Extended (new):** the seek-index + setup-header **sidecar artifact** (built at transcode, stored + beside the Opus bytes); the one-time **sidecar fetch** on track load (parsed into `OpusSeekData`); the + index **binary-search resolver** inside `OpusFormatDecoder`. Three additions, all leaf-level — no change + to the Range mechanism, the proxy, or the format-agnostic player. + ### 3.5 The three candidate directions (shape-level) Per file convention the alternatives are recorded; the recommendation follows. **Direction A — Derived Opus artifact at ingest + format param on delivery (recommended).** What §3.1 -–3.4 describe: transcode to Opus 320 post-store, store as a derived artifact (S2 vault), serve via a -`?format=` param resolved server-side to bytes + content-type, decode via a new `OpusFormatDecoder` in -the existing registry. *Why recommended:* additive (C2), reuses every existing seam (the processor -orchestration, the waveform-datum derived-artifact pattern, the `Range` path, the decoder registry), -and the only genuinely new code is one transcode step + one decoder. Two derived artifacts per track, -both regenerable. +–3.4a describe: transcode to Opus 320 post-store as a **background job** (OQ6), store as derived artifacts +(S2 vault) — the Opus bytes **plus the seek-index/setup-header sidecar** (§3.4a) — serve via a `?format=` +param resolved server-side to bytes + content-type, decode via a new `OpusFormatDecoder` in the existing +registry, **seek accurately via the precomputed index**. *Why recommended:* additive (C2), reuses every +existing seam (the processor orchestration, the waveform-datum derived-artifact pattern, the `Range` path, +the decoder registry, the header-retention discipline), and the only genuinely new code is one transcode +step (+ index build) + one decoder (+ index resolver). **Three** derived artifacts per track (Opus bytes, +seek sidecar, and the existing waveform datum), all regenerable. **Direction B — On-the-fly transcode at delivery (no stored Opus artifact).** Transcode WAV→Opus per request in the stream endpoint, streaming the Opus out as it encodes. *Why not (default):* moves @@ -274,10 +454,13 @@ recorded only because "just store Opus" is the tempting simplification and the s extension); the artifact is a new derived vault resource (the `track-waveforms` precedent is exactly this). Phase 18 adds **three new leaf implementations** and **zero changes to existing format code** — the strongest possible OCP signal that the seams were designed right. -- **SRP, preserved.** Transcoding is a content-domain processor concern (`DeepDrftContent`); delivery - selection is a thin endpoint concern (`DeepDrftAPI` resolves a param to an artifact); decode is the - `OpusFormatDecoder`'s concern; byte↔time math stays inside that decoder via `calculateByteOffset`. - No responsibility crosses a boundary it doesn't already own. +- **SRP, preserved.** Transcoding **and the seek-index build** are content-domain processor concerns + (`DeepDrftContent`); delivery selection is a thin endpoint concern (`DeepDrftAPI` resolves a param to an + artifact, and serves the sidecar); decode is the `OpusFormatDecoder`'s concern; byte↔time math stays + inside that decoder via `calculateByteOffset` (now reading the index, not interpolating). No + responsibility crosses a boundary it doesn't already own. The seek index is built **once, where the + stream is already walked** (transcode) — the natural home for an exact transfer function, never + recomputed at request time. - **DIP / "one source, multiple views."** One `TrackEntity`/`EntryKey` is the single source; "lossless WAV" and "low-data Opus" are two *views* (renderings) of it, diverging only at the delivery/decode layer — the same discipline the dark-mode and track-browse surfaces follow. @@ -289,13 +472,77 @@ recorded only because "just store Opus" is the tempting simplification and the s --- -## 4. Format selection — the product surface (deliberately under-specified; see OQ1/OQ2) +## 4. Format selection — the product surface (RESOLVED — global, via a Settings menu) -Daniel has **not** specified the selection UX. What is settled by his direction: there are two formats, -Opus is the bandwidth-friendly **default-candidate**, lossless is the kept option. What is open: how a -listener expresses the choice, whether it is remembered, and whether the default is global or adapts. -These are genuine product calls — see §6. The *mechanism* (a `?format=` param the player sends; §3.3) -supports any of the policies, so the policy can be decided after the substrate lands. +**Resolved (Daniel, 2026-06-23):** the listener's quality choice is **global** (one session/visitor-level +"streaming quality" preference, not per-track), Opus is the **default** (capability-gated), and the choice +is **remembered** following the dark-mode persistence pattern. Crucially: *"Global is perfect, but we need +a menu system for settings, don't just slap the quality control directly in the app bar."* So the toggle +does **not** sit bare in the app bar — it lives inside a proper **public-site Settings menu** (§4a), of +which it is the **first occupant**. + +- **What the listener sees.** A Settings affordance in the public app bar opens a Settings menu; inside it, + a "Streaming quality" control with two options — **Low-data (Opus)** / **Lossless (WAV)** — defaulting to + Low-data. Picking lossless flips the global preference; the player sends the matching `?format=` on + subsequent stream requests (§3.3). On a browser that can't decode Ogg Opus, the control is shown but the + effective stream is lossless (capability gate, §3.4 / OQ2) — surface this honestly rather than letting + the listener pick a format that silently can't play. +- **Default before any choice:** Opus, capability-gated (OQ2 RESOLVED). A first-time visitor on a capable + browser streams Opus; on an incapable browser, lossless. +- **Persistence:** mirror the dark-mode seam exactly (OQ3 RESOLVED) — see §4a. + +### 4a. The Settings menu surface (NEW — scoping + the dark-mode persistence pattern) + +Daniel asked for a **menu system for settings**, not a control bolted onto the app bar, and noted the +existing **dark-mode toggle** is a natural future tenant of the same menu (design for adaptability — build +the menu so dark mode *could* move into it later, but **do not force that migration now**). + +**Scoping recommendation: a small sub-track *within* Phase 18 (wave 18.6), not its own phase.** Reasoning: + +- The menu's only **required** occupant right now is the quality toggle, which Phase 18 owns end to end — + splitting the shell into a separate phase would create a phase whose sole deliverable is an empty menu + waiting for Phase 18's toggle. That is ceremony, not separation of concerns. +- The menu is **small** — an app-bar trigger + a MudBlazor menu/popover + the persistence seam (which the + quality toggle needs *anyway*). It is not a platform; it is a container with one tenant. +- It carries a real **design-for-adaptability** obligation (it must be able to host dark mode and future + settings later), but that is a *shape* requirement on a small surface, not a phase's worth of work. + +So: **build the Settings-menu shell as part of Phase 18 (wave 18.6), with the quality toggle as its first +occupant, designed so dark mode and future preferences can plug in without restructuring.** Flag for +Daniel: *if he wants the menu shell proven/landed independently before the quality toggle plugs in*, 18.6 +can be split into "menu shell" then "quality toggle plugs in" — but they are small enough to land together. +This is **not** recommended as its own top-level phase. (If Daniel disagrees and wants a dedicated +"Public Settings Menu" phase that Phase 18's toggle then targets, that is a clean alternative — it just +front-loads a surface with no second tenant yet. Recommendation stands: sub-track.) + +**The menu shell — design-for-adaptability requirements (so it survives new tenants):** + +- A **settings-item abstraction**, not a hard-coded list. The menu renders a small set of settings entries; + adding dark mode later is adding an entry, not rewiring the menu. Each entry is a label + a control bound + to a persisted preference. +- A **single public-site settings object** carrying all listener preferences (today: streaming quality; + tomorrow: dark mode, and whatever follows). This is the `DarkModeSettings` analogue, generalized — call + it e.g. `PublicSiteSettings` / `ListenerSettings`. Dark mode's existing `DarkModeSettings` can fold into + it *later* without disturbing the menu. + +**Persistence — mirror the dark-mode seam exactly (OQ3 RESOLVED).** The quality preference follows the +*identical* path dark mode already uses (root `CLAUDE.md` "Theming and dark mode"): + +1. **Cookie** — a `streamQuality` cookie (365-day, like `darkMode`), the durable truth. +2. **Server prerender read** — a service in `DeepDrftPublic` (sibling to `DarkModeService`) reads the + cookie during prerender and seeds the settings object, avoiding a wrong-default flash on first paint + (the streaming-quality analogue of the "wrong theme flash" fix). +3. **`PersistentComponentState` bridge** — the seeded preference carries from server prerender into the + WASM render (the same bridge `DarkModeSettings` and `NowPlayingStats`/`StatsClient` already use), so the + client boots already knowing the quality without a re-read flash or a re-fetch. +4. **Client cookie service** — a runtime client-side service (JS-interop cookie write, like the dark-mode + toggle) persists the choice when the listener changes it in the menu. + +**Why mirror rather than invent:** the dark-mode seam is the codebase's established, working pattern for "a +listener preference seeded at prerender, carried to WASM, persisted in a cookie." Reusing its shape means +the quality preference inherits the no-flash guarantee for free, and the eventual dark-mode-into-the-menu +migration is a *consolidation of two identical seams*, not a reconciliation of two different ones. (This is +the "one source, multiple views" / design-for-adaptability discipline applied to listener settings.) --- @@ -315,48 +562,65 @@ supports any of the policies, so the policy can be decided after the substrate l - **UC5 — Replace-audio regenerates Opus.** The existing replace-audio path (which already regenerates both waveform datums and re-derives duration) also regenerates the Opus artifact from the new source. -- **UC6 — Seek within an Opus stream.** Backward/forward seek resolves via the existing `Range` path; - the offset is the `OpusFormatDecoder`'s approximate page-aligned mapping (§3.4), re-syncing to the - next Ogg page — the VBR analogue of the WAV exact-offset seek. +- **UC6 — Seek within an Opus stream (accurately).** Backward/forward seek resolves via the existing + `Range` path; the offset comes from the `OpusFormatDecoder`'s **precomputed seek index** (§3.4a) — an + exact granule→byte lookup, then fine re-sync to the requested time within the bucket. The listener lands + at the **correct** time, not approximately, and without the full PCM decoded in memory. - **UC7 — Safari that can't decode Ogg Opus.** Capability-gated to the lossless path (§3.4), so the listener still plays audio. (Ties to OQ2 + Phase 1.7.) +- **UC8 — Listener switches streaming quality in the Settings menu.** The listener opens the public + Settings menu, flips "Streaming quality" from Low-data to Lossless (or back); the choice persists + (cookie, dark-mode pattern) and applies to subsequent stream requests via `?format=`. On next visit the + preference is seeded at prerender (no flash, no re-pick). (§4 / §4a.) +- **UC9 — Deep-link / windowed start away from byte 0.** A listener opens a stream at a mid-track position + (deep link, or a Phase 21 window that opens past byte 0) without ever reading the stream start. The + decoder still has the `OpusHead`/`OpusTags` setup bytes because they arrived with the up-front sidecar + fetch (§3.4a B), so the mid-stream slice is decodable immediately. (Composition case for Phase 21.) --- -## 6. Open questions for Daniel (genuine product decisions, not implementation detail) +## 6. Open questions — RESOLVED (Daniel, 2026-06-23) -- **OQ1 — Selection UX: how does a listener choose lossless vs. low-data?** Candidates: a global - toggle in the player bar / settings ("Stream quality: Low-data / Lossless"); a per-track control; an - automatic default with a manual override. Recommend a **single global quality toggle** (player bar - or a settings affordance) — it is the Spotify/Bandcamp/SoundCloud idiom (one account/session-level - "streaming quality" setting), low-friction, and matches a small-sharp-tool posture better than - per-track choosers. `[Daniel decision]` -- **OQ2 — Default policy: what does a listener get before they choose?** Opus is the - *default-candidate* per Daniel — confirm Opus-by-default. Sub-questions: should the default be - **capability-aware** (don't serve Ogg Opus to a browser that can't decode it — §3.4 Safari < 18.4)? - Should it be **network-aware** (Opus on cellular, lossless on wifi)? Recommend **Opus by default, - capability-gated** (fall back to lossless when the browser can't decode Ogg Opus), and **defer - network-awareness** as gold-plating for v1. `[Daniel decision]` -- **OQ3 — Is the choice remembered, and at what scope?** Per-session (resets each visit) vs. - persisted (cookie/`localStorage`, like the `darkMode` cookie) vs. (future) per-account once identity - exists. Recommend **persisted via a cookie/`localStorage` setting**, mirroring the dark-mode - precedent — one truth, seeded at prerender, carried to WASM. `[Daniel decision]` -- **OQ4 — Per-upload Opus control in the CMS, or always-on?** Should the CMS upload form let an admin - opt a track *out* of Opus generation (e.g. a track meant to be lossless-only), or is Opus always - generated for every track? Recommend **always-on** (simpler; Opus is additive and cheap to serve; - the listener's format choice already covers "I want lossless"). A per-track opt-out is a later - refinement if a real need appears. `[Daniel decision]` -- **OQ5 — Opus container/extension specifics.** Ogg Opus (`.opus` / `audio/ogg`) is the assumption - (broadest `decodeAudioData` support; Daniel said "Ogg Opus"). Confirm — vs. CAF-wrapped Opus (older - Safari) or WebM-Opus. Recommend **Ogg Opus** as Daniel directed; CAF-fallback for old Safari is not - worth it given the lossless fallback already covers those browsers (§3.4). `[Daniel steer — confirms - §3.4, not a blocker]` -- **OQ6 — Transcode execution model (flag, leans implementation).** Synchronous-at-upload is a - non-starter for 1 GB mixes (§3.1); the realistic options are a background/queued transcode after the - source is stored. This is largely staff-engineer's call, but it has a **product-visible - consequence**: a freshly uploaded track may be lossless-only for a short window until its Opus - artifact finishes. Confirm that "Opus appears shortly after upload, lossless available immediately" - is acceptable (it is the waveform-datum model already in place). `[Daniel steer]` +All six original open questions are resolved. Kept visible per file convention, each with the decision and +the section that now carries it. One new open question (OQ7) is raised by the seek-model design; it is a +narrow tuning/scoping call, not a blocker. + +- **OQ1 — Selection UX — RESOLVED: global, via a Settings *menu* (not a bare app-bar control).** Daniel: + *"Global is perfect, but we need a menu system for settings, don't just slap the quality control directly + in the app bar."* So: one global quality preference, surfaced inside a new **public-site Settings menu** + (§4 / §4a), of which the quality toggle is the first occupant. The menu is scoped as a **Phase 18 + sub-track (wave 18.6)**, designed so dark mode (its natural future tenant) can plug in later. `[RESOLVED + — §4 / §4a]` +- **OQ2 — Default policy — RESOLVED: Opus by default, capability-gated.** Opus is the default; on a browser + that cannot decode Ogg Opus (Safari < 18.4, §3.4), fall back to lossless rather than serving an + undecodable stream. Network-awareness (Opus on cellular / lossless on wifi) remains **deferred** as + gold-plating. `[RESOLVED — §3.4, §4]` +- **OQ3 — Remembered choice — RESOLVED: persisted, following the dark-mode pattern.** A `streamQuality` + cookie seeded at server prerender → settings object → `PersistentComponentState` bridge into WASM → + client cookie service for runtime writes. The full dark-mode seam mirrored (§4a). `[RESOLVED — §4a]` +- **OQ4 — Per-upload Opus control — RESOLVED: always-on + backfill.** Opus is generated for **every** + track, always (no per-upload opt-out). **Plus** a bulk **Backfill-Opus** CMS action processes the + existing catalogue. (The listener's lossless choice already covers "I want lossless," so a per-track + opt-out earns nothing.) `[RESOLVED — §3.1, UC4, wave 18.5]` +- **OQ5 — Container — RESOLVED: Ogg Opus.** `.opus` / `audio/ogg` (broadest `decodeAudioData` support). No + CAF/WebM fallback — the lossless path already covers browsers that can't decode Ogg Opus (§3.4). + `[RESOLVED — §3.4]` +- **OQ6 — Transcode execution model — RESOLVED: background job after the file is available; uploader shows + a Post-Processing phase.** The source is stored and the track is playable losslessly **first**; the Opus + transcode (+ seek-index build) runs as a **background job** afterward; the CMS upload progress meter + gains a visible **Post-Processing** phase reflecting the transcode status (§3.1a). A freshly uploaded + track is lossless-only until its Opus finishes — accepted, and now made visible rather than implicit. + `[RESOLVED — §3.1a]` + +**New open question raised by the seek-model design (§3.4a) — narrow, non-blocking:** + +- **OQ7 — Seek-index granularity (tuning, leans implementation).** The seek index trades precision against + size: per-Ogg-page (most precise, largest) vs. coarser time buckets snapped to page starts. Recommend + **~1–2 s buckets** (~58 KB for a 1-hour mix at 1 s; the decoder fine-re-syncs within the bucket so seek + *accuracy* is unaffected — only the in-bucket trim distance changes). This is largely staff-engineer's + call at implementation; flagged because the *number* is a deliberate choice and Daniel may have a feel + for acceptable index size vs. in-bucket trim. Does **not** block — the shape (precomputed exact + granule→byte, page-snapped) is fixed regardless of the bucket size. `[Daniel steer — not a blocker]` --- @@ -371,56 +635,95 @@ supports any of the policies, so the policy can be decided after the substrate l - **AC3 — Additive, non-breaking (C2).** The existing lossless WAV path is byte-for-byte unchanged; a track with no Opus artifact still plays losslessly; `?format=opus` on such a track falls back to lossless (no 404, no silence). -- **AC4 — Transcode at ingest, regenerable (C6).** A new upload produces an Opus artifact best-effort - after the source is stored; a transcode failure does not block the upload or break playback; a - Backfill-Opus action (re)generates artifacts for existing tracks; replace-audio regenerates the - Opus artifact from the new source. +- **AC4 — Transcode at ingest as a background job, regenerable (C6, OQ6).** A new upload stores the source + and is playable losslessly **immediately**; the Opus artifact (+ seek-index/setup-header sidecar) is + produced by a **background job** afterward; a transcode failure does not block the upload or break + playback; a Backfill-Opus action (re)generates artifacts for existing tracks; replace-audio regenerates + the Opus artifact and its sidecar from the new source. +- **AC4a — Post-Processing phase is visible on the upload meter (OQ6, §3.1a).** After the byte-transfer and + server-persist phases, the CMS upload progress UI shows a **Post-Processing** phase reflecting the + background transcode (queued → transcoding → done/failed). The admin is never blocked waiting on the + transcode; the track is live before Post-Processing finishes. - **AC5 — Opus seek via the existing `Range` path (C3).** Forward and backward seek in an Opus stream resolve through the landed `Range: bytes=X-` primitive, with the offset coming from - `OpusFormatDecoder.calculateByteOffset`; no new seek mechanism is introduced. + `OpusFormatDecoder.calculateByteOffset`; no new seek *transport* mechanism is introduced. +- **AC5a — Seek-index + setup-header sidecar exists and is fetched once (§3.4a).** Every track with an Opus + artifact has a sidecar carrying the setup header (`OpusHead`/`OpusTags`) and the granule→byte seek index; + the client fetches and parses it once on track load (into `OpusSeekData`) before issuing any seek. +- **AC9 (the seek-accuracy criterion) — an Opus seek lands at the *correct* time, not approximately.** + Seeking to time `t` in an Opus stream resolves via the precomputed index and lands playback at `t` + (within the fine-resync tolerance — sub-second at the recommended bucket granularity), **measurably + accurate**, not a `byteRate`/interpolation estimate. Verifiable: seek to a known marker (e.g. a downbeat + at a known timestamp) and confirm playback resumes there, not seconds off. This holds **without** the + full PCM decoded in memory (composes with Phase 21). - **AC6 — No format branches leak (C4).** The only Opus-specific code is `OpusFormatDecoder`, its - `OpusSeekData`, the one `createFormatDecoder` selection arm, and the transcode processor + delivery - param resolution. The format-agnostic player/scheduler code is unchanged. + `OpusSeekData` (carrying the index), the one `createFormatDecoder` selection arm, the transcode processor + (+ index build), the sidecar artifact + its serving, and the delivery param resolution. The + format-agnostic player/scheduler code is unchanged. - **AC7 — Capability-safe default (OQ2).** A browser that cannot decode Ogg Opus is served (or falls back to) the lossless path and plays audio; no listener gets silence because of codec support. -- **AC8 — Windowing-ready (the Phase 21 handshake).** The `OpusFormatDecoder`'s approximate byte↔time - mapping is the one Phase 21's windowed refill will call; Opus playback must be windowable by the - same machinery (verified jointly when Phase 21 lands on top — see §8 / Phase 21 cross-ref). +- **AC8 — Windowing-ready (the Phase 21 handshake).** The `OpusFormatDecoder`'s **index-based** byte↔time + resolver is the one Phase 21's windowed refill calls; Opus playback must be windowable by the same + machinery, and a windowed refill that opens away from byte 0 still decodes (setup header from the + sidecar — UC9). Verified jointly when Phase 21 lands on top (see §8 / Phase 21 cross-ref). +- **AC10 — The Settings menu hosts the quality toggle and persists the choice (§4 / §4a).** The public app + bar opens a Settings menu containing a "Streaming quality" control (Low-data / Lossless, defaulting to + Low-data, capability-gated); changing it persists via the `streamQuality` cookie and is seeded at + prerender on the next visit (no flash). The menu shell is built so a future dark-mode entry can plug in + without restructuring. --- ## 8. Wave decomposition -Dependency shape: `18.1 → 18.2 → {18.3, 18.4}`, with `18.5` validating end-to-end. **18.1 (the -transcode/derived-artifact ingest) is the cold-start prerequisite** — until an Opus artifact exists, -nothing downstream has bytes to serve or decode. 18.3 (delivery param) and 18.4 (the decoder) are -largely parallel once 18.2 (storage/lookup) settles, but both need an artifact to test against. +Dependency shape: `18.1 → 18.2 → {18.3, 18.4}`, with `18.5` (backfill + e2e) and `18.6` (settings menu) +on top. **18.1 (the transcode + seek-index/setup-header derived artifacts) is the cold-start +prerequisite** — until those artifacts exist, nothing downstream has bytes to serve, decode, or seek +against. 18.3 (delivery param) and 18.4 (the decoder + index resolver) are largely parallel once 18.2 +(storage/lookup) settles, but both need artifacts to test against. **18.6 (the Settings menu) is the only +wave with no audio-pipeline dependency** — it can proceed in parallel with the whole stack; it merely needs +the `?format=` mechanism (18.3) wired before the toggle has anything to drive. -- **18.1 — Ingest transcode: derive + store the Opus artifact (cold-start; load-bearing).** New +- **18.1 — Ingest transcode + seek-index + setup-header (cold-start; load-bearing).** New `OpusTranscodeService`/processor in `DeepDrftContent`, invoked post-store from - `UnifiedTrackService.UploadAsync` alongside `WaveformProfileService`; produces Ogg Opus fullband - 320; stores it as a derived artifact (S2 vault recommended). Failure-tolerant (C6) and off the hot - path (background/queued — OQ6). **Independent of the delivery/decoder waves; can begin immediately.** -- **18.2 — Storage + lookup contract.** The derived-artifact key/vault convention and the server-side - resolution "given `EntryKey` + format, return the right `AudioBinary` + content-type," including the - C2 fallback (no Opus → lossless). **Depends on 18.1** (an artifact must exist to resolve to). -- **18.3 — Delivery: format param + proxy threading.** `?format=opus|lossless` on the + `UnifiedTrackService.UploadAsync` alongside `WaveformProfileService`, **as a background job** (OQ6, + §3.1a); produces Ogg Opus fullband 320; **walks the encoded stream once to build the granule→byte seek + index and extract the `OpusHead`/`OpusTags` setup header** (§3.4a A/B); stores the Opus bytes **and** the + combined seek/setup **sidecar** as derived artifacts (S2 vault recommended). Failure-tolerant (C6). + **Independent of the delivery/decoder waves; can begin immediately.** +- **18.2 — Storage + lookup contract.** The derived-artifact key/vault convention (Opus bytes + sidecar) + and the server-side resolution "given `EntryKey` + format, return the right `AudioBinary` + content-type + (+ the sidecar on its own endpoint/path)," including the C2 fallback (no Opus → lossless). **Depends on + 18.1** (artifacts must exist to resolve to). +- **18.3 — Delivery: format param + sidecar serving + proxy threading.** `?format=opus|lossless` on the `DeepDrftAPI` track stream endpoint (resolves via 18.2), forwarded through the `DeepDrftPublic` - `TrackProxyController` (mirror the existing `offset` param threading), and the `Range` handler - serving the chosen artifact's bytes. The player sends the param via `TrackMediaClient`. **Depends on - 18.2.** Parallel-ok with 18.4. -- **18.4 — `OpusFormatDecoder` in the player stack.** New `IFormatDecoder` implementation - (Ogg-page-aligned `getAlignedSegmentSize` via `OggS` scan, `OpusHead` setup carry in - `wrapSegment`/continuation, approximate page-interpolation `calculateByteOffset` with an - `OpusSeekData` accelerator); one new arm in `AudioPlayer.createFormatDecoder` on - `audio/ogg`/`audio/opus`. Capability detection for the lossless fallback (§3.4, OQ2). **Depends on - 18.2** (needs Opus bytes to decode). Parallel-ok with 18.3; they meet at 18.5. -- **18.5 — Backfill + selection UX + end-to-end validation.** The Backfill-Opus CMS bulk action (third - sibling to Generate-Profiles / Backfill-High-res) and replace-audio Opus regeneration; the listener - selection control per OQ1/OQ3 (global persisted quality toggle, recommended); and the AC1–AC8 - acceptance pass — including AC8's confirmation that Opus is windowable so Phase 21 can build on it. - **Depends on 18.1–18.4.** (Selection UX can be split out if Daniel wants the substrate proven before - the control lands — flag at planning time.) + `TrackProxyController` (mirror the existing `offset` param threading), and the `Range` handler serving + the chosen artifact's bytes; **plus serving the seek/setup sidecar** (a `GET …/opus/seekdata`-style path, + proxied the same way). The player sends the format param via `TrackMediaClient`. **Depends on 18.2.** + Parallel-ok with 18.4. +- **18.4 — `OpusFormatDecoder` + the index-based seek resolver in the player stack.** New `IFormatDecoder` + implementation: Ogg-page-aligned `getAlignedSegmentSize` via `OggS` scan; `OpusHead`/`OpusTags` setup + carry in `wrapSegment`/the continuation path (sourced from the cached sidecar, §3.4a B); **`calculateByteOffset` + that binary-searches the precomputed seek index** (NOT interpolation), with an `OpusSeekData` accelerator + holding the parsed index + setup bytes; the **one-time sidecar fetch + parse** on track load. One new arm + in `AudioPlayer.createFormatDecoder` on `audio/ogg`/`audio/opus`. Capability detection for the lossless + fallback (§3.4, OQ2). **Depends on 18.2** (needs Opus bytes + sidecar). Parallel-ok with 18.3; they meet + at 18.5. +- **18.5 — Backfill + replace-audio + end-to-end validation (incl. seek accuracy).** The Backfill-Opus CMS + bulk action (third sibling to Generate-Profiles / Backfill-High-res), which (re)builds Opus bytes + the + sidecar for existing tracks; replace-audio Opus + sidecar regeneration; and the AC1–AC10 acceptance pass + — **including AC9 (an Opus seek lands at the correct time, not approximately)** and AC8's confirmation + that Opus is windowable (index resolver + sidecar setup header) so Phase 21 can build on it. **Depends on + 18.1–18.4.** +- **18.6 — Public Settings menu + the quality toggle (the listener selection UX).** The new public-site + Settings-menu shell (§4a): an app-bar trigger + MudBlazor menu hosting a settings-item abstraction, the + `PublicSiteSettings`/`ListenerSettings` object, and the dark-mode-pattern persistence seam (`streamQuality` + cookie + a `DeepDrftPublic` prerender-read service + `PersistentComponentState` bridge + client cookie + service). The **quality toggle is its first occupant** (Low-data/Lossless, Opus default, capability-gated), + driving the `?format=` the player sends (needs 18.3). Built design-for-adaptability so dark mode can plug + in later without restructuring (not migrated now). **Depends on 18.3** (the toggle needs the format + mechanism); the menu *shell* can be built ahead of that. *Splittable* into "menu shell" + "toggle plugs + in" if Daniel wants the shell proven first — but small enough to land together (§4a). --- @@ -432,7 +735,10 @@ largely parallel once 18.2 (storage/lookup) settles, but both need an artifact t phase.** Phase 21's C5 invariant ("WAV-only shipping target; must not foreclose MP3/FLAC") is now driven by Opus's VBR/paged seek math; Phase 21 OQ5 (adopt MSE) is resolved **NO** — the bespoke graph stays (the same C1 decision recorded here). Windowing a VBR/Opus stream uses - `OpusFormatDecoder.calculateByteOffset`'s approximate mapping — exactly the C5 case. + `OpusFormatDecoder.calculateByteOffset`'s **accurate index-based mapping** (§3.4a — *not* the earlier + "approximate page-interpolation"; that language in the Phase 21 doc is corrected). Phase 21's windowed + refill calls the **same** index resolver an explicit seek does (§3.4a D), and a window that opens away + from byte 0 still decodes via the sidecar setup header (UC9). - `PLAN.md` Phase 4 (landed) / `COMPLETED.md` — the HTTP `Range: bytes=X-` primitive Opus seek reuses. - `PLAN.md` Phase 1.5 (gapless) / 1.6 (track-skip on error) / 1.7 (Safari) — 1.5's "encoder padding/priming" caveat applies to Opus (it has pre-skip samples in `OpusHead`); 1.6's @@ -452,7 +758,15 @@ largely parallel once 18.2 (storage/lookup) settles, but both need an artifact t - `DeepDrftPublic/Interop/audio/AudioPlayer.ts` (`createFormatDecoder`, lines 117–125) — the decoder registry gaining the Opus arm. - `DeepDrftPublic.Client/Clients/TrackMediaClient.cs` + `DeepDrftPublic/Controllers/TrackProxyController.cs` - — the media fetch + proxy that thread the new `?format=` param (mirroring `offset`). + — the media fetch + proxy that thread the new `?format=` param (mirroring `offset`), and proxy the new + seek/setup sidecar fetch. +- Root `CLAUDE.md` "Theming and dark mode" + `DarkModeService` (in `DeepDrftPublic`) + `DarkModeSettings` + (`DeepDrftPublic.Client.Common`) — the cookie → prerender-read → `PersistentComponentState` → client + cookie-service seam the **streaming-quality preference** (§4a) mirrors exactly; the eventual dark-mode- + into-the-Settings-menu migration consolidates two copies of this seam. +- `DeepDrftPublic.Client` `NowPlayingStats.razor` / `StatsClient` — the `PersistentComponentState` + prerender-bridge precedent (prerender fetch carried into WASM without a re-fetch/flash), the pattern the + quality preference's bridge follows; see the `tracksview-persistent-state-seam` auto-memory. ## Sources diff --git a/product-notes/phase-21-windowed-streaming-buffer.md b/product-notes/phase-21-windowed-streaming-buffer.md index e188ece..dd5c3fe 100644 --- a/product-notes/phase-21-windowed-streaming-buffer.md +++ b/product-notes/phase-21-windowed-streaming-buffer.md @@ -13,10 +13,15 @@ endpoint. > (`product-notes/phase-18-opus-low-data-streaming.md`) — is a prerequisite that sequences ahead of > windowing. Phase 21's windowing must work across **both** delivery formats (lossless WAV and Opus). > Its C5 invariant below already anticipated this ("must not foreclose MP3/FLAC"); **Opus is now the -> concrete VBR/containerized driver of C5.** Windowing an Opus stream uses the decoder's *approximate* -> byte↔time mapping (`OpusFormatDecoder.calculateByteOffset` — Ogg-page interpolation), exactly the C5 -> case — not the exact CBR-WAV `byteRate` math. Build the window machinery format-agnostically -> (§2 C3/C5) so it inherits Opus for free. +> concrete VBR/containerized driver of C5.** Windowing an Opus stream uses the decoder's **accurate +> index-based** byte↔time mapping (`OpusFormatDecoder.calculateByteOffset` — a binary search in the Phase 18 +> precomputed seek index), exactly the C5 case — *not* the exact CBR-WAV `byteRate` math, and *not* +> approximate Ogg-page interpolation. **Correction (Daniel, 2026-06-23):** an earlier draft described the +> Opus mapping as "approximate page interpolation"; the Phase 18 seek-model resolution rejected that — Opus +> seeking is **accurate**, backed by a precomputed seek index built at transcode time, so refill resolves to +> the *exact* page offset. The windowed refill controller calls the **same** index resolver an explicit seek +> does (Phase 18 §3.4a D); a window opening away from byte 0 still decodes via the Phase 18 sidecar setup +> header. Build the window machinery format-agnostically (§2 C3/C5) so it inherits Opus for free. --- @@ -66,14 +71,20 @@ docs. This phase **modifies that seam** — so the contract it must preserve is user-visible control, no change to seek/transport semantics beyond what the listener already experiences. Seek must still feel identical. - **C5 — Must window both delivery formats (WAV lossless AND Opus low-data).** Byte↔time mapping for - refill is exact and cheap for WAV (CBR: `byteRate` from the header). For VBR/containerized formats it - is approximate (the decoders carry TOC/SEEKTABLE/Ogg-page seek math). **Phase 18 (Opus) is sequenced - before this phase and is the concrete driver here:** an Ogg Opus 320 stream is VBR and page-paged, so - its `calculateByteOffset` is an *approximate* page-interpolation, not exact-offset. The window - machinery must express refill purely in terms of the decoder's existing `calculateByteOffset`, so the - same code windows WAV exactly and Opus approximately — **no WAV-special-cased offset math in the - window layer.** (MP3/FLAC decoders are already wired in the registry too — the registry dispatches on - content-type today; an `OpusFormatDecoder` joins them in Phase 18.) + refill is exact and cheap for WAV (CBR: `byteRate` from the header). **Phase 18 (Opus) is sequenced + before this phase and is the concrete VBR driver here** — and its mapping is **also exact**, but by a + different mechanism: an Ogg Opus 320 stream has no linear time↔byte relationship, so + `OpusFormatDecoder.calculateByteOffset` resolves via a **precomputed seek index** (granule→byte, built at + transcode; Phase 18 §3.4a), a binary search that returns the exact page offset — **not** an approximate + page interpolation. (An earlier draft of this invariant said "approximate"; the Phase 18 seek-model + resolution, Daniel 2026-06-23, made Opus seeking accurate. Corrected here.) The window machinery must + express refill purely in terms of the decoder's existing `calculateByteOffset`, so the same code windows + WAV (via `byteRate`) and Opus (via the index) — **no WAV-special-cased offset math in the window layer**, + and no approximation for either. A window that opens away from byte 0 must also prepend the decoder's + retained/sidecar setup header (Phase 18 §3.4a B) — the format-agnostic refill path already routes + continuations through the decoder's header-carry, so this comes for free. (MP3/FLAC decoders are already + wired in the registry too — the registry dispatches on content-type today; an `OpusFormatDecoder` joins + them in Phase 18.) - **C6 — No regression to the single-instance JS decoder concurrency guarantees.** The current code is careful that only one streaming loop touches the single JS `StreamDecoder` at a time (`DrainActiveStreamingTaskAsync`, the `_streamingCancellation` identity dance). Windowed refill