docs(plan): Phase 18 OQ resolutions + VBR-safe accurate Opus seek model

This commit is contained in:
daniel-c-harvey
2026-06-23 05:26:58 -04:00
parent 6af6677a12
commit e3a4364b8c
3 changed files with 543 additions and 183 deletions
+82 -47
View File
@@ -468,23 +468,44 @@ decoder-side `AudioPlayer.createFormatDecoder` is a **wired** strategy registry
step that *derives* an Opus 320 artifact per track (nothing derives Opus today), and (2) a **per-format
delivery selection** so one track serves as either WAV or Opus on request.
**Architectural spine — a derived artifact + a delivery param + one new decoder; three new leaf
implementations, zero changes to existing format code (the strong OCP signal).** Transcode is a new
processor sibling in `DeepDrftContent`, invoked post-store alongside `WaveformProfileService`,
**failure-tolerant and off the hot path** (background/queued — a 1 GB WAV transcode must not block the
upload response) — mirroring the landed waveform-datum pattern (derive at ingest, regenerate via a CMS
bulk action + ApiKey endpoint). The Opus bytes are a **derived artifact** stored like the high-res
waveform datum (recommend a dedicated `track-opus` vault, the `track-waveforms` precedent; final call
staff-engineer's). Delivery adds a **`?format=opus|lossless` param** (mirroring the existing `offset`
param threading through `TrackProxyController`) resolved server-side to the right artifact + content-type,
with a **lossless fallback** when no Opus artifact exists (additive, never 404/silence). The player gains
one `OpusFormatDecoder` (`IFormatDecoder`): Ogg-page-aligned segmenting (`OggS` scan — the FLAC
frame-sync analogue), `OpusHead` setup-bytes carry (the FLAC `streamInfoBytes` analogue), and an
**approximate** page-interpolation `calculateByteOffset` (Opus is VBR/paged — this is exactly the Phase
21 C5 case). **Browser constraint flagged:** Ogg-Opus `decodeAudioData` is Safari-18.4+ only (Chrome/FF
long-standing), so the Opus default must be **capability-gated** — fall back to the universal lossless
**Open questions RESOLVED (Daniel, 2026-06-23).** OQ1 selection UX → **global, via a new public-site
Settings menu** (not a bare app-bar control); OQ2 default → **Opus by default, capability-gated** (defer
network-awareness); OQ3 remembered → **persisted via the dark-mode seam** (cookie → prerender-read →
`PersistentComponentState` → client cookie service); OQ4 → **always-on Opus + Backfill-Opus**; OQ5 →
**Ogg Opus**; OQ6 transcode model → **background job after the file is available, with a visible
Post-Processing phase on the CMS upload meter.** One new tuning OQ (OQ7: seek-index granularity — recommend
~12 s buckets) is non-blocking.
**Architectural spine — a derived artifact set + a delivery param + one new decoder + a precomputed
accurate seek index; leaf implementations only, zero changes to existing format code (the strong OCP
signal).** Transcode is a new processor sibling in `DeepDrftContent`, invoked post-store alongside
`WaveformProfileService` **as a background job** (a 1 GB WAV transcode must not block the upload; the source
is stored and the track plays lossless *first*, then Opus is derived)mirroring the landed waveform-datum
pattern (derive at ingest, regenerate via a CMS bulk action + ApiKey endpoint). The Opus bytes are a
**derived artifact** stored like the high-res waveform datum (recommend a dedicated `track-opus` vault, the
`track-waveforms` precedent; final call staff-engineer's). Delivery adds a **`?format=opus|lossless` param**
(mirroring the existing `offset` param threading through `TrackProxyController`) resolved server-side to the
right artifact + content-type, with a **lossless fallback** when no Opus artifact exists (additive, never
404/silence). The player gains one `OpusFormatDecoder` (`IFormatDecoder`): Ogg-page-aligned segmenting
(`OggS` scan — the FLAC frame-sync analogue) and `OpusHead`/`OpusTags` setup-bytes carry (the FLAC
`streamInfoBytes` analogue). **Browser constraint flagged:** Ogg-Opus `decodeAudioData` is Safari-18.4+ only
(Chrome/FF long-standing), so the Opus default is **capability-gated** — fall back to the universal lossless
path on browsers that can't decode it.
**VBR-safe ACCURATE seeking (Daniel, 2026-06-23 — supersedes the earlier "approximate" hand-wave).** Raw
byte-offset seek and rough page interpolation are inadequate for VBR Opus — there is no linear time↔byte
relationship. The fix is an **accurate transfer function built at transcode time** (the one moment the
whole encoded stream is walked): a precomputed **seek index** mapping Ogg-page `granulepos` (48 kHz sample
counts → time) → exact byte offset (recommend ~12 s buckets snapped to page starts; ~58 KB for a 1-hour
mix). The decode **setup header** (`OpusHead`/`OpusTags`, needed to decode any mid-stream slice) is made
available too. Recommended concrete design: **one sidecar artifact per track = `[setup header][seek
index]`, built at transcode, stored beside the Opus bytes, fetched once on track load**, parsed into
`OpusSeekData`. Client seek flow: `calculateByteOffset(t)` binary-searches the index for the exact page
offset → `Range: bytes=X-` fetch (landed Phase 4 primitive, unchanged) → prepend the cached setup header →
decode → fine re-sync to `t` within the bucket. **The listener lands at the correct time, not
approximately** (AC9), **without** the full PCM in memory — so it composes with Phase 21 windowed refill,
which calls the **same** index resolver. The earlier "approximate page-interpolation" language is rejected.
**Constraints/invariants:** keep the bespoke graph (no MSE); preprocessing is **additive** (WAV path
untouched, byte-for-byte; a track with no Opus artifact still plays losslessly); reuse the landed
`Range`/offset seek path; no format branches leak outside the new decoder + one selection arm + the
@@ -492,38 +513,49 @@ transcode/delivery seam; transcode failure must not block ingest; format selecti
decision resolving one `EntryKey` to one of two artifacts (one source, two views — **not** a second
`TrackEntity` row, which would fracture share/queue/play-count/release identity).
Sequenced as five waves. `18.1 → 18.2 → {18.3, 18.4} → 18.5`. **18.1 (ingest transcode + derived
artifact) is the cold-start prerequisite** — nothing downstream has bytes to serve or decode until an
Opus artifact exists.
Sequenced as six waves. `18.1 → 18.2 → {18.3, 18.4} → 18.5`, with `18.6` (Settings menu) able to run in
parallel (it needs only 18.3's format mechanism before its toggle is live). **18.1 (ingest transcode +
seek-index + setup-header derived artifacts) is the cold-start prerequisite** — nothing downstream has
bytes to serve, decode, or seek against until those artifacts exist.
- **18.1 — Ingest transcode: derive + store the Opus artifact (cold-start; load-bearing).** New
- **18.1 — Ingest transcode + seek-index + setup-header (cold-start; load-bearing).** New
`OpusTranscodeService`/processor in `DeepDrftContent`, invoked post-store from
`UnifiedTrackService.UploadAsync` alongside `WaveformProfileService`; produces Ogg Opus fullband 320;
stores it as a derived artifact (recommend a `track-opus` vault). Failure-tolerant; off the hot path
(background/queued). **Independent of the delivery/decoder waves — can begin immediately.**
- **18.2 — Storage + lookup contract.** The derived-artifact key/vault convention + server-side "given
`EntryKey` + format, return the right `AudioBinary` + content-type," including the lossless fallback.
**Depends on 18.1.**
- **18.3 — Delivery: `?format=opus|lossless` param + proxy threading.** On the `DeepDrftAPI` stream
endpoint (resolves via 18.2), forwarded through `TrackProxyController` (mirror `offset`), `Range`
serving the chosen artifact; player sends it via `TrackMediaClient`. **Depends on 18.2; parallel-ok
with 18.4.**
- **18.4 — `OpusFormatDecoder` in the player stack.** New `IFormatDecoder` (Ogg-page segmenting,
`OpusHead` carry, approximate page-interpolation `calculateByteOffset` with an `OpusSeekData`
accelerator) + one arm in `createFormatDecoder` on `audio/ogg`/`audio/opus`; capability detection for
the lossless fallback. **Depends on 18.2; parallel-ok with 18.3.**
- **18.5 — Backfill + selection UX + end-to-end validation.** "Backfill Opus" CMS bulk action (third
sibling to Generate-Profiles / Backfill-High-res) + replace-audio Opus regeneration; the listener
selection control (recommend a global persisted quality toggle); the AC1AC8 acceptance pass including
the Phase-21 handshake (Opus is windowable by the same machinery). **Depends on 18.118.4.**
`UnifiedTrackService.UploadAsync` alongside `WaveformProfileService` **as a background job** (OQ6);
produces Ogg Opus fullband 320; **walks the encoded stream once to build the granule→byte seek index and
extract the `OpusHead`/`OpusTags` setup header**; stores the Opus bytes **and** the combined seek/setup
**sidecar** as derived artifacts (recommend a `track-opus` vault). Failure-tolerant. **Independent of the
delivery/decoder waves — can begin immediately.**
- **18.2 — Storage + lookup contract.** The derived-artifact key/vault convention (Opus bytes + sidecar) +
server-side "given `EntryKey` + format, return the right `AudioBinary` + content-type (+ the sidecar),"
including the lossless fallback. **Depends on 18.1.**
- **18.3 — Delivery: `?format=opus|lossless` param + sidecar serving + proxy threading.** On the
`DeepDrftAPI` stream endpoint (resolves via 18.2), forwarded through `TrackProxyController` (mirror
`offset`), `Range` serving the chosen artifact; **plus serving the seek/setup sidecar**; player sends the
format param via `TrackMediaClient`. **Depends on 18.2; parallel-ok with 18.4.**
- **18.4 — `OpusFormatDecoder` + index-based seek resolver in the player stack.** New `IFormatDecoder`
(Ogg-page segmenting via `OggS` scan, `OpusHead`/`OpusTags` setup carry from the cached sidecar,
**`calculateByteOffset` that binary-searches the precomputed seek index** — NOT interpolation — with an
`OpusSeekData` accelerator holding the parsed index + setup bytes, and the one-time sidecar fetch+parse on
track load) + one arm in `createFormatDecoder` on `audio/ogg`/`audio/opus`; capability detection for the
lossless fallback. **Depends on 18.2; parallel-ok with 18.3.**
- **18.5 — Backfill + replace-audio + end-to-end validation (incl. seek accuracy).** "Backfill Opus" CMS
bulk action (third sibling to Generate-Profiles / Backfill-High-res), rebuilding Opus bytes + sidecar for
existing tracks; replace-audio Opus + sidecar regeneration; the AC1AC10 acceptance pass **including AC9
(an Opus seek lands at the correct time, not approximately)** and the Phase-21 handshake (Opus windowable
via the index resolver + sidecar setup header). **Depends on 18.118.4.**
- **18.6 — Public Settings menu + quality toggle (the listener selection UX).** New public-site
Settings-menu shell (app-bar trigger + MudBlazor menu + a settings-item abstraction + a
`PublicSiteSettings`/`ListenerSettings` object + the dark-mode-pattern persistence seam: `streamQuality`
cookie, a `DeepDrftPublic` prerender-read service, `PersistentComponentState` bridge, client cookie
service); the **quality toggle is its first occupant** (Low-data/Lossless, Opus default, capability-gated)
+ the CMS upload meter's **Post-Processing phase** (OQ6). Built design-for-adaptability so dark mode can
plug in later without restructuring (not migrated now). **Depends on 18.3** for the toggle; the menu shell
can be built ahead. *Splittable* (shell, then toggle) if Daniel wants the shell proven first.
**Dependency shape:** `18.1 → 18.2 → {18.3 ∥ 18.4} → 18.5`; 18.1 is the only cold-start wave.
**Phase-level: 18 precedes Phase 21.** **Open questions for Daniel (spec §6):** selection UX (recommend a
single global quality toggle); default policy (recommend Opus-by-default, capability-gated; defer
network-awareness); whether the choice is remembered + scope (recommend persisted cookie/`localStorage`,
the dark-mode precedent); per-upload Opus opt-out vs. always-on (recommend always-on); Ogg-vs-CAF/WebM
container (recommend Ogg Opus as directed); transcode execution model (background/queued — a track is
lossless-only briefly until its Opus finishes; confirm acceptable). None block 18.1.
**Dependency shape:** `18.1 → 18.2 → {18.3 ∥ 18.4} → 18.5`; `18.6 ∥` (needs 18.3 for the live toggle);
18.1 is the only cold-start wave. **Phase-level: 18 precedes Phase 21** (windowed refill consumes the Phase
18 seek-index resolver). **OQ1OQ6 RESOLVED (above); OQ7 (seek-index granularity, recommend ~12 s buckets)
is a non-blocking tuning steer.** None block 18.1.
---
@@ -539,9 +571,12 @@ endpoint, no schema change.
derived Ogg Opus 320 low-data path, Phase 18) is a prerequisite that comes first; windowing must work
across **both** delivery formats. Phase 21's C5 invariant already anticipated this ("must not foreclose
MP3/FLAC"); **Opus is now the concrete VBR/paged driver** — windowing an Opus stream uses the decoder's
*approximate* byte↔time mapping (`OpusFormatDecoder.calculateByteOffset`, Ogg-page interpolation), not
the exact CBR-WAV `byteRate` math. Build the window machinery format-agnostically so it inherits Opus
for free.
**accurate index-based** byte↔time mapping (`OpusFormatDecoder.calculateByteOffset`, a binary search in the
Phase 18 precomputed seek index — *not* the exact CBR-WAV `byteRate` math, and *not* approximate page
interpolation: VBR-safe and exact, per the Phase 18 seek-model resolution 2026-06-23). The windowed refill
controller calls the **same** index resolver an explicit seek does, and a window opening away from byte 0
still decodes via the Phase 18 sidecar setup header. Build the window machinery format-agnostically so it
inherits Opus for free.
The network path already streams in adaptive 1664 KB chunks. The accumulation is on the **decode
side**: `PlaybackScheduler` holds an `AudioBuffer[]` it **never evicts** ("Supports pause/resume/seek by
+438 -124
View File
@@ -1,8 +1,17 @@
# Phase 18 — Opus Low-Data Streaming (dual-format lossless + Opus delivery)
Product spec. Status: **design / framing — implementation-ready pending Daniel's open-question calls.**
Product spec. Status: **design / framing — open questions RESOLVED (Daniel, 2026-06-23); implementation-ready.**
Author: product-designer. Date: 2026-06-23. **No code has been written by this doc.**
> **Resolution pass (Daniel, 2026-06-23).** OQ1OQ6 are resolved (see §6 — each marked RESOLVED, kept
> visible per file convention). Two resolutions reshaped the spec materially: (a) the listener quality
> selection lives inside a **new public-site Settings menu surface** (not a bare app-bar control) — §4 +
> §4a; and (b) Daniel rejected the "approximate page-interpolation" seek hand-wave outright — **VBR-safe
> *accurate* seeking is now a first-class part of the architecture** (a precomputed seek-index artifact +
> a separately-available setup header). §3.4 is rewritten and a dedicated seek-model section (§3.4a)
> added. The Phase 21 cross-reference is updated to read "accurate index-based mapping," not
> "approximate."
This phase is the concrete realization of the long-deferred **"Non-WAV formats"** intent
(`CONTEXT.md §5`, the "1.2" the streaming-feature items reference). It supersedes the abstract "a
processor per format + a decoder strategy" framing with a specific, Daniel-directed product: **two
@@ -13,12 +22,17 @@ Surfaces (named precisely):
- **Ingest / preprocessing:** `DeepDrftContent` (`AudioProcessor` / `AudioProcessorRouter` /
`TrackContentService` / `WaveformProfileService`) + `DeepDrftAPI` (upload/persist —
`UnifiedTrackService.UploadAsync`, replace-audio) + `DeepDrftManager` (CMS upload form, only if a
per-upload control is wanted — see OQ4).
- **Delivery / decode:** `DeepDrftAPI` (the track stream endpoint + `Range` handler) +
`DeepDrftPublic` proxy (`TrackProxyController`) + `DeepDrftPublic.Client` player stack
(`StreamingAudioPlayerService`, `TrackMediaClient`) + `DeepDrftPublic/Interop/audio` TS decoders
(`AudioPlayer.createFormatDecoder` registry, a new `OpusFormatDecoder`).
`UnifiedTrackService.UploadAsync`, replace-audio) + `DeepDrftManager` (CMS upload form — the
**Post-Processing phase** on the existing upload progress meter, §3.1a).
- **Delivery / decode:** `DeepDrftAPI` (the track stream endpoint + `Range` handler + the new
**seek-index** and **setup-header** sidecar endpoints, §3.4a) + `DeepDrftPublic` proxy
(`TrackProxyController`) + `DeepDrftPublic.Client` player stack (`StreamingAudioPlayerService`,
`TrackMediaClient`) + `DeepDrftPublic/Interop/audio` TS decoders (`AudioPlayer.createFormatDecoder`
registry, a new `OpusFormatDecoder`).
- **Listener settings (NEW surface):** `DeepDrftPublic.Client` — a public-site **Settings menu** (app-bar
menu/popover) hosting the quality toggle as its first occupant, with a dark-mode-pattern persistence
seam (cookie → settings object → `PersistentComponentState` → client cookie service). §4a. The
prerender-cookie read lives in `DeepDrftPublic` (alongside `DarkModeService`).
**Sequencing headline: Phase 18 comes BEFORE Phase 21 (Windowed Streaming Buffer).** Phase 21's
windowing must work across both formats — its C5 invariant already anticipated this ("must not
@@ -165,8 +179,44 @@ managed binding, or a libopus P/Invoke). The spec fixes the *artifact* (Ogg Opus
and the *seam* (a derived artifact produced post-store, regenerable, failure-tolerant), not the tool.
Note a real operational constraint to flag for implementation: transcoding a 1 GB WAV is **CPU- and
time-expensive** and must not block the upload response — it wants the same off-the-hot-path treatment
the upload body staging already gets (`Upload:StagingPath`), likely a background/queued step. This is
the single biggest implementation risk and is called out as such.
the upload body staging already gets (`Upload:StagingPath`). This is the single biggest implementation
risk and is called out as such. The execution model is now **decided** (OQ6): **the source is stored and
the track is playable (lossless) first, then the Opus transcode runs as a background job** — see §3.1a
for the user-visible consequence on the upload UI.
### 3.1a Transcode execution model + the Post-Processing upload phase (RESOLVED — OQ6)
**Execution model (Daniel, 2026-06-23): background process *after* the file is available.** The upload
flow is now two distinct server-side stages with a hard ordering:
1. **Transfer + store + persist (existing, synchronous).** The WAV body streams in (the landed
`ProgressStreamContent` two-phase cancellation), the source is stored in the vault, the `TrackEntity`
is persisted, the waveform datums are computed. At the end of this stage **the track is fully playable
losslessly** — nothing about Opus gates a successful upload.
2. **Opus transcode (NEW, background, after stage 1 completes).** A queued/background job reads the
stored source, transcodes to Ogg Opus 320, builds the **seek index** and extracts the **setup header**
(§3.4a), and stores all three derived artifacts. Until it finishes, `?format=opus` for that track
falls back to lossless (C2). On failure the track stays lossless-only and is eligible for Backfill-Opus
(C6).
**The upload progress meter gains a visible Post-Processing phase.** The CMS upload forms
(`BatchUpload.razor` / `BatchEdit.razor`) already render a progress meter driven by `ProgressStreamContent`
(byte-transfer progress) and the two-phase cancellation (idle window during transfer, response-wait budget
after the body completes). The transcode is a **third visible phase** appended to that meter — after the
existing "uploading bytes" and "server is persisting" phases, a **Post-Processing** phase reflects the
background transcode's status (queued → transcoding → done / failed). This is an *addition* to the
existing meter, not a new UI.
- The admin sees: bytes transfer → server persists (track now exists + plays lossless) → **Post-Processing**
(Opus being derived). The form may complete/return the admin to the catalogue after stage 1 (the track
is live); the Post-Processing phase can continue to report against that track in the browse/release view
(the Opus waveform/profile columns on `Releases.razor` already poll-and-show per-track derived-artifact
status — Post-Processing status fits the same affordance family).
- **How status reaches the UI is staff-engineer's call** (poll the track's Opus-artifact presence, an SSE/
long-poll job channel, or a status field on the track read). The spec fixes that the phase is *visible*
and *non-blocking* — the admin is never made to wait on the transcode to consider the upload done.
- This composes with the **always-on** decision (OQ4): every upload triggers the background transcode;
there is no per-upload opt-out, so the Post-Processing phase always appears.
### 3.2 Where the Opus artifact is stored (two options)
@@ -206,30 +256,36 @@ Server-side fallback rule (C2): if `format=opus` is requested but no Opus artifa
track (not yet transcoded / backfilled), the endpoint **falls back to lossless** rather than 404ing —
Opus is additive, so its absence degrades to "you get the lossless one," never to "no audio."
### 3.4 The Opus decoder + seek math (the genuinely new decode work)
### 3.4 The Opus decoder (the genuinely new decode work)
`OpusFormatDecoder implements IFormatDecoder` is the new code on the delivery side. Two things make it
harder than the WAV decoder and need to be flagged:
`OpusFormatDecoder implements IFormatDecoder` is the new code on the delivery side. **Ogg Opus is a
containerized, paged format — not raw-frame-sliceable** the way WAV PCM is. WAV's `wrapSegment` prepends a
44-byte PCM header to any PCM-aligned byte run; the current model assumes you can wrap an arbitrary aligned
raw-audio slice and hand it to `decodeAudioData`. Ogg Opus is page-structured (Ogg pages carrying Opus
packets, plus mandatory `OpusHead`/`OpusTags` **setup pages** at the very start). A mid-stream byte slice
is **not** independently decodable: it needs (1) the setup header prepended, and (2) to begin on an Ogg
**page boundary**. So:
- **Containerized, paged format — not raw-frame-sliceable.** WAV's `wrapSegment` prepends a 44-byte
PCM header to any PCM-aligned byte run; the current model assumes you can wrap an arbitrary aligned
raw-audio slice and hand it to `decodeAudioData`. **Ogg Opus is page-structured** (Ogg pages
carrying Opus packets, plus mandatory `OpusHead`/`OpusTags` setup pages at the start). A mid-stream
byte slice is not independently decodable without the setup header and without landing on Ogg page
boundaries. So `OpusFormatDecoder`'s `getAlignedSegmentSize` must align to **Ogg page boundaries**
(scan for the `OggS` capture pattern — analogous to FLAC's frame-sync scan, for which the
`IFormatDecoder` interface already passes `rawData` to `getAlignedSegmentSize`), and
`wrapSegment`/the continuation path must carry the `OpusHead` setup (analogous to FLAC's
`streamInfoBytes` in `FlacSeekData`). **The `IFormatDecoder` abstraction already has the shape for
this** — a format-specific `seekData` accelerator and a setup-bytes carry — because FLAC needed the
same kind of thing. A new `OpusSeekData` variant joins `Mp3VbrSeekData | FlacSeekData`.
- **VBR byte↔time mapping is approximate (the Phase 21 C5 case, concretely).** Opus at "320 kbps" is
effectively VBR; there is no exact `byteRate` for offset math the way CBR WAV has. Seek-by-offset
uses an **approximate** mapping (granule-position/Ogg-page interpolation, the Opus analogue of MP3's
Xing TOC or FLAC's SEEKTABLE). `calculateByteOffset` returns a best-effort page-aligned offset; the
decoder then re-syncs to the next Ogg page. This is exactly the "VBR formats: the mapping is
approximate" case Phase 21's C5 invariant anticipated — **Opus is the format that makes that
invariant load-bearing rather than hypothetical.**
- `OpusFormatDecoder.getAlignedSegmentSize` aligns to **Ogg page boundaries** — scan for the `OggS`
capture pattern (analogous to FLAC's frame-sync scan; the `IFormatDecoder` interface already passes
`rawData` to `getAlignedSegmentSize` for exactly this reason).
- `wrapSegment` / the continuation path **prepends the `OpusHead`/`OpusTags` setup bytes** to a mid-stream
page run before handing it to `decodeAudioData` (analogous to FLAC's `streamInfoBytes` carry in
`FlacSeekData`). The setup bytes come from the **setup-header mechanism** (§3.4a), not from re-reading
the stream start.
- A new `OpusSeekData` variant joins `Mp3VbrSeekData | FlacSeekData` in the `seekData` accelerator slot —
but for Opus it carries the **accurate seek index** (§3.4a), not a heuristic TOC.
**The `IFormatDecoder` abstraction already has the shape for both needs** — a format-specific `seekData`
accelerator and a setup-bytes carry — because FLAC needed the same kind of thing. The genuinely new part
is **where the seek index and setup header come from**, which §3.4a designs.
> **Seek is NOT approximate for Opus (Daniel, 2026-06-23 — supersedes the earlier hand-wave).** An earlier
> draft of this section proposed "granule-position/Ogg-page interpolation" — a best-effort approximate
> offset, the Opus analogue of MP3's Xing TOC. **That is rejected.** Daniel: *"Killing seeking for
> decoding is unacceptable… Raw bytes offset for seeking is no longer adequate due to VBR. We need an
> accurate transfer function for seek time → true file byte offset."* Opus seeking is **accurate**, backed
> by a precomputed index built at transcode time. See §3.4a.
**Browser decode-support constraint (real, must be designed around).** The bespoke graph decodes
segments via `AudioContext.decodeAudioData`. Ogg-Opus support in `decodeAudioData` is long-standing in
@@ -237,21 +293,145 @@ Chrome and Firefox but arrived in **Safari only at 18.4 (macOS 15.4 / iOS 18.4,
Safari decodes Opus only in a CAF container, not Ogg. iOS Safari is a primary music-listening surface,
so this is not a corner case. Implications: (1) the **lossless WAV path is the universal fallback** for
listeners whose browser can't decode Ogg Opus — which C2's additive design already provides for free;
(2) format-default policy (OQ2) should consider capability detection — don't hand Ogg Opus to a Safari
that can't decode it. This intersects Phase 1.7 (Safari compatibility) and is flagged there too.
(2) the format default is **capability-gated** (OQ2, RESOLVED) — don't hand Ogg Opus to a Safari that
can't decode it; detect support (a probe `decodeAudioData` on a tiny Opus blob, or a UA/version gate) and
fall back to lossless. This intersects Phase 1.7 (Safari compatibility) and is flagged there too.
([Browser support: caniuse / WebKit 18.4 release notes — see Sources.])
### 3.4a VBR-safe accurate seeking (the seek-index artifact + the setup-header mechanism)
This is the architectural core of the Opus delivery path, and it must compose with **Phase 21 windowed
refill** (where most of the stream is *not* in memory). The requirement, decomposed from Daniel's
direction:
1. Seeking must be preserved for Opus **without** having the full PCM decoded in memory.
2. Raw byte-offset seek is inadequate — a VBR Opus stream has **no linear time↔byte relationship**, so
`byteRate` math and even rough page interpolation are not accurate enough.
3. We need an **accurate transfer function: seek-time → true file byte offset.**
4. The decode setup header must be **available separately** (or cached before seeking past it), because a
mid-stream slice is undecodable without `OpusHead`/`OpusTags`.
**The key insight: the one moment we already walk the entire encoded stream is the transcode.** That is
precisely when an accurate index can be built for free. We never have to guess at delivery time — we read
the answer out of a precomputed artifact.
#### A. The seek-index artifact (the accurate transfer function)
At transcode time, after the Opus bytes are produced, **walk the encoded Ogg stream once and record, for
each Ogg page (or coarser bucket), the page's `granulepos` (a 48 kHz sample count → time) paired with its
**byte offset** in the file.** That granule→byte table *is* the exact transfer function. This is the Opus
analogue of FLAC's `SEEKTABLE` / MP3's Xing TOC — but **precomputed and exact**, not derived by
interpolation guessing. Ogg granule positions are authoritative sample counts, so the mapping is true, not
estimated.
- **What it contains.** An ordered list of `(timeSeconds | granulepos, byteOffset)` entries, plus the
total duration and total byte length (for clamping a seek to range). A binary little-endian array of
fixed-width records is the natural shape (e.g. a `uint64 granulepos` + `uint64 byteOffset` per entry);
the exact encoding is staff-engineer's, but it should be a **compact binary blob**, fetched once and
parsed into a typed array client-side.
- **Granularity vs. size (the one real tuning knob).** One entry per Ogg page is the most precise but
largest; an Ogg page is typically a few KB of audio (~tens of ms to a few hundred ms), so a 1-hour mix
could be tens of thousands of pages. Recommend a **coarser bucket: one index entry per ~12 seconds of
audio** (snap each bucket boundary to the *nearest enclosing page start*, so every indexed offset is
still an exact page boundary). At ~1 s granularity a 1-hour mix is ~3,600 entries × 16 bytes ≈ **~58 KB**
— a trivial one-time fetch, and 1 s seek resolution is more than fine (the decoder re-syncs to the exact
page within the bucket anyway — see the client flow). **Per-page precision is the fallback if 1 s buckets
ever prove too coarse**, at a larger index. The number is staff-engineer's call; the *shape* (precomputed
exact granule→byte, bucketed, snapped to page starts) is fixed.
- **Sidecar, not embedded (recommended).** Store the index as a **third derived artifact** alongside the
Opus bytes and the waveform datum — the same "derived artifacts get their own vault" pattern this phase
already uses (S2 / `track-opus`; the `track-waveforms` precedent). Keep it a separate vault resource
(e.g. `{entryKey}.seekidx` in a `track-opus` vault, or its own `track-opus-index` vault) rather than
embedding it in the Ogg stream. *Why sidecar:* it is fetched **once, up front** (small, cacheable),
independent of the audio byte stream; embedding it in the Ogg would force the client to read into the
stream to find it, defeating the "resolve the offset *before* the Range fetch" flow. *Road not taken —
derive the index lazily on first seek by scanning server-side:* rejected, because it re-walks the stream
at request time (the cost we avoid by computing at transcode) and gives nothing the precomputed sidecar
doesn't.
#### B. The setup-header mechanism (decodability of any mid-stream slice)
Any post-seek slice needs `OpusHead` + `OpusTags` prepended to decode. Two ways to make those bytes
available to the client:
- **B-a — Client-side caching of the leading setup pages on first read (recommended).** On first play, the
stream already begins at byte 0, so the client *already receives* the `OpusHead`/`OpusTags` pages as the
opening bytes. `OpusFormatDecoder.tryParseHeader` captures and **retains** those setup bytes (exactly as
`WavFormatDecoder` retains the parsed WAV header for `reinitializeForRangeContinuation` today, and FLAC
retains `streamInfoBytes`). Every subsequent post-seek continuation prepends the cached setup bytes. *No
new endpoint;* it reuses the header-retention discipline already in the codebase.
- **B-b — A dedicated setup-header sidecar endpoint** (`GET api/track/{id}/opus/header` → just the
`OpusHead`/`OpusTags` bytes, also derivable at transcode time and stored as a tiny artifact). *Pro:* a
seek can be served even if the listener seeks **before** the stream start has been read (e.g. a deep-link
that begins mid-track, or a Phase 21 window that opens away from byte 0). *Con:* one more endpoint +
artifact.
**Recommendation: B-a as the primary, B-b as a cheap insurance artifact.** B-a covers the overwhelming
common case (play-then-seek) with **zero new surface** — it is the WAV-header-retention pattern applied to
Opus. But Phase 21 windowing and deep-links can legitimately open a window that never read byte 0, so the
setup header should **also** be derivable on demand. Cheapest reconciliation: **extract the setup bytes at
transcode time and store them as a tiny sidecar artifact** (they are a few hundred bytes), and expose them
**either** as a small endpoint **or** simply prepend them to the seek-index sidecar's header region so the
single up-front index fetch *also* delivers the setup bytes. The latter folds B-b into the B-a fetch: **the
client's one up-front sidecar fetch returns both the seek index and the setup header**, so it always has
both before it ever issues a seek — and never needs byte 0 to have been read. **Recommended concrete
design: one sidecar per track = `[setup-header bytes][seek-index table]`, fetched once on track load,
parsed into `OpusSeekData`.** This is the cleanest: one new artifact, one new fetch, both needs met.
#### C. The client-side seek flow, end to end
With the sidecar (`OpusSeekData` = setup header + granule→byte index) fetched and parsed at track load:
1. **Resolve time → byte offset (accurate).** Listener seeks to `t` seconds. `OpusFormatDecoder.calculateByteOffset(t)`
does a binary search in the index for the largest entry with `time ≤ t`, returns its exact (page-start)
`byteOffset`. **No interpolation, no `byteRate` math.** (For WAV this method stays the exact CBR
calculation it is today — the seam is identical; only the Opus implementation reads an index.)
2. **Range fetch from the offset.** Issue `GET api/track/{id}?format=opus` with `Range: bytes={byteOffset}-`
— the **landed Phase 4 Range primitive, unchanged**. Server streams raw Opus bytes from that exact page
boundary (`206 Partial Content`).
3. **Prepend the cached setup header + decode.** The continuation path (the Opus analogue of
`StreamDecoder.reinitializeForRangeContinuation`) prepends the retained/sidecar `OpusHead`/`OpusTags`
bytes to the incoming page run, then feeds it to `decodeAudioData`. Because the index offset is an exact
page start, the stream is immediately Ogg-sync-aligned.
4. **Fine re-sync within the bucket.** The granule of the first decoded page tells the decoder the *exact*
time it landed at (≤ the bucket granularity ahead of `t`); the scheduler trims/positions to land
playback at `t` precisely. With ~1 s buckets the trim is sub-second; with per-page granularity it is
near-zero. **Either way the listener lands at the correct time, not approximately** (AC9).
#### D. Composition with Phase 21 windowed refill
Phase 21's windowed refill controller resolves "I need bytes for playback position `P`" → a byte offset →
a Range fetch. **It calls the *same* `OpusFormatDecoder.calculateByteOffset` (the index-based resolver)
for Opus** that an explicit seek does — windowed refill is just a seek the listener didn't initiate. So the
seek index serves both: explicit seeks and the window's low-water refills both resolve through the index,
and both prepend the cached setup header. This is why §3.4a is in **Phase 18** (where the transcode that
builds the index lives), and Phase 21 *consumes* it. The Phase 21 spec's "approximate mapping" language for
Opus is now wrong and is corrected to **"accurate index-based mapping."**
#### E. Reuse vs. extend (the seam discipline)
- **Reused verbatim:** the Phase 4 `Range: bytes=X-` → 206 primitive (client → proxy → API); the
`IFormatDecoder.calculateByteOffset` seam; the header-retention/continuation discipline
(`reinitializeForRangeContinuation`'s Opus analogue); the derived-artifact-in-its-own-vault pattern
(`track-waveforms``track-opus`); the derive-at-transcode-regenerate-on-backfill lifecycle.
- **Extended (new):** the seek-index + setup-header **sidecar artifact** (built at transcode, stored
beside the Opus bytes); the one-time **sidecar fetch** on track load (parsed into `OpusSeekData`); the
index **binary-search resolver** inside `OpusFormatDecoder`. Three additions, all leaf-level — no change
to the Range mechanism, the proxy, or the format-agnostic player.
### 3.5 The three candidate directions (shape-level)
Per file convention the alternatives are recorded; the recommendation follows.
**Direction A — Derived Opus artifact at ingest + format param on delivery (recommended).** What §3.1
3.4 describe: transcode to Opus 320 post-store, store as a derived artifact (S2 vault), serve via a
`?format=` param resolved server-side to bytes + content-type, decode via a new `OpusFormatDecoder` in
the existing registry. *Why recommended:* additive (C2), reuses every existing seam (the processor
orchestration, the waveform-datum derived-artifact pattern, the `Range` path, the decoder registry),
and the only genuinely new code is one transcode step + one decoder. Two derived artifacts per track,
both regenerable.
3.4a describe: transcode to Opus 320 post-store as a **background job** (OQ6), store as derived artifacts
(S2 vault) — the Opus bytes **plus the seek-index/setup-header sidecar** (§3.4a) — serve via a `?format=`
param resolved server-side to bytes + content-type, decode via a new `OpusFormatDecoder` in the existing
registry, **seek accurately via the precomputed index**. *Why recommended:* additive (C2), reuses every
existing seam (the processor orchestration, the waveform-datum derived-artifact pattern, the `Range` path,
the decoder registry, the header-retention discipline), and the only genuinely new code is one transcode
step (+ index build) + one decoder (+ index resolver). **Three** derived artifacts per track (Opus bytes,
seek sidecar, and the existing waveform datum), all regenerable.
**Direction B — On-the-fly transcode at delivery (no stored Opus artifact).** Transcode WAV→Opus per
request in the stream endpoint, streaming the Opus out as it encodes. *Why not (default):* moves
@@ -274,10 +454,13 @@ recorded only because "just store Opus" is the tempting simplification and the s
extension); the artifact is a new derived vault resource (the `track-waveforms` precedent is exactly
this). Phase 18 adds **three new leaf implementations** and **zero changes to existing format code**
— the strongest possible OCP signal that the seams were designed right.
- **SRP, preserved.** Transcoding is a content-domain processor concern (`DeepDrftContent`); delivery
selection is a thin endpoint concern (`DeepDrftAPI` resolves a param to an artifact); decode is the
`OpusFormatDecoder`'s concern; byte↔time math stays inside that decoder via `calculateByteOffset`.
No responsibility crosses a boundary it doesn't already own.
- **SRP, preserved.** Transcoding **and the seek-index build** are content-domain processor concerns
(`DeepDrftContent`); delivery selection is a thin endpoint concern (`DeepDrftAPI` resolves a param to an
artifact, and serves the sidecar); decode is the `OpusFormatDecoder`'s concern; byte↔time math stays
inside that decoder via `calculateByteOffset` (now reading the index, not interpolating). No
responsibility crosses a boundary it doesn't already own. The seek index is built **once, where the
stream is already walked** (transcode) — the natural home for an exact transfer function, never
recomputed at request time.
- **DIP / "one source, multiple views."** One `TrackEntity`/`EntryKey` is the single source; "lossless
WAV" and "low-data Opus" are two *views* (renderings) of it, diverging only at the delivery/decode
layer — the same discipline the dark-mode and track-browse surfaces follow.
@@ -289,13 +472,77 @@ recorded only because "just store Opus" is the tempting simplification and the s
---
## 4. Format selection — the product surface (deliberately under-specified; see OQ1/OQ2)
## 4. Format selection — the product surface (RESOLVED — global, via a Settings menu)
Daniel has **not** specified the selection UX. What is settled by his direction: there are two formats,
Opus is the bandwidth-friendly **default-candidate**, lossless is the kept option. What is open: how a
listener expresses the choice, whether it is remembered, and whether the default is global or adapts.
These are genuine product calls — see §6. The *mechanism* (a `?format=` param the player sends; §3.3)
supports any of the policies, so the policy can be decided after the substrate lands.
**Resolved (Daniel, 2026-06-23):** the listener's quality choice is **global** (one session/visitor-level
"streaming quality" preference, not per-track), Opus is the **default** (capability-gated), and the choice
is **remembered** following the dark-mode persistence pattern. Crucially: *"Global is perfect, but we need
a menu system for settings, don't just slap the quality control directly in the app bar."* So the toggle
does **not** sit bare in the app bar — it lives inside a proper **public-site Settings menu** (§4a), of
which it is the **first occupant**.
- **What the listener sees.** A Settings affordance in the public app bar opens a Settings menu; inside it,
a "Streaming quality" control with two options — **Low-data (Opus)** / **Lossless (WAV)** — defaulting to
Low-data. Picking lossless flips the global preference; the player sends the matching `?format=` on
subsequent stream requests (§3.3). On a browser that can't decode Ogg Opus, the control is shown but the
effective stream is lossless (capability gate, §3.4 / OQ2) — surface this honestly rather than letting
the listener pick a format that silently can't play.
- **Default before any choice:** Opus, capability-gated (OQ2 RESOLVED). A first-time visitor on a capable
browser streams Opus; on an incapable browser, lossless.
- **Persistence:** mirror the dark-mode seam exactly (OQ3 RESOLVED) — see §4a.
### 4a. The Settings menu surface (NEW — scoping + the dark-mode persistence pattern)
Daniel asked for a **menu system for settings**, not a control bolted onto the app bar, and noted the
existing **dark-mode toggle** is a natural future tenant of the same menu (design for adaptability — build
the menu so dark mode *could* move into it later, but **do not force that migration now**).
**Scoping recommendation: a small sub-track *within* Phase 18 (wave 18.6), not its own phase.** Reasoning:
- The menu's only **required** occupant right now is the quality toggle, which Phase 18 owns end to end —
splitting the shell into a separate phase would create a phase whose sole deliverable is an empty menu
waiting for Phase 18's toggle. That is ceremony, not separation of concerns.
- The menu is **small** — an app-bar trigger + a MudBlazor menu/popover + the persistence seam (which the
quality toggle needs *anyway*). It is not a platform; it is a container with one tenant.
- It carries a real **design-for-adaptability** obligation (it must be able to host dark mode and future
settings later), but that is a *shape* requirement on a small surface, not a phase's worth of work.
So: **build the Settings-menu shell as part of Phase 18 (wave 18.6), with the quality toggle as its first
occupant, designed so dark mode and future preferences can plug in without restructuring.** Flag for
Daniel: *if he wants the menu shell proven/landed independently before the quality toggle plugs in*, 18.6
can be split into "menu shell" then "quality toggle plugs in" — but they are small enough to land together.
This is **not** recommended as its own top-level phase. (If Daniel disagrees and wants a dedicated
"Public Settings Menu" phase that Phase 18's toggle then targets, that is a clean alternative — it just
front-loads a surface with no second tenant yet. Recommendation stands: sub-track.)
**The menu shell — design-for-adaptability requirements (so it survives new tenants):**
- A **settings-item abstraction**, not a hard-coded list. The menu renders a small set of settings entries;
adding dark mode later is adding an entry, not rewiring the menu. Each entry is a label + a control bound
to a persisted preference.
- A **single public-site settings object** carrying all listener preferences (today: streaming quality;
tomorrow: dark mode, and whatever follows). This is the `DarkModeSettings` analogue, generalized — call
it e.g. `PublicSiteSettings` / `ListenerSettings`. Dark mode's existing `DarkModeSettings` can fold into
it *later* without disturbing the menu.
**Persistence — mirror the dark-mode seam exactly (OQ3 RESOLVED).** The quality preference follows the
*identical* path dark mode already uses (root `CLAUDE.md` "Theming and dark mode"):
1. **Cookie** — a `streamQuality` cookie (365-day, like `darkMode`), the durable truth.
2. **Server prerender read** — a service in `DeepDrftPublic` (sibling to `DarkModeService`) reads the
cookie during prerender and seeds the settings object, avoiding a wrong-default flash on first paint
(the streaming-quality analogue of the "wrong theme flash" fix).
3. **`PersistentComponentState` bridge** — the seeded preference carries from server prerender into the
WASM render (the same bridge `DarkModeSettings` and `NowPlayingStats`/`StatsClient` already use), so the
client boots already knowing the quality without a re-read flash or a re-fetch.
4. **Client cookie service** — a runtime client-side service (JS-interop cookie write, like the dark-mode
toggle) persists the choice when the listener changes it in the menu.
**Why mirror rather than invent:** the dark-mode seam is the codebase's established, working pattern for "a
listener preference seeded at prerender, carried to WASM, persisted in a cookie." Reusing its shape means
the quality preference inherits the no-flash guarantee for free, and the eventual dark-mode-into-the-menu
migration is a *consolidation of two identical seams*, not a reconciliation of two different ones. (This is
the "one source, multiple views" / design-for-adaptability discipline applied to listener settings.)
---
@@ -315,48 +562,65 @@ supports any of the policies, so the policy can be decided after the substrate l
- **UC5 — Replace-audio regenerates Opus.** The existing replace-audio path (which already regenerates
both waveform datums and re-derives duration) also regenerates the Opus artifact from the new
source.
- **UC6 — Seek within an Opus stream.** Backward/forward seek resolves via the existing `Range` path;
the offset is the `OpusFormatDecoder`'s approximate page-aligned mapping (§3.4), re-syncing to the
next Ogg page — the VBR analogue of the WAV exact-offset seek.
- **UC6 — Seek within an Opus stream (accurately).** Backward/forward seek resolves via the existing
`Range` path; the offset comes from the `OpusFormatDecoder`'s **precomputed seek index** (§3.4a) — an
exact granule→byte lookup, then fine re-sync to the requested time within the bucket. The listener lands
at the **correct** time, not approximately, and without the full PCM decoded in memory.
- **UC7 — Safari that can't decode Ogg Opus.** Capability-gated to the lossless path (§3.4), so the
listener still plays audio. (Ties to OQ2 + Phase 1.7.)
- **UC8 — Listener switches streaming quality in the Settings menu.** The listener opens the public
Settings menu, flips "Streaming quality" from Low-data to Lossless (or back); the choice persists
(cookie, dark-mode pattern) and applies to subsequent stream requests via `?format=`. On next visit the
preference is seeded at prerender (no flash, no re-pick). (§4 / §4a.)
- **UC9 — Deep-link / windowed start away from byte 0.** A listener opens a stream at a mid-track position
(deep link, or a Phase 21 window that opens past byte 0) without ever reading the stream start. The
decoder still has the `OpusHead`/`OpusTags` setup bytes because they arrived with the up-front sidecar
fetch (§3.4a B), so the mid-stream slice is decodable immediately. (Composition case for Phase 21.)
---
## 6. Open questions for Daniel (genuine product decisions, not implementation detail)
## 6. Open questions — RESOLVED (Daniel, 2026-06-23)
- **OQ1 — Selection UX: how does a listener choose lossless vs. low-data?** Candidates: a global
toggle in the player bar / settings ("Stream quality: Low-data / Lossless"); a per-track control; an
automatic default with a manual override. Recommend a **single global quality toggle** (player bar
or a settings affordance) — it is the Spotify/Bandcamp/SoundCloud idiom (one account/session-level
"streaming quality" setting), low-friction, and matches a small-sharp-tool posture better than
per-track choosers. `[Daniel decision]`
- **OQ2 — Default policy: what does a listener get before they choose?** Opus is the
*default-candidate* per Daniel — confirm Opus-by-default. Sub-questions: should the default be
**capability-aware** (don't serve Ogg Opus to a browser that can't decode it — §3.4 Safari < 18.4)?
Should it be **network-aware** (Opus on cellular, lossless on wifi)? Recommend **Opus by default,
capability-gated** (fall back to lossless when the browser can't decode Ogg Opus), and **defer
network-awareness** as gold-plating for v1. `[Daniel decision]`
- **OQ3 — Is the choice remembered, and at what scope?** Per-session (resets each visit) vs.
persisted (cookie/`localStorage`, like the `darkMode` cookie) vs. (future) per-account once identity
exists. Recommend **persisted via a cookie/`localStorage` setting**, mirroring the dark-mode
precedent — one truth, seeded at prerender, carried to WASM. `[Daniel decision]`
- **OQ4 — Per-upload Opus control in the CMS, or always-on?** Should the CMS upload form let an admin
opt a track *out* of Opus generation (e.g. a track meant to be lossless-only), or is Opus always
generated for every track? Recommend **always-on** (simpler; Opus is additive and cheap to serve;
the listener's format choice already covers "I want lossless"). A per-track opt-out is a later
refinement if a real need appears. `[Daniel decision]`
- **OQ5 — Opus container/extension specifics.** Ogg Opus (`.opus` / `audio/ogg`) is the assumption
(broadest `decodeAudioData` support; Daniel said "Ogg Opus"). Confirm — vs. CAF-wrapped Opus (older
Safari) or WebM-Opus. Recommend **Ogg Opus** as Daniel directed; CAF-fallback for old Safari is not
worth it given the lossless fallback already covers those browsers (§3.4). `[Daniel steer — confirms
§3.4, not a blocker]`
- **OQ6 — Transcode execution model (flag, leans implementation).** Synchronous-at-upload is a
non-starter for 1 GB mixes (§3.1); the realistic options are a background/queued transcode after the
source is stored. This is largely staff-engineer's call, but it has a **product-visible
consequence**: a freshly uploaded track may be lossless-only for a short window until its Opus
artifact finishes. Confirm that "Opus appears shortly after upload, lossless available immediately"
is acceptable (it is the waveform-datum model already in place). `[Daniel steer]`
All six original open questions are resolved. Kept visible per file convention, each with the decision and
the section that now carries it. One new open question (OQ7) is raised by the seek-model design; it is a
narrow tuning/scoping call, not a blocker.
- **OQ1 — Selection UX — RESOLVED: global, via a Settings *menu* (not a bare app-bar control).** Daniel:
*"Global is perfect, but we need a menu system for settings, don't just slap the quality control directly
in the app bar."* So: one global quality preference, surfaced inside a new **public-site Settings menu**
(§4 / §4a), of which the quality toggle is the first occupant. The menu is scoped as a **Phase 18
sub-track (wave 18.6)**, designed so dark mode (its natural future tenant) can plug in later. `[RESOLVED
— §4 / §4a]`
- **OQ2 — Default policy — RESOLVED: Opus by default, capability-gated.** Opus is the default; on a browser
that cannot decode Ogg Opus (Safari < 18.4, §3.4), fall back to lossless rather than serving an
undecodable stream. Network-awareness (Opus on cellular / lossless on wifi) remains **deferred** as
gold-plating. `[RESOLVED — §3.4, §4]`
- **OQ3 — Remembered choice — RESOLVED: persisted, following the dark-mode pattern.** A `streamQuality`
cookie seeded at server prerender → settings object → `PersistentComponentState` bridge into WASM →
client cookie service for runtime writes. The full dark-mode seam mirrored (§4a). `[RESOLVED — §4a]`
- **OQ4 — Per-upload Opus control — RESOLVED: always-on + backfill.** Opus is generated for **every**
track, always (no per-upload opt-out). **Plus** a bulk **Backfill-Opus** CMS action processes the
existing catalogue. (The listener's lossless choice already covers "I want lossless," so a per-track
opt-out earns nothing.) `[RESOLVED — §3.1, UC4, wave 18.5]`
- **OQ5 — Container — RESOLVED: Ogg Opus.** `.opus` / `audio/ogg` (broadest `decodeAudioData` support). No
CAF/WebM fallback — the lossless path already covers browsers that can't decode Ogg Opus (§3.4).
`[RESOLVED — §3.4]`
- **OQ6 — Transcode execution model — RESOLVED: background job after the file is available; uploader shows
a Post-Processing phase.** The source is stored and the track is playable losslessly **first**; the Opus
transcode (+ seek-index build) runs as a **background job** afterward; the CMS upload progress meter
gains a visible **Post-Processing** phase reflecting the transcode status (§3.1a). A freshly uploaded
track is lossless-only until its Opus finishes — accepted, and now made visible rather than implicit.
`[RESOLVED — §3.1a]`
**New open question raised by the seek-model design (§3.4a) — narrow, non-blocking:**
- **OQ7 — Seek-index granularity (tuning, leans implementation).** The seek index trades precision against
size: per-Ogg-page (most precise, largest) vs. coarser time buckets snapped to page starts. Recommend
**~12 s buckets** (~58 KB for a 1-hour mix at 1 s; the decoder fine-re-syncs within the bucket so seek
*accuracy* is unaffected — only the in-bucket trim distance changes). This is largely staff-engineer's
call at implementation; flagged because the *number* is a deliberate choice and Daniel may have a feel
for acceptable index size vs. in-bucket trim. Does **not** block — the shape (precomputed exact
granule→byte, page-snapped) is fixed regardless of the bucket size. `[Daniel steer — not a blocker]`
---
@@ -371,56 +635,95 @@ supports any of the policies, so the policy can be decided after the substrate l
- **AC3 — Additive, non-breaking (C2).** The existing lossless WAV path is byte-for-byte unchanged; a
track with no Opus artifact still plays losslessly; `?format=opus` on such a track falls back to
lossless (no 404, no silence).
- **AC4 — Transcode at ingest, regenerable (C6).** A new upload produces an Opus artifact best-effort
after the source is stored; a transcode failure does not block the upload or break playback; a
Backfill-Opus action (re)generates artifacts for existing tracks; replace-audio regenerates the
Opus artifact from the new source.
- **AC4 — Transcode at ingest as a background job, regenerable (C6, OQ6).** A new upload stores the source
and is playable losslessly **immediately**; the Opus artifact (+ seek-index/setup-header sidecar) is
produced by a **background job** afterward; a transcode failure does not block the upload or break
playback; a Backfill-Opus action (re)generates artifacts for existing tracks; replace-audio regenerates
the Opus artifact and its sidecar from the new source.
- **AC4a — Post-Processing phase is visible on the upload meter (OQ6, §3.1a).** After the byte-transfer and
server-persist phases, the CMS upload progress UI shows a **Post-Processing** phase reflecting the
background transcode (queued → transcoding → done/failed). The admin is never blocked waiting on the
transcode; the track is live before Post-Processing finishes.
- **AC5 — Opus seek via the existing `Range` path (C3).** Forward and backward seek in an Opus stream
resolve through the landed `Range: bytes=X-` primitive, with the offset coming from
`OpusFormatDecoder.calculateByteOffset`; no new seek mechanism is introduced.
`OpusFormatDecoder.calculateByteOffset`; no new seek *transport* mechanism is introduced.
- **AC5a — Seek-index + setup-header sidecar exists and is fetched once (§3.4a).** Every track with an Opus
artifact has a sidecar carrying the setup header (`OpusHead`/`OpusTags`) and the granule→byte seek index;
the client fetches and parses it once on track load (into `OpusSeekData`) before issuing any seek.
- **AC9 (the seek-accuracy criterion) — an Opus seek lands at the *correct* time, not approximately.**
Seeking to time `t` in an Opus stream resolves via the precomputed index and lands playback at `t`
(within the fine-resync tolerance — sub-second at the recommended bucket granularity), **measurably
accurate**, not a `byteRate`/interpolation estimate. Verifiable: seek to a known marker (e.g. a downbeat
at a known timestamp) and confirm playback resumes there, not seconds off. This holds **without** the
full PCM decoded in memory (composes with Phase 21).
- **AC6 — No format branches leak (C4).** The only Opus-specific code is `OpusFormatDecoder`, its
`OpusSeekData`, the one `createFormatDecoder` selection arm, and the transcode processor + delivery
param resolution. The format-agnostic player/scheduler code is unchanged.
`OpusSeekData` (carrying the index), the one `createFormatDecoder` selection arm, the transcode processor
(+ index build), the sidecar artifact + its serving, and the delivery param resolution. The
format-agnostic player/scheduler code is unchanged.
- **AC7 — Capability-safe default (OQ2).** A browser that cannot decode Ogg Opus is served (or falls
back to) the lossless path and plays audio; no listener gets silence because of codec support.
- **AC8 — Windowing-ready (the Phase 21 handshake).** The `OpusFormatDecoder`'s approximate byte↔time
mapping is the one Phase 21's windowed refill will call; Opus playback must be windowable by the
same machinery (verified jointly when Phase 21 lands on top — see §8 / Phase 21 cross-ref).
- **AC8 — Windowing-ready (the Phase 21 handshake).** The `OpusFormatDecoder`'s **index-based** byte↔time
resolver is the one Phase 21's windowed refill calls; Opus playback must be windowable by the same
machinery, and a windowed refill that opens away from byte 0 still decodes (setup header from the
sidecar — UC9). Verified jointly when Phase 21 lands on top (see §8 / Phase 21 cross-ref).
- **AC10 — The Settings menu hosts the quality toggle and persists the choice (§4 / §4a).** The public app
bar opens a Settings menu containing a "Streaming quality" control (Low-data / Lossless, defaulting to
Low-data, capability-gated); changing it persists via the `streamQuality` cookie and is seeded at
prerender on the next visit (no flash). The menu shell is built so a future dark-mode entry can plug in
without restructuring.
---
## 8. Wave decomposition
Dependency shape: `18.1 → 18.2 → {18.3, 18.4}`, with `18.5` validating end-to-end. **18.1 (the
transcode/derived-artifact ingest) is the cold-start prerequisite** — until an Opus artifact exists,
nothing downstream has bytes to serve or decode. 18.3 (delivery param) and 18.4 (the decoder) are
largely parallel once 18.2 (storage/lookup) settles, but both need an artifact to test against.
Dependency shape: `18.1 → 18.2 → {18.3, 18.4}`, with `18.5` (backfill + e2e) and `18.6` (settings menu)
on top. **18.1 (the transcode + seek-index/setup-header derived artifacts) is the cold-start
prerequisite** — until those artifacts exist, nothing downstream has bytes to serve, decode, or seek
against. 18.3 (delivery param) and 18.4 (the decoder + index resolver) are largely parallel once 18.2
(storage/lookup) settles, but both need artifacts to test against. **18.6 (the Settings menu) is the only
wave with no audio-pipeline dependency** — it can proceed in parallel with the whole stack; it merely needs
the `?format=` mechanism (18.3) wired before the toggle has anything to drive.
- **18.1 — Ingest transcode: derive + store the Opus artifact (cold-start; load-bearing).** New
- **18.1 — Ingest transcode + seek-index + setup-header (cold-start; load-bearing).** New
`OpusTranscodeService`/processor in `DeepDrftContent`, invoked post-store from
`UnifiedTrackService.UploadAsync` alongside `WaveformProfileService`; produces Ogg Opus fullband
320; stores it as a derived artifact (S2 vault recommended). Failure-tolerant (C6) and off the hot
path (background/queued — OQ6). **Independent of the delivery/decoder waves; can begin immediately.**
- **18.2 — Storage + lookup contract.** The derived-artifact key/vault convention and the server-side
resolution "given `EntryKey` + format, return the right `AudioBinary` + content-type," including the
C2 fallback (no Opus → lossless). **Depends on 18.1** (an artifact must exist to resolve to).
- **18.3 — Delivery: format param + proxy threading.** `?format=opus|lossless` on the
`UnifiedTrackService.UploadAsync` alongside `WaveformProfileService`, **as a background job** (OQ6,
§3.1a); produces Ogg Opus fullband 320; **walks the encoded stream once to build the granule→byte seek
index and extract the `OpusHead`/`OpusTags` setup header** (§3.4a A/B); stores the Opus bytes **and** the
combined seek/setup **sidecar** as derived artifacts (S2 vault recommended). Failure-tolerant (C6).
**Independent of the delivery/decoder waves; can begin immediately.**
- **18.2 — Storage + lookup contract.** The derived-artifact key/vault convention (Opus bytes + sidecar)
and the server-side resolution "given `EntryKey` + format, return the right `AudioBinary` + content-type
(+ the sidecar on its own endpoint/path)," including the C2 fallback (no Opus → lossless). **Depends on
18.1** (artifacts must exist to resolve to).
- **18.3 — Delivery: format param + sidecar serving + proxy threading.** `?format=opus|lossless` on the
`DeepDrftAPI` track stream endpoint (resolves via 18.2), forwarded through the `DeepDrftPublic`
`TrackProxyController` (mirror the existing `offset` param threading), and the `Range` handler
serving the chosen artifact's bytes. The player sends the param via `TrackMediaClient`. **Depends on
18.2.** Parallel-ok with 18.4.
- **18.4 — `OpusFormatDecoder` in the player stack.** New `IFormatDecoder` implementation
(Ogg-page-aligned `getAlignedSegmentSize` via `OggS` scan, `OpusHead` setup carry in
`wrapSegment`/continuation, approximate page-interpolation `calculateByteOffset` with an
`OpusSeekData` accelerator); one new arm in `AudioPlayer.createFormatDecoder` on
`audio/ogg`/`audio/opus`. Capability detection for the lossless fallback (§3.4, OQ2). **Depends on
18.2** (needs Opus bytes to decode). Parallel-ok with 18.3; they meet at 18.5.
- **18.5 — Backfill + selection UX + end-to-end validation.** The Backfill-Opus CMS bulk action (third
sibling to Generate-Profiles / Backfill-High-res) and replace-audio Opus regeneration; the listener
selection control per OQ1/OQ3 (global persisted quality toggle, recommended); and the AC1AC8
acceptance pass — including AC8's confirmation that Opus is windowable so Phase 21 can build on it.
**Depends on 18.118.4.** (Selection UX can be split out if Daniel wants the substrate proven before
the control lands — flag at planning time.)
`TrackProxyController` (mirror the existing `offset` param threading), and the `Range` handler serving
the chosen artifact's bytes; **plus serving the seek/setup sidecar** (a `GET …/opus/seekdata`-style path,
proxied the same way). The player sends the format param via `TrackMediaClient`. **Depends on 18.2.**
Parallel-ok with 18.4.
- **18.4 — `OpusFormatDecoder` + the index-based seek resolver in the player stack.** New `IFormatDecoder`
implementation: Ogg-page-aligned `getAlignedSegmentSize` via `OggS` scan; `OpusHead`/`OpusTags` setup
carry in `wrapSegment`/the continuation path (sourced from the cached sidecar, §3.4a B); **`calculateByteOffset`
that binary-searches the precomputed seek index** (NOT interpolation), with an `OpusSeekData` accelerator
holding the parsed index + setup bytes; the **one-time sidecar fetch + parse** on track load. One new arm
in `AudioPlayer.createFormatDecoder` on `audio/ogg`/`audio/opus`. Capability detection for the lossless
fallback (§3.4, OQ2). **Depends on 18.2** (needs Opus bytes + sidecar). Parallel-ok with 18.3; they meet
at 18.5.
- **18.5 — Backfill + replace-audio + end-to-end validation (incl. seek accuracy).** The Backfill-Opus CMS
bulk action (third sibling to Generate-Profiles / Backfill-High-res), which (re)builds Opus bytes + the
sidecar for existing tracks; replace-audio Opus + sidecar regeneration; and the AC1AC10 acceptance pass
**including AC9 (an Opus seek lands at the correct time, not approximately)** and AC8's confirmation
that Opus is windowable (index resolver + sidecar setup header) so Phase 21 can build on it. **Depends on
18.118.4.**
- **18.6 — Public Settings menu + the quality toggle (the listener selection UX).** The new public-site
Settings-menu shell (§4a): an app-bar trigger + MudBlazor menu hosting a settings-item abstraction, the
`PublicSiteSettings`/`ListenerSettings` object, and the dark-mode-pattern persistence seam (`streamQuality`
cookie + a `DeepDrftPublic` prerender-read service + `PersistentComponentState` bridge + client cookie
service). The **quality toggle is its first occupant** (Low-data/Lossless, Opus default, capability-gated),
driving the `?format=` the player sends (needs 18.3). Built design-for-adaptability so dark mode can plug
in later without restructuring (not migrated now). **Depends on 18.3** (the toggle needs the format
mechanism); the menu *shell* can be built ahead of that. *Splittable* into "menu shell" + "toggle plugs
in" if Daniel wants the shell proven first — but small enough to land together (§4a).
---
@@ -432,7 +735,10 @@ largely parallel once 18.2 (storage/lookup) settles, but both need an artifact t
phase.** Phase 21's C5 invariant ("WAV-only shipping target; must not foreclose MP3/FLAC") is now
driven by Opus's VBR/paged seek math; Phase 21 OQ5 (adopt MSE) is resolved **NO** — the bespoke
graph stays (the same C1 decision recorded here). Windowing a VBR/Opus stream uses
`OpusFormatDecoder.calculateByteOffset`'s approximate mapping — exactly the C5 case.
`OpusFormatDecoder.calculateByteOffset`'s **accurate index-based mapping** (§3.4a — *not* the earlier
"approximate page-interpolation"; that language in the Phase 21 doc is corrected). Phase 21's windowed
refill calls the **same** index resolver an explicit seek does (§3.4a D), and a window that opens away
from byte 0 still decodes via the sidecar setup header (UC9).
- `PLAN.md` Phase 4 (landed) / `COMPLETED.md` — the HTTP `Range: bytes=X-` primitive Opus seek reuses.
- `PLAN.md` Phase 1.5 (gapless) / 1.6 (track-skip on error) / 1.7 (Safari) — 1.5's "encoder
padding/priming" caveat applies to Opus (it has pre-skip samples in `OpusHead`); 1.6's
@@ -452,7 +758,15 @@ largely parallel once 18.2 (storage/lookup) settles, but both need an artifact t
- `DeepDrftPublic/Interop/audio/AudioPlayer.ts` (`createFormatDecoder`, lines 117125) — the decoder
registry gaining the Opus arm.
- `DeepDrftPublic.Client/Clients/TrackMediaClient.cs` + `DeepDrftPublic/Controllers/TrackProxyController.cs`
— the media fetch + proxy that thread the new `?format=` param (mirroring `offset`).
— the media fetch + proxy that thread the new `?format=` param (mirroring `offset`), and proxy the new
seek/setup sidecar fetch.
- Root `CLAUDE.md` "Theming and dark mode" + `DarkModeService` (in `DeepDrftPublic`) + `DarkModeSettings`
(`DeepDrftPublic.Client.Common`) — the cookie → prerender-read → `PersistentComponentState` → client
cookie-service seam the **streaming-quality preference** (§4a) mirrors exactly; the eventual dark-mode-
into-the-Settings-menu migration consolidates two copies of this seam.
- `DeepDrftPublic.Client` `NowPlayingStats.razor` / `StatsClient` — the `PersistentComponentState`
prerender-bridge precedent (prerender fetch carried into WASM without a re-fetch/flash), the pattern the
quality preference's bridge follows; see the `tracksview-persistent-state-seam` auto-memory.
## Sources
@@ -13,10 +13,15 @@ endpoint.
> (`product-notes/phase-18-opus-low-data-streaming.md`) — is a prerequisite that sequences ahead of
> windowing. Phase 21's windowing must work across **both** delivery formats (lossless WAV and Opus).
> Its C5 invariant below already anticipated this ("must not foreclose MP3/FLAC"); **Opus is now the
> concrete VBR/containerized driver of C5.** Windowing an Opus stream uses the decoder's *approximate*
> byte↔time mapping (`OpusFormatDecoder.calculateByteOffset`Ogg-page interpolation), exactly the C5
> case — not the exact CBR-WAV `byteRate` math. Build the window machinery format-agnostically
> (§2 C3/C5) so it inherits Opus for free.
> concrete VBR/containerized driver of C5.** Windowing an Opus stream uses the decoder's **accurate
> index-based** byte↔time mapping (`OpusFormatDecoder.calculateByteOffset`a binary search in the Phase 18
> precomputed seek index), exactly the C5 case — *not* the exact CBR-WAV `byteRate` math, and *not*
> approximate Ogg-page interpolation. **Correction (Daniel, 2026-06-23):** an earlier draft described the
> Opus mapping as "approximate page interpolation"; the Phase 18 seek-model resolution rejected that — Opus
> seeking is **accurate**, backed by a precomputed seek index built at transcode time, so refill resolves to
> the *exact* page offset. The windowed refill controller calls the **same** index resolver an explicit seek
> does (Phase 18 §3.4a D); a window opening away from byte 0 still decodes via the Phase 18 sidecar setup
> header. Build the window machinery format-agnostically (§2 C3/C5) so it inherits Opus for free.
---
@@ -66,14 +71,20 @@ docs. This phase **modifies that seam** — so the contract it must preserve is
user-visible control, no change to seek/transport semantics beyond what the listener already
experiences. Seek must still feel identical.
- **C5 — Must window both delivery formats (WAV lossless AND Opus low-data).** Byte↔time mapping for
refill is exact and cheap for WAV (CBR: `byteRate` from the header). For VBR/containerized formats it
is approximate (the decoders carry TOC/SEEKTABLE/Ogg-page seek math). **Phase 18 (Opus) is sequenced
before this phase and is the concrete driver here:** an Ogg Opus 320 stream is VBR and page-paged, so
its `calculateByteOffset` is an *approximate* page-interpolation, not exact-offset. The window
machinery must express refill purely in terms of the decoder's existing `calculateByteOffset`, so the
same code windows WAV exactly and Opus approximately — **no WAV-special-cased offset math in the
window layer.** (MP3/FLAC decoders are already wired in the registry too — the registry dispatches on
content-type today; an `OpusFormatDecoder` joins them in Phase 18.)
refill is exact and cheap for WAV (CBR: `byteRate` from the header). **Phase 18 (Opus) is sequenced
before this phase and is the concrete VBR driver here** — and its mapping is **also exact**, but by a
different mechanism: an Ogg Opus 320 stream has no linear time↔byte relationship, so
`OpusFormatDecoder.calculateByteOffset` resolves via a **precomputed seek index** (granule→byte, built at
transcode; Phase 18 §3.4a), a binary search that returns the exact page offset — **not** an approximate
page interpolation. (An earlier draft of this invariant said "approximate"; the Phase 18 seek-model
resolution, Daniel 2026-06-23, made Opus seeking accurate. Corrected here.) The window machinery must
express refill purely in terms of the decoder's existing `calculateByteOffset`, so the same code windows
WAV (via `byteRate`) and Opus (via the index) — **no WAV-special-cased offset math in the window layer**,
and no approximation for either. A window that opens away from byte 0 must also prepend the decoder's
retained/sidecar setup header (Phase 18 §3.4a B) — the format-agnostic refill path already routes
continuations through the decoder's header-carry, so this comes for free. (MP3/FLAC decoders are already
wired in the registry too — the registry dispatches on content-type today; an `OpusFormatDecoder` joins
them in Phase 18.)
- **C6 — No regression to the single-instance JS decoder concurrency guarantees.** The current code is
careful that only one streaming loop touches the single JS `StreamDecoder` at a time
(`DrainActiveStreamingTaskAsync`, the `_streamingCancellation` identity dance). Windowed refill