780 lines
58 KiB
Markdown
780 lines
58 KiB
Markdown
# Phase 18 — Opus Low-Data Streaming (dual-format lossless + Opus delivery)
|
||
|
||
Product spec. Status: **design / framing — open questions RESOLVED (Daniel, 2026-06-23); implementation-ready.**
|
||
Author: product-designer. Date: 2026-06-23. **No code has been written by this doc.**
|
||
|
||
> **Resolution pass (Daniel, 2026-06-23).** OQ1–OQ7 are resolved (see §6 — each marked RESOLVED, kept
|
||
> visible per file convention; OQ7 — seek-index granularity — set to **0.5 s buckets**). Two resolutions
|
||
> reshaped the spec materially: (a) the listener quality
|
||
> selection lives inside a **new public-site Settings menu surface** (not a bare app-bar control) — §4 +
|
||
> §4a; and (b) Daniel rejected the "approximate page-interpolation" seek hand-wave outright — **VBR-safe
|
||
> *accurate* seeking is now a first-class part of the architecture** (a precomputed seek-index artifact +
|
||
> a separately-available setup header). §3.4 is rewritten and a dedicated seek-model section (§3.4a)
|
||
> added. The Phase 21 cross-reference is updated to read "accurate index-based mapping," not
|
||
> "approximate."
|
||
|
||
This phase is the concrete realization of the long-deferred **"Non-WAV formats"** intent
|
||
(`CONTEXT.md §5`, the "1.2" the streaming-feature items reference). It supersedes the abstract "a
|
||
processor per format + a decoder strategy" framing with a specific, Daniel-directed product: **two
|
||
delivery formats per track — the existing lossless WAV path and a new low-data Ogg Opus path — so the
|
||
listener gets a choice, with Opus the bandwidth-friendly default-candidate.**
|
||
|
||
Surfaces (named precisely):
|
||
|
||
- **Ingest / preprocessing:** `DeepDrftContent` (`AudioProcessor` / `AudioProcessorRouter` /
|
||
`TrackContentService` / `WaveformProfileService`) + `DeepDrftAPI` (upload/persist —
|
||
`UnifiedTrackService.UploadAsync`, replace-audio) + `DeepDrftManager` (CMS upload form — the
|
||
**Post-Processing phase** on the existing upload progress meter, §3.1a).
|
||
- **Delivery / decode:** `DeepDrftAPI` (the track stream endpoint + `Range` handler + the new
|
||
**seek-index** and **setup-header** sidecar endpoints, §3.4a) + `DeepDrftPublic` proxy
|
||
(`TrackProxyController`) + `DeepDrftPublic.Client` player stack (`StreamingAudioPlayerService`,
|
||
`TrackMediaClient`) + `DeepDrftPublic/Interop/audio` TS decoders (`AudioPlayer.createFormatDecoder`
|
||
registry, a new `OpusFormatDecoder`).
|
||
- **Listener settings (NEW surface):** `DeepDrftPublic.Client` — a public-site **Settings menu** (app-bar
|
||
menu/popover) hosting the quality toggle as its first occupant, with a dark-mode-pattern persistence
|
||
seam (cookie → settings object → `PersistentComponentState` → client cookie service). §4a. The
|
||
prerender-cookie read lives in `DeepDrftPublic` (alongside `DarkModeService`).
|
||
|
||
**Sequencing headline: Phase 18 comes BEFORE Phase 21 (Windowed Streaming Buffer).** Phase 21's
|
||
windowing must work across both formats — its C5 invariant already anticipated this ("must not
|
||
foreclose MP3/FLAC"); Opus is now the concrete VBR/containerized driver of that invariant. See §6 and
|
||
the Phase 21 cross-reference.
|
||
|
||
---
|
||
|
||
## 0. State of the world (what already exists — verified 2026-06-23)
|
||
|
||
This phase is **much further along than the "Non-WAV formats" backlog line implies**, on both sides.
|
||
Two prior efforts already built most of the multi-format substrate; what is *missing* is specifically
|
||
the **derived-Opus-artifact** idea, not generic format support.
|
||
|
||
**Producer side is already multi-format (router landed):**
|
||
- `AudioProcessorRouter.ProcessAudioFileAsync(filePath)` routes by extension — `.wav` →
|
||
`AudioProcessor`, `.mp3` → `Mp3AudioProcessor`, `.flac` → `FlacAudioProcessor`
|
||
(`DeepDrftContent/CLAUDE.md`).
|
||
- `TrackContentService.AddTrackAsync(filePath, mimeType)` is **format-agnostic**: it selects the
|
||
processor, generates an entry GUID, and **stores the original bytes** with correct extension/MIME
|
||
in the `tracks` vault.
|
||
- So today the system can *ingest and store* WAV/MP3/FLAC. It **does not transcode** — it keeps the
|
||
original. There is no derived artifact and no second format per track.
|
||
|
||
**Decoder side is a wired strategy registry (not "implemented-not-wired" anymore):**
|
||
- `AudioPlayer.createFormatDecoder(contentType)` (`AudioPlayer.ts:117`) dispatches on `Content-Type`:
|
||
`audio/mpeg|audio/mp3` → `Mp3FormatDecoder`, `audio/flac|audio/x-flac` → `FlacFormatDecoder`,
|
||
default → `WavFormatDecoder`. All three decoders exist and implement `IFormatDecoder`.
|
||
- `IFormatDecoder` (`IFormatDecoder.ts`) is a clean per-format strategy: `tryParseHeader`,
|
||
`getAlignedSegmentSize`, `wrapSegment`, `calculateByteOffset`, plus a `FormatInfo` carrying
|
||
`byteRate`, `blockAlign`, `audioDataOffset`, and a `seekData` accelerator slot (already polymorphic:
|
||
`Mp3VbrSeekData | FlacSeekData`). **This is the seam an `OpusFormatDecoder` slots into.**
|
||
- **Correction to the Phase 21 spec's §2 C3 note** ("MP3/FLAC implemented, not yet wired"): the
|
||
registry *is* wired and dispatches on content-type today. Phase 21's invariant still holds; the
|
||
parenthetical is stale and is corrected by this phase's reconciliation.
|
||
|
||
**What this means for the gap.** Daniel's direction is **not** "add format support" — that substrate
|
||
exists. It is "**derive a second, low-data artifact (Opus fullband 320) at ingest and let the listener
|
||
pick which to stream.**" That is two genuinely new things: (1) a **transcode-at-ingest** step that
|
||
produces a derived artifact per track (the router stores originals; nothing derives Opus), and (2) a
|
||
**per-format delivery selection** so the same track can be served as either WAV or Opus on request.
|
||
|
||
---
|
||
|
||
## 1. Goal
|
||
|
||
**Dual-format delivery.** Every track is streamable in two formats:
|
||
|
||
- **Lossless** — the existing WAV path, unchanged. The archival / audiophile option.
|
||
- **Low-data** — a derived **Ogg Opus, fullband, 320 kbps** artifact. The bandwidth-friendly
|
||
default-candidate.
|
||
|
||
The listener chooses; Opus is the recommended default. The bespoke Web Audio decode→schedule graph is
|
||
**retained by deliberate choice** (Daniel) — Opus is fed through the same `IFormatDecoder` strategy
|
||
seam, not through an HTML `<media>` element or MSE.
|
||
|
||
**Why Opus fullband 320.** Opus is the modern, royalty-free, best-in-class lossy codec; "fullband"
|
||
(48 kHz, full 20 kHz audio bandwidth) at 320 kbps is transparent-to-most-listeners quality at roughly
|
||
**1/4 to 1/5 the bytes of 16-bit/44.1 stereo WAV** (~1411 kbps). For a 1 GB DJ MIX (Phase 9 `Mix`
|
||
medium), that is the difference between a ~1 GB transfer and a ~220 MB transfer — the headline
|
||
low-data win, and directly relevant to the Phase 21 long-stream case.
|
||
|
||
**Non-goals.** This phase does not retire WAV (it stays as the lossless option), does not change the
|
||
bespoke graph for MSE (explicitly rejected — see §2 / Phase 21 OQ5), and does not add new transport
|
||
mechanisms beyond the existing stream + `Range` primitive.
|
||
|
||
---
|
||
|
||
## 2. Constraints / invariants (the contract that must hold)
|
||
|
||
- **C1 — Keep the bespoke Web Audio graph. MSE is rejected (Daniel, deliberate).** The custom
|
||
decode→schedule graph is a long-term commitment, not a stopgap. Opus is fed through the existing
|
||
`IFormatDecoder` → `StreamDecoder` → `PlaybackScheduler` pipeline. (This is the same decision
|
||
recorded as **Phase 21 OQ5 = NO**; the two phases share it.)
|
||
- **C2 — Preprocessing is additive; the WAV path is untouched.** The Opus artifact is a **second
|
||
derived artifact per track**, not a replacement. The existing WAV in the `tracks` vault stays
|
||
byte-for-byte as it is today; the lossless stream path is unchanged. A track with no Opus artifact
|
||
(legacy rows, or a transcode that hasn't run yet) must still play losslessly — Opus is strictly
|
||
additive.
|
||
- **C3 — Reuse the landed `Range`/offset seek path; do not fork it.** Phase 4's
|
||
`Range: bytes=X-` → `206` primitive (client `TrackMediaClient` → `DeepDrftPublic` proxy →
|
||
`DeepDrftAPI`) is the substrate for Opus seek too. Opus seek math differs from WAV (VBR /
|
||
container-paged, see §3.4) but it is expressed through the **same** `IFormatDecoder.calculateByteOffset`
|
||
seam the MP3/FLAC decoders already use — no second seek mechanism.
|
||
- **C4 — Opus slots the `IFormatDecoder` registry; no format branches leak elsewhere.** The new
|
||
`OpusFormatDecoder` is selected by `AudioPlayer.createFormatDecoder` on `Content-Type:
|
||
audio/ogg`/`audio/opus`. The rest of the player stack stays format-agnostic. No `if (opus)` outside
|
||
the decoder and the one selection point.
|
||
- **C5 — Format selection is a delivery-time decision, resolved server-side from a listener
|
||
signal.** The same `TrackEntity` / `EntryKey` addresses both artifacts; the *format* is a parameter
|
||
on the stream request (query param or `Accept` negotiation — see §3.3), not a different track id and
|
||
not a different vault entry key. One track, two renderings (the standing "one source, multiple
|
||
views" preference applied to delivery).
|
||
- **C6 — Transcode failure must not block ingest.** If the Opus transcode fails or is slow, the
|
||
track still persists with its lossless artifact and is playable. Opus is generated best-effort and
|
||
can be (re)generated later — mirror the **waveform-datum** model (`WaveformProfileService`: compute
|
||
on upload, regenerate on demand via a CMS action), which is exactly the "derived artifact, generated
|
||
at ingest, regenerable" pattern this needs.
|
||
- **C7 — The vault model holds: derived artifact is a new entry, not a mutation.** The Opus bytes
|
||
live in the FileDatabase under the track's `EntryKey` — either in the existing `tracks` vault under
|
||
a derived key, or in a new sibling vault (see §3.2 options). Either way it is `AudioBinary` with the
|
||
`.opus`/`.ogg` extension and correct MIME, registered like any other vault resource.
|
||
|
||
---
|
||
|
||
## 3. Architectural shape
|
||
|
||
### 3.0 The mental model
|
||
|
||
A track has one **source artifact** (the uploaded WAV/MP3/FLAC, stored as-is today) and gains one
|
||
**derived low-data artifact** (Ogg Opus fullband 320, produced at ingest). The stream endpoint serves
|
||
*either*, selected per request. The player picks a decoder by the response `Content-Type` exactly as
|
||
it does today. Seeking uses the same `Range` primitive; the byte↔time math is the decoder's job.
|
||
|
||
```
|
||
INGEST (DeepDrftContent + DeepDrftAPI)
|
||
upload → AudioProcessorRouter (existing) → store SOURCE artifact in vault [unchanged]
|
||
→ TRANSCODE to Opus 320 → store DERIVED artifact [NEW]
|
||
→ WaveformProfileService (existing, unchanged)
|
||
|
||
DELIVERY (DeepDrftAPI → DeepDrftPublic proxy → DeepDrftPublic.Client → Interop/audio)
|
||
GET api/track/{id}?format=opus|lossless → serve the chosen artifact's bytes (+ Range) [NEW param]
|
||
player: createFormatDecoder(Content-Type) → OpusFormatDecoder | Wav | Mp3 | Flac [+1 decoder]
|
||
```
|
||
|
||
### 3.1 Where the transcode lives (relative to existing processing)
|
||
|
||
The transcode is a **new processor sibling** to the existing format processors, invoked **after** the
|
||
source is stored, in the same orchestration that already calls `WaveformProfileService`:
|
||
|
||
- It belongs in `DeepDrftContent` (the binary-content domain library) as e.g. an
|
||
`OpusTranscodeService` / `OpusProcessor`, **not** in a host and **not** in a controller (per the
|
||
`*.Services`-owns-domain-logic convention).
|
||
- It is invoked from `UnifiedTrackService.UploadAsync` (the same place `WaveformProfileService`
|
||
computes the high-res datum on every new track) and from the **replace-audio** path (which already
|
||
regenerates both waveform datums — Opus is the third derived thing to regenerate there).
|
||
- Like the waveform datum, it gets a **regenerate trigger**: a CMS per-track / bulk action and an
|
||
ApiKey-gated endpoint, so existing tracks can be backfilled. This mirrors the landed
|
||
"Generate All Profiles / Backfill High-res" bulk actions on `Releases.razor` — **Backfill Opus**
|
||
is the natural third bulk action.
|
||
|
||
**The transcode engine itself is staff-engineer's call** (FFmpeg/libopus via a process invocation, a
|
||
managed binding, or a libopus P/Invoke). The spec fixes the *artifact* (Ogg Opus, fullband, 320 kbps)
|
||
and the *seam* (a derived artifact produced post-store, regenerable, failure-tolerant), not the tool.
|
||
Note a real operational constraint to flag for implementation: transcoding a 1 GB WAV is **CPU- and
|
||
time-expensive** and must not block the upload response — it wants the same off-the-hot-path treatment
|
||
the upload body staging already gets (`Upload:StagingPath`). This is the single biggest implementation
|
||
risk and is called out as such. The execution model is now **decided** (OQ6): **the source is stored and
|
||
the track is playable (lossless) first, then the Opus transcode runs as a background job** — see §3.1a
|
||
for the user-visible consequence on the upload UI.
|
||
|
||
### 3.1a Transcode execution model + the Post-Processing upload phase (RESOLVED — OQ6)
|
||
|
||
**Execution model (Daniel, 2026-06-23): background process *after* the file is available.** The upload
|
||
flow is now two distinct server-side stages with a hard ordering:
|
||
|
||
1. **Transfer + store + persist (existing, synchronous).** The WAV body streams in (the landed
|
||
`ProgressStreamContent` two-phase cancellation), the source is stored in the vault, the `TrackEntity`
|
||
is persisted, the waveform datums are computed. At the end of this stage **the track is fully playable
|
||
losslessly** — nothing about Opus gates a successful upload.
|
||
2. **Opus transcode (NEW, background, after stage 1 completes).** A queued/background job reads the
|
||
stored source, transcodes to Ogg Opus 320, builds the **seek index** and extracts the **setup header**
|
||
(§3.4a), and stores all three derived artifacts. Until it finishes, `?format=opus` for that track
|
||
falls back to lossless (C2). On failure the track stays lossless-only and is eligible for Backfill-Opus
|
||
(C6).
|
||
|
||
**The upload progress meter gains a visible Post-Processing phase.** The CMS upload forms
|
||
(`BatchUpload.razor` / `BatchEdit.razor`) already render a progress meter driven by `ProgressStreamContent`
|
||
(byte-transfer progress) and the two-phase cancellation (idle window during transfer, response-wait budget
|
||
after the body completes). The transcode is a **third visible phase** appended to that meter — after the
|
||
existing "uploading bytes" and "server is persisting" phases, a **Post-Processing** phase reflects the
|
||
background transcode's status (queued → transcoding → done / failed). This is an *addition* to the
|
||
existing meter, not a new UI.
|
||
|
||
- The admin sees: bytes transfer → server persists (track now exists + plays lossless) → **Post-Processing**
|
||
(Opus being derived). The form may complete/return the admin to the catalogue after stage 1 (the track
|
||
is live); the Post-Processing phase can continue to report against that track in the browse/release view
|
||
(the Opus waveform/profile columns on `Releases.razor` already poll-and-show per-track derived-artifact
|
||
status — Post-Processing status fits the same affordance family).
|
||
- **How status reaches the UI is staff-engineer's call** (poll the track's Opus-artifact presence, an SSE/
|
||
long-poll job channel, or a status field on the track read). The spec fixes that the phase is *visible*
|
||
and *non-blocking* — the admin is never made to wait on the transcode to consider the upload done.
|
||
- This composes with the **always-on** decision (OQ4): every upload triggers the background transcode;
|
||
there is no per-upload opt-out, so the Post-Processing phase always appears.
|
||
|
||
### 3.2 Where the Opus artifact is stored (two options)
|
||
|
||
**Option S1 — derived key in the existing `tracks` vault (recommended).** Store the Opus bytes under
|
||
a derived entry key alongside the source, e.g. `{entryKey}` for source and `{entryKey}.opus` (or a
|
||
parallel key convention) in the same `tracks` vault. *Pro:* no new vault type, co-located with the
|
||
source, simplest lookup. *Con:* mixes two artifacts per logical track in one vault's index.
|
||
|
||
**Option S2 — a new sibling vault (e.g. `track-opus`).** Mirror the `track-waveforms` precedent
|
||
(Phase 12 added a dedicated vault for the derived high-res datum). Opus bytes keyed by the same
|
||
`EntryKey` in a `track-opus` vault. *Pro:* clean separation of source vs. derived, matches the
|
||
established "derived artifacts get their own vault" pattern (`track-waveforms`), easy to enumerate /
|
||
backfill / purge independently. *Con:* one more vault to register.
|
||
|
||
**Recommendation: S2** — it is the pattern the codebase already chose for the *other* derived
|
||
per-track artifact (the high-res waveform datum), so it is the least surprising and keeps the source
|
||
`tracks` vault meaning exactly one thing. **Final call is staff-engineer's**; both are viable.
|
||
|
||
### 3.3 How a listener's format choice reaches the bytes
|
||
|
||
The stream endpoint gains a **format selector**. Two candidate mechanisms:
|
||
|
||
- **D-a — explicit query param** `GET api/track/{id}?format=opus|lossless` (recommended). Mirrors the
|
||
existing `offset` query param the proxy already forwards (`TrackProxyController`). Explicit,
|
||
cache-friendly (distinct URLs), trivial to thread through the proxy, and the player already knows
|
||
which it asked for. Server resolves the param → the right artifact → sets the right `Content-Type`,
|
||
which the player's existing `createFormatDecoder` then dispatches on. **No new decoder-selection
|
||
mechanism** — the response content-type does the work it already does.
|
||
- **D-b — HTTP content negotiation** (`Accept: audio/ogg` vs `audio/wav`). More "correct" REST, but
|
||
the proxy + WASM client wiring is fussier and caches are content-type-varied. Not worth it here.
|
||
|
||
**Recommended: D-a.** The selection *policy* (which format a given listener gets by default, and how
|
||
they switch) is a genuine **product call — see OQ1/OQ2**, deliberately not decided here. The
|
||
*mechanism* (a query param resolved server-side to an artifact + content-type) is settled.
|
||
|
||
Server-side fallback rule (C2): if `format=opus` is requested but no Opus artifact exists for that
|
||
track (not yet transcoded / backfilled), the endpoint **falls back to lossless** rather than 404ing —
|
||
Opus is additive, so its absence degrades to "you get the lossless one," never to "no audio."
|
||
|
||
### 3.4 The Opus decoder (the genuinely new decode work)
|
||
|
||
`OpusFormatDecoder implements IFormatDecoder` is the new code on the delivery side. **Ogg Opus is a
|
||
containerized, paged format — not raw-frame-sliceable** the way WAV PCM is. WAV's `wrapSegment` prepends a
|
||
44-byte PCM header to any PCM-aligned byte run; the current model assumes you can wrap an arbitrary aligned
|
||
raw-audio slice and hand it to `decodeAudioData`. Ogg Opus is page-structured (Ogg pages carrying Opus
|
||
packets, plus mandatory `OpusHead`/`OpusTags` **setup pages** at the very start). A mid-stream byte slice
|
||
is **not** independently decodable: it needs (1) the setup header prepended, and (2) to begin on an Ogg
|
||
**page boundary**. So:
|
||
|
||
- `OpusFormatDecoder.getAlignedSegmentSize` aligns to **Ogg page boundaries** — scan for the `OggS`
|
||
capture pattern (analogous to FLAC's frame-sync scan; the `IFormatDecoder` interface already passes
|
||
`rawData` to `getAlignedSegmentSize` for exactly this reason).
|
||
- `wrapSegment` / the continuation path **prepends the `OpusHead`/`OpusTags` setup bytes** to a mid-stream
|
||
page run before handing it to `decodeAudioData` (analogous to FLAC's `streamInfoBytes` carry in
|
||
`FlacSeekData`). The setup bytes come from the **setup-header mechanism** (§3.4a), not from re-reading
|
||
the stream start.
|
||
- A new `OpusSeekData` variant joins `Mp3VbrSeekData | FlacSeekData` in the `seekData` accelerator slot —
|
||
but for Opus it carries the **accurate seek index** (§3.4a), not a heuristic TOC.
|
||
|
||
**The `IFormatDecoder` abstraction already has the shape for both needs** — a format-specific `seekData`
|
||
accelerator and a setup-bytes carry — because FLAC needed the same kind of thing. The genuinely new part
|
||
is **where the seek index and setup header come from**, which §3.4a designs.
|
||
|
||
> **Seek is NOT approximate for Opus (Daniel, 2026-06-23 — supersedes the earlier hand-wave).** An earlier
|
||
> draft of this section proposed "granule-position/Ogg-page interpolation" — a best-effort approximate
|
||
> offset, the Opus analogue of MP3's Xing TOC. **That is rejected.** Daniel: *"Killing seeking for
|
||
> decoding is unacceptable… Raw bytes offset for seeking is no longer adequate due to VBR. We need an
|
||
> accurate transfer function for seek time → true file byte offset."* Opus seeking is **accurate**, backed
|
||
> by a precomputed index built at transcode time. See §3.4a.
|
||
|
||
**Browser decode-support constraint (real, must be designed around).** The bespoke graph decodes
|
||
segments via `AudioContext.decodeAudioData`. Ogg-Opus support in `decodeAudioData` is long-standing in
|
||
Chrome and Firefox but arrived in **Safari only at 18.4 (macOS 15.4 / iOS 18.4, March 2025)**; older
|
||
Safari decodes Opus only in a CAF container, not Ogg. iOS Safari is a primary music-listening surface,
|
||
so this is not a corner case. Implications: (1) the **lossless WAV path is the universal fallback** for
|
||
listeners whose browser can't decode Ogg Opus — which C2's additive design already provides for free;
|
||
(2) the format default is **capability-gated** (OQ2, RESOLVED) — don't hand Ogg Opus to a Safari that
|
||
can't decode it; detect support (a probe `decodeAudioData` on a tiny Opus blob, or a UA/version gate) and
|
||
fall back to lossless. This intersects Phase 1.7 (Safari compatibility) and is flagged there too.
|
||
([Browser support: caniuse / WebKit 18.4 release notes — see Sources.])
|
||
|
||
### 3.4a VBR-safe accurate seeking (the seek-index artifact + the setup-header mechanism)
|
||
|
||
This is the architectural core of the Opus delivery path, and it must compose with **Phase 21 windowed
|
||
refill** (where most of the stream is *not* in memory). The requirement, decomposed from Daniel's
|
||
direction:
|
||
|
||
1. Seeking must be preserved for Opus **without** having the full PCM decoded in memory.
|
||
2. Raw byte-offset seek is inadequate — a VBR Opus stream has **no linear time↔byte relationship**, so
|
||
`byteRate` math and even rough page interpolation are not accurate enough.
|
||
3. We need an **accurate transfer function: seek-time → true file byte offset.**
|
||
4. The decode setup header must be **available separately** (or cached before seeking past it), because a
|
||
mid-stream slice is undecodable without `OpusHead`/`OpusTags`.
|
||
|
||
**The key insight: the one moment we already walk the entire encoded stream is the transcode.** That is
|
||
precisely when an accurate index can be built for free. We never have to guess at delivery time — we read
|
||
the answer out of a precomputed artifact.
|
||
|
||
#### A. The seek-index artifact (the accurate transfer function)
|
||
|
||
At transcode time, after the Opus bytes are produced, **walk the encoded Ogg stream once and record, for
|
||
each Ogg page (or coarser bucket), the page's `granulepos` (a 48 kHz sample count → time) paired with its
|
||
**byte offset** in the file.** That granule→byte table *is* the exact transfer function. This is the Opus
|
||
analogue of FLAC's `SEEKTABLE` / MP3's Xing TOC — but **precomputed and exact**, not derived by
|
||
interpolation guessing. Ogg granule positions are authoritative sample counts, so the mapping is true, not
|
||
estimated.
|
||
|
||
- **What it contains.** An ordered list of `(timeSeconds | granulepos, byteOffset)` entries, plus the
|
||
total duration and total byte length (for clamping a seek to range). A binary little-endian array of
|
||
fixed-width records is the natural shape (e.g. a `uint64 granulepos` + `uint64 byteOffset` per entry);
|
||
the exact encoding is staff-engineer's, but it should be a **compact binary blob**, fetched once and
|
||
parsed into a typed array client-side.
|
||
- **Granularity vs. size — RESOLVED: 0.5 s (half-second) buckets (Daniel, 2026-06-23).** One entry per
|
||
Ogg page is the most precise but largest; an Ogg page is typically a few KB of audio (~tens of ms to a
|
||
few hundred ms), so a 1-hour mix could be tens of thousands of pages. The chosen bucket is **one index
|
||
entry per 0.5 seconds of audio** (snap each bucket boundary to the *nearest enclosing page start*, so
|
||
every indexed offset is still an exact page boundary). At 0.5 s granularity a 1-hour mix is
|
||
~7,200 entries × 16 bytes ≈ **~115 KB** — still a trivial one-time fetch, and 0.5 s seek resolution is
|
||
finer than required (the decoder re-syncs to the exact page within the bucket anyway — see the client
|
||
flow — so the in-bucket trim is *sub-half-second*, tighter than the earlier ~1–2 s recommendation).
|
||
**Per-page precision remains the fallback if 0.5 s buckets ever prove too coarse**, at a larger index.
|
||
The bucket size is now fixed; the *shape* (precomputed exact granule→byte, bucketed, snapped to page
|
||
starts) is unchanged.
|
||
- **Sidecar, not embedded (recommended).** Store the index as a **third derived artifact** alongside the
|
||
Opus bytes and the waveform datum — the same "derived artifacts get their own vault" pattern this phase
|
||
already uses (S2 / `track-opus`; the `track-waveforms` precedent). Keep it a separate vault resource
|
||
(e.g. `{entryKey}.seekidx` in a `track-opus` vault, or its own `track-opus-index` vault) rather than
|
||
embedding it in the Ogg stream. *Why sidecar:* it is fetched **once, up front** (small, cacheable),
|
||
independent of the audio byte stream; embedding it in the Ogg would force the client to read into the
|
||
stream to find it, defeating the "resolve the offset *before* the Range fetch" flow. *Road not taken —
|
||
derive the index lazily on first seek by scanning server-side:* rejected, because it re-walks the stream
|
||
at request time (the cost we avoid by computing at transcode) and gives nothing the precomputed sidecar
|
||
doesn't.
|
||
|
||
#### B. The setup-header mechanism (decodability of any mid-stream slice)
|
||
|
||
Any post-seek slice needs `OpusHead` + `OpusTags` prepended to decode. Two ways to make those bytes
|
||
available to the client:
|
||
|
||
- **B-a — Client-side caching of the leading setup pages on first read (recommended).** On first play, the
|
||
stream already begins at byte 0, so the client *already receives* the `OpusHead`/`OpusTags` pages as the
|
||
opening bytes. `OpusFormatDecoder.tryParseHeader` captures and **retains** those setup bytes (exactly as
|
||
`WavFormatDecoder` retains the parsed WAV header for `reinitializeForRangeContinuation` today, and FLAC
|
||
retains `streamInfoBytes`). Every subsequent post-seek continuation prepends the cached setup bytes. *No
|
||
new endpoint;* it reuses the header-retention discipline already in the codebase.
|
||
- **B-b — A dedicated setup-header sidecar endpoint** (`GET api/track/{id}/opus/header` → just the
|
||
`OpusHead`/`OpusTags` bytes, also derivable at transcode time and stored as a tiny artifact). *Pro:* a
|
||
seek can be served even if the listener seeks **before** the stream start has been read (e.g. a deep-link
|
||
that begins mid-track, or a Phase 21 window that opens away from byte 0). *Con:* one more endpoint +
|
||
artifact.
|
||
|
||
**Recommendation: B-a as the primary, B-b as a cheap insurance artifact.** B-a covers the overwhelming
|
||
common case (play-then-seek) with **zero new surface** — it is the WAV-header-retention pattern applied to
|
||
Opus. But Phase 21 windowing and deep-links can legitimately open a window that never read byte 0, so the
|
||
setup header should **also** be derivable on demand. Cheapest reconciliation: **extract the setup bytes at
|
||
transcode time and store them as a tiny sidecar artifact** (they are a few hundred bytes), and expose them
|
||
**either** as a small endpoint **or** simply prepend them to the seek-index sidecar's header region so the
|
||
single up-front index fetch *also* delivers the setup bytes. The latter folds B-b into the B-a fetch: **the
|
||
client's one up-front sidecar fetch returns both the seek index and the setup header**, so it always has
|
||
both before it ever issues a seek — and never needs byte 0 to have been read. **Recommended concrete
|
||
design: one sidecar per track = `[setup-header bytes][seek-index table]`, fetched once on track load,
|
||
parsed into `OpusSeekData`.** This is the cleanest: one new artifact, one new fetch, both needs met.
|
||
|
||
#### C. The client-side seek flow, end to end
|
||
|
||
With the sidecar (`OpusSeekData` = setup header + granule→byte index) fetched and parsed at track load:
|
||
|
||
1. **Resolve time → byte offset (accurate).** Listener seeks to `t` seconds. `OpusFormatDecoder.calculateByteOffset(t)`
|
||
does a binary search in the index for the largest entry with `time ≤ t`, returns its exact (page-start)
|
||
`byteOffset`. **No interpolation, no `byteRate` math.** (For WAV this method stays the exact CBR
|
||
calculation it is today — the seam is identical; only the Opus implementation reads an index.)
|
||
2. **Range fetch from the offset.** Issue `GET api/track/{id}?format=opus` with `Range: bytes={byteOffset}-`
|
||
— the **landed Phase 4 Range primitive, unchanged**. Server streams raw Opus bytes from that exact page
|
||
boundary (`206 Partial Content`).
|
||
3. **Prepend the cached setup header + decode.** The continuation path (the Opus analogue of
|
||
`StreamDecoder.reinitializeForRangeContinuation`) prepends the retained/sidecar `OpusHead`/`OpusTags`
|
||
bytes to the incoming page run, then feeds it to `decodeAudioData`. Because the index offset is an exact
|
||
page start, the stream is immediately Ogg-sync-aligned.
|
||
4. **Fine re-sync within the bucket.** The granule of the first decoded page tells the decoder the *exact*
|
||
time it landed at (≤ the bucket granularity ahead of `t`); the scheduler trims/positions to land
|
||
playback at `t` precisely. With 0.5 s buckets the trim is sub-half-second; with per-page granularity it
|
||
is near-zero. **Either way the listener lands at the correct time, not approximately** (AC9).
|
||
|
||
#### D. Composition with Phase 21 windowed refill
|
||
|
||
Phase 21's windowed refill controller resolves "I need bytes for playback position `P`" → a byte offset →
|
||
a Range fetch. **It calls the *same* `OpusFormatDecoder.calculateByteOffset` (the index-based resolver)
|
||
for Opus** that an explicit seek does — windowed refill is just a seek the listener didn't initiate. So the
|
||
seek index serves both: explicit seeks and the window's low-water refills both resolve through the index,
|
||
and both prepend the cached setup header. This is why §3.4a is in **Phase 18** (where the transcode that
|
||
builds the index lives), and Phase 21 *consumes* it. The Phase 21 spec's "approximate mapping" language for
|
||
Opus is now wrong and is corrected to **"accurate index-based mapping."**
|
||
|
||
#### E. Reuse vs. extend (the seam discipline)
|
||
|
||
- **Reused verbatim:** the Phase 4 `Range: bytes=X-` → 206 primitive (client → proxy → API); the
|
||
`IFormatDecoder.calculateByteOffset` seam; the header-retention/continuation discipline
|
||
(`reinitializeForRangeContinuation`'s Opus analogue); the derived-artifact-in-its-own-vault pattern
|
||
(`track-waveforms` → `track-opus`); the derive-at-transcode-regenerate-on-backfill lifecycle.
|
||
- **Extended (new):** the seek-index + setup-header **sidecar artifact** (built at transcode, stored
|
||
beside the Opus bytes); the one-time **sidecar fetch** on track load (parsed into `OpusSeekData`); the
|
||
index **binary-search resolver** inside `OpusFormatDecoder`. Three additions, all leaf-level — no change
|
||
to the Range mechanism, the proxy, or the format-agnostic player.
|
||
|
||
### 3.5 The three candidate directions (shape-level)
|
||
|
||
Per file convention the alternatives are recorded; the recommendation follows.
|
||
|
||
**Direction A — Derived Opus artifact at ingest + format param on delivery (recommended).** What §3.1
|
||
–3.4a describe: transcode to Opus 320 post-store as a **background job** (OQ6), store as derived artifacts
|
||
(S2 vault) — the Opus bytes **plus the seek-index/setup-header sidecar** (§3.4a) — serve via a `?format=`
|
||
param resolved server-side to bytes + content-type, decode via a new `OpusFormatDecoder` in the existing
|
||
registry, **seek accurately via the precomputed index**. *Why recommended:* additive (C2), reuses every
|
||
existing seam (the processor orchestration, the waveform-datum derived-artifact pattern, the `Range` path,
|
||
the decoder registry, the header-retention discipline), and the only genuinely new code is one transcode
|
||
step (+ index build) + one decoder (+ index resolver). **Three** derived artifacts per track (Opus bytes,
|
||
seek sidecar, and the existing waveform datum), all regenerable.
|
||
|
||
**Direction B — On-the-fly transcode at delivery (no stored Opus artifact).** Transcode WAV→Opus per
|
||
request in the stream endpoint, streaming the Opus out as it encodes. *Why not (default):* moves
|
||
expensive CPU onto the **hot request path** (a 1 GB mix transcoded per play is untenable), breaks
|
||
`Range`/seek (you can't byte-offset into a stream you're encoding live), and defeats caching. It *is*
|
||
storage-cheaper (no second artifact on disk), so it is the fallback only if disk cost ever dominates —
|
||
but for a music site where the same tracks are played repeatedly, precompute-once wins decisively.
|
||
Rejected as the primary.
|
||
|
||
**Direction C — Replace WAV ingest with Opus-only (transcode and discard the lossless source).** Make
|
||
Opus *the* stored format; drop WAV. *Why not:* violates Daniel's explicit "lossless streaming
|
||
*optional* — two delivery formats, listener gets a choice." Lossless is a kept option, not a thing to
|
||
transcode away. Also irreversibly lossy at ingest (you can never recover the WAV). Rejected outright;
|
||
recorded only because "just store Opus" is the tempting simplification and the spec should say why not.
|
||
|
||
### 3.6 SOLID / road-not-taken rationale
|
||
|
||
- **OCP, via the existing seams.** The transcode is a new processor sibling (the router pattern is
|
||
already open for extension); the decoder is a new `IFormatDecoder` (the registry is already open for
|
||
extension); the artifact is a new derived vault resource (the `track-waveforms` precedent is exactly
|
||
this). Phase 18 adds **three new leaf implementations** and **zero changes to existing format code**
|
||
— the strongest possible OCP signal that the seams were designed right.
|
||
- **SRP, preserved.** Transcoding **and the seek-index build** are content-domain processor concerns
|
||
(`DeepDrftContent`); delivery selection is a thin endpoint concern (`DeepDrftAPI` resolves a param to an
|
||
artifact, and serves the sidecar); decode is the `OpusFormatDecoder`'s concern; byte↔time math stays
|
||
inside that decoder via `calculateByteOffset` (now reading the index, not interpolating). No
|
||
responsibility crosses a boundary it doesn't already own. The seek index is built **once, where the
|
||
stream is already walked** (transcode) — the natural home for an exact transfer function, never
|
||
recomputed at request time.
|
||
- **DIP / "one source, multiple views."** One `TrackEntity`/`EntryKey` is the single source; "lossless
|
||
WAV" and "low-data Opus" are two *views* (renderings) of it, diverging only at the delivery/decode
|
||
layer — the same discipline the dark-mode and track-browse surfaces follow.
|
||
- **Road not taken — a separate `TrackEntity` row (or a new track id) per format.** Tempting (one row
|
||
= one streamable file) but it fractures the track identity: shares, queues, play-counts (Phase 16),
|
||
release membership, and waveform data all key on one track, and doubling rows to carry a format
|
||
would force every one of those surfaces to dedupe. Format is a *delivery attribute of one track*,
|
||
not a *second track*. Rejected — keep one identity, two artifacts.
|
||
|
||
---
|
||
|
||
## 4. Format selection — the product surface (RESOLVED — global, via a Settings menu)
|
||
|
||
**Resolved (Daniel, 2026-06-23):** the listener's quality choice is **global** (one session/visitor-level
|
||
"streaming quality" preference, not per-track), Opus is the **default** (capability-gated), and the choice
|
||
is **remembered** following the dark-mode persistence pattern. Crucially: *"Global is perfect, but we need
|
||
a menu system for settings, don't just slap the quality control directly in the app bar."* So the toggle
|
||
does **not** sit bare in the app bar — it lives inside a proper **public-site Settings menu** (§4a), of
|
||
which it is the **first occupant**.
|
||
|
||
- **What the listener sees.** A Settings affordance in the public app bar opens a Settings menu; inside it,
|
||
a "Streaming quality" control with two options — **Low-data (Opus)** / **Lossless (WAV)** — defaulting to
|
||
Low-data. Picking lossless flips the global preference; the player sends the matching `?format=` on
|
||
subsequent stream requests (§3.3). On a browser that can't decode Ogg Opus, the control is shown but the
|
||
effective stream is lossless (capability gate, §3.4 / OQ2) — surface this honestly rather than letting
|
||
the listener pick a format that silently can't play.
|
||
- **Default before any choice:** Opus, capability-gated (OQ2 RESOLVED). A first-time visitor on a capable
|
||
browser streams Opus; on an incapable browser, lossless.
|
||
- **Persistence:** mirror the dark-mode seam exactly (OQ3 RESOLVED) — see §4a.
|
||
|
||
### 4a. The Settings menu surface (NEW — scoping + the dark-mode persistence pattern)
|
||
|
||
Daniel asked for a **menu system for settings**, not a control bolted onto the app bar, and noted the
|
||
existing **dark-mode toggle** is a natural future tenant of the same menu (design for adaptability — build
|
||
the menu so dark mode *could* move into it later, but **do not force that migration now**).
|
||
|
||
**Scoping recommendation: a small sub-track *within* Phase 18 (wave 18.6), not its own phase.** Reasoning:
|
||
|
||
- The menu's only **required** occupant right now is the quality toggle, which Phase 18 owns end to end —
|
||
splitting the shell into a separate phase would create a phase whose sole deliverable is an empty menu
|
||
waiting for Phase 18's toggle. That is ceremony, not separation of concerns.
|
||
- The menu is **small** — an app-bar trigger + a MudBlazor menu/popover + the persistence seam (which the
|
||
quality toggle needs *anyway*). It is not a platform; it is a container with one tenant.
|
||
- It carries a real **design-for-adaptability** obligation (it must be able to host dark mode and future
|
||
settings later), but that is a *shape* requirement on a small surface, not a phase's worth of work.
|
||
|
||
So: **build the Settings-menu shell as part of Phase 18 (wave 18.6), with the quality toggle as its first
|
||
occupant, designed so dark mode and future preferences can plug in without restructuring.** Flag for
|
||
Daniel: *if he wants the menu shell proven/landed independently before the quality toggle plugs in*, 18.6
|
||
can be split into "menu shell" then "quality toggle plugs in" — but they are small enough to land together.
|
||
This is **not** recommended as its own top-level phase. (If Daniel disagrees and wants a dedicated
|
||
"Public Settings Menu" phase that Phase 18's toggle then targets, that is a clean alternative — it just
|
||
front-loads a surface with no second tenant yet. Recommendation stands: sub-track.)
|
||
|
||
**The menu shell — design-for-adaptability requirements (so it survives new tenants):**
|
||
|
||
- A **settings-item abstraction**, not a hard-coded list. The menu renders a small set of settings entries;
|
||
adding dark mode later is adding an entry, not rewiring the menu. Each entry is a label + a control bound
|
||
to a persisted preference.
|
||
- A **single public-site settings object** carrying all listener preferences (today: streaming quality;
|
||
tomorrow: dark mode, and whatever follows). This is the `DarkModeSettings` analogue, generalized — call
|
||
it e.g. `PublicSiteSettings` / `ListenerSettings`. Dark mode's existing `DarkModeSettings` can fold into
|
||
it *later* without disturbing the menu.
|
||
|
||
**Persistence — mirror the dark-mode seam exactly (OQ3 RESOLVED).** The quality preference follows the
|
||
*identical* path dark mode already uses (root `CLAUDE.md` "Theming and dark mode"):
|
||
|
||
1. **Cookie** — a `streamQuality` cookie (365-day, like `darkMode`), the durable truth.
|
||
2. **Server prerender read** — a service in `DeepDrftPublic` (sibling to `DarkModeService`) reads the
|
||
cookie during prerender and seeds the settings object, avoiding a wrong-default flash on first paint
|
||
(the streaming-quality analogue of the "wrong theme flash" fix).
|
||
3. **`PersistentComponentState` bridge** — the seeded preference carries from server prerender into the
|
||
WASM render (the same bridge `DarkModeSettings` and `NowPlayingStats`/`StatsClient` already use), so the
|
||
client boots already knowing the quality without a re-read flash or a re-fetch.
|
||
4. **Client cookie service** — a runtime client-side service (JS-interop cookie write, like the dark-mode
|
||
toggle) persists the choice when the listener changes it in the menu.
|
||
|
||
**Why mirror rather than invent:** the dark-mode seam is the codebase's established, working pattern for "a
|
||
listener preference seeded at prerender, carried to WASM, persisted in a cookie." Reusing its shape means
|
||
the quality preference inherits the no-flash guarantee for free, and the eventual dark-mode-into-the-menu
|
||
migration is a *consolidation of two identical seams*, not a reconciliation of two different ones. (This is
|
||
the "one source, multiple views" / design-for-adaptability discipline applied to listener settings.)
|
||
|
||
---
|
||
|
||
## 5. Use cases
|
||
|
||
- **UC1 — Listener streams the low-data Opus of a long mix (the headline win).** A ~1 GB lossless mix
|
||
transfers as ~220 MB of Opus; playback through the bespoke graph is identical in feel, far cheaper
|
||
on bandwidth. (Compounds with Phase 21 windowing for the memory side.)
|
||
- **UC2 — Listener prefers lossless and switches to it.** The same track served as WAV via
|
||
`?format=lossless`; the bespoke graph decodes it exactly as today.
|
||
- **UC3 — Legacy / not-yet-transcoded track.** `?format=opus` requested, no Opus artifact yet →
|
||
server falls back to lossless (C2); the listener still hears the track. A later Backfill-Opus pass
|
||
produces the artifact.
|
||
- **UC4 — Admin backfills Opus for the existing catalogue.** A bulk "Backfill Opus" CMS action (the
|
||
third sibling to the existing Generate-Profiles / Backfill-High-res actions) transcodes every track
|
||
lacking an Opus artifact.
|
||
- **UC5 — Replace-audio regenerates Opus.** The existing replace-audio path (which already regenerates
|
||
both waveform datums and re-derives duration) also regenerates the Opus artifact from the new
|
||
source.
|
||
- **UC6 — Seek within an Opus stream (accurately).** Backward/forward seek resolves via the existing
|
||
`Range` path; the offset comes from the `OpusFormatDecoder`'s **precomputed seek index** (§3.4a) — an
|
||
exact granule→byte lookup, then fine re-sync to the requested time within the bucket. The listener lands
|
||
at the **correct** time, not approximately, and without the full PCM decoded in memory.
|
||
- **UC7 — Safari that can't decode Ogg Opus.** Capability-gated to the lossless path (§3.4), so the
|
||
listener still plays audio. (Ties to OQ2 + Phase 1.7.)
|
||
- **UC8 — Listener switches streaming quality in the Settings menu.** The listener opens the public
|
||
Settings menu, flips "Streaming quality" from Low-data to Lossless (or back); the choice persists
|
||
(cookie, dark-mode pattern) and applies to subsequent stream requests via `?format=`. On next visit the
|
||
preference is seeded at prerender (no flash, no re-pick). (§4 / §4a.)
|
||
- **UC9 — Deep-link / windowed start away from byte 0.** A listener opens a stream at a mid-track position
|
||
(deep link, or a Phase 21 window that opens past byte 0) without ever reading the stream start. The
|
||
decoder still has the `OpusHead`/`OpusTags` setup bytes because they arrived with the up-front sidecar
|
||
fetch (§3.4a B), so the mid-stream slice is decodable immediately. (Composition case for Phase 21.)
|
||
|
||
---
|
||
|
||
## 6. Open questions — RESOLVED (Daniel, 2026-06-23)
|
||
|
||
All seven open questions are resolved. Kept visible per file convention, each with the decision and
|
||
the section that now carries it. OQ7 (raised by the seek-model design) is a narrow tuning call, now set to
|
||
0.5 s buckets.
|
||
|
||
- **OQ1 — Selection UX — RESOLVED: global, via a Settings *menu* (not a bare app-bar control).** Daniel:
|
||
*"Global is perfect, but we need a menu system for settings, don't just slap the quality control directly
|
||
in the app bar."* So: one global quality preference, surfaced inside a new **public-site Settings menu**
|
||
(§4 / §4a), of which the quality toggle is the first occupant. The menu is scoped as a **Phase 18
|
||
sub-track (wave 18.6)**, designed so dark mode (its natural future tenant) can plug in later. `[RESOLVED
|
||
— §4 / §4a]`
|
||
- **OQ2 — Default policy — RESOLVED: Opus by default, capability-gated.** Opus is the default; on a browser
|
||
that cannot decode Ogg Opus (Safari < 18.4, §3.4), fall back to lossless rather than serving an
|
||
undecodable stream. Network-awareness (Opus on cellular / lossless on wifi) remains **deferred** as
|
||
gold-plating. `[RESOLVED — §3.4, §4]`
|
||
- **OQ3 — Remembered choice — RESOLVED: persisted, following the dark-mode pattern.** A `streamQuality`
|
||
cookie seeded at server prerender → settings object → `PersistentComponentState` bridge into WASM →
|
||
client cookie service for runtime writes. The full dark-mode seam mirrored (§4a). `[RESOLVED — §4a]`
|
||
- **OQ4 — Per-upload Opus control — RESOLVED: always-on + backfill.** Opus is generated for **every**
|
||
track, always (no per-upload opt-out). **Plus** a bulk **Backfill-Opus** CMS action processes the
|
||
existing catalogue. (The listener's lossless choice already covers "I want lossless," so a per-track
|
||
opt-out earns nothing.) `[RESOLVED — §3.1, UC4, wave 18.5]`
|
||
- **OQ5 — Container — RESOLVED: Ogg Opus.** `.opus` / `audio/ogg` (broadest `decodeAudioData` support). No
|
||
CAF/WebM fallback — the lossless path already covers browsers that can't decode Ogg Opus (§3.4).
|
||
`[RESOLVED — §3.4]`
|
||
- **OQ6 — Transcode execution model — RESOLVED: background job after the file is available; uploader shows
|
||
a Post-Processing phase.** The source is stored and the track is playable losslessly **first**; the Opus
|
||
transcode (+ seek-index build) runs as a **background job** afterward; the CMS upload progress meter
|
||
gains a visible **Post-Processing** phase reflecting the transcode status (§3.1a). A freshly uploaded
|
||
track is lossless-only until its Opus finishes — accepted, and now made visible rather than implicit.
|
||
`[RESOLVED — §3.1a]`
|
||
|
||
**New open question raised by the seek-model design (§3.4a) — RESOLVED:**
|
||
|
||
- **OQ7 — Seek-index granularity — RESOLVED: 0.5 s (half-second) buckets (Daniel, 2026-06-23).** The seek
|
||
index trades precision against size: per-Ogg-page (most precise, largest) vs. coarser time buckets snapped
|
||
to page starts. Daniel set the bucket at **0.5 s** (finer than the ~1–2 s the spec had recommended):
|
||
~7,200 entries × 16 bytes ≈ **~115 KB** for a 1-hour mix — still a trivial one-time fetch. The decoder
|
||
fine-re-syncs within the bucket so seek *accuracy* is unaffected; at 0.5 s the in-bucket trim is
|
||
sub-half-second, tighter than before. The shape (precomputed exact granule→byte, page-snapped) is
|
||
unchanged. `[RESOLVED — §3.4a A]`
|
||
|
||
---
|
||
|
||
## 7. Acceptance criteria
|
||
|
||
- **AC1 (headline) — Dual-format delivery works.** A track can be streamed as either lossless WAV or
|
||
Ogg Opus 320 from the same `EntryKey`, selected per request; both play correctly through the bespoke
|
||
Web Audio graph.
|
||
- **AC2 — Opus is the low-data win.** The Opus artifact of a representative track is materially smaller
|
||
than its lossless source (target ~1/4–1/5 the bytes); a long mix's Opus transfer is correspondingly
|
||
smaller.
|
||
- **AC3 — Additive, non-breaking (C2).** The existing lossless WAV path is byte-for-byte unchanged; a
|
||
track with no Opus artifact still plays losslessly; `?format=opus` on such a track falls back to
|
||
lossless (no 404, no silence).
|
||
- **AC4 — Transcode at ingest as a background job, regenerable (C6, OQ6).** A new upload stores the source
|
||
and is playable losslessly **immediately**; the Opus artifact (+ seek-index/setup-header sidecar) is
|
||
produced by a **background job** afterward; a transcode failure does not block the upload or break
|
||
playback; a Backfill-Opus action (re)generates artifacts for existing tracks; replace-audio regenerates
|
||
the Opus artifact and its sidecar from the new source.
|
||
- **AC4a — Post-Processing phase is visible on the upload meter (OQ6, §3.1a).** After the byte-transfer and
|
||
server-persist phases, the CMS upload progress UI shows a **Post-Processing** phase reflecting the
|
||
background transcode (queued → transcoding → done/failed). The admin is never blocked waiting on the
|
||
transcode; the track is live before Post-Processing finishes.
|
||
- **AC5 — Opus seek via the existing `Range` path (C3).** Forward and backward seek in an Opus stream
|
||
resolve through the landed `Range: bytes=X-` primitive, with the offset coming from
|
||
`OpusFormatDecoder.calculateByteOffset`; no new seek *transport* mechanism is introduced.
|
||
- **AC5a — Seek-index + setup-header sidecar exists and is fetched once (§3.4a).** Every track with an Opus
|
||
artifact has a sidecar carrying the setup header (`OpusHead`/`OpusTags`) and the granule→byte seek index;
|
||
the client fetches and parses it once on track load (into `OpusSeekData`) before issuing any seek.
|
||
- **AC9 (the seek-accuracy criterion) — an Opus seek lands at the *correct* time, not approximately.**
|
||
Seeking to time `t` in an Opus stream resolves via the precomputed index and lands playback at `t`
|
||
(within the fine-resync tolerance — sub-half-second at the chosen 0.5 s bucket granularity), **measurably
|
||
accurate**, not a `byteRate`/interpolation estimate. Verifiable: seek to a known marker (e.g. a downbeat
|
||
at a known timestamp) and confirm playback resumes there, not seconds off. This holds **without** the
|
||
full PCM decoded in memory (composes with Phase 21).
|
||
- **AC6 — No format branches leak (C4).** The only Opus-specific code is `OpusFormatDecoder`, its
|
||
`OpusSeekData` (carrying the index), the one `createFormatDecoder` selection arm, the transcode processor
|
||
(+ index build), the sidecar artifact + its serving, and the delivery param resolution. The
|
||
format-agnostic player/scheduler code is unchanged.
|
||
- **AC7 — Capability-safe default (OQ2).** A browser that cannot decode Ogg Opus is served (or falls
|
||
back to) the lossless path and plays audio; no listener gets silence because of codec support.
|
||
- **AC8 — Windowing-ready (the Phase 21 handshake).** The `OpusFormatDecoder`'s **index-based** byte↔time
|
||
resolver is the one Phase 21's windowed refill calls; Opus playback must be windowable by the same
|
||
machinery, and a windowed refill that opens away from byte 0 still decodes (setup header from the
|
||
sidecar — UC9). Verified jointly when Phase 21 lands on top (see §8 / Phase 21 cross-ref).
|
||
- **AC10 — The Settings menu hosts the quality toggle and persists the choice (§4 / §4a).** The public app
|
||
bar opens a Settings menu containing a "Streaming quality" control (Low-data / Lossless, defaulting to
|
||
Low-data, capability-gated); changing it persists via the `streamQuality` cookie and is seeded at
|
||
prerender on the next visit (no flash). The menu shell is built so a future dark-mode entry can plug in
|
||
without restructuring.
|
||
|
||
---
|
||
|
||
## 8. Wave decomposition
|
||
|
||
Dependency shape: `18.1 → 18.2 → {18.3, 18.4}`, with `18.5` (backfill + e2e) and `18.6` (settings menu)
|
||
on top. **18.1 (the transcode + seek-index/setup-header derived artifacts) is the cold-start
|
||
prerequisite** — until those artifacts exist, nothing downstream has bytes to serve, decode, or seek
|
||
against. 18.3 (delivery param) and 18.4 (the decoder + index resolver) are largely parallel once 18.2
|
||
(storage/lookup) settles, but both need artifacts to test against. **18.6 (the Settings menu) is the only
|
||
wave with no audio-pipeline dependency** — it can proceed in parallel with the whole stack; it merely needs
|
||
the `?format=` mechanism (18.3) wired before the toggle has anything to drive.
|
||
|
||
- **18.1 — Ingest transcode + seek-index + setup-header (cold-start; load-bearing).** New
|
||
`OpusTranscodeService`/processor in `DeepDrftContent`, invoked post-store from
|
||
`UnifiedTrackService.UploadAsync` alongside `WaveformProfileService`, **as a background job** (OQ6,
|
||
§3.1a); produces Ogg Opus fullband 320; **walks the encoded stream once to build the granule→byte seek
|
||
index and extract the `OpusHead`/`OpusTags` setup header** (§3.4a A/B); stores the Opus bytes **and** the
|
||
combined seek/setup **sidecar** as derived artifacts (S2 vault recommended). Failure-tolerant (C6).
|
||
**Independent of the delivery/decoder waves; can begin immediately.**
|
||
- **18.2 — Storage + lookup contract.** The derived-artifact key/vault convention (Opus bytes + sidecar)
|
||
and the server-side resolution "given `EntryKey` + format, return the right `AudioBinary` + content-type
|
||
(+ the sidecar on its own endpoint/path)," including the C2 fallback (no Opus → lossless). **Depends on
|
||
18.1** (artifacts must exist to resolve to).
|
||
- **18.3 — Delivery: format param + sidecar serving + proxy threading.** `?format=opus|lossless` on the
|
||
`DeepDrftAPI` track stream endpoint (resolves via 18.2), forwarded through the `DeepDrftPublic`
|
||
`TrackProxyController` (mirror the existing `offset` param threading), and the `Range` handler serving
|
||
the chosen artifact's bytes; **plus serving the seek/setup sidecar** (a `GET …/opus/seekdata`-style path,
|
||
proxied the same way). The player sends the format param via `TrackMediaClient`. **Depends on 18.2.**
|
||
Parallel-ok with 18.4.
|
||
- **18.4 — `OpusFormatDecoder` + the index-based seek resolver in the player stack.** New `IFormatDecoder`
|
||
implementation: Ogg-page-aligned `getAlignedSegmentSize` via `OggS` scan; `OpusHead`/`OpusTags` setup
|
||
carry in `wrapSegment`/the continuation path (sourced from the cached sidecar, §3.4a B); **`calculateByteOffset`
|
||
that binary-searches the precomputed seek index** (NOT interpolation), with an `OpusSeekData` accelerator
|
||
holding the parsed index + setup bytes; the **one-time sidecar fetch + parse** on track load. One new arm
|
||
in `AudioPlayer.createFormatDecoder` on `audio/ogg`/`audio/opus`. Capability detection for the lossless
|
||
fallback (§3.4, OQ2). **Depends on 18.2** (needs Opus bytes + sidecar). Parallel-ok with 18.3; they meet
|
||
at 18.5.
|
||
- **18.5 — Backfill + replace-audio + end-to-end validation (incl. seek accuracy).** The Backfill-Opus CMS
|
||
bulk action (third sibling to Generate-Profiles / Backfill-High-res), which (re)builds Opus bytes + the
|
||
sidecar for existing tracks; replace-audio Opus + sidecar regeneration; and the AC1–AC10 acceptance pass
|
||
— **including AC9 (an Opus seek lands at the correct time, not approximately)** and AC8's confirmation
|
||
that Opus is windowable (index resolver + sidecar setup header) so Phase 21 can build on it. **Depends on
|
||
18.1–18.4.**
|
||
- **18.6 — Public Settings menu + the quality toggle (the listener selection UX).** The new public-site
|
||
Settings-menu shell (§4a): an app-bar trigger + MudBlazor menu hosting a settings-item abstraction, the
|
||
`PublicSiteSettings`/`ListenerSettings` object, and the dark-mode-pattern persistence seam (`streamQuality`
|
||
cookie + a `DeepDrftPublic` prerender-read service + `PersistentComponentState` bridge + client cookie
|
||
service). The **quality toggle is its first occupant** (Low-data/Lossless, Opus default, capability-gated),
|
||
driving the `?format=` the player sends (needs 18.3). Built design-for-adaptability so dark mode can plug
|
||
in later without restructuring (not migrated now). **Depends on 18.3** (the toggle needs the format
|
||
mechanism); the menu *shell* can be built ahead of that. *Splittable* into "menu shell" + "toggle plugs
|
||
in" if Daniel wants the shell proven first — but small enough to land together (§4a).
|
||
|
||
---
|
||
|
||
## 9. Cross-references (read before implementing)
|
||
|
||
- `CONTEXT.md §5` "Non-WAV formats" — the deferred intent this phase realizes (now concrete: derived
|
||
Opus low-data path, not generic format support).
|
||
- `PLAN.md` Phase 21 / `product-notes/phase-21-windowed-streaming-buffer.md` — **sequenced AFTER this
|
||
phase.** Phase 21's C5 invariant ("WAV-only shipping target; must not foreclose MP3/FLAC") is now
|
||
driven by Opus's VBR/paged seek math; Phase 21 OQ5 (adopt MSE) is resolved **NO** — the bespoke
|
||
graph stays (the same C1 decision recorded here). Windowing a VBR/Opus stream uses
|
||
`OpusFormatDecoder.calculateByteOffset`'s **accurate index-based mapping** (§3.4a — *not* the earlier
|
||
"approximate page-interpolation"; that language in the Phase 21 doc is corrected). Phase 21's windowed
|
||
refill calls the **same** index resolver an explicit seek does (§3.4a D), and a window that opens away
|
||
from byte 0 still decodes via the sidecar setup header (UC9).
|
||
- `PLAN.md` Phase 4 (landed) / `COMPLETED.md` — the HTTP `Range: bytes=X-` primitive Opus seek reuses.
|
||
- `PLAN.md` Phase 1.5 (gapless) / 1.6 (track-skip on error) / 1.7 (Safari) — 1.5's "encoder
|
||
padding/priming" caveat applies to Opus (it has pre-skip samples in `OpusHead`); 1.6's
|
||
byte-scan-to-next-frame is the Ogg-page-sync analogue; 1.7's Safari floor intersects §3.4's Ogg-Opus
|
||
`decodeAudioData` support (Safari < 18.4).
|
||
- `PLAN.md` Phase 12 / `product-notes/phase-12-waveform-visualizer-generalization.md` — the
|
||
`WaveformProfileService` derived-artifact-at-ingest + regenerate pattern this transcode mirrors
|
||
(compute on upload, regenerate via CMS action / endpoint, its own `track-waveforms` vault → the S2
|
||
precedent).
|
||
- `PLAN.md` Phase 9 — defines the `Mix` medium (single long track), the canonical low-data case.
|
||
- `PLAN.md` Phase 16 — play/share telemetry keys on one track identity; the §3.6 road-not-taken
|
||
(one-row-per-format) would have fractured this — kept to one identity, two artifacts.
|
||
- `DeepDrftContent/Processors/AudioProcessor.cs` + `AudioProcessorRouter` + `DeepDrftContent/CLAUDE.md`
|
||
— the existing format-router and the `WaveformProfileService` derived-artifact seam; 18.1 lives here.
|
||
- `DeepDrftPublic/Interop/audio/IFormatDecoder.ts` — the strategy interface `OpusFormatDecoder`
|
||
implements; `FlacFormatDecoder.ts` is the nearest prior art (setup-bytes carry + frame-sync scan).
|
||
- `DeepDrftPublic/Interop/audio/AudioPlayer.ts` (`createFormatDecoder`, lines 117–125) — the decoder
|
||
registry gaining the Opus arm.
|
||
- `DeepDrftPublic.Client/Clients/TrackMediaClient.cs` + `DeepDrftPublic/Controllers/TrackProxyController.cs`
|
||
— the media fetch + proxy that thread the new `?format=` param (mirroring `offset`), and proxy the new
|
||
seek/setup sidecar fetch.
|
||
- Root `CLAUDE.md` "Theming and dark mode" + `DarkModeService` (in `DeepDrftPublic`) + `DarkModeSettings`
|
||
(`DeepDrftPublic.Client.Common`) — the cookie → prerender-read → `PersistentComponentState` → client
|
||
cookie-service seam the **streaming-quality preference** (§4a) mirrors exactly; the eventual dark-mode-
|
||
into-the-Settings-menu migration consolidates two copies of this seam.
|
||
- `DeepDrftPublic.Client` `NowPlayingStats.razor` / `StatsClient` — the `PersistentComponentState`
|
||
prerender-bridge precedent (prerender fetch carried into WASM without a re-fetch/flash), the pattern the
|
||
quality preference's bridge follows; see the `tracksview-persistent-state-seam` auto-memory.
|
||
|
||
## Sources
|
||
|
||
- Ogg Opus support in `decodeAudioData`: Chrome/Firefox long-standing; Safari added Ogg-Opus at 18.4
|
||
(macOS 15.4 / iOS 18.4, March 2025) — prior Safari decoded Opus only in CAF.
|
||
https://chromestatus.com/feature/5649634416394240 ;
|
||
https://www.testmuai.com/learning-hub/opus-audio-codec-browser-support/
|