docs(plan): Phase 18 OQ resolutions + VBR-safe accurate Opus seek model
This commit is contained in:
@@ -1,8 +1,17 @@
|
||||
# Phase 18 — Opus Low-Data Streaming (dual-format lossless + Opus delivery)
|
||||
|
||||
Product spec. Status: **design / framing — implementation-ready pending Daniel's open-question calls.**
|
||||
Product spec. Status: **design / framing — open questions RESOLVED (Daniel, 2026-06-23); implementation-ready.**
|
||||
Author: product-designer. Date: 2026-06-23. **No code has been written by this doc.**
|
||||
|
||||
> **Resolution pass (Daniel, 2026-06-23).** OQ1–OQ6 are resolved (see §6 — each marked RESOLVED, kept
|
||||
> visible per file convention). Two resolutions reshaped the spec materially: (a) the listener quality
|
||||
> selection lives inside a **new public-site Settings menu surface** (not a bare app-bar control) — §4 +
|
||||
> §4a; and (b) Daniel rejected the "approximate page-interpolation" seek hand-wave outright — **VBR-safe
|
||||
> *accurate* seeking is now a first-class part of the architecture** (a precomputed seek-index artifact +
|
||||
> a separately-available setup header). §3.4 is rewritten and a dedicated seek-model section (§3.4a)
|
||||
> added. The Phase 21 cross-reference is updated to read "accurate index-based mapping," not
|
||||
> "approximate."
|
||||
|
||||
This phase is the concrete realization of the long-deferred **"Non-WAV formats"** intent
|
||||
(`CONTEXT.md §5`, the "1.2" the streaming-feature items reference). It supersedes the abstract "a
|
||||
processor per format + a decoder strategy" framing with a specific, Daniel-directed product: **two
|
||||
@@ -13,12 +22,17 @@ Surfaces (named precisely):
|
||||
|
||||
- **Ingest / preprocessing:** `DeepDrftContent` (`AudioProcessor` / `AudioProcessorRouter` /
|
||||
`TrackContentService` / `WaveformProfileService`) + `DeepDrftAPI` (upload/persist —
|
||||
`UnifiedTrackService.UploadAsync`, replace-audio) + `DeepDrftManager` (CMS upload form, only if a
|
||||
per-upload control is wanted — see OQ4).
|
||||
- **Delivery / decode:** `DeepDrftAPI` (the track stream endpoint + `Range` handler) +
|
||||
`DeepDrftPublic` proxy (`TrackProxyController`) + `DeepDrftPublic.Client` player stack
|
||||
(`StreamingAudioPlayerService`, `TrackMediaClient`) + `DeepDrftPublic/Interop/audio` TS decoders
|
||||
(`AudioPlayer.createFormatDecoder` registry, a new `OpusFormatDecoder`).
|
||||
`UnifiedTrackService.UploadAsync`, replace-audio) + `DeepDrftManager` (CMS upload form — the
|
||||
**Post-Processing phase** on the existing upload progress meter, §3.1a).
|
||||
- **Delivery / decode:** `DeepDrftAPI` (the track stream endpoint + `Range` handler + the new
|
||||
**seek-index** and **setup-header** sidecar endpoints, §3.4a) + `DeepDrftPublic` proxy
|
||||
(`TrackProxyController`) + `DeepDrftPublic.Client` player stack (`StreamingAudioPlayerService`,
|
||||
`TrackMediaClient`) + `DeepDrftPublic/Interop/audio` TS decoders (`AudioPlayer.createFormatDecoder`
|
||||
registry, a new `OpusFormatDecoder`).
|
||||
- **Listener settings (NEW surface):** `DeepDrftPublic.Client` — a public-site **Settings menu** (app-bar
|
||||
menu/popover) hosting the quality toggle as its first occupant, with a dark-mode-pattern persistence
|
||||
seam (cookie → settings object → `PersistentComponentState` → client cookie service). §4a. The
|
||||
prerender-cookie read lives in `DeepDrftPublic` (alongside `DarkModeService`).
|
||||
|
||||
**Sequencing headline: Phase 18 comes BEFORE Phase 21 (Windowed Streaming Buffer).** Phase 21's
|
||||
windowing must work across both formats — its C5 invariant already anticipated this ("must not
|
||||
@@ -165,8 +179,44 @@ managed binding, or a libopus P/Invoke). The spec fixes the *artifact* (Ogg Opus
|
||||
and the *seam* (a derived artifact produced post-store, regenerable, failure-tolerant), not the tool.
|
||||
Note a real operational constraint to flag for implementation: transcoding a 1 GB WAV is **CPU- and
|
||||
time-expensive** and must not block the upload response — it wants the same off-the-hot-path treatment
|
||||
the upload body staging already gets (`Upload:StagingPath`), likely a background/queued step. This is
|
||||
the single biggest implementation risk and is called out as such.
|
||||
the upload body staging already gets (`Upload:StagingPath`). This is the single biggest implementation
|
||||
risk and is called out as such. The execution model is now **decided** (OQ6): **the source is stored and
|
||||
the track is playable (lossless) first, then the Opus transcode runs as a background job** — see §3.1a
|
||||
for the user-visible consequence on the upload UI.
|
||||
|
||||
### 3.1a Transcode execution model + the Post-Processing upload phase (RESOLVED — OQ6)
|
||||
|
||||
**Execution model (Daniel, 2026-06-23): background process *after* the file is available.** The upload
|
||||
flow is now two distinct server-side stages with a hard ordering:
|
||||
|
||||
1. **Transfer + store + persist (existing, synchronous).** The WAV body streams in (the landed
|
||||
`ProgressStreamContent` two-phase cancellation), the source is stored in the vault, the `TrackEntity`
|
||||
is persisted, the waveform datums are computed. At the end of this stage **the track is fully playable
|
||||
losslessly** — nothing about Opus gates a successful upload.
|
||||
2. **Opus transcode (NEW, background, after stage 1 completes).** A queued/background job reads the
|
||||
stored source, transcodes to Ogg Opus 320, builds the **seek index** and extracts the **setup header**
|
||||
(§3.4a), and stores all three derived artifacts. Until it finishes, `?format=opus` for that track
|
||||
falls back to lossless (C2). On failure the track stays lossless-only and is eligible for Backfill-Opus
|
||||
(C6).
|
||||
|
||||
**The upload progress meter gains a visible Post-Processing phase.** The CMS upload forms
|
||||
(`BatchUpload.razor` / `BatchEdit.razor`) already render a progress meter driven by `ProgressStreamContent`
|
||||
(byte-transfer progress) and the two-phase cancellation (idle window during transfer, response-wait budget
|
||||
after the body completes). The transcode is a **third visible phase** appended to that meter — after the
|
||||
existing "uploading bytes" and "server is persisting" phases, a **Post-Processing** phase reflects the
|
||||
background transcode's status (queued → transcoding → done / failed). This is an *addition* to the
|
||||
existing meter, not a new UI.
|
||||
|
||||
- The admin sees: bytes transfer → server persists (track now exists + plays lossless) → **Post-Processing**
|
||||
(Opus being derived). The form may complete/return the admin to the catalogue after stage 1 (the track
|
||||
is live); the Post-Processing phase can continue to report against that track in the browse/release view
|
||||
(the Opus waveform/profile columns on `Releases.razor` already poll-and-show per-track derived-artifact
|
||||
status — Post-Processing status fits the same affordance family).
|
||||
- **How status reaches the UI is staff-engineer's call** (poll the track's Opus-artifact presence, an SSE/
|
||||
long-poll job channel, or a status field on the track read). The spec fixes that the phase is *visible*
|
||||
and *non-blocking* — the admin is never made to wait on the transcode to consider the upload done.
|
||||
- This composes with the **always-on** decision (OQ4): every upload triggers the background transcode;
|
||||
there is no per-upload opt-out, so the Post-Processing phase always appears.
|
||||
|
||||
### 3.2 Where the Opus artifact is stored (two options)
|
||||
|
||||
@@ -206,30 +256,36 @@ Server-side fallback rule (C2): if `format=opus` is requested but no Opus artifa
|
||||
track (not yet transcoded / backfilled), the endpoint **falls back to lossless** rather than 404ing —
|
||||
Opus is additive, so its absence degrades to "you get the lossless one," never to "no audio."
|
||||
|
||||
### 3.4 The Opus decoder + seek math (the genuinely new decode work)
|
||||
### 3.4 The Opus decoder (the genuinely new decode work)
|
||||
|
||||
`OpusFormatDecoder implements IFormatDecoder` is the new code on the delivery side. Two things make it
|
||||
harder than the WAV decoder and need to be flagged:
|
||||
`OpusFormatDecoder implements IFormatDecoder` is the new code on the delivery side. **Ogg Opus is a
|
||||
containerized, paged format — not raw-frame-sliceable** the way WAV PCM is. WAV's `wrapSegment` prepends a
|
||||
44-byte PCM header to any PCM-aligned byte run; the current model assumes you can wrap an arbitrary aligned
|
||||
raw-audio slice and hand it to `decodeAudioData`. Ogg Opus is page-structured (Ogg pages carrying Opus
|
||||
packets, plus mandatory `OpusHead`/`OpusTags` **setup pages** at the very start). A mid-stream byte slice
|
||||
is **not** independently decodable: it needs (1) the setup header prepended, and (2) to begin on an Ogg
|
||||
**page boundary**. So:
|
||||
|
||||
- **Containerized, paged format — not raw-frame-sliceable.** WAV's `wrapSegment` prepends a 44-byte
|
||||
PCM header to any PCM-aligned byte run; the current model assumes you can wrap an arbitrary aligned
|
||||
raw-audio slice and hand it to `decodeAudioData`. **Ogg Opus is page-structured** (Ogg pages
|
||||
carrying Opus packets, plus mandatory `OpusHead`/`OpusTags` setup pages at the start). A mid-stream
|
||||
byte slice is not independently decodable without the setup header and without landing on Ogg page
|
||||
boundaries. So `OpusFormatDecoder`'s `getAlignedSegmentSize` must align to **Ogg page boundaries**
|
||||
(scan for the `OggS` capture pattern — analogous to FLAC's frame-sync scan, for which the
|
||||
`IFormatDecoder` interface already passes `rawData` to `getAlignedSegmentSize`), and
|
||||
`wrapSegment`/the continuation path must carry the `OpusHead` setup (analogous to FLAC's
|
||||
`streamInfoBytes` in `FlacSeekData`). **The `IFormatDecoder` abstraction already has the shape for
|
||||
this** — a format-specific `seekData` accelerator and a setup-bytes carry — because FLAC needed the
|
||||
same kind of thing. A new `OpusSeekData` variant joins `Mp3VbrSeekData | FlacSeekData`.
|
||||
- **VBR byte↔time mapping is approximate (the Phase 21 C5 case, concretely).** Opus at "320 kbps" is
|
||||
effectively VBR; there is no exact `byteRate` for offset math the way CBR WAV has. Seek-by-offset
|
||||
uses an **approximate** mapping (granule-position/Ogg-page interpolation, the Opus analogue of MP3's
|
||||
Xing TOC or FLAC's SEEKTABLE). `calculateByteOffset` returns a best-effort page-aligned offset; the
|
||||
decoder then re-syncs to the next Ogg page. This is exactly the "VBR formats: the mapping is
|
||||
approximate" case Phase 21's C5 invariant anticipated — **Opus is the format that makes that
|
||||
invariant load-bearing rather than hypothetical.**
|
||||
- `OpusFormatDecoder.getAlignedSegmentSize` aligns to **Ogg page boundaries** — scan for the `OggS`
|
||||
capture pattern (analogous to FLAC's frame-sync scan; the `IFormatDecoder` interface already passes
|
||||
`rawData` to `getAlignedSegmentSize` for exactly this reason).
|
||||
- `wrapSegment` / the continuation path **prepends the `OpusHead`/`OpusTags` setup bytes** to a mid-stream
|
||||
page run before handing it to `decodeAudioData` (analogous to FLAC's `streamInfoBytes` carry in
|
||||
`FlacSeekData`). The setup bytes come from the **setup-header mechanism** (§3.4a), not from re-reading
|
||||
the stream start.
|
||||
- A new `OpusSeekData` variant joins `Mp3VbrSeekData | FlacSeekData` in the `seekData` accelerator slot —
|
||||
but for Opus it carries the **accurate seek index** (§3.4a), not a heuristic TOC.
|
||||
|
||||
**The `IFormatDecoder` abstraction already has the shape for both needs** — a format-specific `seekData`
|
||||
accelerator and a setup-bytes carry — because FLAC needed the same kind of thing. The genuinely new part
|
||||
is **where the seek index and setup header come from**, which §3.4a designs.
|
||||
|
||||
> **Seek is NOT approximate for Opus (Daniel, 2026-06-23 — supersedes the earlier hand-wave).** An earlier
|
||||
> draft of this section proposed "granule-position/Ogg-page interpolation" — a best-effort approximate
|
||||
> offset, the Opus analogue of MP3's Xing TOC. **That is rejected.** Daniel: *"Killing seeking for
|
||||
> decoding is unacceptable… Raw bytes offset for seeking is no longer adequate due to VBR. We need an
|
||||
> accurate transfer function for seek time → true file byte offset."* Opus seeking is **accurate**, backed
|
||||
> by a precomputed index built at transcode time. See §3.4a.
|
||||
|
||||
**Browser decode-support constraint (real, must be designed around).** The bespoke graph decodes
|
||||
segments via `AudioContext.decodeAudioData`. Ogg-Opus support in `decodeAudioData` is long-standing in
|
||||
@@ -237,21 +293,145 @@ Chrome and Firefox but arrived in **Safari only at 18.4 (macOS 15.4 / iOS 18.4,
|
||||
Safari decodes Opus only in a CAF container, not Ogg. iOS Safari is a primary music-listening surface,
|
||||
so this is not a corner case. Implications: (1) the **lossless WAV path is the universal fallback** for
|
||||
listeners whose browser can't decode Ogg Opus — which C2's additive design already provides for free;
|
||||
(2) format-default policy (OQ2) should consider capability detection — don't hand Ogg Opus to a Safari
|
||||
that can't decode it. This intersects Phase 1.7 (Safari compatibility) and is flagged there too.
|
||||
(2) the format default is **capability-gated** (OQ2, RESOLVED) — don't hand Ogg Opus to a Safari that
|
||||
can't decode it; detect support (a probe `decodeAudioData` on a tiny Opus blob, or a UA/version gate) and
|
||||
fall back to lossless. This intersects Phase 1.7 (Safari compatibility) and is flagged there too.
|
||||
([Browser support: caniuse / WebKit 18.4 release notes — see Sources.])
|
||||
|
||||
### 3.4a VBR-safe accurate seeking (the seek-index artifact + the setup-header mechanism)
|
||||
|
||||
This is the architectural core of the Opus delivery path, and it must compose with **Phase 21 windowed
|
||||
refill** (where most of the stream is *not* in memory). The requirement, decomposed from Daniel's
|
||||
direction:
|
||||
|
||||
1. Seeking must be preserved for Opus **without** having the full PCM decoded in memory.
|
||||
2. Raw byte-offset seek is inadequate — a VBR Opus stream has **no linear time↔byte relationship**, so
|
||||
`byteRate` math and even rough page interpolation are not accurate enough.
|
||||
3. We need an **accurate transfer function: seek-time → true file byte offset.**
|
||||
4. The decode setup header must be **available separately** (or cached before seeking past it), because a
|
||||
mid-stream slice is undecodable without `OpusHead`/`OpusTags`.
|
||||
|
||||
**The key insight: the one moment we already walk the entire encoded stream is the transcode.** That is
|
||||
precisely when an accurate index can be built for free. We never have to guess at delivery time — we read
|
||||
the answer out of a precomputed artifact.
|
||||
|
||||
#### A. The seek-index artifact (the accurate transfer function)
|
||||
|
||||
At transcode time, after the Opus bytes are produced, **walk the encoded Ogg stream once and record, for
|
||||
each Ogg page (or coarser bucket), the page's `granulepos` (a 48 kHz sample count → time) paired with its
|
||||
**byte offset** in the file.** That granule→byte table *is* the exact transfer function. This is the Opus
|
||||
analogue of FLAC's `SEEKTABLE` / MP3's Xing TOC — but **precomputed and exact**, not derived by
|
||||
interpolation guessing. Ogg granule positions are authoritative sample counts, so the mapping is true, not
|
||||
estimated.
|
||||
|
||||
- **What it contains.** An ordered list of `(timeSeconds | granulepos, byteOffset)` entries, plus the
|
||||
total duration and total byte length (for clamping a seek to range). A binary little-endian array of
|
||||
fixed-width records is the natural shape (e.g. a `uint64 granulepos` + `uint64 byteOffset` per entry);
|
||||
the exact encoding is staff-engineer's, but it should be a **compact binary blob**, fetched once and
|
||||
parsed into a typed array client-side.
|
||||
- **Granularity vs. size (the one real tuning knob).** One entry per Ogg page is the most precise but
|
||||
largest; an Ogg page is typically a few KB of audio (~tens of ms to a few hundred ms), so a 1-hour mix
|
||||
could be tens of thousands of pages. Recommend a **coarser bucket: one index entry per ~1–2 seconds of
|
||||
audio** (snap each bucket boundary to the *nearest enclosing page start*, so every indexed offset is
|
||||
still an exact page boundary). At ~1 s granularity a 1-hour mix is ~3,600 entries × 16 bytes ≈ **~58 KB**
|
||||
— a trivial one-time fetch, and 1 s seek resolution is more than fine (the decoder re-syncs to the exact
|
||||
page within the bucket anyway — see the client flow). **Per-page precision is the fallback if 1 s buckets
|
||||
ever prove too coarse**, at a larger index. The number is staff-engineer's call; the *shape* (precomputed
|
||||
exact granule→byte, bucketed, snapped to page starts) is fixed.
|
||||
- **Sidecar, not embedded (recommended).** Store the index as a **third derived artifact** alongside the
|
||||
Opus bytes and the waveform datum — the same "derived artifacts get their own vault" pattern this phase
|
||||
already uses (S2 / `track-opus`; the `track-waveforms` precedent). Keep it a separate vault resource
|
||||
(e.g. `{entryKey}.seekidx` in a `track-opus` vault, or its own `track-opus-index` vault) rather than
|
||||
embedding it in the Ogg stream. *Why sidecar:* it is fetched **once, up front** (small, cacheable),
|
||||
independent of the audio byte stream; embedding it in the Ogg would force the client to read into the
|
||||
stream to find it, defeating the "resolve the offset *before* the Range fetch" flow. *Road not taken —
|
||||
derive the index lazily on first seek by scanning server-side:* rejected, because it re-walks the stream
|
||||
at request time (the cost we avoid by computing at transcode) and gives nothing the precomputed sidecar
|
||||
doesn't.
|
||||
|
||||
#### B. The setup-header mechanism (decodability of any mid-stream slice)
|
||||
|
||||
Any post-seek slice needs `OpusHead` + `OpusTags` prepended to decode. Two ways to make those bytes
|
||||
available to the client:
|
||||
|
||||
- **B-a — Client-side caching of the leading setup pages on first read (recommended).** On first play, the
|
||||
stream already begins at byte 0, so the client *already receives* the `OpusHead`/`OpusTags` pages as the
|
||||
opening bytes. `OpusFormatDecoder.tryParseHeader` captures and **retains** those setup bytes (exactly as
|
||||
`WavFormatDecoder` retains the parsed WAV header for `reinitializeForRangeContinuation` today, and FLAC
|
||||
retains `streamInfoBytes`). Every subsequent post-seek continuation prepends the cached setup bytes. *No
|
||||
new endpoint;* it reuses the header-retention discipline already in the codebase.
|
||||
- **B-b — A dedicated setup-header sidecar endpoint** (`GET api/track/{id}/opus/header` → just the
|
||||
`OpusHead`/`OpusTags` bytes, also derivable at transcode time and stored as a tiny artifact). *Pro:* a
|
||||
seek can be served even if the listener seeks **before** the stream start has been read (e.g. a deep-link
|
||||
that begins mid-track, or a Phase 21 window that opens away from byte 0). *Con:* one more endpoint +
|
||||
artifact.
|
||||
|
||||
**Recommendation: B-a as the primary, B-b as a cheap insurance artifact.** B-a covers the overwhelming
|
||||
common case (play-then-seek) with **zero new surface** — it is the WAV-header-retention pattern applied to
|
||||
Opus. But Phase 21 windowing and deep-links can legitimately open a window that never read byte 0, so the
|
||||
setup header should **also** be derivable on demand. Cheapest reconciliation: **extract the setup bytes at
|
||||
transcode time and store them as a tiny sidecar artifact** (they are a few hundred bytes), and expose them
|
||||
**either** as a small endpoint **or** simply prepend them to the seek-index sidecar's header region so the
|
||||
single up-front index fetch *also* delivers the setup bytes. The latter folds B-b into the B-a fetch: **the
|
||||
client's one up-front sidecar fetch returns both the seek index and the setup header**, so it always has
|
||||
both before it ever issues a seek — and never needs byte 0 to have been read. **Recommended concrete
|
||||
design: one sidecar per track = `[setup-header bytes][seek-index table]`, fetched once on track load,
|
||||
parsed into `OpusSeekData`.** This is the cleanest: one new artifact, one new fetch, both needs met.
|
||||
|
||||
#### C. The client-side seek flow, end to end
|
||||
|
||||
With the sidecar (`OpusSeekData` = setup header + granule→byte index) fetched and parsed at track load:
|
||||
|
||||
1. **Resolve time → byte offset (accurate).** Listener seeks to `t` seconds. `OpusFormatDecoder.calculateByteOffset(t)`
|
||||
does a binary search in the index for the largest entry with `time ≤ t`, returns its exact (page-start)
|
||||
`byteOffset`. **No interpolation, no `byteRate` math.** (For WAV this method stays the exact CBR
|
||||
calculation it is today — the seam is identical; only the Opus implementation reads an index.)
|
||||
2. **Range fetch from the offset.** Issue `GET api/track/{id}?format=opus` with `Range: bytes={byteOffset}-`
|
||||
— the **landed Phase 4 Range primitive, unchanged**. Server streams raw Opus bytes from that exact page
|
||||
boundary (`206 Partial Content`).
|
||||
3. **Prepend the cached setup header + decode.** The continuation path (the Opus analogue of
|
||||
`StreamDecoder.reinitializeForRangeContinuation`) prepends the retained/sidecar `OpusHead`/`OpusTags`
|
||||
bytes to the incoming page run, then feeds it to `decodeAudioData`. Because the index offset is an exact
|
||||
page start, the stream is immediately Ogg-sync-aligned.
|
||||
4. **Fine re-sync within the bucket.** The granule of the first decoded page tells the decoder the *exact*
|
||||
time it landed at (≤ the bucket granularity ahead of `t`); the scheduler trims/positions to land
|
||||
playback at `t` precisely. With ~1 s buckets the trim is sub-second; with per-page granularity it is
|
||||
near-zero. **Either way the listener lands at the correct time, not approximately** (AC9).
|
||||
|
||||
#### D. Composition with Phase 21 windowed refill
|
||||
|
||||
Phase 21's windowed refill controller resolves "I need bytes for playback position `P`" → a byte offset →
|
||||
a Range fetch. **It calls the *same* `OpusFormatDecoder.calculateByteOffset` (the index-based resolver)
|
||||
for Opus** that an explicit seek does — windowed refill is just a seek the listener didn't initiate. So the
|
||||
seek index serves both: explicit seeks and the window's low-water refills both resolve through the index,
|
||||
and both prepend the cached setup header. This is why §3.4a is in **Phase 18** (where the transcode that
|
||||
builds the index lives), and Phase 21 *consumes* it. The Phase 21 spec's "approximate mapping" language for
|
||||
Opus is now wrong and is corrected to **"accurate index-based mapping."**
|
||||
|
||||
#### E. Reuse vs. extend (the seam discipline)
|
||||
|
||||
- **Reused verbatim:** the Phase 4 `Range: bytes=X-` → 206 primitive (client → proxy → API); the
|
||||
`IFormatDecoder.calculateByteOffset` seam; the header-retention/continuation discipline
|
||||
(`reinitializeForRangeContinuation`'s Opus analogue); the derived-artifact-in-its-own-vault pattern
|
||||
(`track-waveforms` → `track-opus`); the derive-at-transcode-regenerate-on-backfill lifecycle.
|
||||
- **Extended (new):** the seek-index + setup-header **sidecar artifact** (built at transcode, stored
|
||||
beside the Opus bytes); the one-time **sidecar fetch** on track load (parsed into `OpusSeekData`); the
|
||||
index **binary-search resolver** inside `OpusFormatDecoder`. Three additions, all leaf-level — no change
|
||||
to the Range mechanism, the proxy, or the format-agnostic player.
|
||||
|
||||
### 3.5 The three candidate directions (shape-level)
|
||||
|
||||
Per file convention the alternatives are recorded; the recommendation follows.
|
||||
|
||||
**Direction A — Derived Opus artifact at ingest + format param on delivery (recommended).** What §3.1
|
||||
–3.4 describe: transcode to Opus 320 post-store, store as a derived artifact (S2 vault), serve via a
|
||||
`?format=` param resolved server-side to bytes + content-type, decode via a new `OpusFormatDecoder` in
|
||||
the existing registry. *Why recommended:* additive (C2), reuses every existing seam (the processor
|
||||
orchestration, the waveform-datum derived-artifact pattern, the `Range` path, the decoder registry),
|
||||
and the only genuinely new code is one transcode step + one decoder. Two derived artifacts per track,
|
||||
both regenerable.
|
||||
–3.4a describe: transcode to Opus 320 post-store as a **background job** (OQ6), store as derived artifacts
|
||||
(S2 vault) — the Opus bytes **plus the seek-index/setup-header sidecar** (§3.4a) — serve via a `?format=`
|
||||
param resolved server-side to bytes + content-type, decode via a new `OpusFormatDecoder` in the existing
|
||||
registry, **seek accurately via the precomputed index**. *Why recommended:* additive (C2), reuses every
|
||||
existing seam (the processor orchestration, the waveform-datum derived-artifact pattern, the `Range` path,
|
||||
the decoder registry, the header-retention discipline), and the only genuinely new code is one transcode
|
||||
step (+ index build) + one decoder (+ index resolver). **Three** derived artifacts per track (Opus bytes,
|
||||
seek sidecar, and the existing waveform datum), all regenerable.
|
||||
|
||||
**Direction B — On-the-fly transcode at delivery (no stored Opus artifact).** Transcode WAV→Opus per
|
||||
request in the stream endpoint, streaming the Opus out as it encodes. *Why not (default):* moves
|
||||
@@ -274,10 +454,13 @@ recorded only because "just store Opus" is the tempting simplification and the s
|
||||
extension); the artifact is a new derived vault resource (the `track-waveforms` precedent is exactly
|
||||
this). Phase 18 adds **three new leaf implementations** and **zero changes to existing format code**
|
||||
— the strongest possible OCP signal that the seams were designed right.
|
||||
- **SRP, preserved.** Transcoding is a content-domain processor concern (`DeepDrftContent`); delivery
|
||||
selection is a thin endpoint concern (`DeepDrftAPI` resolves a param to an artifact); decode is the
|
||||
`OpusFormatDecoder`'s concern; byte↔time math stays inside that decoder via `calculateByteOffset`.
|
||||
No responsibility crosses a boundary it doesn't already own.
|
||||
- **SRP, preserved.** Transcoding **and the seek-index build** are content-domain processor concerns
|
||||
(`DeepDrftContent`); delivery selection is a thin endpoint concern (`DeepDrftAPI` resolves a param to an
|
||||
artifact, and serves the sidecar); decode is the `OpusFormatDecoder`'s concern; byte↔time math stays
|
||||
inside that decoder via `calculateByteOffset` (now reading the index, not interpolating). No
|
||||
responsibility crosses a boundary it doesn't already own. The seek index is built **once, where the
|
||||
stream is already walked** (transcode) — the natural home for an exact transfer function, never
|
||||
recomputed at request time.
|
||||
- **DIP / "one source, multiple views."** One `TrackEntity`/`EntryKey` is the single source; "lossless
|
||||
WAV" and "low-data Opus" are two *views* (renderings) of it, diverging only at the delivery/decode
|
||||
layer — the same discipline the dark-mode and track-browse surfaces follow.
|
||||
@@ -289,13 +472,77 @@ recorded only because "just store Opus" is the tempting simplification and the s
|
||||
|
||||
---
|
||||
|
||||
## 4. Format selection — the product surface (deliberately under-specified; see OQ1/OQ2)
|
||||
## 4. Format selection — the product surface (RESOLVED — global, via a Settings menu)
|
||||
|
||||
Daniel has **not** specified the selection UX. What is settled by his direction: there are two formats,
|
||||
Opus is the bandwidth-friendly **default-candidate**, lossless is the kept option. What is open: how a
|
||||
listener expresses the choice, whether it is remembered, and whether the default is global or adapts.
|
||||
These are genuine product calls — see §6. The *mechanism* (a `?format=` param the player sends; §3.3)
|
||||
supports any of the policies, so the policy can be decided after the substrate lands.
|
||||
**Resolved (Daniel, 2026-06-23):** the listener's quality choice is **global** (one session/visitor-level
|
||||
"streaming quality" preference, not per-track), Opus is the **default** (capability-gated), and the choice
|
||||
is **remembered** following the dark-mode persistence pattern. Crucially: *"Global is perfect, but we need
|
||||
a menu system for settings, don't just slap the quality control directly in the app bar."* So the toggle
|
||||
does **not** sit bare in the app bar — it lives inside a proper **public-site Settings menu** (§4a), of
|
||||
which it is the **first occupant**.
|
||||
|
||||
- **What the listener sees.** A Settings affordance in the public app bar opens a Settings menu; inside it,
|
||||
a "Streaming quality" control with two options — **Low-data (Opus)** / **Lossless (WAV)** — defaulting to
|
||||
Low-data. Picking lossless flips the global preference; the player sends the matching `?format=` on
|
||||
subsequent stream requests (§3.3). On a browser that can't decode Ogg Opus, the control is shown but the
|
||||
effective stream is lossless (capability gate, §3.4 / OQ2) — surface this honestly rather than letting
|
||||
the listener pick a format that silently can't play.
|
||||
- **Default before any choice:** Opus, capability-gated (OQ2 RESOLVED). A first-time visitor on a capable
|
||||
browser streams Opus; on an incapable browser, lossless.
|
||||
- **Persistence:** mirror the dark-mode seam exactly (OQ3 RESOLVED) — see §4a.
|
||||
|
||||
### 4a. The Settings menu surface (NEW — scoping + the dark-mode persistence pattern)
|
||||
|
||||
Daniel asked for a **menu system for settings**, not a control bolted onto the app bar, and noted the
|
||||
existing **dark-mode toggle** is a natural future tenant of the same menu (design for adaptability — build
|
||||
the menu so dark mode *could* move into it later, but **do not force that migration now**).
|
||||
|
||||
**Scoping recommendation: a small sub-track *within* Phase 18 (wave 18.6), not its own phase.** Reasoning:
|
||||
|
||||
- The menu's only **required** occupant right now is the quality toggle, which Phase 18 owns end to end —
|
||||
splitting the shell into a separate phase would create a phase whose sole deliverable is an empty menu
|
||||
waiting for Phase 18's toggle. That is ceremony, not separation of concerns.
|
||||
- The menu is **small** — an app-bar trigger + a MudBlazor menu/popover + the persistence seam (which the
|
||||
quality toggle needs *anyway*). It is not a platform; it is a container with one tenant.
|
||||
- It carries a real **design-for-adaptability** obligation (it must be able to host dark mode and future
|
||||
settings later), but that is a *shape* requirement on a small surface, not a phase's worth of work.
|
||||
|
||||
So: **build the Settings-menu shell as part of Phase 18 (wave 18.6), with the quality toggle as its first
|
||||
occupant, designed so dark mode and future preferences can plug in without restructuring.** Flag for
|
||||
Daniel: *if he wants the menu shell proven/landed independently before the quality toggle plugs in*, 18.6
|
||||
can be split into "menu shell" then "quality toggle plugs in" — but they are small enough to land together.
|
||||
This is **not** recommended as its own top-level phase. (If Daniel disagrees and wants a dedicated
|
||||
"Public Settings Menu" phase that Phase 18's toggle then targets, that is a clean alternative — it just
|
||||
front-loads a surface with no second tenant yet. Recommendation stands: sub-track.)
|
||||
|
||||
**The menu shell — design-for-adaptability requirements (so it survives new tenants):**
|
||||
|
||||
- A **settings-item abstraction**, not a hard-coded list. The menu renders a small set of settings entries;
|
||||
adding dark mode later is adding an entry, not rewiring the menu. Each entry is a label + a control bound
|
||||
to a persisted preference.
|
||||
- A **single public-site settings object** carrying all listener preferences (today: streaming quality;
|
||||
tomorrow: dark mode, and whatever follows). This is the `DarkModeSettings` analogue, generalized — call
|
||||
it e.g. `PublicSiteSettings` / `ListenerSettings`. Dark mode's existing `DarkModeSettings` can fold into
|
||||
it *later* without disturbing the menu.
|
||||
|
||||
**Persistence — mirror the dark-mode seam exactly (OQ3 RESOLVED).** The quality preference follows the
|
||||
*identical* path dark mode already uses (root `CLAUDE.md` "Theming and dark mode"):
|
||||
|
||||
1. **Cookie** — a `streamQuality` cookie (365-day, like `darkMode`), the durable truth.
|
||||
2. **Server prerender read** — a service in `DeepDrftPublic` (sibling to `DarkModeService`) reads the
|
||||
cookie during prerender and seeds the settings object, avoiding a wrong-default flash on first paint
|
||||
(the streaming-quality analogue of the "wrong theme flash" fix).
|
||||
3. **`PersistentComponentState` bridge** — the seeded preference carries from server prerender into the
|
||||
WASM render (the same bridge `DarkModeSettings` and `NowPlayingStats`/`StatsClient` already use), so the
|
||||
client boots already knowing the quality without a re-read flash or a re-fetch.
|
||||
4. **Client cookie service** — a runtime client-side service (JS-interop cookie write, like the dark-mode
|
||||
toggle) persists the choice when the listener changes it in the menu.
|
||||
|
||||
**Why mirror rather than invent:** the dark-mode seam is the codebase's established, working pattern for "a
|
||||
listener preference seeded at prerender, carried to WASM, persisted in a cookie." Reusing its shape means
|
||||
the quality preference inherits the no-flash guarantee for free, and the eventual dark-mode-into-the-menu
|
||||
migration is a *consolidation of two identical seams*, not a reconciliation of two different ones. (This is
|
||||
the "one source, multiple views" / design-for-adaptability discipline applied to listener settings.)
|
||||
|
||||
---
|
||||
|
||||
@@ -315,48 +562,65 @@ supports any of the policies, so the policy can be decided after the substrate l
|
||||
- **UC5 — Replace-audio regenerates Opus.** The existing replace-audio path (which already regenerates
|
||||
both waveform datums and re-derives duration) also regenerates the Opus artifact from the new
|
||||
source.
|
||||
- **UC6 — Seek within an Opus stream.** Backward/forward seek resolves via the existing `Range` path;
|
||||
the offset is the `OpusFormatDecoder`'s approximate page-aligned mapping (§3.4), re-syncing to the
|
||||
next Ogg page — the VBR analogue of the WAV exact-offset seek.
|
||||
- **UC6 — Seek within an Opus stream (accurately).** Backward/forward seek resolves via the existing
|
||||
`Range` path; the offset comes from the `OpusFormatDecoder`'s **precomputed seek index** (§3.4a) — an
|
||||
exact granule→byte lookup, then fine re-sync to the requested time within the bucket. The listener lands
|
||||
at the **correct** time, not approximately, and without the full PCM decoded in memory.
|
||||
- **UC7 — Safari that can't decode Ogg Opus.** Capability-gated to the lossless path (§3.4), so the
|
||||
listener still plays audio. (Ties to OQ2 + Phase 1.7.)
|
||||
- **UC8 — Listener switches streaming quality in the Settings menu.** The listener opens the public
|
||||
Settings menu, flips "Streaming quality" from Low-data to Lossless (or back); the choice persists
|
||||
(cookie, dark-mode pattern) and applies to subsequent stream requests via `?format=`. On next visit the
|
||||
preference is seeded at prerender (no flash, no re-pick). (§4 / §4a.)
|
||||
- **UC9 — Deep-link / windowed start away from byte 0.** A listener opens a stream at a mid-track position
|
||||
(deep link, or a Phase 21 window that opens past byte 0) without ever reading the stream start. The
|
||||
decoder still has the `OpusHead`/`OpusTags` setup bytes because they arrived with the up-front sidecar
|
||||
fetch (§3.4a B), so the mid-stream slice is decodable immediately. (Composition case for Phase 21.)
|
||||
|
||||
---
|
||||
|
||||
## 6. Open questions for Daniel (genuine product decisions, not implementation detail)
|
||||
## 6. Open questions — RESOLVED (Daniel, 2026-06-23)
|
||||
|
||||
- **OQ1 — Selection UX: how does a listener choose lossless vs. low-data?** Candidates: a global
|
||||
toggle in the player bar / settings ("Stream quality: Low-data / Lossless"); a per-track control; an
|
||||
automatic default with a manual override. Recommend a **single global quality toggle** (player bar
|
||||
or a settings affordance) — it is the Spotify/Bandcamp/SoundCloud idiom (one account/session-level
|
||||
"streaming quality" setting), low-friction, and matches a small-sharp-tool posture better than
|
||||
per-track choosers. `[Daniel decision]`
|
||||
- **OQ2 — Default policy: what does a listener get before they choose?** Opus is the
|
||||
*default-candidate* per Daniel — confirm Opus-by-default. Sub-questions: should the default be
|
||||
**capability-aware** (don't serve Ogg Opus to a browser that can't decode it — §3.4 Safari < 18.4)?
|
||||
Should it be **network-aware** (Opus on cellular, lossless on wifi)? Recommend **Opus by default,
|
||||
capability-gated** (fall back to lossless when the browser can't decode Ogg Opus), and **defer
|
||||
network-awareness** as gold-plating for v1. `[Daniel decision]`
|
||||
- **OQ3 — Is the choice remembered, and at what scope?** Per-session (resets each visit) vs.
|
||||
persisted (cookie/`localStorage`, like the `darkMode` cookie) vs. (future) per-account once identity
|
||||
exists. Recommend **persisted via a cookie/`localStorage` setting**, mirroring the dark-mode
|
||||
precedent — one truth, seeded at prerender, carried to WASM. `[Daniel decision]`
|
||||
- **OQ4 — Per-upload Opus control in the CMS, or always-on?** Should the CMS upload form let an admin
|
||||
opt a track *out* of Opus generation (e.g. a track meant to be lossless-only), or is Opus always
|
||||
generated for every track? Recommend **always-on** (simpler; Opus is additive and cheap to serve;
|
||||
the listener's format choice already covers "I want lossless"). A per-track opt-out is a later
|
||||
refinement if a real need appears. `[Daniel decision]`
|
||||
- **OQ5 — Opus container/extension specifics.** Ogg Opus (`.opus` / `audio/ogg`) is the assumption
|
||||
(broadest `decodeAudioData` support; Daniel said "Ogg Opus"). Confirm — vs. CAF-wrapped Opus (older
|
||||
Safari) or WebM-Opus. Recommend **Ogg Opus** as Daniel directed; CAF-fallback for old Safari is not
|
||||
worth it given the lossless fallback already covers those browsers (§3.4). `[Daniel steer — confirms
|
||||
§3.4, not a blocker]`
|
||||
- **OQ6 — Transcode execution model (flag, leans implementation).** Synchronous-at-upload is a
|
||||
non-starter for 1 GB mixes (§3.1); the realistic options are a background/queued transcode after the
|
||||
source is stored. This is largely staff-engineer's call, but it has a **product-visible
|
||||
consequence**: a freshly uploaded track may be lossless-only for a short window until its Opus
|
||||
artifact finishes. Confirm that "Opus appears shortly after upload, lossless available immediately"
|
||||
is acceptable (it is the waveform-datum model already in place). `[Daniel steer]`
|
||||
All six original open questions are resolved. Kept visible per file convention, each with the decision and
|
||||
the section that now carries it. One new open question (OQ7) is raised by the seek-model design; it is a
|
||||
narrow tuning/scoping call, not a blocker.
|
||||
|
||||
- **OQ1 — Selection UX — RESOLVED: global, via a Settings *menu* (not a bare app-bar control).** Daniel:
|
||||
*"Global is perfect, but we need a menu system for settings, don't just slap the quality control directly
|
||||
in the app bar."* So: one global quality preference, surfaced inside a new **public-site Settings menu**
|
||||
(§4 / §4a), of which the quality toggle is the first occupant. The menu is scoped as a **Phase 18
|
||||
sub-track (wave 18.6)**, designed so dark mode (its natural future tenant) can plug in later. `[RESOLVED
|
||||
— §4 / §4a]`
|
||||
- **OQ2 — Default policy — RESOLVED: Opus by default, capability-gated.** Opus is the default; on a browser
|
||||
that cannot decode Ogg Opus (Safari < 18.4, §3.4), fall back to lossless rather than serving an
|
||||
undecodable stream. Network-awareness (Opus on cellular / lossless on wifi) remains **deferred** as
|
||||
gold-plating. `[RESOLVED — §3.4, §4]`
|
||||
- **OQ3 — Remembered choice — RESOLVED: persisted, following the dark-mode pattern.** A `streamQuality`
|
||||
cookie seeded at server prerender → settings object → `PersistentComponentState` bridge into WASM →
|
||||
client cookie service for runtime writes. The full dark-mode seam mirrored (§4a). `[RESOLVED — §4a]`
|
||||
- **OQ4 — Per-upload Opus control — RESOLVED: always-on + backfill.** Opus is generated for **every**
|
||||
track, always (no per-upload opt-out). **Plus** a bulk **Backfill-Opus** CMS action processes the
|
||||
existing catalogue. (The listener's lossless choice already covers "I want lossless," so a per-track
|
||||
opt-out earns nothing.) `[RESOLVED — §3.1, UC4, wave 18.5]`
|
||||
- **OQ5 — Container — RESOLVED: Ogg Opus.** `.opus` / `audio/ogg` (broadest `decodeAudioData` support). No
|
||||
CAF/WebM fallback — the lossless path already covers browsers that can't decode Ogg Opus (§3.4).
|
||||
`[RESOLVED — §3.4]`
|
||||
- **OQ6 — Transcode execution model — RESOLVED: background job after the file is available; uploader shows
|
||||
a Post-Processing phase.** The source is stored and the track is playable losslessly **first**; the Opus
|
||||
transcode (+ seek-index build) runs as a **background job** afterward; the CMS upload progress meter
|
||||
gains a visible **Post-Processing** phase reflecting the transcode status (§3.1a). A freshly uploaded
|
||||
track is lossless-only until its Opus finishes — accepted, and now made visible rather than implicit.
|
||||
`[RESOLVED — §3.1a]`
|
||||
|
||||
**New open question raised by the seek-model design (§3.4a) — narrow, non-blocking:**
|
||||
|
||||
- **OQ7 — Seek-index granularity (tuning, leans implementation).** The seek index trades precision against
|
||||
size: per-Ogg-page (most precise, largest) vs. coarser time buckets snapped to page starts. Recommend
|
||||
**~1–2 s buckets** (~58 KB for a 1-hour mix at 1 s; the decoder fine-re-syncs within the bucket so seek
|
||||
*accuracy* is unaffected — only the in-bucket trim distance changes). This is largely staff-engineer's
|
||||
call at implementation; flagged because the *number* is a deliberate choice and Daniel may have a feel
|
||||
for acceptable index size vs. in-bucket trim. Does **not** block — the shape (precomputed exact
|
||||
granule→byte, page-snapped) is fixed regardless of the bucket size. `[Daniel steer — not a blocker]`
|
||||
|
||||
---
|
||||
|
||||
@@ -371,56 +635,95 @@ supports any of the policies, so the policy can be decided after the substrate l
|
||||
- **AC3 — Additive, non-breaking (C2).** The existing lossless WAV path is byte-for-byte unchanged; a
|
||||
track with no Opus artifact still plays losslessly; `?format=opus` on such a track falls back to
|
||||
lossless (no 404, no silence).
|
||||
- **AC4 — Transcode at ingest, regenerable (C6).** A new upload produces an Opus artifact best-effort
|
||||
after the source is stored; a transcode failure does not block the upload or break playback; a
|
||||
Backfill-Opus action (re)generates artifacts for existing tracks; replace-audio regenerates the
|
||||
Opus artifact from the new source.
|
||||
- **AC4 — Transcode at ingest as a background job, regenerable (C6, OQ6).** A new upload stores the source
|
||||
and is playable losslessly **immediately**; the Opus artifact (+ seek-index/setup-header sidecar) is
|
||||
produced by a **background job** afterward; a transcode failure does not block the upload or break
|
||||
playback; a Backfill-Opus action (re)generates artifacts for existing tracks; replace-audio regenerates
|
||||
the Opus artifact and its sidecar from the new source.
|
||||
- **AC4a — Post-Processing phase is visible on the upload meter (OQ6, §3.1a).** After the byte-transfer and
|
||||
server-persist phases, the CMS upload progress UI shows a **Post-Processing** phase reflecting the
|
||||
background transcode (queued → transcoding → done/failed). The admin is never blocked waiting on the
|
||||
transcode; the track is live before Post-Processing finishes.
|
||||
- **AC5 — Opus seek via the existing `Range` path (C3).** Forward and backward seek in an Opus stream
|
||||
resolve through the landed `Range: bytes=X-` primitive, with the offset coming from
|
||||
`OpusFormatDecoder.calculateByteOffset`; no new seek mechanism is introduced.
|
||||
`OpusFormatDecoder.calculateByteOffset`; no new seek *transport* mechanism is introduced.
|
||||
- **AC5a — Seek-index + setup-header sidecar exists and is fetched once (§3.4a).** Every track with an Opus
|
||||
artifact has a sidecar carrying the setup header (`OpusHead`/`OpusTags`) and the granule→byte seek index;
|
||||
the client fetches and parses it once on track load (into `OpusSeekData`) before issuing any seek.
|
||||
- **AC9 (the seek-accuracy criterion) — an Opus seek lands at the *correct* time, not approximately.**
|
||||
Seeking to time `t` in an Opus stream resolves via the precomputed index and lands playback at `t`
|
||||
(within the fine-resync tolerance — sub-second at the recommended bucket granularity), **measurably
|
||||
accurate**, not a `byteRate`/interpolation estimate. Verifiable: seek to a known marker (e.g. a downbeat
|
||||
at a known timestamp) and confirm playback resumes there, not seconds off. This holds **without** the
|
||||
full PCM decoded in memory (composes with Phase 21).
|
||||
- **AC6 — No format branches leak (C4).** The only Opus-specific code is `OpusFormatDecoder`, its
|
||||
`OpusSeekData`, the one `createFormatDecoder` selection arm, and the transcode processor + delivery
|
||||
param resolution. The format-agnostic player/scheduler code is unchanged.
|
||||
`OpusSeekData` (carrying the index), the one `createFormatDecoder` selection arm, the transcode processor
|
||||
(+ index build), the sidecar artifact + its serving, and the delivery param resolution. The
|
||||
format-agnostic player/scheduler code is unchanged.
|
||||
- **AC7 — Capability-safe default (OQ2).** A browser that cannot decode Ogg Opus is served (or falls
|
||||
back to) the lossless path and plays audio; no listener gets silence because of codec support.
|
||||
- **AC8 — Windowing-ready (the Phase 21 handshake).** The `OpusFormatDecoder`'s approximate byte↔time
|
||||
mapping is the one Phase 21's windowed refill will call; Opus playback must be windowable by the
|
||||
same machinery (verified jointly when Phase 21 lands on top — see §8 / Phase 21 cross-ref).
|
||||
- **AC8 — Windowing-ready (the Phase 21 handshake).** The `OpusFormatDecoder`'s **index-based** byte↔time
|
||||
resolver is the one Phase 21's windowed refill calls; Opus playback must be windowable by the same
|
||||
machinery, and a windowed refill that opens away from byte 0 still decodes (setup header from the
|
||||
sidecar — UC9). Verified jointly when Phase 21 lands on top (see §8 / Phase 21 cross-ref).
|
||||
- **AC10 — The Settings menu hosts the quality toggle and persists the choice (§4 / §4a).** The public app
|
||||
bar opens a Settings menu containing a "Streaming quality" control (Low-data / Lossless, defaulting to
|
||||
Low-data, capability-gated); changing it persists via the `streamQuality` cookie and is seeded at
|
||||
prerender on the next visit (no flash). The menu shell is built so a future dark-mode entry can plug in
|
||||
without restructuring.
|
||||
|
||||
---
|
||||
|
||||
## 8. Wave decomposition
|
||||
|
||||
Dependency shape: `18.1 → 18.2 → {18.3, 18.4}`, with `18.5` validating end-to-end. **18.1 (the
|
||||
transcode/derived-artifact ingest) is the cold-start prerequisite** — until an Opus artifact exists,
|
||||
nothing downstream has bytes to serve or decode. 18.3 (delivery param) and 18.4 (the decoder) are
|
||||
largely parallel once 18.2 (storage/lookup) settles, but both need an artifact to test against.
|
||||
Dependency shape: `18.1 → 18.2 → {18.3, 18.4}`, with `18.5` (backfill + e2e) and `18.6` (settings menu)
|
||||
on top. **18.1 (the transcode + seek-index/setup-header derived artifacts) is the cold-start
|
||||
prerequisite** — until those artifacts exist, nothing downstream has bytes to serve, decode, or seek
|
||||
against. 18.3 (delivery param) and 18.4 (the decoder + index resolver) are largely parallel once 18.2
|
||||
(storage/lookup) settles, but both need artifacts to test against. **18.6 (the Settings menu) is the only
|
||||
wave with no audio-pipeline dependency** — it can proceed in parallel with the whole stack; it merely needs
|
||||
the `?format=` mechanism (18.3) wired before the toggle has anything to drive.
|
||||
|
||||
- **18.1 — Ingest transcode: derive + store the Opus artifact (cold-start; load-bearing).** New
|
||||
- **18.1 — Ingest transcode + seek-index + setup-header (cold-start; load-bearing).** New
|
||||
`OpusTranscodeService`/processor in `DeepDrftContent`, invoked post-store from
|
||||
`UnifiedTrackService.UploadAsync` alongside `WaveformProfileService`; produces Ogg Opus fullband
|
||||
320; stores it as a derived artifact (S2 vault recommended). Failure-tolerant (C6) and off the hot
|
||||
path (background/queued — OQ6). **Independent of the delivery/decoder waves; can begin immediately.**
|
||||
- **18.2 — Storage + lookup contract.** The derived-artifact key/vault convention and the server-side
|
||||
resolution "given `EntryKey` + format, return the right `AudioBinary` + content-type," including the
|
||||
C2 fallback (no Opus → lossless). **Depends on 18.1** (an artifact must exist to resolve to).
|
||||
- **18.3 — Delivery: format param + proxy threading.** `?format=opus|lossless` on the
|
||||
`UnifiedTrackService.UploadAsync` alongside `WaveformProfileService`, **as a background job** (OQ6,
|
||||
§3.1a); produces Ogg Opus fullband 320; **walks the encoded stream once to build the granule→byte seek
|
||||
index and extract the `OpusHead`/`OpusTags` setup header** (§3.4a A/B); stores the Opus bytes **and** the
|
||||
combined seek/setup **sidecar** as derived artifacts (S2 vault recommended). Failure-tolerant (C6).
|
||||
**Independent of the delivery/decoder waves; can begin immediately.**
|
||||
- **18.2 — Storage + lookup contract.** The derived-artifact key/vault convention (Opus bytes + sidecar)
|
||||
and the server-side resolution "given `EntryKey` + format, return the right `AudioBinary` + content-type
|
||||
(+ the sidecar on its own endpoint/path)," including the C2 fallback (no Opus → lossless). **Depends on
|
||||
18.1** (artifacts must exist to resolve to).
|
||||
- **18.3 — Delivery: format param + sidecar serving + proxy threading.** `?format=opus|lossless` on the
|
||||
`DeepDrftAPI` track stream endpoint (resolves via 18.2), forwarded through the `DeepDrftPublic`
|
||||
`TrackProxyController` (mirror the existing `offset` param threading), and the `Range` handler
|
||||
serving the chosen artifact's bytes. The player sends the param via `TrackMediaClient`. **Depends on
|
||||
18.2.** Parallel-ok with 18.4.
|
||||
- **18.4 — `OpusFormatDecoder` in the player stack.** New `IFormatDecoder` implementation
|
||||
(Ogg-page-aligned `getAlignedSegmentSize` via `OggS` scan, `OpusHead` setup carry in
|
||||
`wrapSegment`/continuation, approximate page-interpolation `calculateByteOffset` with an
|
||||
`OpusSeekData` accelerator); one new arm in `AudioPlayer.createFormatDecoder` on
|
||||
`audio/ogg`/`audio/opus`. Capability detection for the lossless fallback (§3.4, OQ2). **Depends on
|
||||
18.2** (needs Opus bytes to decode). Parallel-ok with 18.3; they meet at 18.5.
|
||||
- **18.5 — Backfill + selection UX + end-to-end validation.** The Backfill-Opus CMS bulk action (third
|
||||
sibling to Generate-Profiles / Backfill-High-res) and replace-audio Opus regeneration; the listener
|
||||
selection control per OQ1/OQ3 (global persisted quality toggle, recommended); and the AC1–AC8
|
||||
acceptance pass — including AC8's confirmation that Opus is windowable so Phase 21 can build on it.
|
||||
**Depends on 18.1–18.4.** (Selection UX can be split out if Daniel wants the substrate proven before
|
||||
the control lands — flag at planning time.)
|
||||
`TrackProxyController` (mirror the existing `offset` param threading), and the `Range` handler serving
|
||||
the chosen artifact's bytes; **plus serving the seek/setup sidecar** (a `GET …/opus/seekdata`-style path,
|
||||
proxied the same way). The player sends the format param via `TrackMediaClient`. **Depends on 18.2.**
|
||||
Parallel-ok with 18.4.
|
||||
- **18.4 — `OpusFormatDecoder` + the index-based seek resolver in the player stack.** New `IFormatDecoder`
|
||||
implementation: Ogg-page-aligned `getAlignedSegmentSize` via `OggS` scan; `OpusHead`/`OpusTags` setup
|
||||
carry in `wrapSegment`/the continuation path (sourced from the cached sidecar, §3.4a B); **`calculateByteOffset`
|
||||
that binary-searches the precomputed seek index** (NOT interpolation), with an `OpusSeekData` accelerator
|
||||
holding the parsed index + setup bytes; the **one-time sidecar fetch + parse** on track load. One new arm
|
||||
in `AudioPlayer.createFormatDecoder` on `audio/ogg`/`audio/opus`. Capability detection for the lossless
|
||||
fallback (§3.4, OQ2). **Depends on 18.2** (needs Opus bytes + sidecar). Parallel-ok with 18.3; they meet
|
||||
at 18.5.
|
||||
- **18.5 — Backfill + replace-audio + end-to-end validation (incl. seek accuracy).** The Backfill-Opus CMS
|
||||
bulk action (third sibling to Generate-Profiles / Backfill-High-res), which (re)builds Opus bytes + the
|
||||
sidecar for existing tracks; replace-audio Opus + sidecar regeneration; and the AC1–AC10 acceptance pass
|
||||
— **including AC9 (an Opus seek lands at the correct time, not approximately)** and AC8's confirmation
|
||||
that Opus is windowable (index resolver + sidecar setup header) so Phase 21 can build on it. **Depends on
|
||||
18.1–18.4.**
|
||||
- **18.6 — Public Settings menu + the quality toggle (the listener selection UX).** The new public-site
|
||||
Settings-menu shell (§4a): an app-bar trigger + MudBlazor menu hosting a settings-item abstraction, the
|
||||
`PublicSiteSettings`/`ListenerSettings` object, and the dark-mode-pattern persistence seam (`streamQuality`
|
||||
cookie + a `DeepDrftPublic` prerender-read service + `PersistentComponentState` bridge + client cookie
|
||||
service). The **quality toggle is its first occupant** (Low-data/Lossless, Opus default, capability-gated),
|
||||
driving the `?format=` the player sends (needs 18.3). Built design-for-adaptability so dark mode can plug
|
||||
in later without restructuring (not migrated now). **Depends on 18.3** (the toggle needs the format
|
||||
mechanism); the menu *shell* can be built ahead of that. *Splittable* into "menu shell" + "toggle plugs
|
||||
in" if Daniel wants the shell proven first — but small enough to land together (§4a).
|
||||
|
||||
---
|
||||
|
||||
@@ -432,7 +735,10 @@ largely parallel once 18.2 (storage/lookup) settles, but both need an artifact t
|
||||
phase.** Phase 21's C5 invariant ("WAV-only shipping target; must not foreclose MP3/FLAC") is now
|
||||
driven by Opus's VBR/paged seek math; Phase 21 OQ5 (adopt MSE) is resolved **NO** — the bespoke
|
||||
graph stays (the same C1 decision recorded here). Windowing a VBR/Opus stream uses
|
||||
`OpusFormatDecoder.calculateByteOffset`'s approximate mapping — exactly the C5 case.
|
||||
`OpusFormatDecoder.calculateByteOffset`'s **accurate index-based mapping** (§3.4a — *not* the earlier
|
||||
"approximate page-interpolation"; that language in the Phase 21 doc is corrected). Phase 21's windowed
|
||||
refill calls the **same** index resolver an explicit seek does (§3.4a D), and a window that opens away
|
||||
from byte 0 still decodes via the sidecar setup header (UC9).
|
||||
- `PLAN.md` Phase 4 (landed) / `COMPLETED.md` — the HTTP `Range: bytes=X-` primitive Opus seek reuses.
|
||||
- `PLAN.md` Phase 1.5 (gapless) / 1.6 (track-skip on error) / 1.7 (Safari) — 1.5's "encoder
|
||||
padding/priming" caveat applies to Opus (it has pre-skip samples in `OpusHead`); 1.6's
|
||||
@@ -452,7 +758,15 @@ largely parallel once 18.2 (storage/lookup) settles, but both need an artifact t
|
||||
- `DeepDrftPublic/Interop/audio/AudioPlayer.ts` (`createFormatDecoder`, lines 117–125) — the decoder
|
||||
registry gaining the Opus arm.
|
||||
- `DeepDrftPublic.Client/Clients/TrackMediaClient.cs` + `DeepDrftPublic/Controllers/TrackProxyController.cs`
|
||||
— the media fetch + proxy that thread the new `?format=` param (mirroring `offset`).
|
||||
— the media fetch + proxy that thread the new `?format=` param (mirroring `offset`), and proxy the new
|
||||
seek/setup sidecar fetch.
|
||||
- Root `CLAUDE.md` "Theming and dark mode" + `DarkModeService` (in `DeepDrftPublic`) + `DarkModeSettings`
|
||||
(`DeepDrftPublic.Client.Common`) — the cookie → prerender-read → `PersistentComponentState` → client
|
||||
cookie-service seam the **streaming-quality preference** (§4a) mirrors exactly; the eventual dark-mode-
|
||||
into-the-Settings-menu migration consolidates two copies of this seam.
|
||||
- `DeepDrftPublic.Client` `NowPlayingStats.razor` / `StatsClient` — the `PersistentComponentState`
|
||||
prerender-bridge precedent (prerender fetch carried into WASM without a re-fetch/flash), the pattern the
|
||||
quality preference's bridge follows; see the `tracksview-persistent-state-seam` auto-memory.
|
||||
|
||||
## Sources
|
||||
|
||||
|
||||
@@ -13,10 +13,15 @@ endpoint.
|
||||
> (`product-notes/phase-18-opus-low-data-streaming.md`) — is a prerequisite that sequences ahead of
|
||||
> windowing. Phase 21's windowing must work across **both** delivery formats (lossless WAV and Opus).
|
||||
> Its C5 invariant below already anticipated this ("must not foreclose MP3/FLAC"); **Opus is now the
|
||||
> concrete VBR/containerized driver of C5.** Windowing an Opus stream uses the decoder's *approximate*
|
||||
> byte↔time mapping (`OpusFormatDecoder.calculateByteOffset` — Ogg-page interpolation), exactly the C5
|
||||
> case — not the exact CBR-WAV `byteRate` math. Build the window machinery format-agnostically
|
||||
> (§2 C3/C5) so it inherits Opus for free.
|
||||
> concrete VBR/containerized driver of C5.** Windowing an Opus stream uses the decoder's **accurate
|
||||
> index-based** byte↔time mapping (`OpusFormatDecoder.calculateByteOffset` — a binary search in the Phase 18
|
||||
> precomputed seek index), exactly the C5 case — *not* the exact CBR-WAV `byteRate` math, and *not*
|
||||
> approximate Ogg-page interpolation. **Correction (Daniel, 2026-06-23):** an earlier draft described the
|
||||
> Opus mapping as "approximate page interpolation"; the Phase 18 seek-model resolution rejected that — Opus
|
||||
> seeking is **accurate**, backed by a precomputed seek index built at transcode time, so refill resolves to
|
||||
> the *exact* page offset. The windowed refill controller calls the **same** index resolver an explicit seek
|
||||
> does (Phase 18 §3.4a D); a window opening away from byte 0 still decodes via the Phase 18 sidecar setup
|
||||
> header. Build the window machinery format-agnostically (§2 C3/C5) so it inherits Opus for free.
|
||||
|
||||
---
|
||||
|
||||
@@ -66,14 +71,20 @@ docs. This phase **modifies that seam** — so the contract it must preserve is
|
||||
user-visible control, no change to seek/transport semantics beyond what the listener already
|
||||
experiences. Seek must still feel identical.
|
||||
- **C5 — Must window both delivery formats (WAV lossless AND Opus low-data).** Byte↔time mapping for
|
||||
refill is exact and cheap for WAV (CBR: `byteRate` from the header). For VBR/containerized formats it
|
||||
is approximate (the decoders carry TOC/SEEKTABLE/Ogg-page seek math). **Phase 18 (Opus) is sequenced
|
||||
before this phase and is the concrete driver here:** an Ogg Opus 320 stream is VBR and page-paged, so
|
||||
its `calculateByteOffset` is an *approximate* page-interpolation, not exact-offset. The window
|
||||
machinery must express refill purely in terms of the decoder's existing `calculateByteOffset`, so the
|
||||
same code windows WAV exactly and Opus approximately — **no WAV-special-cased offset math in the
|
||||
window layer.** (MP3/FLAC decoders are already wired in the registry too — the registry dispatches on
|
||||
content-type today; an `OpusFormatDecoder` joins them in Phase 18.)
|
||||
refill is exact and cheap for WAV (CBR: `byteRate` from the header). **Phase 18 (Opus) is sequenced
|
||||
before this phase and is the concrete VBR driver here** — and its mapping is **also exact**, but by a
|
||||
different mechanism: an Ogg Opus 320 stream has no linear time↔byte relationship, so
|
||||
`OpusFormatDecoder.calculateByteOffset` resolves via a **precomputed seek index** (granule→byte, built at
|
||||
transcode; Phase 18 §3.4a), a binary search that returns the exact page offset — **not** an approximate
|
||||
page interpolation. (An earlier draft of this invariant said "approximate"; the Phase 18 seek-model
|
||||
resolution, Daniel 2026-06-23, made Opus seeking accurate. Corrected here.) The window machinery must
|
||||
express refill purely in terms of the decoder's existing `calculateByteOffset`, so the same code windows
|
||||
WAV (via `byteRate`) and Opus (via the index) — **no WAV-special-cased offset math in the window layer**,
|
||||
and no approximation for either. A window that opens away from byte 0 must also prepend the decoder's
|
||||
retained/sidecar setup header (Phase 18 §3.4a B) — the format-agnostic refill path already routes
|
||||
continuations through the decoder's header-carry, so this comes for free. (MP3/FLAC decoders are already
|
||||
wired in the registry too — the registry dispatches on content-type today; an `OpusFormatDecoder` joins
|
||||
them in Phase 18.)
|
||||
- **C6 — No regression to the single-instance JS decoder concurrency guarantees.** The current code is
|
||||
careful that only one streaming loop touches the single JS `StreamDecoder` at a time
|
||||
(`DrainActiveStreamingTaskAsync`, the `_streamingCancellation` identity dance). Windowed refill
|
||||
|
||||
Reference in New Issue
Block a user