Merge streaming-overhaul into dev (Opus low-data streaming, windowed streaming, HW-accel-off stabilization)
This commit is contained in:
@@ -443,213 +443,15 @@ not the same work; this phase does not satisfy or depend on that one.
|
||||
|
||||
---
|
||||
|
||||
## Phase 18 — Opus Low-Data Streaming (dual-format lossless + Opus delivery)
|
||||
## Phase 18 — Opus Low-Data Streaming (Completed)
|
||||
|
||||
The concrete realization of the long-deferred **"Non-WAV formats"** intent (`CONTEXT.md §5`). Daniel's
|
||||
direction (2026-06-23): **two delivery formats per track — the existing lossless WAV path, and a new
|
||||
low-data Ogg Opus (fullband, 320 kbps) path — so the listener gets a choice, with Opus the
|
||||
bandwidth-friendly default-candidate.** Lossless streaming becomes *optional*, not the only path. The
|
||||
bespoke Web Audio decode→schedule graph is **retained by deliberate choice** — Opus feeds the same
|
||||
`IFormatDecoder` seam, not an HTML `<media>` element or MSE (the decision shared with Phase 21 OQ5).
|
||||
**Sequenced BEFORE Phase 21** — windowing must work across both formats. Surfaces: ingest/preprocessing
|
||||
in `DeepDrftContent` (`AudioProcessor`/router/`WaveformProfileService`) + `DeepDrftAPI`
|
||||
(`UnifiedTrackService.UploadAsync`, replace-audio); delivery/decode in `DeepDrftAPI` (stream endpoint +
|
||||
`Range`) + `DeepDrftPublic` proxy + `DeepDrftPublic.Client` player stack + `DeepDrftPublic/Interop/audio`
|
||||
TS decoders. Full design, the three directions with SOLID/road-not-taken rationale, the storage and
|
||||
delivery options, the Opus decoder + seek math, acceptance criteria, open questions, and wave
|
||||
decomposition: `product-notes/phase-18-opus-low-data-streaming.md`.
|
||||
|
||||
**Much further along than the backlog line implies (verified 2026-06-23).** The multi-format *substrate*
|
||||
already exists on both sides: the producer-side `AudioProcessorRouter` routes `.wav`/`.mp3`/`.flac` and
|
||||
`TrackContentService.AddTrackAsync` is format-agnostic (it **stores originals**, no transcode); the
|
||||
decoder-side `AudioPlayer.createFormatDecoder` is a **wired** strategy registry dispatching on
|
||||
`Content-Type` (WAV/MP3/FLAC decoders all present — correcting the Phase 21 spec's stale
|
||||
"implemented-not-wired" note). **The actual gap is Daniel's specific ask:** (1) a **transcode-at-ingest**
|
||||
step that *derives* an Opus 320 artifact per track (nothing derives Opus today), and (2) a **per-format
|
||||
delivery selection** so one track serves as either WAV or Opus on request.
|
||||
|
||||
**Open questions RESOLVED (Daniel, 2026-06-23).** OQ1 selection UX → **global, via a new public-site
|
||||
Settings menu** (not a bare app-bar control); OQ2 default → **Opus by default, capability-gated** (defer
|
||||
network-awareness); OQ3 remembered → **persisted via the dark-mode seam** (cookie → prerender-read →
|
||||
`PersistentComponentState` → client cookie service); OQ4 → **always-on Opus + Backfill-Opus**; OQ5 →
|
||||
**Ogg Opus**; OQ6 transcode model → **background job after the file is available, with a visible
|
||||
Post-Processing phase on the CMS upload meter.** OQ7 (seek-index granularity) → **0.5 s (half-second)
|
||||
buckets** (~115 KB index for a 1-hour mix).
|
||||
|
||||
**Architectural spine — a derived artifact set + a delivery param + one new decoder + a precomputed
|
||||
accurate seek index; leaf implementations only, zero changes to existing format code (the strong OCP
|
||||
signal).** Transcode is a new processor sibling in `DeepDrftContent`, invoked post-store alongside
|
||||
`WaveformProfileService` **as a background job** (a 1 GB WAV transcode must not block the upload; the source
|
||||
is stored and the track plays lossless *first*, then Opus is derived) — mirroring the landed waveform-datum
|
||||
pattern (derive at ingest, regenerate via a CMS bulk action + ApiKey endpoint). The Opus bytes are a
|
||||
**derived artifact** stored like the high-res waveform datum (recommend a dedicated `track-opus` vault, the
|
||||
`track-waveforms` precedent; final call staff-engineer's). Delivery adds a **`?format=opus|lossless` param**
|
||||
(mirroring the existing `offset` param threading through `TrackProxyController`) resolved server-side to the
|
||||
right artifact + content-type, with a **lossless fallback** when no Opus artifact exists (additive, never
|
||||
404/silence). The player gains one `OpusFormatDecoder` (`IFormatDecoder`): Ogg-page-aligned segmenting
|
||||
(`OggS` scan — the FLAC frame-sync analogue) and `OpusHead`/`OpusTags` setup-bytes carry (the FLAC
|
||||
`streamInfoBytes` analogue). **Browser constraint flagged:** Ogg-Opus `decodeAudioData` is Safari-18.4+ only
|
||||
(Chrome/FF long-standing), so the Opus default is **capability-gated** — fall back to the universal lossless
|
||||
path on browsers that can't decode it.
|
||||
|
||||
**VBR-safe ACCURATE seeking (Daniel, 2026-06-23 — supersedes the earlier "approximate" hand-wave).** Raw
|
||||
byte-offset seek and rough page interpolation are inadequate for VBR Opus — there is no linear time↔byte
|
||||
relationship. The fix is an **accurate transfer function built at transcode time** (the one moment the
|
||||
whole encoded stream is walked): a precomputed **seek index** mapping Ogg-page `granulepos` (48 kHz sample
|
||||
counts → time) → exact byte offset (**0.5 s buckets** snapped to page starts — OQ7; ~7,200 entries ×
|
||||
16 bytes ≈ ~115 KB for a 1-hour mix). The decode **setup header** (`OpusHead`/`OpusTags`, needed to decode any mid-stream slice) is made
|
||||
available too. Recommended concrete design: **one sidecar artifact per track = `[setup header][seek
|
||||
index]`, built at transcode, stored beside the Opus bytes, fetched once on track load**, parsed into
|
||||
`OpusSeekData`. Client seek flow: `calculateByteOffset(t)` binary-searches the index for the exact page
|
||||
offset → `Range: bytes=X-` fetch (landed Phase 4 primitive, unchanged) → prepend the cached setup header →
|
||||
decode → fine re-sync to `t` within the bucket. **The listener lands at the correct time, not
|
||||
approximately** (AC9), **without** the full PCM in memory — so it composes with Phase 21 windowed refill,
|
||||
which calls the **same** index resolver. The earlier "approximate page-interpolation" language is rejected.
|
||||
|
||||
**Constraints/invariants:** keep the bespoke graph (no MSE); preprocessing is **additive** (WAV path
|
||||
untouched, byte-for-byte; a track with no Opus artifact still plays losslessly); reuse the landed
|
||||
`Range`/offset seek path; no format branches leak outside the new decoder + one selection arm + the
|
||||
transcode/delivery seam; transcode failure must not block ingest; format selection is a delivery-time
|
||||
decision resolving one `EntryKey` to one of two artifacts (one source, two views — **not** a second
|
||||
`TrackEntity` row, which would fracture share/queue/play-count/release identity).
|
||||
|
||||
Sequenced as six waves. `18.1 → 18.2 → {18.3, 18.4} → 18.5`, with `18.6` (Settings menu) able to run in
|
||||
parallel (it needs only 18.3's format mechanism before its toggle is live). **18.1 (ingest transcode +
|
||||
seek-index + setup-header derived artifacts) is the cold-start prerequisite** — nothing downstream has
|
||||
bytes to serve, decode, or seek against until those artifacts exist.
|
||||
|
||||
- **18.1 — Ingest transcode + seek-index + setup-header (cold-start; load-bearing).** New
|
||||
`OpusTranscodeService`/processor in `DeepDrftContent`, invoked post-store from
|
||||
`UnifiedTrackService.UploadAsync` alongside `WaveformProfileService` **as a background job** (OQ6);
|
||||
produces Ogg Opus fullband 320; **walks the encoded stream once to build the granule→byte seek index and
|
||||
extract the `OpusHead`/`OpusTags` setup header**; stores the Opus bytes **and** the combined seek/setup
|
||||
**sidecar** as derived artifacts (recommend a `track-opus` vault). Failure-tolerant. **Independent of the
|
||||
delivery/decoder waves — can begin immediately.**
|
||||
- **18.2 — Storage + lookup contract.** The derived-artifact key/vault convention (Opus bytes + sidecar) +
|
||||
server-side "given `EntryKey` + format, return the right `AudioBinary` + content-type (+ the sidecar),"
|
||||
including the lossless fallback. **Depends on 18.1.**
|
||||
- **18.3 — Delivery: `?format=opus|lossless` param + sidecar serving + proxy threading.** On the
|
||||
`DeepDrftAPI` stream endpoint (resolves via 18.2), forwarded through `TrackProxyController` (mirror
|
||||
`offset`), `Range` serving the chosen artifact; **plus serving the seek/setup sidecar**; player sends the
|
||||
format param via `TrackMediaClient`. **Depends on 18.2; parallel-ok with 18.4.**
|
||||
- **18.4 — `OpusFormatDecoder` + index-based seek resolver in the player stack.** New `IFormatDecoder`
|
||||
(Ogg-page segmenting via `OggS` scan, `OpusHead`/`OpusTags` setup carry from the cached sidecar,
|
||||
**`calculateByteOffset` that binary-searches the precomputed seek index** — NOT interpolation — with an
|
||||
`OpusSeekData` accelerator holding the parsed index + setup bytes, and the one-time sidecar fetch+parse on
|
||||
track load) + one arm in `createFormatDecoder` on `audio/ogg`/`audio/opus`; capability detection for the
|
||||
lossless fallback. **Depends on 18.2; parallel-ok with 18.3.**
|
||||
- **18.5 — Backfill + replace-audio + end-to-end validation (incl. seek accuracy).** "Backfill Opus" CMS
|
||||
bulk action (third sibling to Generate-Profiles / Backfill-High-res), rebuilding Opus bytes + sidecar for
|
||||
existing tracks; replace-audio Opus + sidecar regeneration; the AC1–AC10 acceptance pass **including AC9
|
||||
(an Opus seek lands at the correct time, not approximately)** and the Phase-21 handshake (Opus windowable
|
||||
via the index resolver + sidecar setup header). **Depends on 18.1–18.4.**
|
||||
- **18.6 — Public Settings menu + quality toggle (the listener selection UX).** New public-site
|
||||
Settings-menu shell (app-bar trigger + MudBlazor menu + a settings-item abstraction + a
|
||||
`PublicSiteSettings`/`ListenerSettings` object + the dark-mode-pattern persistence seam: `streamQuality`
|
||||
cookie, a `DeepDrftPublic` prerender-read service, `PersistentComponentState` bridge, client cookie
|
||||
service); the **quality toggle is its first occupant** (Low-data/Lossless, Opus default, capability-gated)
|
||||
+ the CMS upload meter's **Post-Processing phase** (OQ6). Built design-for-adaptability so dark mode can
|
||||
plug in later without restructuring (not migrated now). **Depends on 18.3** for the toggle; the menu shell
|
||||
can be built ahead. *Splittable* (shell, then toggle) if Daniel wants the shell proven first.
|
||||
|
||||
**Dependency shape:** `18.1 → 18.2 → {18.3 ∥ 18.4} → 18.5`; `18.6 ∥` (needs 18.3 for the live toggle);
|
||||
18.1 is the only cold-start wave. **Phase-level: 18 precedes Phase 21** (windowed refill consumes the Phase
|
||||
18 seek-index resolver). **OQ1–OQ7 RESOLVED (above); OQ7 (seek-index granularity) = 0.5 s buckets.** None
|
||||
block 18.1.
|
||||
See `COMPLETED.md` for Phase 18 — dual-format lossless + Opus delivery, including the as-built WebCodecs decoder divergence — which landed on `streaming-overhaul` (2026-06-23).
|
||||
|
||||
---
|
||||
|
||||
## Phase 21 — Windowed Streaming Buffer (bounded client memory for long streams)
|
||||
## Phase 21 — Windowed Streaming Buffer (Completed)
|
||||
|
||||
Bound the **client memory** a playing track consumes to a small, configurable forward window —
|
||||
**independent of total stream length** — so a 1 GB+ DJ MIX (Phase 9 `Mix` medium: a single long track)
|
||||
plays without the whole decoded PCM accumulating in the browser. **Public listener site only**
|
||||
(`DeepDrftPublic.Client` player stack + `DeepDrftPublic` TypeScript audio interop); no CMS, no API
|
||||
endpoint, no schema change.
|
||||
|
||||
**Sequenced AFTER Phase 18 (Opus Low-Data Streaming) — Daniel, 2026-06-23.** Format support (the
|
||||
derived Ogg Opus 320 low-data path, Phase 18) is a prerequisite that comes first; windowing must work
|
||||
across **both** delivery formats. Phase 21's C5 invariant already anticipated this ("must not foreclose
|
||||
MP3/FLAC"); **Opus is now the concrete VBR/paged driver** — windowing an Opus stream uses the decoder's
|
||||
**accurate index-based** byte↔time mapping (`OpusFormatDecoder.calculateByteOffset`, a binary search in the
|
||||
Phase 18 precomputed seek index — *not* the exact CBR-WAV `byteRate` math, and *not* approximate page
|
||||
interpolation: VBR-safe and exact, per the Phase 18 seek-model resolution 2026-06-23). The windowed refill
|
||||
controller calls the **same** index resolver an explicit seek does, and a window opening away from byte 0
|
||||
still decodes via the Phase 18 sidecar setup header. Build the window machinery format-agnostically so it
|
||||
inherits Opus for free.
|
||||
|
||||
The network path already streams in adaptive 16–64 KB chunks. The accumulation is on the **decode
|
||||
side**: `PlaybackScheduler` holds an `AudioBuffer[]` it **never evicts** ("Supports pause/resume/seek by
|
||||
retaining all buffers" — its own doc comment). Decoded PCM is larger than the source (Web Audio is
|
||||
32-bit float per sample/channel — a 16-bit stereo WAV roughly doubles once decoded), so a 1 GB WAV
|
||||
becomes ~2 GB of retained float data. That is the OOM. The fix: hold only a sliding forward window plus a
|
||||
small back-retain, discard already-played buffers, and refill on demand.
|
||||
|
||||
**Architectural spine — a sliding window keyed on playback position, built as a generalization of the
|
||||
landed seek-beyond-buffer path.** The Phase 4 HTTP `Range: bytes=X-` → 206 primitive already does every
|
||||
plumbing primitive the window needs (discard-buffers-keep-offset via `clearForSeek`/`setPlaybackOffset`;
|
||||
fetch-from-offset via `TrackMediaClient`; decode-header-less-body via
|
||||
`StreamDecoder.reinitializeForRangeContinuation`; time→byte via `IFormatDecoder.calculateByteOffset`),
|
||||
just triggered manually and one-shot. The only genuinely new mechanisms are **partial eviction** on the
|
||||
scheduler and **back-pressure** on the forward read loop (stop calling `ReadAsync` above a high-water
|
||||
mark, resume below low-water). Recommended **Direction A** (sliding window on the existing single forward
|
||||
stream); **Direction B** (discrete Range-fetched segments — the HLS/DASH/MSE-eviction analogue) held as
|
||||
the documented fallback; **Direction C** (adopt MSE and let the browser manage the buffer) **rejected
|
||||
(OQ5 = NO, Daniel 2026-06-23)** — the bespoke Web Audio graph is a deliberate long-term commitment, and
|
||||
the compressed-delivery move that would have justified MSE is met instead by **Phase 18 (Opus) feeding
|
||||
the same bespoke graph** through the `IFormatDecoder` seam. Direction A is therefore the permanent
|
||||
destination, not a stopgap MSE would retire.
|
||||
|
||||
**Invariants that must hold (the §3.5 seam contract).** Reuse the Range path, don't fork it; playback-
|
||||
start latency at parity; the `IFormatDecoder` abstraction untouched (windowing is format-agnostic, so
|
||||
wiring MP3/FLAC later inherits it free); read-only playback (no new control); the single-instance JS
|
||||
decoder stays single-writer (every refill routes through the existing cancellation/drain discipline). The
|
||||
**Mix visualizer is provably unaffected** — it renders from the preprocessed per-track high-res datum
|
||||
(Phase 10/12), never from live decoded PCM, so evicting played buffers cannot starve it. The 1 GB mix is
|
||||
both the canonical case *and* the proof the eviction is safe.
|
||||
|
||||
**Interaction with deferred Phase 1 features (same seam):** windowing should land **before** preload
|
||||
(1.3) — it makes preload of long tracks memory-safe by construction (a staged next-track decoder inherits
|
||||
the bounded scheduler); it makes crossfade (1.4) between two long mixes affordable (the overlap doubles
|
||||
the *window*, not the track); it adds a minor "don't evict the final window before the gapless boundary"
|
||||
care point for 1.5. It **enlarges the error surface** (1.6): windowed refill issues mid-stream fetches
|
||||
the listener didn't initiate, one of which can fail deep into a 1 GB mix — so the *cheap* half of 1.6
|
||||
(clean refill-failure handling, no wedged player) is folded into this phase's acceptance criteria, not
|
||||
left fully to 1.6.
|
||||
|
||||
Full design, the three directions with SOLID/road-not-taken rationale, use cases, acceptance criteria,
|
||||
the open-question set, and the wave decomposition: `product-notes/phase-21-windowed-streaming-buffer.md`.
|
||||
|
||||
Sequenced as four waves. `21.1 → 21.2 → 21.3`, with `21.4` validating the whole. **21.1 is the cold-start
|
||||
prerequisite and the load-bearing change** — independent of the open questions (window *sizes* are
|
||||
parameters fed in later).
|
||||
|
||||
- **21.1 — Partial eviction in `PlaybackScheduler` (cold-start; load-bearing).** Drop already-played
|
||||
buffers while keeping the position/index/time-anchor bookkeeping exact against a buffer array that no
|
||||
longer begins at absolute time 0 (today `getCurrentPosition`/`playFromPosition`/the schedule loop all
|
||||
assume `buffers[0]` is the track start). The hardest correctness work in the phase. No refill yet.
|
||||
**Independent of the open questions — can begin immediately.**
|
||||
- **21.2 — Back-pressure on the forward read loop.** Stop `ReadAsync` above the high-water mark, resume
|
||||
below low-water; together with 21.1 this bounds *both* the played and unplayed regions (the AC1
|
||||
guarantee). Routes resume/pause through the existing single-loop cancellation discipline. **Depends on
|
||||
21.1.**
|
||||
- **21.3 — Seek-back-past-window refill.** When a backward seek lands earlier than the retained tail,
|
||||
refetch via the existing seek-beyond-buffer Range path pointed at the earlier offset; plus the minimal
|
||||
clean refill-failure handling (the 1.6 adjacency). Mostly reuse of the landed seek path. **Depends on
|
||||
21.1 + 21.2.**
|
||||
- **21.4 — Validation against the 1 GB target (acceptance).** Memory profiling (bounded under 1 GB is the
|
||||
headline), latency parity, edge-to-edge playback, the seek matrix, induced refill failure, visualizer-
|
||||
running, rapid-seek concurrency. Largely measurement; breaks are tuning fixes in 21.1's anchor math or
|
||||
21.2's water-marks. **Depends on 21.1–21.3.**
|
||||
|
||||
**Dependency shape:** `21.1 → 21.2 → 21.3 → 21.4`; 21.1 is the only cold-start wave. **Phase-level
|
||||
prerequisite: Phase 18 (Opus) lands first** so windowing is built against both formats. **Open questions
|
||||
for Daniel (spec §6):** window-size policy axis (time-based window + memory guard — recommended); seek-
|
||||
back-past-window re-buffer acceptable (recommend yes, symmetric to forward); a hard total in-flight
|
||||
memory cap as a guard rail (recommend yes); window everything vs. only long tracks (recommend everything
|
||||
— one path, short tracks never hit a refill). **OQ5 (adopt MSE) — RESOLVED NO (Daniel 2026-06-23): the
|
||||
bespoke graph stays by deliberate choice; recorded considered-and-declined, kept visible per file
|
||||
convention.** None block 21.1.
|
||||
See `COMPLETED.md` for Phase 21 — bounded client memory for long streams, including the as-built Direction A→B pivot — which landed on `streaming-overhaul` (2026-06-24).
|
||||
|
||||
---
|
||||
|
||||
|
||||
Reference in New Issue
Block a user