Merge streaming-overhaul into dev (Opus low-data streaming, windowed streaming, HW-accel-off stabilization)

This commit is contained in:
daniel-c-harvey
2026-06-26 11:14:59 -04:00
97 changed files with 10086 additions and 591 deletions
+21 -4
View File
@@ -66,11 +66,28 @@ If step 1 succeeds and step 2 fails, audio is orphaned in the vault (no rollback
The player is not fetch-then-play:
1. Client calls `GET api/track/{id}` on DeepDrftContent and receives WAV bytes as a stream (`HttpCompletionOption.ResponseHeadersRead`).
2. `StreamingAudioPlayerService` reads in adaptive 1664 KB chunks, pushes each via `AudioInteropService.processStreamingChunk`.
3. TypeScript `StreamDecoder` parses WAV header, decodes chunks to `AudioBuffer`s. `PlaybackScheduler` schedules them on a Web Audio graph.
1. Client calls `GET api/track/{id}` on DeepDrftContent and receives the first bounded segment (`Range: bytes=0-{SegmentSizeBytes-1}`, 4 MB) via `HttpCompletionOption.ResponseHeadersRead`.
2. `StreamingAudioPlayerService` reads in adaptive 1664 KB chunks within each segment, pushes each via `AudioInteropService.processStreamingChunk`.
3. TypeScript `StreamDecoder` parses the format header, decodes chunks to `AudioBuffer`s. `PlaybackScheduler` schedules them on a Web Audio graph.
4. Playback starts as soon as a min buffer is queued; UI duration from parsed header (not waiting for full file).
5. **Seek beyond buffer**: if seek target is past what's decoded, client issues `GET api/track/{id}` with `Range: bytes={byteOffset}-`. Server streams raw bytes from that file-absolute offset with a `206 Partial Content` response. Player retains the parsed WAV header and feeds the raw PCM continuation into the existing decode pipeline.
5. **Seek beyond buffer**: if seek target is past what's decoded, `StreamingAudioPlayerService` issues a new bounded segment starting at the seek byte offset. Server responds 206; player retains the parsed header and feeds the raw continuation into the existing decode pipeline.
**Memory bounding (three complementary layers, all required):**
- **Raw-queue bound (`StreamDecoder`):** `releaseConsumedChunks()` front-compacts `rawChunks` after each aligned segment is decoded, using a `discardedBytes` absolute cursor so all offsets remain absolute even as the array's front moves. Without this, a long WAV (e.g. a 92-min mix ≈ 970 MB raw) accumulates its entire decoded-from body in `rawChunks` regardless of the decode-side bounds below.
- **Decoded-queue bound (`PlaybackScheduler`):** `evictPlayedBuffers()` discards already-played `AudioBuffer`s, capping the scheduler's forward fill to a 96 MB ceiling (Phase 21.2). Decoded PCM is larger than source (Web Audio uses 32-bit float — a 16-bit stereo WAV roughly doubles; Opus decodes to the same float footprint).
- **Network bound (segmented fetch, Phase 21 Direction B):** The forward stream is fetched as sequential bounded `bytes=cursor-{cursor+4MB-1}` Range requests via `RunSegmentedStreamAsync`; the next segment is fetched only after `DrainBackpressureAsync` confirms the scheduler is below low-water. Because each segment is fully consumed before the next is issued, the browser holds at most ~one segment of raw bytes. A per-chunk drain inside the segment loop (gated on `_streamingPlaybackStarted` so it cannot deadlock first-audio) additionally prevents high-density codecs (e.g. Opus, where a 4 MB segment is ~100 s of audio) from decoding the whole segment eagerly ahead of the playhead before the inter-segment gate runs.
**Playback stability invariants (streaming-stabilization arc):**
- **Back-pressure water marks:** forward fill 60s high / 30s low (`PlaybackScheduler.DEFAULT_FORWARD_HIGH_WATER_SECONDS` / `...LOW_WATER_SECONDS`). Production pauses on `lookahead ≥ high OR decoded bytes > 96 MB ceiling`, whichever first. The time window is a jitter cushion for Opus's async decode ramp; the byte cap is the hard OOM guarantee and is unchanged.
- **Genuine end-of-playback:** `PlaybackScheduler` uses a `streamComplete` flag (set by `setStreamComplete`) combined with an `underrun_` park/resume state to distinguish a drained-but-still-streaming queue (startup/underrun gap → park, resume on refill) from a truly finished track (stream complete AND queue drained → `finishPlayback()`, the single genuine-end path). Prevents the false `onPlaybackEnded` that previously fired when the Opus WebCodecs queue momentarily drained during the decode ramp.
- **Rebuffer hysteresis:** Opus playback start and underrun-resume are gated on `hasMinimumPlaybackLead()` (1s decoded lead, `DEFAULT_MIN_PLAYBACK_LEAD_SECONDS`); WAV keeps `hasMinimumBuffers(6)`. `streamComplete` overrides the gate so a short tail still plays out. `StreamingAudioPlayerService.TryStartPlaybackAsync()` force-starts after `MarkStreamCompleteAsync` when the threshold was never crossed (ultra-short track protection).
- **Opus `AudioContext` pre-aligned to 48 kHz** in `AudioPlayer.initializeStreaming` before any bytes flow, eliminating the mid-decode context teardown that previously OOM'd the tab under software rendering.
**Visualizer / decode contention (decodePressure + hwAccel):**
- `DeepDrftPublic/Interop/audio/decodePressure.ts` exports a shared `DecodePressureSignal` singleton. The audio pipeline (`OpusStreamDecoder` yield-cap events, `PlaybackScheduler` underrun-parking events) calls `report()` on sustained lag; the visualizer calls `isUnderPressure()` each rAF frame. The signal engages only after ≥ 5 reports within 2500 ms with a 1s minimum hold (hysteresis); a lone startup-ramp blip never engages.
- Under pressure, `WaveformVisualizer.ts` throttles its rAF loop to ~15 fps (`PRESSURE_THROTTLE_FRAME_MS = 1000/15`), cutting main-thread WebGL software-render + physics cost so WebCodecs decode recovers. A no-op under HW accel.
- `DeepDrftPublic/Interop/visualizer/hwAccel.ts` probes `WEBGL_debug_renderer_info.UNMASKED_RENDERER_WEBGL` for software-renderer signatures (SwiftShader, llvmpipe, softpipe, Microsoft Basic Render, etc.). Absence of the debug extension is treated as accelerated — lava is disabled only on positive evidence. On first interactive render, `WaveformVisualizerControlState.ApplyCapabilityDefault(bool)` applies a one-time scoped default: `LavaEnabled = false` (the expensive subsystem) when no HW accel is detected, `WaveformEnabled` stays on. Guarded by `_capabilityDefaultApplied`; never overrides an explicit user toggle.
- **Known limitation / deferred escalation:** HW-accel-off Opus playback is made usable by defaulting lava off and throttling under pressure. If starvation recurs (e.g. waveform-only path also proves too costly, or a software renderer slips the probe), the documented next step is moving Opus WebCodecs decode off the main thread (Web Worker / AudioWorklet) so it stops competing with main-thread rendering. Not a current implementation — a recorded fallback.
Keep this seam clean — it is the most architecturally load-bearing part of the playback path.