5c3c3c3d0c
Record the design-gate decision for HTTP Range support: Range headers in the JS fetch retaining the AudioBuffer decoder, rejecting MediaElement (loses early-playback) and synthesized-header-over-Range (breaks caching invariant). Add per-file shape, acceptance criteria, and the file-absolute offset constraint. Tighten 4.2 — disk-streaming already done on the default path; only the legacy offset branch remains.
189 lines
21 KiB
Markdown
189 lines
21 KiB
Markdown
# PLAN.md — DeepDrftHome forward roadmap
|
||
|
||
Forward-looking roadmap. Sits alongside `CONTEXT.md` (architecture orientation) and `COMPLETED.md` (history). Per `CONTEXT.md §6`, items move from here to `COMPLETED.md` when work lands; do not delete completed entries.
|
||
|
||
Organised by **theme**, not by date. Themes are roughly ordered by current product weight, not commitment. Nothing here carries a timeline unless it explicitly says so.
|
||
|
||
---
|
||
|
||
## 0. Baseline — what just landed
|
||
|
||
A two-part audit (design + streaming) ran on 2026-05-17 and the fixes for Critical, Major, and Minor findings are now on `dev`. The remainder of this plan assumes that baseline. In summary the audit-pass fixed:
|
||
|
||
- **Index concurrency** — `VaultIndexDirectory` no longer drops the lock before its async disk write; the index file can no longer be clobbered by interleaved writers.
|
||
- **Repository semantics** — `TrackRepository.Update` now fails-fast when an `Id` is not found instead of silently issuing an `INSERT`.
|
||
- **Streaming Criticals** — concurrent-seek race in the client, dirty trailing bytes leaking out of the `ArrayPool`-rented buffer, final-tail audio dropped at EOF below the minimum decode frame, and the assumption that the first network chunk contains the whole WAV header.
|
||
- **17 design and streaming Majors/Minors** across all eight projects — format-validation alignment between processor/offset/decoder, `IAsyncDisposable` on the player provider, cancellation tokens threaded through the HTTP path, structured logging into the FileDatabase subsystem, sort-sentinel cleanup, sundry DRY/SRP tightenings.
|
||
|
||
What this means for the roadmap: the streaming substrate is solid. Future work can build on top of it rather than around it. The remaining items in `TODO-V2.md` that did not land are **deferred as features, not bugs** — they are captured below under Phase 1.
|
||
|
||
---
|
||
|
||
## Phase 1 — Streaming features deferred from the audit
|
||
|
||
These were flagged during the audit but classified as feature work, not defect fixes. They are listed in rough order of user-visible impact.
|
||
|
||
### 1.2 Audio format diversity
|
||
|
||
- **What:** Today `AudioProcessor`, `WavOffsetService`, and the JS decoder are PCM/WAV-only. `MimeTypeExtensions` already maps MP3, FLAC, Ogg, AAC, M4A — none are wired.
|
||
- **Why it matters:** WAV-only is a real ceiling for any non-internal release. Distribution-grade formats (MP3, FLAC at minimum) are table stakes for a music site.
|
||
- **Shape:** Two seams need a strategy pattern.
|
||
- Server side: replace `AudioProcessor.ProcessWavFileAsync` with a format-router that selects a per-format processor; replace `WavOffsetService` with a per-format offset strategy (some formats — MP3, Ogg — have natural frame boundaries; FLAC has block headers; AAC has ADTS).
|
||
- Client side: the JS decoder is currently a WAV byte-walker. For non-WAV, the simplest path is `decodeAudioData` over the full payload (loses streaming-start). The richer path is per-format chunked decoders. Worth a design pass before committing.
|
||
- **Prerequisite:** None functionally, but consider settling **Phase 4 (HTTP Range)** first — native range/cache is much more important for large MP3s than for WAVs.
|
||
- **Constraint:** Spectrum FFT tap currently relies on raw `AudioBuffer`s through `decodeAudioData`. If a future path uses `MediaElementAudioSourceNode` (see 4.1), the FFT tap still works but the early-playback story changes.
|
||
|
||
### 1.3 Preload / prefetch of the next track
|
||
|
||
- **What:** No mechanism to begin the next track's stream during the tail of the current. Each play is a cold fetch.
|
||
- **Why it matters:** Prerequisite for both crossfade (1.4) and gapless (1.5). Also a perceived-latency win on its own — track-change feels instant when the bytes are already in flight.
|
||
- **Shape:** A second `HttpClient` request kicked off when the current track passes a configurable threshold (e.g. last 10 seconds). Bytes accumulate into a staged `StreamDecoder` instance rather than the live one. Promotion to "current" happens at end-of-stream or on user-selected next.
|
||
- **Prerequisite:** Requires a notion of "next track" — today the player only knows the current one. That implies either a playlist/queue model in `IPlayerService` or a passive "what was the next row in the gallery" inference.
|
||
- **Open question:** Does a queue model belong in `IPlayerService`, or is the player a single-slot device that a future `PlaylistService` orchestrates above? Worth a design note before implementation. Capture in product notes when picked up.
|
||
|
||
### 1.4 Crossfade
|
||
|
||
- **What:** Smooth A→B transition with overlapping fade-out / fade-in.
|
||
- **Why it matters:** DJ/mix aesthetic that fits the DeepDrft collective's electronic-music context. Distinguishing UX from generic "next track."
|
||
- **Shape:** Architecturally two simultaneous `PlaybackScheduler` instances suffice — each owns its own gain node, crossfaded via `GainNode.gain.linearRampToValueAtTime`. The wiring is the work, not the audio graph itself.
|
||
- **Prerequisite:** **1.3 (Preload)** — there is nothing to fade *into* without prefetch.
|
||
|
||
### 1.5 Gapless playback
|
||
|
||
- **What:** Eliminate the inter-track silence that exists today.
|
||
- **Why it matters:** Important for live-set rips, mix tapes, anything authored to flow continuously.
|
||
- **Shape:** The decoder must be able to start the next track's first buffer scheduled exactly at the end of the current one's last buffer (sample-accurate, not wall-clock). With `PlaybackScheduler`'s existing 500 ms lookahead this is mechanically achievable — the next track's first `AudioBufferSourceNode.start(t)` is set to the previous track's end time.
|
||
- **Prerequisite:** **1.3 (Preload)**. Also needs to play nicely with **1.2** because gapless across formats is hard (encoder padding/priming on MP3 in particular).
|
||
- **Constraint:** Truly sample-accurate gapless requires knowing the priming/padding sample counts of the source format. Out of scope for WAV-only; revisit when format diversity lands.
|
||
|
||
### 1.6 Track-skip on error
|
||
|
||
- **What:** A failed `processStreamingChunk` aborts the entire load with no recovery path.
|
||
- **Why it matters:** One corrupt frame at byte 4M of a 100 MB stream currently means the listener loses the entire track. Should at minimum surface a clear error and (optionally) skip past the bad region.
|
||
- **Shape:** Two-level response.
|
||
- Cheap: catch in the streaming loop, surface a user-visible error, advance the gallery to the next track if a queue exists.
|
||
- Richer: byte-scan forward to the next valid frame header for the format and resume. Format-dependent — only worth doing once **1.2** lands.
|
||
|
||
### 1.7 Safari compatibility
|
||
|
||
- **What:** Two known Safari edge cases.
|
||
- `webkitAudioContext.close()` is async-but-not-Promise on older Safari (≤ ~14); `await` resolves immediately and the next `initialize()` can run against a not-yet-closed context.
|
||
- iOS Safari < 15 had streaming-fetch quirks; `HttpCompletionOption.ResponseHeadersRead` behaviour is not guaranteed there.
|
||
- **Why it matters:** Real listener share. iOS in particular is a primary listening surface for music.
|
||
- **Shape:** For the `close()` race — detect `webkitAudioContext` and poll `state === "closed"` with a short timeout instead of trusting the `await`. For the fetch quirks — first decide the minimum supported iOS version; if pre-15 is in scope, fall back to a non-streaming fetch path and accept the latency.
|
||
- **Open question:** What's the floor? Decide before designing the fallback. iOS 15+ as the floor would let us drop the second concern entirely.
|
||
|
||
---
|
||
|
||
## Phase 2 — Product surface: gallery, browsing, ingestion
|
||
|
||
These follow from `CONTEXT.md §5`. Direction is strongly implied but no specific UI has been committed.
|
||
|
||
### 2.2 Album and genre views
|
||
|
||
- **What:** `TrackCard` already renders album/genre/release date; the data is there. Missing are gallery groupings (album view, genre view), filters, and the API-side support for filter expressions in `TrackService.GetPaged`.
|
||
- **Why it matters:** The track gallery is the only working content surface. Multiple views over the same library is how it earns the "gallery" name.
|
||
- **Shape:** Per `CONTEXT.md §6`, the convention is one source of truth, multiple views over it. New views should consume the same `TracksViewModel` / `PagedResult<TrackEntity>` and differ only at the rendering layer.
|
||
- `TrackService.GetPaged` extended to accept a filter expression (or a simple structured filter DTO).
|
||
- `PagingParameters<T>` extended with a `Where: Expression<Func<T, bool>>?` or a parallel `FilterParameters<T>` — pick one to avoid drift.
|
||
- New routes (`/albums`, `/genres`) consume the same VM with different grouping / filter inputs.
|
||
- **Prerequisite:** **2.1** for any view that prominently features cover art (album view especially is impoverished without it).
|
||
|
||
|
||
### 2.3 Search and filter on the gallery
|
||
|
||
- **What:** `TracksViewModel` exposes sort but no filter. `TrackService.GetPaged` accepts only sort. Simple text search across `TrackName` / `Artist` / `Album` is the obvious first cut.
|
||
- **Why it matters:** Once the library has more than ~30 entries, sort-only browsing is friction.
|
||
- **Shape:** Same extension to `GetPaged` as 2.2. UI is a debounced text input bound to the VM's filter property. EF Core translates `Contains` to SQLite `LIKE`.
|
||
- **Prerequisite:** Fold into 2.2 if both are being done — the same `GetPaged` extension serves both. Doing them separately doubles the API churn.
|
||
|
||
|
||
---
|
||
|
||
## Phase 3 — New content kinds
|
||
|
||
### 3.1 Live / session content
|
||
|
||
- **What:** The home page advertises "Live Sessions" and "Video Content (coming soon)". No data model exists for these.
|
||
- **Why it matters:** Honour the home page copy. Also differentiates the site from a generic track gallery — live sessions and video are the collective's authored output.
|
||
- **Shape:** Speculative; no commitment yet.
|
||
- Likely new entity table(s) sibling to `TrackEntity` (`SessionEntity`, `VideoEntity`?) — or a polymorphic `MediaEntity` with discriminator. The choice affects how much code in `TrackService` / `TrackController` can be reused.
|
||
- New vault type(s). `MediaVaultType.Media` exists and is the obvious home for video; sessions are probably still `Audio`.
|
||
- New routes, new UI surfaces, new player considerations (video has its own playback element and does not go through the WAV decoder).
|
||
- **Prerequisite:** Probably **2.1** (vault wiring proof) and a decision on the entity model before any code lands.
|
||
- **`[speculative]`** — direction inferred from home-page copy, not a Daniel-confirmed commitment.
|
||
|
||
---
|
||
|
||
## Phase 4 — Infrastructure / delivery
|
||
|
||
### 4.1 HTTP Range + CDN caching
|
||
|
||
- **What:** Today's `?offset=` query parameter defeats HTTP caching — a CDN sees `?offset=1234567` as a distinct URL from the un-offset request. The architecture re-invents byte-range on top of a custom query param. Move the player's transport to standard HTTP `Range` headers against one canonical URL.
|
||
- **Why it matters:** Material once the site has real listener traffic. Also relevant to non-WAV formats (1.2) where decoder-side seek is cheaper natively.
|
||
- **Chosen approach (design pass 2026-06-09): Option A1 — Range headers in the JS fetch, keep the custom `AudioBuffer` decoder.** Rejected Option B (`MediaElementAudioSourceNode`): it surrenders early-playback (the `minBuffersForPlayback` start-as-soon-as-buffered behaviour, a listed quality feature) and forces a redesign of the waveform-seek and early-play UX, while delivering no caching benefit beyond what the HTTP layer already gives. Also rejected A2 (synthesised header delivered over Range): keeping `WavOffsetService` on the hot path means each `bytes=X-` request produces a distinct synthesised prefix that can't share cache lineage with the canonical `bytes=0-` object, defeating half the caching win. A1 makes the cached object the *real file*, so every Range request is a true sub-range of one entity. Key enabling insight: `StreamDecoder` already synthesises a per-segment 44-byte header internally for every `decodeAudioData` call (`createWavFile`), so a Range continuation only needs to *retain* the parsed `WavHeader` and feed raw PCM — it does not need a header in the network stream.
|
||
- **Shape (implementation direction):**
|
||
- **Server (`DeepDrftAPI/Controllers/TrackController.cs` ~L407):** flip `enableRangeProcessing: false → true` on the no-offset seekable `FileStream` path; ASP.NET Core slices natively and emits `206` + `Content-Range`. Leave the `?offset=` / `WavOffsetService` branch reachable but off the player hot path — its removal is a clean follow-up commit, not part of this change.
|
||
- **Proxy (`DeepDrftPublic/Controllers/TrackProxyController.cs` ~L175):** forward the incoming `Range` request header upstream; pass through upstream status (`206`/`200`/`416`) and the `Content-Range` / `Accept-Ranges` / `Content-Length` response headers verbatim. The proxy is a transparent relay — it does **not** slice the (non-seekable) upstream stream. Keep `ResponseHeadersRead` + `RegisterForDispose`.
|
||
- **Client transport (`DeepDrftPublic.Client/Clients/TrackMediaClient`):** send `Range: bytes={byteOffset}-` instead of the `?offset=` query param (`byteOffset == 0` → `bytes=0-`, single code path). Confirm `TrackMediaResponse.ContentLength` carries the 206 remaining-length for continuations and full length for the initial request.
|
||
- **JS decoder (`StreamDecoder.ts` — the real work):** add a continuation mode. Replace `reinitializeForOffset` (which nulls `wavHeader` and re-parses) with a `reinitializeForRangeContinuation(remainingByteLength)` that **retains** the parsed `WavHeader`, resets `rawChunks`/`totalRawBytes`/`processedBytes`/`streamComplete`, and routes incoming bytes straight to `addRawData` (the existing `if (!this.wavHeader)` branch already does this when the header is set). Add an `isContinuation` flag so `updateStreamCompleteFlag()` uses `totalRawBytes` **without** the `+ headerSize` addend on continuations. `createWavFile`, the decode pipeline, and the spectrum/level tap are all unchanged.
|
||
- **`AudioPlayer.ts` / `index.ts`:** keep the public `reinitializeFromOffset` interop name (so `AudioInteropService` and the C# caller are untouched); internally call the continuation reinit. C# `StreamingAudioPlayerService.SeekBeyondBuffer` is otherwise unchanged.
|
||
- **Acceptance criteria:**
|
||
1. Initial load sends `Range: bytes=0-`; server responds `206`/`200` with `Accept-Ranges: bytes`; time-to-first-audio unchanged (early playback after `minBuffersForPlayback`).
|
||
2. Seek-beyond-buffer sends `Range: bytes=X-` (block-aligned, file-absolute X) with **no `?offset=` anywhere**; server responds `206` + `Content-Range`; audio resumes with no click/pop and no header bytes leaking into PCM.
|
||
3. Displayed total duration is unchanged across a seek (original full-track duration, not remaining-segment).
|
||
4. A track seeked-near-end then played out fires the end callback exactly once (continuation `streamComplete` math correct).
|
||
5. Spectrum visualiser and `LevelMeterFab` behave identically pre/post on a loud master (−3 dBFS).
|
||
6. Same-URL invariant: two different-offset requests hit an identical URL differing only in the `Range` header (verifiable in the network panel; live CDN cache-hit verification is out of scope — no CDN in dev).
|
||
7. No `MediaElement` introduced; the `AudioBufferSourceNode` graph remains the playback path.
|
||
- **Constraints (non-obvious):**
|
||
- **Range offset is file-absolute, not audio-relative.** The old `?offset=` contract was audio-data-relative (`WavOffsetService` added `HeaderSize` server-side). The Range offset must be `header.headerSize + blockAlignedAudioOffset`. Omitting `headerSize` lands the seek ~44 bytes early — audible click + position drift. **Most likely bug; verify first.**
|
||
- Only the *continuation* skips header parse; the initial `bytes=0-` response still flows through `tryParseHeader` unchanged. Don't let the continuation flag bleed into initial load.
|
||
- Proxy must pass `Accept-Ranges` / `Content-Range` (and a `416`) through verbatim — stripping them blinds the browser and any future CDN.
|
||
- A1 preserves the multi-format (1.2) seam: the decoder stays the format integration point; the "retain format, skip header, treat bytes as frame data" pattern generalises (frame-boundary alignment differs per format). Add no new WAV-specific coupling in the transport/proxy layers beyond what already exists.
|
||
|
||
### 4.2 Server-side stream from disk (no buffer materialisation)
|
||
|
||
- **What:** The no-offset path **already** streams from disk — `TrackController` (~L390) takes `mediaStream.Stream` (a `FileStream` from `LoadResourceStreamAsync`), reads `streamLength` from `.Length`, and hands ownership to `File(...)`; no `LoadResourceAsync` buffer materialisation on the default path. The remaining buffer materialisation is **only** the legacy `?offset=` branch (~L414): `GetAudioBinaryAsync` loads the full `AudioBinary` into memory because `WavOffsetService` reslices over the in-memory buffer.
|
||
- **Why it matters:** Scaling ceiling on the offset path specifically. Once 4.1 (A1) lands, the offset branch is off the player hot path, so its buffer cost stops mattering in practice.
|
||
- **Shape:** Resolved for the default path. The only outstanding work is retiring the offset branch entirely — which is the 4.1 follow-up commit (remove the `?offset=` server branch, `WavOffsetService`, and the now-unused `ConcatStream`). No separate work item beyond that cleanup.
|
||
|
||
### 4.3 Dual-write rollback / dead-letter log
|
||
|
||
- **What:** If content-side write succeeds and SQL-side write fails, audio is orphaned in the vault. No compensating mechanism exists.
|
||
- **Why it matters:** A latent data-integrity issue. Materially riskier once web upload (2.4) exists.
|
||
- **Shape:** Audit suggested a `DeadLetterLog` recording orphaned `entryKey`s for a periodic maintenance pass. Lighter than full transactional rollback (which the dual-database split fundamentally cannot give us).
|
||
- **Prerequisite:** None. Worth landing alongside or just before 2.4.
|
||
|
||
---
|
||
|
||
## Phase 5 — Documentation backlog
|
||
|
||
### 5.1 Folder-level CLAUDE.md sweep
|
||
|
||
- **What:** Eight folder-level `CLAUDE.md` files need writing/rewriting per the brief in `DOC_PLAN.md`. Five are rewrites (drift from the `.NET 10` upgrade and structural moves); three are new (`DeepDrftWeb.Services`, `DeepDrftContent.Services` — the two libraries where most domain logic now lives — plus the open question on `DeepDrftContent.Services/FileDatabase/README.md`).
|
||
- **Why it matters:** The agent guidance files are how every future implementer (human or agent) gets oriented in a directory. They are currently misleading in ways that will cause wrong assumptions on first contact — claiming `.NET 9`, referencing `MediaPath` that has been `EntryKey` for two migrations, describing a `FileDatabase/` tree inside `DeepDrftContent` that has moved out, and missing entirely for the two `*.Services` libraries.
|
||
- **Shape:** Doc-keeper executes against `DOC_PLAN.md`. Order of operations and the per-folder briefs are already specified there.
|
||
- **Prerequisite:** None. Can run fully in parallel with any feature work.
|
||
- **Constraint:** Wait on Daniel for the `DeepDrftContent.Services/FileDatabase/README.md` judgement call before that file changes (retire, keep + refresh, or replace with a CLAUDE.md). The other seven can proceed without that decision.
|
||
|
||
---
|
||
|
||
## Cross-cutting / not yet themed
|
||
|
||
A small set of items that are real but don't fit a phase yet. Surface them when they become relevant rather than committing now.
|
||
|
||
- **Identity / accounts.** Currently no user concept. Needed before web upload (2.4); also a precondition for favourites, listening history, per-user playlists. Decide the shape before any of those lands. `[speculative]` until Daniel signals interest.
|
||
- **`ITrackService` interface.** Audit-suggested. Low value today (one consumer pair); higher value when the test surface expands beyond FileDatabase.
|
||
- **Test coverage outside FileDatabase.** Tests today cover the FileDatabase subsystem comprehensively and nothing else. As features in Phases 1–4 land, test scope should expand — at minimum `WavOffsetService`, `AudioProcessor`, `TrackService` (both sides), and the streaming player services. Not a phase of its own; an attached cost to feature work.
|
||
|
||
---
|
||
|
||
## Working with this file
|
||
|
||
- **Add items by extending an existing phase first**; only create a new phase when the addition genuinely doesn't fit any of 1–5. Phase numbers are organisational, not sequencing.
|
||
- **When something lands, move it to `COMPLETED.md`** rather than deleting it. Keep the original "What / Why / Shape" body intact so the history reads as a record of the decision, not just the outcome.
|
||
- **Mark genuinely uncertain items `[speculative]`** so future readers can tell what is direction vs. commitment.
|
||
- **Open questions belong in the item that raises them**, not in a separate "questions" list — they expire when the item does.
|
||
|