Files
deepdrft/PLAN.md
T
daniel-c-harvey 5c3c3c3d0c docs(plan): commit Phase 4.1 to Option A1 (Range headers, custom decoder)
Record the design-gate decision for HTTP Range support: Range headers in
the JS fetch retaining the AudioBuffer decoder, rejecting MediaElement
(loses early-playback) and synthesized-header-over-Range (breaks caching
invariant). Add per-file shape, acceptance criteria, and the file-absolute
offset constraint. Tighten 4.2 — disk-streaming already done on the
default path; only the legacy offset branch remains.
2026-06-09 06:33:29 -04:00

21 KiB
Raw Blame History

PLAN.md — DeepDrftHome forward roadmap

Forward-looking roadmap. Sits alongside CONTEXT.md (architecture orientation) and COMPLETED.md (history). Per CONTEXT.md §6, items move from here to COMPLETED.md when work lands; do not delete completed entries.

Organised by theme, not by date. Themes are roughly ordered by current product weight, not commitment. Nothing here carries a timeline unless it explicitly says so.


0. Baseline — what just landed

A two-part audit (design + streaming) ran on 2026-05-17 and the fixes for Critical, Major, and Minor findings are now on dev. The remainder of this plan assumes that baseline. In summary the audit-pass fixed:

  • Index concurrencyVaultIndexDirectory no longer drops the lock before its async disk write; the index file can no longer be clobbered by interleaved writers.
  • Repository semanticsTrackRepository.Update now fails-fast when an Id is not found instead of silently issuing an INSERT.
  • Streaming Criticals — concurrent-seek race in the client, dirty trailing bytes leaking out of the ArrayPool-rented buffer, final-tail audio dropped at EOF below the minimum decode frame, and the assumption that the first network chunk contains the whole WAV header.
  • 17 design and streaming Majors/Minors across all eight projects — format-validation alignment between processor/offset/decoder, IAsyncDisposable on the player provider, cancellation tokens threaded through the HTTP path, structured logging into the FileDatabase subsystem, sort-sentinel cleanup, sundry DRY/SRP tightenings.

What this means for the roadmap: the streaming substrate is solid. Future work can build on top of it rather than around it. The remaining items in TODO-V2.md that did not land are deferred as features, not bugs — they are captured below under Phase 1.


Phase 1 — Streaming features deferred from the audit

These were flagged during the audit but classified as feature work, not defect fixes. They are listed in rough order of user-visible impact.

1.2 Audio format diversity

  • What: Today AudioProcessor, WavOffsetService, and the JS decoder are PCM/WAV-only. MimeTypeExtensions already maps MP3, FLAC, Ogg, AAC, M4A — none are wired.
  • Why it matters: WAV-only is a real ceiling for any non-internal release. Distribution-grade formats (MP3, FLAC at minimum) are table stakes for a music site.
  • Shape: Two seams need a strategy pattern.
    • Server side: replace AudioProcessor.ProcessWavFileAsync with a format-router that selects a per-format processor; replace WavOffsetService with a per-format offset strategy (some formats — MP3, Ogg — have natural frame boundaries; FLAC has block headers; AAC has ADTS).
    • Client side: the JS decoder is currently a WAV byte-walker. For non-WAV, the simplest path is decodeAudioData over the full payload (loses streaming-start). The richer path is per-format chunked decoders. Worth a design pass before committing.
  • Prerequisite: None functionally, but consider settling Phase 4 (HTTP Range) first — native range/cache is much more important for large MP3s than for WAVs.
  • Constraint: Spectrum FFT tap currently relies on raw AudioBuffers through decodeAudioData. If a future path uses MediaElementAudioSourceNode (see 4.1), the FFT tap still works but the early-playback story changes.

1.3 Preload / prefetch of the next track

  • What: No mechanism to begin the next track's stream during the tail of the current. Each play is a cold fetch.
  • Why it matters: Prerequisite for both crossfade (1.4) and gapless (1.5). Also a perceived-latency win on its own — track-change feels instant when the bytes are already in flight.
  • Shape: A second HttpClient request kicked off when the current track passes a configurable threshold (e.g. last 10 seconds). Bytes accumulate into a staged StreamDecoder instance rather than the live one. Promotion to "current" happens at end-of-stream or on user-selected next.
  • Prerequisite: Requires a notion of "next track" — today the player only knows the current one. That implies either a playlist/queue model in IPlayerService or a passive "what was the next row in the gallery" inference.
  • Open question: Does a queue model belong in IPlayerService, or is the player a single-slot device that a future PlaylistService orchestrates above? Worth a design note before implementation. Capture in product notes when picked up.

1.4 Crossfade

  • What: Smooth A→B transition with overlapping fade-out / fade-in.
  • Why it matters: DJ/mix aesthetic that fits the DeepDrft collective's electronic-music context. Distinguishing UX from generic "next track."
  • Shape: Architecturally two simultaneous PlaybackScheduler instances suffice — each owns its own gain node, crossfaded via GainNode.gain.linearRampToValueAtTime. The wiring is the work, not the audio graph itself.
  • Prerequisite: 1.3 (Preload) — there is nothing to fade into without prefetch.

1.5 Gapless playback

  • What: Eliminate the inter-track silence that exists today.
  • Why it matters: Important for live-set rips, mix tapes, anything authored to flow continuously.
  • Shape: The decoder must be able to start the next track's first buffer scheduled exactly at the end of the current one's last buffer (sample-accurate, not wall-clock). With PlaybackScheduler's existing 500 ms lookahead this is mechanically achievable — the next track's first AudioBufferSourceNode.start(t) is set to the previous track's end time.
  • Prerequisite: 1.3 (Preload). Also needs to play nicely with 1.2 because gapless across formats is hard (encoder padding/priming on MP3 in particular).
  • Constraint: Truly sample-accurate gapless requires knowing the priming/padding sample counts of the source format. Out of scope for WAV-only; revisit when format diversity lands.

1.6 Track-skip on error

  • What: A failed processStreamingChunk aborts the entire load with no recovery path.
  • Why it matters: One corrupt frame at byte 4M of a 100 MB stream currently means the listener loses the entire track. Should at minimum surface a clear error and (optionally) skip past the bad region.
  • Shape: Two-level response.
    • Cheap: catch in the streaming loop, surface a user-visible error, advance the gallery to the next track if a queue exists.
    • Richer: byte-scan forward to the next valid frame header for the format and resume. Format-dependent — only worth doing once 1.2 lands.

1.7 Safari compatibility

  • What: Two known Safari edge cases.
    • webkitAudioContext.close() is async-but-not-Promise on older Safari (≤ ~14); await resolves immediately and the next initialize() can run against a not-yet-closed context.
    • iOS Safari < 15 had streaming-fetch quirks; HttpCompletionOption.ResponseHeadersRead behaviour is not guaranteed there.
  • Why it matters: Real listener share. iOS in particular is a primary listening surface for music.
  • Shape: For the close() race — detect webkitAudioContext and poll state === "closed" with a short timeout instead of trusting the await. For the fetch quirks — first decide the minimum supported iOS version; if pre-15 is in scope, fall back to a non-streaming fetch path and accept the latency.
  • Open question: What's the floor? Decide before designing the fallback. iOS 15+ as the floor would let us drop the second concern entirely.

These follow from CONTEXT.md §5. Direction is strongly implied but no specific UI has been committed.

2.2 Album and genre views

  • What: TrackCard already renders album/genre/release date; the data is there. Missing are gallery groupings (album view, genre view), filters, and the API-side support for filter expressions in TrackService.GetPaged.
  • Why it matters: The track gallery is the only working content surface. Multiple views over the same library is how it earns the "gallery" name.
  • Shape: Per CONTEXT.md §6, the convention is one source of truth, multiple views over it. New views should consume the same TracksViewModel / PagedResult<TrackEntity> and differ only at the rendering layer.
    • TrackService.GetPaged extended to accept a filter expression (or a simple structured filter DTO).
    • PagingParameters<T> extended with a Where: Expression<Func<T, bool>>? or a parallel FilterParameters<T> — pick one to avoid drift.
    • New routes (/albums, /genres) consume the same VM with different grouping / filter inputs.
  • Prerequisite: 2.1 for any view that prominently features cover art (album view especially is impoverished without it).
  • What: TracksViewModel exposes sort but no filter. TrackService.GetPaged accepts only sort. Simple text search across TrackName / Artist / Album is the obvious first cut.
  • Why it matters: Once the library has more than ~30 entries, sort-only browsing is friction.
  • Shape: Same extension to GetPaged as 2.2. UI is a debounced text input bound to the VM's filter property. EF Core translates Contains to SQLite LIKE.
  • Prerequisite: Fold into 2.2 if both are being done — the same GetPaged extension serves both. Doing them separately doubles the API churn.

Phase 3 — New content kinds

3.1 Live / session content

  • What: The home page advertises "Live Sessions" and "Video Content (coming soon)". No data model exists for these.
  • Why it matters: Honour the home page copy. Also differentiates the site from a generic track gallery — live sessions and video are the collective's authored output.
  • Shape: Speculative; no commitment yet.
    • Likely new entity table(s) sibling to TrackEntity (SessionEntity, VideoEntity?) — or a polymorphic MediaEntity with discriminator. The choice affects how much code in TrackService / TrackController can be reused.
    • New vault type(s). MediaVaultType.Media exists and is the obvious home for video; sessions are probably still Audio.
    • New routes, new UI surfaces, new player considerations (video has its own playback element and does not go through the WAV decoder).
  • Prerequisite: Probably 2.1 (vault wiring proof) and a decision on the entity model before any code lands.
  • [speculative] — direction inferred from home-page copy, not a Daniel-confirmed commitment.

Phase 4 — Infrastructure / delivery

4.1 HTTP Range + CDN caching

  • What: Today's ?offset= query parameter defeats HTTP caching — a CDN sees ?offset=1234567 as a distinct URL from the un-offset request. The architecture re-invents byte-range on top of a custom query param. Move the player's transport to standard HTTP Range headers against one canonical URL.
  • Why it matters: Material once the site has real listener traffic. Also relevant to non-WAV formats (1.2) where decoder-side seek is cheaper natively.
  • Chosen approach (design pass 2026-06-09): Option A1 — Range headers in the JS fetch, keep the custom AudioBuffer decoder. Rejected Option B (MediaElementAudioSourceNode): it surrenders early-playback (the minBuffersForPlayback start-as-soon-as-buffered behaviour, a listed quality feature) and forces a redesign of the waveform-seek and early-play UX, while delivering no caching benefit beyond what the HTTP layer already gives. Also rejected A2 (synthesised header delivered over Range): keeping WavOffsetService on the hot path means each bytes=X- request produces a distinct synthesised prefix that can't share cache lineage with the canonical bytes=0- object, defeating half the caching win. A1 makes the cached object the real file, so every Range request is a true sub-range of one entity. Key enabling insight: StreamDecoder already synthesises a per-segment 44-byte header internally for every decodeAudioData call (createWavFile), so a Range continuation only needs to retain the parsed WavHeader and feed raw PCM — it does not need a header in the network stream.
  • Shape (implementation direction):
    • Server (DeepDrftAPI/Controllers/TrackController.cs ~L407): flip enableRangeProcessing: false → true on the no-offset seekable FileStream path; ASP.NET Core slices natively and emits 206 + Content-Range. Leave the ?offset= / WavOffsetService branch reachable but off the player hot path — its removal is a clean follow-up commit, not part of this change.
    • Proxy (DeepDrftPublic/Controllers/TrackProxyController.cs ~L175): forward the incoming Range request header upstream; pass through upstream status (206/200/416) and the Content-Range / Accept-Ranges / Content-Length response headers verbatim. The proxy is a transparent relay — it does not slice the (non-seekable) upstream stream. Keep ResponseHeadersRead + RegisterForDispose.
    • Client transport (DeepDrftPublic.Client/Clients/TrackMediaClient): send Range: bytes={byteOffset}- instead of the ?offset= query param (byteOffset == 0bytes=0-, single code path). Confirm TrackMediaResponse.ContentLength carries the 206 remaining-length for continuations and full length for the initial request.
    • JS decoder (StreamDecoder.ts — the real work): add a continuation mode. Replace reinitializeForOffset (which nulls wavHeader and re-parses) with a reinitializeForRangeContinuation(remainingByteLength) that retains the parsed WavHeader, resets rawChunks/totalRawBytes/processedBytes/streamComplete, and routes incoming bytes straight to addRawData (the existing if (!this.wavHeader) branch already does this when the header is set). Add an isContinuation flag so updateStreamCompleteFlag() uses totalRawBytes without the + headerSize addend on continuations. createWavFile, the decode pipeline, and the spectrum/level tap are all unchanged.
    • AudioPlayer.ts / index.ts: keep the public reinitializeFromOffset interop name (so AudioInteropService and the C# caller are untouched); internally call the continuation reinit. C# StreamingAudioPlayerService.SeekBeyondBuffer is otherwise unchanged.
  • Acceptance criteria:
    1. Initial load sends Range: bytes=0-; server responds 206/200 with Accept-Ranges: bytes; time-to-first-audio unchanged (early playback after minBuffersForPlayback).
    2. Seek-beyond-buffer sends Range: bytes=X- (block-aligned, file-absolute X) with no ?offset= anywhere; server responds 206 + Content-Range; audio resumes with no click/pop and no header bytes leaking into PCM.
    3. Displayed total duration is unchanged across a seek (original full-track duration, not remaining-segment).
    4. A track seeked-near-end then played out fires the end callback exactly once (continuation streamComplete math correct).
    5. Spectrum visualiser and LevelMeterFab behave identically pre/post on a loud master (3 dBFS).
    6. Same-URL invariant: two different-offset requests hit an identical URL differing only in the Range header (verifiable in the network panel; live CDN cache-hit verification is out of scope — no CDN in dev).
    7. No MediaElement introduced; the AudioBufferSourceNode graph remains the playback path.
  • Constraints (non-obvious):
    • Range offset is file-absolute, not audio-relative. The old ?offset= contract was audio-data-relative (WavOffsetService added HeaderSize server-side). The Range offset must be header.headerSize + blockAlignedAudioOffset. Omitting headerSize lands the seek ~44 bytes early — audible click + position drift. Most likely bug; verify first.
    • Only the continuation skips header parse; the initial bytes=0- response still flows through tryParseHeader unchanged. Don't let the continuation flag bleed into initial load.
    • Proxy must pass Accept-Ranges / Content-Range (and a 416) through verbatim — stripping them blinds the browser and any future CDN.
    • A1 preserves the multi-format (1.2) seam: the decoder stays the format integration point; the "retain format, skip header, treat bytes as frame data" pattern generalises (frame-boundary alignment differs per format). Add no new WAV-specific coupling in the transport/proxy layers beyond what already exists.

4.2 Server-side stream from disk (no buffer materialisation)

  • What: The no-offset path already streams from disk — TrackController (~L390) takes mediaStream.Stream (a FileStream from LoadResourceStreamAsync), reads streamLength from .Length, and hands ownership to File(...); no LoadResourceAsync buffer materialisation on the default path. The remaining buffer materialisation is only the legacy ?offset= branch (~L414): GetAudioBinaryAsync loads the full AudioBinary into memory because WavOffsetService reslices over the in-memory buffer.
  • Why it matters: Scaling ceiling on the offset path specifically. Once 4.1 (A1) lands, the offset branch is off the player hot path, so its buffer cost stops mattering in practice.
  • Shape: Resolved for the default path. The only outstanding work is retiring the offset branch entirely — which is the 4.1 follow-up commit (remove the ?offset= server branch, WavOffsetService, and the now-unused ConcatStream). No separate work item beyond that cleanup.

4.3 Dual-write rollback / dead-letter log

  • What: If content-side write succeeds and SQL-side write fails, audio is orphaned in the vault. No compensating mechanism exists.
  • Why it matters: A latent data-integrity issue. Materially riskier once web upload (2.4) exists.
  • Shape: Audit suggested a DeadLetterLog recording orphaned entryKeys for a periodic maintenance pass. Lighter than full transactional rollback (which the dual-database split fundamentally cannot give us).
  • Prerequisite: None. Worth landing alongside or just before 2.4.

Phase 5 — Documentation backlog

5.1 Folder-level CLAUDE.md sweep

  • What: Eight folder-level CLAUDE.md files need writing/rewriting per the brief in DOC_PLAN.md. Five are rewrites (drift from the .NET 10 upgrade and structural moves); three are new (DeepDrftWeb.Services, DeepDrftContent.Services — the two libraries where most domain logic now lives — plus the open question on DeepDrftContent.Services/FileDatabase/README.md).
  • Why it matters: The agent guidance files are how every future implementer (human or agent) gets oriented in a directory. They are currently misleading in ways that will cause wrong assumptions on first contact — claiming .NET 9, referencing MediaPath that has been EntryKey for two migrations, describing a FileDatabase/ tree inside DeepDrftContent that has moved out, and missing entirely for the two *.Services libraries.
  • Shape: Doc-keeper executes against DOC_PLAN.md. Order of operations and the per-folder briefs are already specified there.
  • Prerequisite: None. Can run fully in parallel with any feature work.
  • Constraint: Wait on Daniel for the DeepDrftContent.Services/FileDatabase/README.md judgement call before that file changes (retire, keep + refresh, or replace with a CLAUDE.md). The other seven can proceed without that decision.

Cross-cutting / not yet themed

A small set of items that are real but don't fit a phase yet. Surface them when they become relevant rather than committing now.

  • Identity / accounts. Currently no user concept. Needed before web upload (2.4); also a precondition for favourites, listening history, per-user playlists. Decide the shape before any of those lands. [speculative] until Daniel signals interest.
  • ITrackService interface. Audit-suggested. Low value today (one consumer pair); higher value when the test surface expands beyond FileDatabase.
  • Test coverage outside FileDatabase. Tests today cover the FileDatabase subsystem comprehensively and nothing else. As features in Phases 14 land, test scope should expand — at minimum WavOffsetService, AudioProcessor, TrackService (both sides), and the streaming player services. Not a phase of its own; an attached cost to feature work.

Working with this file

  • Add items by extending an existing phase first; only create a new phase when the addition genuinely doesn't fit any of 15. Phase numbers are organisational, not sequencing.
  • When something lands, move it to COMPLETED.md rather than deleting it. Keep the original "What / Why / Shape" body intact so the history reads as a record of the decision, not just the outcome.
  • Mark genuinely uncertain items [speculative] so future readers can tell what is direction vs. commitment.
  • Open questions belong in the item that raises them, not in a separate "questions" list — they expire when the item does.