Files
deepdrft/TODO-V2.md
T

299 lines
24 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# DeepDrft — Design & Streaming Audit (2026-05-17)
Two parallel analyses commissioned on 2026-05-17:
1. **Design Audit** — SOLID compliance, DRY violations, encapsulation, documentation, modularity across all 8 projects.
2. **Streaming Evaluation** — Functional correctness, error handling, state machine integrity, performance, and missing capabilities in the audio streaming pipeline.
---
## Part 1: SOLID / DRY / Modularity Design Audit
**Reviewer:** code-reviewer
**Scope:** All 8 projects — DeepDrftModels, DeepDrftWeb.Services, DeepDrftWeb (host), DeepDrftWeb.Client, DeepDrftContent (host), DeepDrftContent.Services, DeepDrftCli, DeepDrftTests.
**Overall Assessment:** The macro architecture is sound — the host/services split is honoured, dual-database responsibilities are cleanly divided, and the FileDatabase subsystem is notably well-structured. The most pressing issues are a concurrency hazard in the `VaultIndexDirectory` lock, a silent data-loss risk in `TrackRepository.Update`, and several smaller modularity leaks. Most findings are in the Major/Minor tier; there are no broad design failures.
---
### 🔴 Critical
**`DeepDrftContent.Services/FileDatabase/Services/IndexSystem.cs:98128` — Lock released before async disk write; race window on index corruption**
`VaultIndexDirectory.AddToIndexAsync` takes `_indexLock` to mutate `_vaultIndex`, then releases it before calling `SaveIndexAsync` (which writes JSON to disk). If a second call arrives after the lock drop but before the file write finishes, it can read stale in-memory state and overwrite with an older version of the index. The `ReloadIndexAsync` path shares the same lock pattern. Fix: persist the in-memory state to a local snapshot before the lock releases, or ensure `SaveIndexAsync` is called inside the lock (acceptable because file I/O is the only async operation and the lock is not held across unrelated waits).
**`DeepDrftWeb.Services/Repositories/TrackRepository.cs:4765``Update` silently creates a new record when `Id` is not found**
If a caller passes a `TrackEntity` with a non-existent `Id`, `Update` falls through to `Create`, issuing an `INSERT` instead of signalling an error. The service wraps this in a `ResultContainer` that will report success, giving the caller no indication that the semantics changed. This matters in the CLI's `ValidateAndUpdateTrackAsync` path. Fix: return a `Result.CreateFailResult("Track not found")` instead of calling `Create`.
---
### 🟡 Major
**`DeepDrftContent.Services/FileDatabase/Services/FileDatabase.cs:120138` — Vacuous try/catch/throw in `CreateVaultAsync`**
The try block catches everything and immediately rethrows. It adds no logging, no transformation, and no cleanup — it is pure noise that also prevents the method from following the project's established exception-swallowing contract for public FileDatabase operations. Either adopt the swallow-and-return-false pattern (matching the rest of the public API) or remove the try/catch entirely.
**`DeepDrftContent.Services/TrackService.cs:7881` — Re-throws `InvalidOperationException` wrapping every exception, including cancellation**
`AddTrackFromWavAsync` catches `Exception ex` in the outer handler and always rethrows as `new InvalidOperationException(...)`. This swallows `OperationCanceledException`, which will lose the cooperative-cancellation semantics expected by the CLI and any future async callers. Fix: add `when (ex is not OperationCanceledException)` to the catch filter, or rethrow cancelled exceptions unwrapped.
**`DeepDrftContent.Services/FileDatabase/Services/IndexSystem.cs:122``Console.WriteLine` used for structured diagnostics in library code**
`VaultIndexDirectory.ReloadIndexAsync` and `IndexWatcher.cs` both emit diagnostic output directly to `Console.WriteLine`. The `DeepDrftContent.Services` library has no dependency on `ILogger`, so when hosted inside ASP.NET Core the messages bypass the structured logging pipeline (no log level, no correlation, no sink routing). Fix: inject `ILogger<T>` into `FileDatabase` and propagate it to `VaultIndexDirectory` and `IndexWatcher`, or accept the logger as an optional constructor parameter so tests can omit it.
**`DeepDrftWeb.Services/TrackService.cs:6076` — Sort-sentinel strings constructed per-service-instance with 64-char aggregations**
The `_sortLastAscending` and `_sortLastDescending` fields use `Enumerable.Repeat(..., 64).Aggregate(...)` — a per-instance heap allocation at construction time. The idiomatic version is `entity.Album ?? ""` (for ascending nulls-to-end). Also note that `_sortLastDescending` is never actually used in the switch — only `_sortLastAscending` is referenced — so one field is entirely dead.
**`DeepDrftWeb.Services/TrackRepository.cs:2836``GetPage` counts all rows then queries a potentially sorted subset; count and page can be inconsistent under concurrent writes**
`CountAsync()` and `ToListAsync()` are two separate database round-trips without a transaction. Under concurrent inserts/deletes a caller could receive a page count for N rows but a page containing N±k rows. More concretely: if filtering is ever added, the `count` query will silently return the wrong `TotalCount`. Consider combining into a single query or adding a comment explaining the deliberate trade-off.
**`DeepDrftContent.Services/FileDatabase/Services/FileDatabase.cs:7999``IndexFactoryService` instantiated on every `GetVaultTypeFromIndex` call**
Each call constructs a fresh `IndexFactoryService` complete with two `Dictionary<IndexType, Func<...>>` initializations. `FileDatabase` already holds `_vaults` and `_indexWatcher` as fields; the factory should be the same. Promote to a private field.
**`DeepDrftModels/DTOs/TrackDto.cs:78``TrackName` and `Artist` are not `required` while their entity counterparts are**
A `new TrackDto()` object silently has null `TrackName` and `Artist`, which can cause NREs or invalid-data bugs if a DTO is instantiated directly. Add `required` to both or annotate with `[NotNull]` and a comment explaining the intentional deviation.
**`DeepDrftWeb.Client/Services/AudioPlayerService.cs` — Base class `StreamAudio` allocates a new `byte[32KB]` buffer on every loop iteration**
Produces GC pressure proportional to audio file size. `StreamingAudioPlayerService` correctly uses `ArrayPool<byte>.Shared`. Use `ArrayPool` in both paths.
**`DeepDrftCli/Services/GuiService.cs``TruncateString` is duplicated verbatim from `CliService`**
`GuiService.TruncateString` and `CliService.TruncateString` are identical private methods. Extract to a shared `CliUtils` static class in the `DeepDrftCli` project.
---
### 🟢 Minor / Style
**`DeepDrftContent.Services/FileDatabase/Models/MediaModels.cs:119123` and `169173``GetExtensionType` duplicated in `ImageBinary` and `AudioBinary`**
Both subclasses define identical single-line forwarding methods. The call sites could reference `MimeTypeExtensions.GetExtension` directly, or the method could live once in `MediaBinary` as `protected static`.
**`DeepDrftContent.Services/FileDatabase/Models/MediaFactories.cs:58` — Multiple `SimpleMediaTypeRegistry` instances across static factory classes**
`MediaVaultTypeMap`, `MetaDataFactory`, `MediaParamsFactory`, `FileBinaryFactory`, and `FileBinaryDtoFactory` each hold their own `private static readonly IMediaTypeRegistry` instance. They should share one.
**`DeepDrftWeb.Services/TrackService.cs` — No interface for `TrackService`; concrete type injected directly**
`TrackController` and `CliService` both depend on concrete `TrackService`. Defining `ITrackService` (or even a minimal `IReadTrackService`) would improve testability and remove the namespace disambiguation noise in `CliService.cs`.
**`DeepDrftWeb.Services/TrackService.cs:1112` — Dead field `_sortLastDescending`**
Declared but never used in the sort switch. Remove it.
**`DeepDrftWeb.Client/Services/AudioPlayerService.cs` — Commented-out code**
`DarkModeSettings.cs` has commented-out `IsDarkModeChanged` event. `AudioPlayerService.SelectTrack` has `// NotifyStateChanged();` commented out. Either remove or document with a `// TODO` explaining intent.
**`DeepDrftWeb.Services/TrackRepository.cs:32``Skip` recalculated manually**
Uses `(pageParameters.Page - 1) * pageParameters.PageSize` instead of `pageParameters.Skip`. Use `pageParameters.Skip`.
**`DeepDrftModels/Models/PagedResult.cs:30``?? new List<T>()` guard is unreachable**
`items.ToList() ?? new List<T>()``ToList()` never returns null. Remove the null-coalescing branch.
**`DeepDrftContent/Controllers/TrackController.cs` — Controller calls `FileDatabase` directly, bypassing `TrackService`**
`GetTrack` and `PutTrack` call `_fileDatabase.LoadResourceAsync` and `RegisterResourceAsync` directly. The controller and CLI service have different entry paths into FileDatabase. If vault initialisation logic changes in `TrackService`, the controller will not pick it up.
**`DeepDrftWeb.Client/Services/AudioInteropService.cs:223241``InvokeJsAsync<T>` uses runtime type-checking switch**
Adding a new result type requires updating this method. Adding an `IOperationResult` marker interface would let the method use a generic constraint and remove the `typeof` switch.
---
### 💡 Suggestions
**No rollback for orphaned vault entries on dual-database write failure.** A compensating mechanism — even a simple `DeadLetterLog` recording orphaned `entryKey` values — would allow a maintenance pass to clean up without manually auditing the vault.
**`StructuralMap` and `StructuralSet` are `Dictionary<string, T>` wrappers for string keys.** The `GetKeyString` fast-path means no JSON overhead occurs in production. Consider whether plain `Dictionary<string, T>` and `HashSet<string>` would serve the same purpose with less indirection.
**`AudioPlayerService.StreamAudio` and `StreamingAudioPlayerService.StreamAudioWithEarlyPlayback` share the read-loop skeleton.** A shared `protected async Task ReadChunks(...)` helper in the base class would unify progress reporting and the do/while structure.
**`IPlayerService` interface exposes `EventCallback?` (a Blazor-specific type) as a settable property.** This leaks a UI framework dependency into a service contract. An `event Action StateChanged` pattern is framework-agnostic.
---
### ✅ Positive Observations
- **The host/services split is genuinely honoured.** Every piece of domain logic is in a `*.Services` project; hosts contain only controllers, middleware, DI wiring, and config. `DeepDrftCli` consuming both service libraries without taking ASP.NET dependencies is a direct payoff of this discipline.
- **The FileDatabase abstraction layer is well-designed.** The `IMediaTypeRegistry` / `SimpleMediaTypeRegistry` pattern cleanly centralises all type-specific factory logic, avoids reflection entirely, and the registration-time lambda approach is expressive.
- **`WavOffsetService` is precise and well-explained.** Block alignment is correctly applied, the chunk-walking loop handles non-fixed-header WAV files (LIST chunks, INFO chunks), and the early-return guards are tight.
- **Result pattern usage is consistent.** Services return `ResultContainer<T>` / `Result`; callers check `.Success` rather than catching; applied uniformly across the SQL-side and CLI paths.
- **Test isolation is solid.** Unique temp-directory per test, best-effort `[TearDown]` cleanup, and `ArrayPool`-safe test data via `TestData.cs` mean test runs are genuinely independent and fast.
- **`TrackDto` / `TrackEntity` separation is preserved** even though today they are structurally identical. This seam means the API response shape can diverge from the database shape without a refactor.
---
## Part 2: Streaming System Functional Evaluation
**Evaluator:** staff-engineer
**Scope:** TypeScript audio pipeline (StreamDecoder, PlaybackScheduler, AudioContextManager, AudioPlayer, SpectrumAnalyzer, wavutils.ts) + C# streaming server (WavOffsetService, AudioProcessor, StreamingAudioPlayerService, AudioInteropService, TrackController).
---
### ✅ What's Working Well
- **Clean SRP decomposition on the JS side.** `AudioContextManager` / `StreamDecoder` / `PlaybackScheduler` / `SpectrumAnalyzer` each own one responsibility; `AudioPlayer` is a thin composer.
- **The seek-beyond-buffer architecture is genuinely good.** Synthesising a fresh 44-byte WAV header server-side and tearing down only the decoder (not the AudioContext or scheduler offset bookkeeping) is the right seam. `playbackOffset` cleanly bridges "the buffer array starts at time T."
- **Block alignment is observed on both sides** (`WavOffsetService.alignedOffset` and `StreamDecoder.calculateByteOffset`). Both round down to `blockAlign` boundaries — correct invariant for click-free seek.
- **Adaptive chunk sizing on the C# side is sensible.** 1664 KB with hysteresis avoids thrashing.
- **`HttpCompletionOption.ResponseHeadersRead`** used correctly — playback can start before the full payload arrives.
- **Sample-rate matching via `recreateWithSampleRate`** re-creates the context at the source rate rather than forcing resampling.
- **Lookahead scheduler (500 ms)** is well-chosen — enough to absorb GC jitter without scheduling so far ahead that pause/seek latency becomes user-visible.
- **`ArrayPool<byte>.Shared` rental** in `StreamAudioWithEarlyPlayback` is correct for a hot loop on large files.
- **Spectrum animation throttled to ~30 fps** and gated by subscriber count — JS interop is the expensive seam, not RAF.
---
### 🔴 Critical
**1. Concurrent seeks race the streaming loop; cancellation does not await the previous loop's exit.**
`DeepDrftWeb.Client/Services/StreamingAudioPlayerService.cs``SeekBeyondBuffer` cancels `_streamingCancellation` and immediately disposes it, then constructs a new one and calls `StreamAudioWithEarlyPlayback`. The previously-running loop's `ReadAsync` will throw `OperationCanceledException` *asynchronously*, after the new loop has already begun pushing chunks to JS. Both loops can be calling `ProcessStreamingChunk` interleaved against the same `StreamDecoder` instance until the old one unwinds. The JS decoder is single-instance per player and cannot distinguish "old stream's late chunk" from "new stream's first chunk."
**Recommendation:** Track the active stream generation as a counter; on each new stream, increment and capture locally; gate `ProcessStreamingChunk` calls on `generation === currentGeneration`. Also `await` the previous streaming task before reinitializing.
**2. Garbage trailing bytes shipped to JS (silent audio corruption).**
`StreamingAudioPlayerService.cs:167``currentBytes == _currentBufferSize ? buffer : buffer[..currentBytes]`. When `currentBytes == _currentBufferSize`, the full `MaxBufferSize` rented array (dirty from `ArrayPool`) is passed to JS interop. Blazor serializes *all* of it including stale bytes past `currentBytes`. The decoder adds garbage audio data.
**Recommendation:** Drop the conditional; always pass `buffer.AsMemory(0, currentBytes).ToArray()`.
**3. Final-tail audio silently dropped at EOF.**
`DeepDrftWeb/Interop/wavutils.ts:176191``getSampleAlignedChunkSize` returns 0 when remaining bytes < minimum decode frame (`512 bytes / 10 frames`). Those bytes are never decoded. `isComplete` eventually flips true, but `tryDecodeNextSegment` has already returned `null` and the trailing audio is silently discarded.
**Recommendation:** Track "stream-complete" state in `StreamDecoder` and, once `isComplete`, decode whatever remains regardless of the minimum.
**4. First chunk assumed to contain the entire WAV header.**
`DeepDrftWeb/Interop/audio/StreamDecoder.ts:5879` and `wavutils.ts:12101` — If the first network chunk is < 44 bytes (possible at TCP segment boundaries), `parseHeader` returns `null` and the decoder throws fatal "Invalid WAV header in first chunk." Also breaks on non-trivial headers with LIST/INFO/JUNK chunks extending past the first chunk.
**Recommendation:** Accumulate raw bytes until `parseHeader` returns non-null before flipping `isFirstChunk`. The current `rawChunks` array is already designed to grow — feed pre-header bytes into it the same way and re-attempt parse on each chunk until success.
---
### 🟡 Major
**5. Format inconsistency: WavOffsetService / AudioProcessor / wavutils.ts disagree on PCM vs. IEEE Float.**
`AudioProcessor.ValidateWavStructure` only accepts PCM (format 1). `WavOffsetService` accepts PCM + IEEE Float (format 3). `WavOffsetService.CreateWavHeader` hard-codes `(short)1` in the synthesized header. A seek on a Float32 WAV would produce a corrupt header claiming PCM with Float-encoded data.
**Recommendation:** Standardize on PCM-only everywhere; enforce identically in all three places; have `CreateWavHeader` accept format as a parameter.
**6. `seekBeyondBuffer` rejects offset 0 as invalid.**
`DeepDrftWeb/Interop/audio/AudioPlayer.ts:288``if (byteOffset <= 0) return error`. `calculateByteOffset(0)` returns 0 legitimately (seek to start). A play-from-zero after a long-running stream cannot recover.
**Recommendation:** Change the guard to `byteOffset < 0`.
**7. Server allocates full `byte[]` for every request — no true server-side streaming.**
`DeepDrftContent/Controllers/TrackController.cs:47``return File(file.Buffer, mimeType)`. `LoadResourceAsync<AudioBinary>` reads the entire file into memory. A 100 MB WAV = 100 MB LOH allocation per request.
**Recommendation:** Add a `LoadResourceStreamAsync` overload returning an open `FileStream`. Have the controller pass it to `File(stream, mime, enableRangeProcessing: true)`.
**8. `?offset=` query parameter instead of HTTP `Range` headers.**
The architecture re-invents byte-range streaming on top of a custom query parameter, defeating HTTP caching (CDN sees `?offset=1234567` as a distinct URL) and preventing use of `enableRangeProcessing: true`. This is architecture-intentional (requires `decodeAudioData` for spectrum analysis) — but the trade-off should be documented. Note: `MediaElementAudioSourceNode` would still permit FFT tap on the audio graph with native range/seek/cache.
**9. 5-second decode timeout silently drops segments.**
`StreamDecoder.ts:166175` — on decode timeout, `tryDecodeNextSegment` catches and returns `null`. The bytes are consumed (`processedBytes` was already advanced), creating a silent ~0.37 s audio gap.
**Recommendation:** On decode failure, decrement `processedBytes` by the segment size for retry, or fail the stream loudly.
**10. `playFromPosition` silently no-ops when position equals total duration.**
`PlaybackScheduler.ts:101113` — if `position` exactly equals total duration, the loop falls through with `startBufferIndex == buffers.length`. A pause near end-of-stream followed by play cannot resume.
**Recommendation:** Clamp to `getTotalDuration() - epsilon` before searching, or detect end-of-stream and call `handlePlaybackEnded`.
**11. `AudioPlayer.play()` does not `await ensureReady()`.**
`AudioPlayer.ts:190``this.contextManager.ensureReady()` returns a Promise that is never awaited. On a backgrounded tab with a suspended AudioContext, `createBufferSource` and `source.start` run against a suspended context; audio silently fails.
**Recommendation:** Make `play()` async and `await ensureReady()`.
**12. Progress callbacks fire after disposal; `AudioPlayerProvider` has no `IAsyncDisposable`.**
`AudioPlayer.ts:455460` and `AudioInteropService.cs:288292` — the JS `setInterval` holds `DotNetObjectReference` after component unmount, throwing every 100 ms. `AudioPlayerProvider` has no `IAsyncDisposable`, so layout teardown leaks the streaming task, AudioContext, and HTTP connection.
**Recommendation:** Make `AudioPlayerProvider` implement `IAsyncDisposable` and call `_audioPlayerService.DisposeAsync()` on dispose.
**13. No cancellation token on the HTTP GET for streaming.**
`StreamingAudioPlayerService` streams without a `CancellationToken` passed to `_http.GetAsync`. Navigation mid-load leaves a server connection draining bytes into the void.
**Recommendation:** Pass the streaming `CancellationToken` through to `TrackMediaClient.GetTrackMedia` and the underlying `_http.GetAsync`.
**14. `WavOffsetService.CreateOffsetStream` allocates the full offset stream upfront in a `MemoryStream`.**
For a 1 MB seek into a 100 MB file, this allocates 99 MB into a fresh `MemoryStream`. Pure waste.
**Recommendation:** Return a custom composite `Stream` wrapping the synthesized header followed by `new MemoryStream(fullAudioBuffer, headerSize + alignedOffset, newDataSize, writable: false)`.
---
### 🟢 Minor
**15. `playbackAnchorTime == 0` as is-playing sentinel is brittle.** `AudioContext.currentTime` is technically allowed to be 0 at context creation. Use the `isActive_` flag that already exists.
**16. Non-WAV upload error message is unhelpful.** "Invalid WAV header in first chunk" falls through to `StreamingErrorHandler` as "Unable to play audio. Please try again." Add a `"header"` / `"wav"` match to the error handler.
**17. `new TextDecoder()` allocated inside chunk-walk loop in `parseHeader`.** `TextDecoder` instantiation is non-trivial. Allocate once at the top of `parseHeader`.
**18. `WavOffsetService.CreateOffsetStream` allocates full offset stream upfront.** (See Major finding 14 — this Minor note is the scope annotation; the fix is the same.)
**19. `MimeTypeExtensions.GetExtension` for unmapped mime returns `.bin`.** Silently renames unknown types. Reject unknown mime types at the PUT controller instead.
**20. `AudioProcessor.cs` `FindChunk` returns first match; `WavOffsetService.ParseWavHeader` uses last match.** Inconsistent behaviour on malformed WAV with two `fmt ` chunks. Pick one.
**21. `processedBytes` and `totalRawBytes` are `number` (double) — correct for ≤ 2^53 bytes.** Note for posterity; not a real risk at audio file sizes.
**22. `SpectrumAnalyzer.dataArray` typed `Float32Array<ArrayBuffer>` — requires TypeScript ≥ 5.7.** Confirm `tsconfig.json` target.
---
### 🚫 Missing Capabilities
- **Backward seek is broken.** Seeks below `playbackOffset` silently clamp to start of the current buffer segment, not the user's chosen time. To truly seek backward requires the same "re-fetch from offset" path as forward-beyond-buffer. **This is the highest-impact missing feature.**
- **Format diversity.** PCM/IEEE Float WAV only. MP3 / FLAC / Ogg / AAC / M4A are MIME-mapped but unimplemented.
- **Preload / prefetch next track.** No mechanism to start the next track's stream during the tail of the current.
- **Crossfade.** Two simultaneous `PlaybackScheduler`s would suffice architecturally; nothing wires it.
- **Gapless playback.** Inter-track gap is unmanaged.
- **HTTP Range + cache integration.** See finding 8.
- **Server-side stream from disk.** See finding 7.
- **Track-skip on error.** A failed `processStreamingChunk` aborts the entire load; no concept of skipping bad bytes to recover.
---
### 🌐 Browser Compatibility
- **`AudioContext` with explicit `sampleRate`** — Safari ≤ 14 ignored the `sampleRate` constructor option. Safari 14.1+ honors it.
- **`webkitAudioContext` fallback exists** (`AudioContextManager.ts:21`) — good. But `webkitAudioContext.close()` is async-but-not-Promise on older Safari; the `await` resolves immediately and the next `initialize` may run against a not-yet-closed context.
- **`AnalyserNode.getFloatFrequencyData`** — supported everywhere modern.
- **`AudioBufferSourceNode.onended`** — fires reliably; in Firefox, fires on next microtask rather than the audio thread. `handleSourceEnded` scheduling more buffers immediately is fine.
- **`HttpCompletionOption.ResponseHeadersRead` + WASM HttpClient** — depends on browser's streaming fetch support. All evergreen browsers handle this; iOS Safari < 15 had quirks.
- **`Uint8Array` over JS interop** — Blazor serializes via base64. 64 KB chunks incur base64 overhead on every chunk push.
---
### 🏆 Top 5 Priorities (Streaming)
1. Fix finding 2 — garbage trailing bytes shipped to JS (silent audio corruption)
2. Fix finding 3 — final-tail audio dropped at EOF (silent audio truncation)
3. Fix finding 1 — concurrent seek race (reproducible by rapid scrub bar clicks)
4. Implement backward seek (fundamental UX gap)
5. Fix findings 12 & 13 — `AudioPlayerProvider` disposal + HTTP cancellation (session leaks)
---
*Generated 2026-05-17. Both analyses are read-only — no code was changed.*