diff --git a/DeepDrftAPI/CLAUDE.md b/DeepDrftAPI/CLAUDE.md index 6d93bc6..a4fa35a 100644 --- a/DeepDrftAPI/CLAUDE.md +++ b/DeepDrftAPI/CLAUDE.md @@ -50,21 +50,21 @@ Returns the WAV bytes from the `tracks` vault with HTTP Range support. ### POST api/track/upload ([ApiKeyAuthorize]) -**Authenticated endpoint.** Accepts a raw WAV upload + metadata as `multipart/form-data`, processes the WAV, stores it in the vault, and persists metadata to SQL. Returns the fully persisted `TrackDto` with `Id` populated. +**Authenticated endpoint.** Accepts a raw audio file upload (.wav, .mp3, .flac) + metadata as `multipart/form-data`, processes the file, stores it in the vault, and persists metadata to SQL. Returns the fully persisted `TrackDto` with `Id` populated. - **Header `ApiKey`**: required. Validated by `ApiKeyAuthenticationMiddleware`. - **Form fields**: - - `wav` (`IFormFile`, required): the WAV bytes. File name must end in `.wav`. + - `audioFile` (`IFormFile`, required): the audio bytes. File name must end in `.wav`, `.mp3`, or `.flac`. - `trackName` (string, required) - `artist` (string, required) - `album` (string, optional) - `genre` (string, optional) - `releaseDate` (string, optional, format `YYYY-MM-DD`) - `createdByUserId` (long, required): audit trail — who uploaded this track. -- The upload stream is copied to a `.wav`-suffixed temp file under `Path.GetTempPath()` (the audio processor requires that extension and reads from disk). The temp file is always deleted in a `finally` block — success or failure. -- `[RequestSizeLimit(1 GB)]` + `[RequestFormLimits(MultipartBodyLengthLimit = 1 GB)]` lift the per-request ceiling above the framework default (~28 MB) so production-sized WAVs are accepted. The body is streamed to the temp file, not buffered in memory. -- Calls `UnifiedTrackService.UploadAsync`, which orchestrates: `TrackService.AddTrackFromWavAsync` (vault write) → `TrackManager` (SQL persist with `createdByUserId`). -- Returns 200 with the **persisted** `TrackDto` JSON (Id populated) on success. Returns 400 for missing/invalid form fields. Returns 500 if processing fails. +- The upload stream is copied to a temp file under `Path.GetTempPath()` with the appropriate extension (`.wav`, `.mp3`, or `.flac`). The audio processor reads from disk and requires the correct extension for format detection. The temp file is always deleted in a `finally` block — success or failure. +- `[RequestSizeLimit(1 GB)]` + `[RequestFormLimits(MultipartBodyLengthLimit = 1 GB)]` lift the per-request ceiling above the framework default (~28 MB) so production-sized files are accepted. The body is streamed to the temp file, not buffered in memory. +- Calls `UnifiedTrackService.UploadAsync`, which orchestrates: `TrackContentService.AddTrackAsync` (format-agnostic vault write via router) → `TrackManager` (SQL persist with `createdByUserId`). +- Returns 200 with the **persisted** `TrackDto` JSON (Id populated) on success. Returns 400 for missing/invalid form fields or unsupported audio format. Returns 500 if processing fails. ### DELETE api/track/{id:long} ([ApiKeyAuthorize]) diff --git a/DeepDrftContent/CLAUDE.md b/DeepDrftContent/CLAUDE.md index 62205e8..0a1a35e 100644 --- a/DeepDrftContent/CLAUDE.md +++ b/DeepDrftContent/CLAUDE.md @@ -74,19 +74,18 @@ public async Task RegisterResourceAsync(string vaultId, string entryId, Fi **Callers must check return values.** Do not change this without a deliberate design pass — it's embedded in all FileDatabase tests and client code. -## Audio processor +## Audio processors -`AudioProcessor.ProcessWavFileAsync(filePath)`: +Multi-format support via router pattern. All processors live in `DeepDrftContent/Processors/`: -1. Validates the RIFF/WAVE structure and format code. -2. Accepts standard PCM (audioFormat=1) and WAVE_FORMAT_EXTENSIBLE (audioFormat=0xFFFE) when the SubFormat GUID indicates PCM. -3. Normalizes EXTENSIBLE-PCM uploads to standard 44-byte PCM WAV before storing in the vault. -4. Parses the fmt and data chunks. -5. Extracts duration (sample count / sample rate) and bitrate (file size / duration). -6. Returns `AudioBinary` with all metadata. -7. **Fallback**: If parsing fails, logs a warning and returns defaults (180s / 1411 kbps / 44.1 kHz / 16-bit stereo). +- `AudioProcessor.ProcessWavFileAsync(filePath)`: WAV-specific processor. Validates RIFF/WAVE structure and format code. Accepts standard PCM (audioFormat=1) and WAVE_FORMAT_EXTENSIBLE (audioFormat=0xFFFE) when the SubFormat GUID indicates PCM. Normalizes EXTENSIBLE-PCM uploads to standard 44-byte PCM WAV before storing. Parses fmt and data chunks; extracts duration and bitrate. Returns `AudioBinary` with metadata. On parse failure, logs warning and returns defaults (180s / 1411 kbps / 44.1 kHz / 16-bit stereo). Accepts standard PCM (audioFormat=1), WAVE_FORMAT_EXTENSIBLE with PCM SubFormat (0x0001), IEEE Float SubFormat (0x0003), and Padded 24-in-32 containers; normalizes Float and padded inputs to standard 24-bit PCM before storage. +- `Mp3AudioProcessor.ProcessMp3FileAsync(filePath)`: MP3 processor. Skips ID3v2 tag, finds first valid MPEG frame sync, decodes frame header (bitrate, sample rate, channels). Reads Xing/Info header for VBR total-frame count (accurate duration); VBRI header as fallback; CBR estimate from file size otherwise. Returns `AudioBinary` with original bytes and `.mp3` extension. On parse failure, falls back to defaults (180s / 320 kbps). +- `FlacAudioProcessor.ProcessFlacFileAsync(filePath)`: FLAC processor. Validates `fLaC` magic, reads STREAMINFO metadata block (20-bit sample rate, 3-bit channels, 5-bit bits-per-sample, 36-bit total samples — all bit-packed). Computes duration from `totalSamples / sampleRate`; average bitrate from file size. Returns `AudioBinary` with original bytes and `.flac` extension. On parse failure, falls back to defaults (180s / 1411 kbps). +- `AudioProcessorRouter.ProcessAudioFileAsync(filePath)`: Routes by extension — `.wav` → `AudioProcessor`, `.mp3` → `Mp3AudioProcessor`, `.flac` → `FlacAudioProcessor`. Throws `ArgumentException` for unsupported extensions. -PCM-only. Accepts standard PCM (audioFormat=1), WAVE_FORMAT_EXTENSIBLE (audioFormat=0xFFFE) with PCM SubFormat (0x0001), IEEE Float SubFormat (0x0003), and Padded 24-in-32 containers (wValidBitsPerSample=24 in a 32-bit container). Float and padded-container inputs are normalized to standard 24-bit PCM at storage time via `ConvertFloatTo24BitPcm` and `RepackPaddedContainer` respectively. Other formats (mp3, flac, aac, ogg, m4a) are listed in `MimeTypeExtensions` but not implemented. The processor validates RIFF/WAVE structure — anything else is rejected. +Vault stores original bytes with correct extension and MIME type (inferred from file extension or content-type header at upload time). + +The primary entry point is `TrackContentService.AddTrackAsync(filePath, mimeType)` — format-agnostic. It selects the right processor via `AudioProcessorRouter`, processes the file, generates an entry GUID, stores in vault, returns unpersisted `TrackEntity`. Legacy `AddTrackFromWavAsync(filePath)` is now a shim over `AddTrackAsync` for backward compatibility. ## Image processor diff --git a/DeepDrftPublic.Client/CLAUDE.md b/DeepDrftPublic.Client/CLAUDE.md index 4a4fc61..a9d9a85 100644 --- a/DeepDrftPublic.Client/CLAUDE.md +++ b/DeepDrftPublic.Client/CLAUDE.md @@ -61,18 +61,27 @@ Both are configured with JSON serializer settings (case-insensitive property mat ### Implementation - `AudioPlayerService` (abstract base): Lifecycle. Stores current track, playback state, volume. `SelectTrack` throws `NotSupportedException` (buffered path is dead); derived classes override `SelectTrackStreaming`. - `StreamingAudioPlayerService` (production): Constructor takes `TrackMediaClient`, `AudioInteropService`, logger. `SelectTrackStreaming`: - 1. Calls `TrackMediaClient.GetAudioStreamAsync(trackId)`. - 2. `StreamingAudioPlayerService.StreamAudioAsync` reads chunks (16–64 KB adaptive), pushes each via `AudioInteropService.ProcessStreamingChunkAsync` (JS interop call). - 3. TypeScript `StreamDecoder` parses WAV header (first chunk), decodes subsequent chunks to `AudioBuffer`s. + 1. Calls `TrackMediaClient.GetAudioStreamAsync(trackId)`, which returns a response object including `ContentType` (e.g., `audio/wav`, `audio/mpeg`, `audio/flac`). + 2. `StreamingAudioPlayerService.StreamAudioAsync` reads chunks (16–64 KB adaptive), pushes each via `AudioInteropService.ProcessStreamingChunkAsync(contentType, chunk)` (JS interop call with format hint). + 3. TypeScript `StreamDecoder` is format-agnostic; delegates format-specific header parsing and chunked decoding to the appropriate `IFormatDecoder` implementation (e.g., `WavFormatDecoder` for WAV, TBD MP3/FLAC decoders for other formats). Decoder parses header (first chunk), decodes subsequent chunks to `AudioBuffer`s. 4. `PlaybackScheduler` schedules buffers on Web Audio `AudioContext`. 5. Playback starts as soon as a configurable min buffer count is queued. - 6. **Seek beyond buffer**: if seek target is past the decoded range, `Seek(position)` calls `TrackMediaClient.GetAudioStreamAsync(trackId, byteOffset)` with a file-absolute byte offset. Client sends `Range: bytes={offset}-`; server responds 206 with raw PCM; decoder retains the parsed WAV header and feeds the continuation directly into the decode pipeline. + 6. **Seek beyond buffer**: if seek target is past the decoded range, `Seek(position)` calls `TrackMediaClient.GetAudioStreamAsync(trackId, byteOffset)` with a file-absolute byte offset. Client sends `Range: bytes={offset}-`; server responds 206 with raw bytes (same format as original file); decoder retains the parsed header and feeds the continuation directly into the decode pipeline. ### Interop bridge - `AudioInteropService.CreatePlayerAsync` polls `DeepDrftAudio.isReady()` before proceeding; `index.ts` sets `ready = true` after attaching the API to `window`. This guards against slow WASM boot / cache misses. -- `AudioInteropService.ProcessStreamingChunkAsync(chunk)` calls JS `window.DeepDrftAudio.processStreamingChunk(chunk)` and awaits the Promise. +- `AudioInteropService.ProcessStreamingChunkAsync(contentType, chunk)` calls JS `window.DeepDrftAudio.processStreamingChunk(contentType, chunk)` and awaits the Promise. The `contentType` parameter is passed through to the format-decoder factory. - `AudioInteropService` also manages callback registrations for progress (fired by `PlaybackScheduler`), end-of-playback (fired by `PlaybackScheduler`), and spectrum data (fired by `SpectrumAnalyzer`). Each callback is a `DotNetObjectReference` to a delegate. +### Format decoders (TypeScript) +New modules in `DeepDrftPublic/Interop/audio/`: + +- `IFormatDecoder.ts`: Interface. Defines contract for format-specific decoders: `parseHeader(chunk, offset)` → header metadata; `decodeChunk(chunk, offset)` → `AudioBuffer`. +- `WavFormatDecoder.ts`: Concrete WAV implementation. Parses RIFF/WAVE structure, fmt and data chunks. All WAV-specific byte-parsing logic lives here. Exported as the default WAV decoder. +- Future decoders: MP3, FLAC, etc. (TBD). + +`StreamDecoder.ts` remains the orchestrator — it accepts the first chunk, selects the right format decoder via factory (based on `contentType`), delegates all format-specific work to it, and chains subsequent chunks through the same decoder instance. + ### Component integration - `AudioPlayerProvider.razor` is the cascading host. It injects `IStreamingPlayerService` (resolved to `StreamingAudioPlayerService` in DI), stores it in a cascade with `IsFixed="true"`, and keeps it alive across navigation. - `AudioPlayerBar.razor` is the dock UI. It cascades the player, binds buttons to `Play()` / `Pause()` / `Seek()` / `SetVolume()`, and displays current time / duration / progress bar. Minimize-state mutations (`Expand`, `ToggleMinimized`, `Close`) all route through a private `SetMinimized(bool value)` mutator, which guards no-ops, fires the `OnMinimized` callback, and calls `StateHasChanged()`. Subscribes to `IPlayerService.StateChanged` in `OnParametersSet` (reference-guarded, idempotent) and unsubscribes on dispose to re-render itself when the cascade updates. diff --git a/PLAN.md b/PLAN.md index 6c816e1..49edd1a 100644 --- a/PLAN.md +++ b/PLAN.md @@ -32,6 +32,7 @@ These were flagged during the audit but classified as feature work, not defect f - Client side: the JS decoder is currently a WAV byte-walker. For non-WAV, the simplest path is `decodeAudioData` over the full payload (loses streaming-start). The richer path is per-format chunked decoders. Worth a design pass before committing. - **Prerequisite:** None functionally, but consider settling **Phase 4 (HTTP Range)** first — native range/cache is much more important for large MP3s than for WAVs. - **Constraint:** Spectrum FFT tap currently relies on raw `AudioBuffer`s through `decodeAudioData`. If a future path uses `MediaElementAudioSourceNode` (see 4.1), the FFT tap still works but the early-playback story changes. +- **Progress:** Wave 1 (foundation) landed on `dev` (2026-06-10). Format router (`Mp3AudioProcessor`, `FlacAudioProcessor`, `AudioProcessorRouter`) and client-side `IFormatDecoder` interface + `WavFormatDecoder` implementation are in place. `TrackContentService.AddTrackAsync` is the new format-agnostic primary method; `AddTrackFromWavAsync` is a shim. Client `StreamDecoder` is now format-agnostic; WAV-specific logic moved to `WavFormatDecoder`. Wave 2 (per-format chunked decoders) and Wave 3 (full wiring) pending. ### 1.3 Preload / prefetch of the next track