c10d315a7b
Loudness-waveform seekbar replacing MudSlider; ILoudnessAlgorithm abstraction (RMS first, LUFS future); vault sidecar storage; CMS PreProcessing panel for backfill; VolumeZone rename. All decisions resolved 2026-06-05.
606 lines
34 KiB
Markdown
606 lines
34 KiB
Markdown
# WaveformSeeker — loudness-waveform seekbar to replace the MudSlider
|
||
|
||
Status: approved. Decisions resolved 2026-06-05. Author: product-designer. Date: 2026-06-05.
|
||
**Plan only — no code edits made by this doc.**
|
||
|
||
---
|
||
|
||
## 1. Summary
|
||
|
||
Replace the `MudSlider`-based scrub bar in `PlayerSeekZone.razor` with a new
|
||
`<WaveformSeeker/>` component that renders the track's **loudness profile** as a
|
||
high-density vertical bar chart and serves as the seek surface (click / drag to seek).
|
||
|
||
The point is to make the seekbar *informative*: instead of a featureless line, the
|
||
listener sees the track's energy shape — the quiet intro, the drop, the breakdown, the
|
||
outro — and can scrub against that shape. This is the established "waveform scrubber"
|
||
idiom from SoundCloud, Overcast, and most DAW transport bars. We are borrowing it
|
||
deliberately; the novel part for us is only that the profile is **preprocessed server-side
|
||
and shipped as a small quantized array**, so the visual paints the instant a track loads rather
|
||
than waiting for the audio to decode.
|
||
|
||
The loudness measure is **not hardcoded to RMS**. The first implementation computes RMS, but
|
||
the compute path is built around a swappable `ILoudnessAlgorithm` abstraction (§5a) so a
|
||
different perceptual loudness profile (e.g. LUFS) can be substituted later without touching the
|
||
component, the wire format, or the storage. The component and the data are named for the
|
||
*concept* (waveform / loudness profile), not the algorithm.
|
||
|
||
Two visualizations currently coexist in the seek zone. They are being separated by
|
||
*kind*:
|
||
|
||
- **Real-time spectrum** (FFT frequency bars, `SpectrumVisualizer.razor`) — a *live* readout
|
||
of "what is sounding right now." This moves **up, above the volume slider**.
|
||
- **Static loudness-over-time** (the new `WaveformSeeker`) — a *whole-track* readout of "how loud
|
||
is each moment." This takes over the seek area.
|
||
|
||
This is a clean conceptual split: live-frequency lives with the output level (volume),
|
||
whole-track-amplitude lives with the transport position (seek). The current arrangement
|
||
(real-time spectrum behind the seek slider) conflates the two.
|
||
|
||
### Naming (decided)
|
||
|
||
"Spectrum" properly means frequency content; what this component shows is **amplitude over
|
||
time**, not spectrum. The component is named honestly: **`WaveformSeeker`** (decided), which
|
||
reads correctly against the live `SpectrumVisualizer` (frequency) without implying FFT data is
|
||
in the payload. The *data* is named for the concept, not the algorithm: **`WaveformProfile`** /
|
||
`WaveformProfileDto` / `waveformBuckets` / a `profile` field — so substituting the loudness
|
||
algorithm (RMS → LUFS, §5a) never forces a rename of the type that carries it.
|
||
|
||
---
|
||
|
||
## 2. Current state (what we're changing)
|
||
|
||
The seek zone today (`PlayerSeekZone.razor`):
|
||
|
||
```razor
|
||
<MudStack Row="false" Spacing="0" Class="@Class">
|
||
<SpectrumVisualizer/> @* live FFT bars, sits on top *@
|
||
<div class="mx-3" @onpointerdown/up/leave>
|
||
<MudSlider .../> @* the scrub bar *@
|
||
</div>
|
||
<TimestampLabel CurrentTime=... Duration=.../> @* time text *@
|
||
</MudStack>
|
||
```
|
||
|
||
Relevant mechanics already in place that the new component must preserve:
|
||
|
||
- **Seek gesture plumbing** lives in `PlayerSeekZone.razor.cs`: `OnSeekStart` /
|
||
`OnSeekChange` / `OnSeekEnd` callbacks bubble to `AudioPlayerBar.razor.cs`, which sets
|
||
`_isSeeking`, tracks `_seekPosition`, and calls `PlayerService.Seek(position)` on release.
|
||
`DisplayTime` shows the drag position while seeking, real `CurrentTime` otherwise.
|
||
- **`CanSeek`** = `IsLoaded && Duration.HasValue && Duration > 0`. Seek is allowed during
|
||
streaming, including beyond the buffer (the offset-refetch path in
|
||
`StreamingAudioPlayerService` / `AudioPlayer.ts.seekBeyondBuffer`). The new component does
|
||
**not** touch that path — it only produces a target time and hands it to the same
|
||
`Seek(double)` call.
|
||
- **`SpectrumVisualizer`** is driven entirely by `AudioInteropService.StartSpectrumAnimationAsync`,
|
||
which subscribes a callback to the TS `SpectrumAnalyzer` (live FFT, ~30fps). It already
|
||
self-manages animation lifecycle off `PlayerService.StateChanged`. Moving it is a pure
|
||
layout move — no logic change.
|
||
- **Player layout** (`AudioPlayerBar.razor.css`) is pure-CSS responsive: at ≥600px the row is
|
||
`[transport] [seek grows] [volume]`; at <600px it's `[transport][volume]` then full-width
|
||
seek below. Wherever the spectrum lands, it must respect this.
|
||
|
||
---
|
||
|
||
## 3. UI layout changes
|
||
|
||
### 3a. What moves
|
||
|
||
| Element | Today | After |
|
||
|---|---|---|
|
||
| Live FFT spectrum (`SpectrumVisualizer`) | Inside `PlayerSeekZone`, above the slider | Inside the **volume cluster**, above the volume slider |
|
||
| Scrub bar (`MudSlider`) | `PlayerSeekZone` | Replaced by `WaveformSeeker` (loudness bars + playhead) |
|
||
| Timestamp (`TimestampLabel`) | Below the slider in `PlayerSeekZone` | Stays with the seeker (below or overlaid on the bars) |
|
||
| Volume slider (`VolumeControls`) | Right cluster | Unchanged position; now has the live spectrum stacked above it |
|
||
|
||
### 3b. Resulting zones
|
||
|
||
- **Transport zone** — unchanged (play/pause/stop + load spinner).
|
||
- **Volume zone** — becomes a small vertical stack: live FFT spectrum on top, volume
|
||
slider below. This is a natural pairing ("here's the live output, here's how loud").
|
||
`VolumeControls.razor` gets the `<SpectrumVisualizer/>` stacked above its existing
|
||
`MudStack`. The wrapper is renamed `VolumeZone` (**decided**) for symmetry with the other
|
||
two zones.
|
||
- **Seek zone** — becomes the `WaveformSeeker`: a wide loudness bar chart that grows to fill the
|
||
available width (it inherits the `flex-grow:1` the seek zone has today), with the
|
||
timestamp beneath.
|
||
|
||
### 3c. Layout risk
|
||
|
||
The live spectrum is currently a wide element. Stacking it above the *volume* slider
|
||
constrains it to the narrow right cluster — at ≥600px the volume cluster is only as wide as
|
||
the slider (the CSS halves and flex-start-pins it per commit `78c6803`). A 32-bucket FFT bar
|
||
chart squeezed into ~120px will look cramped.
|
||
|
||
**Decided: 24 buckets in the volume cluster, parameterized.** The live spectrum renders **24
|
||
buckets** in the narrow volume slot, set via the existing `BucketCount` parameter on
|
||
`SpectrumVisualizer` so the count can be tuned without a code change to the component. 24 reads
|
||
denser than 16 while still fitting the ~120px cluster comfortably.
|
||
|
||
---
|
||
|
||
## 4. WaveformSeeker component design
|
||
|
||
### 4a. Data → geometry
|
||
|
||
The component receives a normalized loudness profile: `double[] profile`, each value in `[0,1]`,
|
||
representing the loudness measure of a contiguous time slice. Profile length is **N buckets**
|
||
covering the whole track regardless of duration (fixed bucket count, variable bucket
|
||
*duration*). Each bucket renders as one vertical bar; bar height = `profile[i]` scaled to the
|
||
component height (with a small floor, ~2%, so silence is still visible as a hairline — mirrors
|
||
`SpectrumVisualizer.GetBarHeight`).
|
||
|
||
**Bar count.** Two regimes:
|
||
|
||
- **Preprocessed resolution (N):** how many buckets the backend computes and stores. **N is
|
||
configurable** (e.g. via `WaveformProfileOptions` bound from DI/config), **default 512**. A
|
||
high source resolution lets the front end downsample to whatever fits the rendered width
|
||
without re-fetching. Storage is tiny regardless of N (see §5).
|
||
- **Rendered resolution:** how many bars actually draw, = pixels-available / (bar + gap). **The
|
||
front end derives its rendered bar count from the available width, regardless of N** — it does
|
||
not assume the stored N is the bar count. At a typical ~600px seek zone with 2px bars + 1px
|
||
gaps that's ~200 bars. The component **downsamples N → rendered count** by max-or-mean over
|
||
each rendered bucket's source range. Use **max** (peak) for the visual — peak-per-bucket gives
|
||
the punchy DAW look; mean flattens transients.
|
||
|
||
**Decided: N configurable, default 512; rendered count derived from width; downsample by peak.**
|
||
512 is a clean power-of-two, downsamples evenly to 256/128/64, and is ~512B on the wire as
|
||
quantized bytes (§5b). The wire format is the quantized `byte[]` base64 either way; N being
|
||
configurable does not change the format.
|
||
|
||
### 4b. Playhead / progress indication
|
||
|
||
The current position is shown two ways simultaneously (both cheap, both standard):
|
||
|
||
1. **Played/unplayed split** — bars left of the playhead render in the played colour (moss
|
||
green `--deepdrft-green-accent`, matching the house waveform identity called out in
|
||
`track-card-theming.md`), bars right render muted. The split point = `CurrentTime / Duration`.
|
||
2. **Playhead line** — a 1–2px vertical rule at the split, for precision.
|
||
|
||
While dragging, the split/line follow the pointer (`DisplayTime`), not playback — same
|
||
`_isSeeking` discipline as today.
|
||
|
||
### 4c. Interaction model
|
||
|
||
Pointer-based, reusing the existing callback contract so `AudioPlayerBar.razor.cs` is barely
|
||
touched:
|
||
|
||
- **Hover** → a faint preview line at the cursor + a tooltip/label showing the time under the
|
||
cursor (`hoverTime = (cursorX / width) * Duration`). Preview only; no seek. (New affordance;
|
||
the MudSlider had none. Borrowed from SoundCloud/YouTube scrubbers.)
|
||
- **Click** → seek to `clickX / width * Duration`. Fires `OnSeekStart` then immediately
|
||
`OnSeekEnd(clickTime)`.
|
||
- **Drag** → `pointerdown` starts seeking (`OnSeekStart`), `pointermove` updates the preview
|
||
position and fires `OnSeekChange(t)` (so `DisplayTime` and the played/unplayed split track
|
||
the drag live), `pointerup` commits (`OnSeekEnd(t)` → `PlayerService.Seek(t)`).
|
||
`pointerleave` while dragging commits at the last position (matches current
|
||
`HandlePointerLeave` behaviour) — or, better, use **pointer capture** (`setPointerCapture`)
|
||
so a drag that leaves the element keeps tracking until release. Recommend pointer capture;
|
||
it's the more forgiving gesture and avoids the "lost the drag" feel.
|
||
|
||
Position math needs the element's pixel width and the pointer's offset. Two implementations:
|
||
|
||
- **Pure Blazor:** use `@onpointermove`/`@onpointerdown` with `PointerEventArgs.OffsetX` and a
|
||
cached bounding width (one JS `getBoundingClientRect` call on resize). Simple, no per-frame
|
||
interop.
|
||
- **Thin JS helper:** a tiny interop that does hit-testing and returns a normalized `[0,1]`
|
||
fraction. Only worth it if `OffsetX` proves unreliable across the responsive reflows.
|
||
|
||
**Recommend pure-Blazor pointer events first**, with `OffsetX`/cached width; fall back to a JS
|
||
helper only if hit-testing is flaky. Keeps the new surface out of the TS bundle (see §7).
|
||
|
||
### 4d. Rendering approach
|
||
|
||
- **DOM bars** (one `<div>` per rendered bar, CSS `--bar-height`) — exactly how
|
||
`SpectrumVisualizer` works today, so it's consistent and themeable via existing
|
||
`deepdrft-` tokens. At ~200 bars this is fine; Blazor diffing over 200 static divs that only
|
||
change a CSS var on seek is cheap.
|
||
- **Canvas** — one `<canvas>`, drawn via a small JS interop on load + on playhead move. Scales
|
||
to thousands of bars and avoids 200-node diffs, but pulls the component into the JS interop
|
||
layer and complicates theming (canvas can't read CSS vars without plumbing).
|
||
|
||
**Recommend DOM bars** to match the existing visualizer and stay in pure Blazor/CSS. Revisit
|
||
canvas only if profiling shows the seek-time re-render (recolouring the split) janks. The
|
||
played/unplayed split can be done **without** re-rendering every bar by overlaying a clipped
|
||
coloured layer — render the bars once in the played colour, lay a muted-colour copy clipped to
|
||
`width * (1 - progress)` from the right on top. Then a seek only moves one clip rect, not 200
|
||
divs. This is the key perf trick; call it out for the implementer.
|
||
|
||
### 4e. No-profile-yet state (important)
|
||
|
||
A track may have no stored loudness profile (legacy tracks uploaded before this feature; profile
|
||
fetch failed; profile still computing). The component must degrade, not break:
|
||
|
||
- **Fallback bars:** render a flat row of floor-height bars (or a gentle idle shimmer) so the
|
||
control still reads as a seekbar and **remains fully seekable** (geometry is just time/width;
|
||
it needs no profile data to seek). Seek must never depend on the profile being present.
|
||
- **Optional client-side compute:** once audio is decoded, the front end *could* compute a
|
||
loudness profile from the decoded `AudioBuffer`s and fill the bars live (progressive reveal as
|
||
the stream decodes). This is a real fallback but adds a TS path (§7); treat as a **later
|
||
enhancement**, not part of the first cut. First cut: preprocessed profile or flat fallback.
|
||
|
||
**Recommend: first cut ships preprocessed-or-flat. Seekability is never gated on the profile.**
|
||
|
||
---
|
||
|
||
## 5. Backend loudness preprocessing
|
||
|
||
This is the load-bearing design decision. Three sub-questions: **how to compute**, **when to
|
||
compute**, **where to store**.
|
||
|
||
### 5a. How to compute (swappable loudness algorithm)
|
||
|
||
**The loudness measure is an abstraction, not a hardwired RMS pass** (decided). `WaveformProfileService`
|
||
in `DeepDrftContent` owns the PCM walk, bucketing, normalization, and storage; the per-bucket
|
||
loudness calculation is delegated to an injected **`ILoudnessAlgorithm`** strategy. The first
|
||
implementation is RMS (`RmsLoudnessAlgorithm`); **LUFS** (or another perceptual profile) is the
|
||
named future alternative, droppable in as a second `ILoudnessAlgorithm` without touching the
|
||
service, the wire format, the storage, or the component.
|
||
|
||
Sketch of the seam (illustrative, not prescriptive):
|
||
|
||
```
|
||
interface ILoudnessAlgorithm {
|
||
// given the mono samples for one time slice, return its loudness in [0,1]-able units
|
||
double Measure(ReadOnlySpan<float> sliceSamples);
|
||
}
|
||
// first impl: RMS — sqrt(mean(sample²)). future: LUFS (K-weighting + gating).
|
||
```
|
||
|
||
We already own a PCM-WAV parser: `AudioProcessor` in `DeepDrftContent` parses RIFF/WAVE/fmt/
|
||
data, validates PCM, and knows channels / sampleRate / bitsPerSample / blockAlign / dataSize.
|
||
Computing the profile is a straightforward extension of that same buffer walk — **no new audio
|
||
library needed for RMS**. The stack is PCM-only WAV today (`AudioProcessor` rejects non-PCM), so
|
||
`WaveformProfileService` can read samples directly:
|
||
|
||
1. Locate the `data` chunk (already done in `ValidateWavStructure` / `FindChunk`).
|
||
2. Walk the PCM samples, decode per `bitsPerSample` (16/24/32-bit signed; 8-bit unsigned),
|
||
average channels to mono.
|
||
3. Partition the sample stream into **N equal time slices** (N from `WaveformProfileOptions`,
|
||
default 512); hand each slice to `ILoudnessAlgorithm.Measure` to get `bucket[i]`.
|
||
4. Normalize: divide by the max bucket (**peak-normalize** to `[0,1]`, decided) so quiet tracks
|
||
still show shape. (Trade-off: peak-normalize loses absolute-loudness comparison *between*
|
||
tracks. Acceptable — the seeker is about *this* track's shape, not cross-track loudness. A
|
||
future LUFS algorithm that wants absolute units can normalize differently behind the same
|
||
interface.)
|
||
|
||
The PCM walk + bucketing + RMS `Measure` is ~40 lines. No external dependency for the RMS path.
|
||
**Do not pull in NAudio or similar** for RMS — the existing parser already does the hard part. A
|
||
future LUFS implementation may justify a dep; if so, that decision rides with *that* algorithm,
|
||
not the service.
|
||
|
||
Cost: one linear pass over the PCM buffer. For a 100MB WAV that's ~25M stereo samples — a few
|
||
hundred ms, done **once at upload**, never on the playback path.
|
||
|
||
### 5b. Data format on the wire
|
||
|
||
Front end needs `double[] profile` length N, each `[0,1]`. **Decided: quantized `byte[]` (each
|
||
bucket 0–255), base64 in JSON**, decoded to `[0,1]` client-side (`b/255.0`). 8-bit quantization
|
||
is *visually* lossless for a bar chart; at N=512 that's 512 bytes raw / ~684 chars base64 —
|
||
negligible to store and ship, and it keeps the profile from bloating the metadata payload if it
|
||
ever rides along with `TrackDto` (see §5d). The format is independent of N and of the loudness
|
||
algorithm — both RMS and a future LUFS profile quantize to the same `[0,1]`→byte wire shape.
|
||
|
||
### 5c. When to compute
|
||
|
||
- **On upload (decided, for new tracks):** `UnifiedTrackService.UploadAsync` already processes
|
||
the WAV (`AddTrackFromWavAsync` → `AudioProcessor`). Add the `WaveformProfileService` pass there,
|
||
in the same read, and persist the profile alongside the track. Cost is paid once, by the
|
||
uploader (CMS admin), off the listener's path. This is the natural seam.
|
||
- **CMS PreProcessing panel (decided, for existing tracks):** **not** a CLI command. Existing
|
||
vault tracks predate the feature, so they need an explicit generation path — surfaced **in the
|
||
CMS** (`DeepDrftManager`) rather than as an offline job. The CMS track grid shows which tracks
|
||
are missing a profile and offers **1-click generation** per track (and/or a bulk action). The
|
||
compute runs server-side via the same `WaveformProfileService`. See Phase 5 (§12) for the panel
|
||
design.
|
||
- **On demand + cache (rejected):** computing lazily on first profile request spreads cost to
|
||
first-listen and needs a cache layer + cold-start penalty. Not worth its complexity given
|
||
upload is the only ingest and the CMS panel covers the backlog explicitly.
|
||
|
||
**Decided: compute on upload for new tracks; CMS PreProcessing panel for existing ones.** The
|
||
no-profile fallback (§4e) carries the UI in the meantime, so the seeker can ship before every
|
||
existing track has been processed. (Memory note: Daniel favours designing the seam now even when
|
||
deferring the feature — the no-profile fallback *is* that seam.)
|
||
|
||
### 5d. Where the data lives — vault sidecar (decided)
|
||
|
||
**Decided: option 3 — a sidecar in the FileDatabase vault + a dedicated endpoint (§6).** Store
|
||
the profile as its own vault entry (e.g. a `profiles` vault keyed by `EntryKey`, or a
|
||
`.profile`/`.wfp` companion next to the audio). The candidates and the reasoning for the choice:
|
||
|
||
1. **New column on `TrackEntity` / `track` table** (`WaveformProfile byte[]` or `text`). Profile
|
||
rides with metadata. Pro: one fetch (`GET api/track/page` or `meta/{id}` already returns
|
||
`TrackDto`). Con: bloats every paged list response by ~512B × pageSize (20 → ~10KB/page) even
|
||
when the player isn't open; `TrackEntity` is described in `CLAUDE.md` as "a join, *only*
|
||
metadata" — a binary blob stretches that contract.
|
||
|
||
2. **New column, but only returned by `meta/{id}` / a dedicated fetch — not by `page`.** Keeps
|
||
the list lean; the player fetches the profile when a track is selected. Needs the profile
|
||
field to be omittable from the paged DTO. (Second choice if a vault type is unwelcome — see
|
||
below.)
|
||
|
||
3. **A sidecar in the FileDatabase vault** (**chosen**) — store the profile as its own vault
|
||
entry keyed by `EntryKey`. Pro: keeps it out of SQL entirely, near the binary it describes,
|
||
consistent with "binary content lives in the vault." Con: a second vault round-trip to serve
|
||
it; new endpoint.
|
||
|
||
4. **Computed into the audio stream's header response** — no separate storage; return the profile
|
||
as a response header / preamble on `GET api/track/{id}`. Couples profile delivery to the audio
|
||
fetch. Awkward (headers for 512B, or a framing change to the WAV stream). Rejected.
|
||
|
||
**Rationale for the vault sidecar:**
|
||
|
||
- It honours the architectural line `CLAUDE.md` draws — `TrackEntity` stays pure metadata, the
|
||
vault owns "binary stuff about the audio." A loudness profile is derived binary content; it
|
||
belongs with the binary.
|
||
- It keeps the paged list response unchanged (no regression to `TracksView` load weight).
|
||
- It parallels the existing audio path exactly: the player already does a *separate* content
|
||
fetch (`TrackMediaClient` → `api/track/{id}`) distinct from the metadata fetch
|
||
(`TrackClient` → `api/track/page`). The profile is one more content fetch on track-select.
|
||
|
||
**Fallback if the vault type proves unwelcome: option 2** (SQL column, served only on `meta/{id}`
|
||
or a dedicated route). Simpler to migrate (one EF column) but puts derived binary in SQL. Not the
|
||
chosen path; recorded so the alternative is on the table if the vault sidecar hits friction.
|
||
|
||
The dual-database split here is real: metadata (SQL) vs derived-binary (vault). The profile is
|
||
derived binary. The vault sidecar keeps the split clean.
|
||
|
||
---
|
||
|
||
## 6. New API surface
|
||
|
||
Per the vault-sidecar storage (§5d, decided), add **one unauthenticated GET** that mirrors
|
||
the existing audio route's shape and proxy path:
|
||
|
||
### `GET api/track/{trackId}/waveform` (DeepDrftAPI, unauthenticated)
|
||
|
||
- **Route param `trackId`** (string) = `EntryKey`, same as `GET api/track/{trackId}`.
|
||
- Loads the stored profile for that entry (from the `profiles` vault / sidecar).
|
||
- Returns `200` with `WaveformProfileDto { int BucketCount; string Data; }` (base64 quantized
|
||
bytes), or `404` if no profile exists for that track (front end then renders the flat fallback,
|
||
§4e).
|
||
- Unauthenticated, like audio streaming — it's public listener data.
|
||
|
||
**Proxy:** add the matching forward in `DeepDrftPublic/Controllers/TrackProxyController.cs`
|
||
(currently forwards `page` and `{trackId}`); add `{trackId}/waveform`. Same thin-proxy pattern,
|
||
no logic.
|
||
|
||
**Client:** a method on `TrackMediaClient` (it owns the `DeepDrft.Content` client and the
|
||
content base address) — `GetWaveformProfileAsync(trackId) → ApiResult<WaveformProfileDto>`. Keeps
|
||
the profile fetch on the content client, consistent with §5d's "profile is content."
|
||
|
||
The CMS PreProcessing panel (§12 Phase 5) also needs server-side endpoints: a way to query which
|
||
tracks lack a profile and a way to trigger generation. Those are **authenticated CMS routes** on
|
||
`DeepDrftAPI` (ApiKey), distinct from this public read — see Phase 5 for their shape.
|
||
|
||
### New model
|
||
|
||
`WaveformProfileDto` in `DeepDrftModels` (new `DTOs/WaveformProfileDto.cs`):
|
||
|
||
```
|
||
public class WaveformProfileDto {
|
||
public int BucketCount { get; set; }
|
||
public string Data { get; set; } // base64 of byte[BucketCount], each 0..255
|
||
}
|
||
```
|
||
|
||
`DeepDrftModels` is referenced by every project (`CLAUDE.md`), so both API and client see it. The
|
||
DTO carries no algorithm tag — it is loudness-in-`[0,1]` regardless of how it was computed.
|
||
|
||
---
|
||
|
||
## 7. TypeScript seam
|
||
|
||
**First cut: no TS changes required.** The preprocessed profile arrives as data over HTTP,
|
||
is decoded in C# (`WaveformProfileDto.Data` → `double[]`), and rendered by the Blazor component
|
||
with pure pointer events (§4c/§4d). The TS audio bundle (`DeepDrftPublic/Interop/audio/`) is
|
||
untouched. The live `SpectrumVisualizer` keeps using the existing
|
||
`startSpectrumAnimation`/`SpectrumAnalyzer` path verbatim — only its *position* in the markup
|
||
changes.
|
||
|
||
**Deliberately deferred TS work (later enhancement, see §4e):** client-side loudness computation
|
||
from decoded `AudioBuffer`s for the no-profile fallback. That *would* need a new TS module
|
||
(e.g. `WaveformProfiler.ts`) reading `scheduler`'s decoded buffers and bucketing amplitude, plus
|
||
an interop method to stream buckets to Blazor as they fill. It mirrors `SpectrumAnalyzer`'s
|
||
callback pattern. **Not in the first cut** — the flat fallback covers the gap, and the CMS
|
||
PreProcessing panel removes most no-profile cases. Keep this seam in mind so the component's data
|
||
input is an abstract `double[]` that *could* later be fed by either source.
|
||
|
||
This matters for the component contract: `WaveformSeeker` should take its profile as a
|
||
parameter/observable it doesn't care about the origin of — preprocessed today, possibly
|
||
live-computed later. Don't hard-wire it to the HTTP fetch.
|
||
|
||
---
|
||
|
||
## 8. Frontend data flow
|
||
|
||
```
|
||
Track selected (TracksView.PlayTrack → PlayerService.SelectTrackStreaming)
|
||
│
|
||
├── (existing) audio: TrackMediaClient.GetTrackMedia(entryKey) → stream → TS decode → playback
|
||
│
|
||
└── (new) profile: TrackMediaClient.GetWaveformProfileAsync(entryKey) → WaveformProfileDto
|
||
│
|
||
└── decode base64 → double[] profile → WaveformSeeker.Profile
|
||
```
|
||
|
||
Wiring options for *who* fetches the profile and holds it:
|
||
|
||
- **A. Player service holds it.** `StreamingAudioPlayerService` (or the base
|
||
`AudioPlayerService`) gains a `WaveformProfile` property, fetched when a track is selected,
|
||
exposed like `Duration`/`CurrentTime`. `WaveformSeeker` reads it off the cascaded
|
||
`IStreamingPlayerService`, re-rendering on `StateChanged` — the same pattern
|
||
`SpectrumVisualizer` and `AudioPlayerBar` already use. **Recommended:** the profile is part
|
||
of "current track state," and the player service is already the single source the seek zone
|
||
binds to. One place fetches, one place caches per track, cleared on `Unload`.
|
||
|
||
- **B. WaveformSeeker fetches its own.** Component takes `EntryKey` + `TrackMediaClient`,
|
||
fetches in `OnParametersSet` when the key changes. Simpler to reason about in isolation but
|
||
duplicates "current track" knowledge the player already owns and risks double-fetch / stale
|
||
key on rapid track switches.
|
||
|
||
- **C. A dedicated `WaveformProfileViewModel`** (MVVM convention in `CLAUDE.md`) scoped in DI,
|
||
fetches and caches by `EntryKey`, injected into the component. Cleanest separation, an extra
|
||
moving part. Reasonable if profiles get reused across views (e.g. mini-waveforms on track
|
||
cards later — see §10).
|
||
|
||
**Recommend A for the first cut** (profile as player-service state — matches the established
|
||
binding pattern and the "one source, multiple views" instinct: the seeker is just another view
|
||
over current-track state). Promote to C later if profiles need to be consumed outside the
|
||
player (track-card waveforms).
|
||
|
||
`CurrentTime` / `Duration` for the playhead come from the player service exactly as
|
||
`PlayerSeekZone` reads them today — no change.
|
||
|
||
---
|
||
|
||
## 9. Component & file inventory
|
||
|
||
New:
|
||
|
||
- `DeepDrftPublic.Client/Controls/AudioPlayerBar/WaveformSeeker.razor` (+ `.razor.cs`, `.razor.css`)
|
||
- `DeepDrftModels/DTOs/WaveformProfileDto.cs`
|
||
- `DeepDrftContent/Processors/WaveformProfileService.cs` — owns the PCM walk, bucketing,
|
||
normalization, storage; takes an `ILoudnessAlgorithm`.
|
||
- `DeepDrftContent/Processors/ILoudnessAlgorithm.cs` — the swappable loudness strategy (§5a).
|
||
- `DeepDrftContent/Processors/RmsLoudnessAlgorithm.cs` — first implementation (RMS). LUFS is a
|
||
future sibling implementation, not built now.
|
||
- `WaveformProfileOptions` (config-bound) — carries `BucketCount` (default 512) and any future
|
||
algorithm-selection knob.
|
||
- DeepDrftAPI public read route `GET api/track/{trackId}/waveform` in `TrackController.cs` +
|
||
proxy in `TrackProxyController.cs`.
|
||
- DeepDrftAPI CMS routes (ApiKey) for the PreProcessing panel: query missing-profile tracks +
|
||
trigger generation (§12 Phase 5).
|
||
- `TrackMediaClient.GetWaveformProfileAsync`
|
||
- (storage) new `profiles` vault constant in `VaultConstants`.
|
||
- (CMS) PreProcessing panel surface in `DeepDrftManager` — see Phase 5 for the component/service
|
||
inventory (it lands with that phase, not the first cut).
|
||
|
||
Changed:
|
||
|
||
- `PlayerSeekZone.razor` — swap `MudSlider` block for `<WaveformSeeker/>`; drop the
|
||
`<SpectrumVisualizer/>` (moves to volume).
|
||
- `VolumeControls.razor` → renamed **`VolumeZone.razor`** (decided) — stack `<SpectrumVisualizer/>`
|
||
above the volume slider.
|
||
- `AudioPlayerBar.razor.css` — adjust volume cluster to host the spectrum; seeker sizing.
|
||
- `SpectrumVisualizer` — set `BucketCount=24` for the narrow volume slot (§3c).
|
||
- `AudioPlayerBar.razor.cs` — minimal; seek callbacks already abstract. Possibly hold/clear
|
||
`WaveformProfile` if §8-A.
|
||
- `StreamingAudioPlayerService` / `AudioPlayerService` — add `WaveformProfile` state + fetch (§8-A).
|
||
- `UnifiedTrackService.UploadAsync` — compute + persist profile on upload via `WaveformProfileService`.
|
||
|
||
Untouched (important): the entire TS audio bundle, the seek-beyond-buffer offset path,
|
||
`WavOffsetService`, the streaming decode pipeline.
|
||
|
||
---
|
||
|
||
## 10. Future options this unlocks (don't build now, leave room for)
|
||
|
||
- **LUFS (or other perceptual) loudness profile.** The `ILoudnessAlgorithm` seam (§5a) exists
|
||
precisely so this drops in as a second strategy without touching the component, wire format, or
|
||
storage. The cheapest of the future moves because the abstraction is built up front.
|
||
- **Track-card mini-waveforms.** Once profiles exist as a reusable resource, `TrackCard` could
|
||
show a tiny loudness sparkline. This is the argument for the §8-C `WaveformProfileViewModel`
|
||
eventually, and for storing profiles where non-player surfaces can fetch them cheaply (favours
|
||
the vault sidecar + endpoint, §5d-3).
|
||
- **Loudness-normalized playback / waveform colouring by energy.** The same profile data could
|
||
drive auto-gain or heat-coloured bars.
|
||
- **Live-computed profiles** for the no-profile case (§7 deferred TS).
|
||
- **Higher-res zoomed scrub** on long tracks (re-fetch a denser profile for a time window) —
|
||
why a generous, configurable stored N and client-side downsampling is worth it now.
|
||
|
||
Keep the component's profile input origin-agnostic and the stored resolution generous so these
|
||
stay cheap to add.
|
||
|
||
---
|
||
|
||
## 11. Decisions (resolved 2026-06-05)
|
||
|
||
All seven forks below are **decided**. Recorded here so the rationale travels with the spec.
|
||
|
||
1. **Storage location (§5d): vault sidecar + dedicated endpoint — decided ✓.** Profile is derived
|
||
binary; it lives in the vault, `TrackEntity` stays pure metadata, the paged list stays lean.
|
||
SQL-column-on-`meta/{id}` is the recorded fallback only if the vault type hits friction.
|
||
2. **Names (§1): component `WaveformSeeker`; data `WaveformProfile` (`WaveformProfileDto`,
|
||
`waveformBuckets`, `profile`) — decided ✓.** Honest naming; the data is named for the concept,
|
||
not the algorithm, so RMS→LUFS never forces a rename.
|
||
3. **Live-spectrum bucket count (§3c): 24 buckets, parameterized — decided ✓.** Set via
|
||
`BucketCount` on `SpectrumVisualizer` so it can be tuned without a code change.
|
||
4. **Stored resolution + wire format (§4a/§5b): N configurable (default 512) via
|
||
`WaveformProfileOptions`; quantized `byte[]` base64 — decided ✓.** Front end derives its
|
||
rendered bar count from available width regardless of N.
|
||
5. **Backfill (§5c): CMS PreProcessing panel, not a CLI — decided ✓.** The CMS track grid shows
|
||
missing-profile tracks and offers 1-click generation per track (and/or bulk); compute runs
|
||
server-side via `WaveformProfileService`. See §12 Phase 5.
|
||
6. **Normalization (§5a): peak-normalize — decided ✓.** Per-track shape over cross-track absolute
|
||
loudness; a future LUFS algorithm can normalize differently behind the same interface.
|
||
7. **`VolumeControls` → `VolumeZone` rename (§3b) — decided ✓.** Symmetry with the transport and
|
||
seek zones.
|
||
|
||
**Cross-cutting decision (§5a):** the loudness measure is a swappable `ILoudnessAlgorithm`, RMS
|
||
first, LUFS the named future alternative — not hardwired to RMS.
|
||
|
||
---
|
||
|
||
## 12. Implementation phases (ordered, delegable)
|
||
|
||
Sequenced so each phase has a shippable deliverable and the UI can land before existing tracks
|
||
are all preprocessed. Phases 1–2 (backend) and phase 3 (layout move) are **parallelizable** —
|
||
they touch disjoint files and meet only at the client fetch in phase 4. §11 decisions are all
|
||
resolved, so there is no decisions-gate phase.
|
||
|
||
**Phase 1 — Loudness computation + storage (backend).** `WaveformProfileService` in
|
||
`DeepDrftContent` (extend the existing PCM walk) with an `ILoudnessAlgorithm` strategy and the
|
||
`RmsLoudnessAlgorithm` first implementation. Wire into `UnifiedTrackService.UploadAsync` to
|
||
compute + persist on upload (vault sidecar, §5d). Add `WaveformProfileDto` to `DeepDrftModels` and
|
||
`WaveformProfileOptions` (default N=512).
|
||
*Deliverable:* new uploads get a stored profile; unit-test the RMS math against a known WAV, and
|
||
unit-test that a second `ILoudnessAlgorithm` swaps in cleanly (guards the abstraction).
|
||
|
||
**Phase 2 — Public read API + proxy + client (backend/transport).** Add
|
||
`GET api/track/{trackId}/waveform`, the proxy forward, and `TrackMediaClient.GetWaveformProfileAsync`.
|
||
*Deliverable:* a track's profile is fetchable end-to-end over HTTP. Can be tested with curl
|
||
before any UI.
|
||
|
||
**Phase 3 — Layout move (frontend, parallel with 1–2).** Move `SpectrumVisualizer` from
|
||
`PlayerSeekZone` into the volume cluster (renamed `VolumeZone`); adjust CSS (§3c); set
|
||
`BucketCount=24`.
|
||
*Deliverable:* live spectrum sits above the volume slider; seek zone temporarily keeps the
|
||
MudSlider (or a placeholder). Player still fully works. This de-risks the layout independently
|
||
of the new component.
|
||
|
||
**Phase 4 — WaveformSeeker component (frontend, needs 2 + 3).** Build `WaveformSeeker.razor`:
|
||
DOM bars, played/unplayed split via clip overlay (§4d), pointer-capture seek (§4c), flat
|
||
fallback (§4e), rendered bar count derived from width. Wire profile via player-service state
|
||
(§8-A). Replace the MudSlider in `PlayerSeekZone` with it.
|
||
*Deliverable:* the new seekbar is live for tracks that have a profile; flat-but-seekable for
|
||
those that don't.
|
||
|
||
**Phase 5 — CMS PreProcessing panel (CMS, after 1).** In `DeepDrftManager`, add a PreProcessing
|
||
feature to the CMS track grid: a column/indicator showing which tracks **lack a waveform profile**
|
||
and a per-track **Generate** action (and/or a bulk "generate all missing" action). The grid
|
||
queries missing-profile state and triggers generation through authenticated CMS API routes on
|
||
`DeepDrftAPI`; the compute runs server-side via the same `WaveformProfileService` (no CLI). New
|
||
surface roughly: a CMS service method on `ICmsTrackService`/`CmsTrackService` for
|
||
list-missing + generate, the backing `DeepDrftAPI` routes (ApiKey), and the grid column/action in
|
||
the Tracks CMS page.
|
||
*Deliverable:* a CMS admin can see and one-click-fill any missing profile; the no-profile fallback
|
||
becomes rare/never as the backlog is worked off in-app.
|
||
|
||
**Deferred (not scheduled):** live client-side loudness compute (§7), track-card mini-waveforms
|
||
(§10), a LUFS `ILoudnessAlgorithm` (§5a/§10). Tracked here so the component contract stays
|
||
origin-agnostic and the algorithm stays swappable.
|
||
|
||
---
|
||
|
||
## 13. What this plan deliberately does NOT do
|
||
|
||
- Does not touch the streaming decode pipeline, seek-beyond-buffer, or `WavOffsetService`.
|
||
- Does not add an audio-processing dependency (NAudio etc.) for the RMS path — the existing PCM
|
||
parser suffices. (A future LUFS `ILoudnessAlgorithm` may revisit this, on its own merits.)
|
||
- Does not compute the profile on the playback path — preprocessed only (the whole point).
|
||
- Does not change `TrackEntity`'s metadata contract — the profile lives in the vault sidecar.
|
||
- Does not add a CLI; existing-track preprocessing is the in-CMS PreProcessing panel (§12 Phase 5).
|
||
- Does not require TS bundle changes in the first cut.
|