574 lines
38 KiB
Markdown
574 lines
38 KiB
Markdown
# Phase 16 — Anonymous Play & Share Tracking (Design Spec)
|
||
|
||
Status: **design-complete — decisions D1–D7 resolved by Daniel 2026-06-19.** Author:
|
||
product-designer. Drafted 2026-06-18; decisions resolved and phasing re-sequenced 2026-06-19.
|
||
**No code has been written by this doc.** This is the phase deferred behind the home-hero "Plays"
|
||
stat card, which today renders a static `XXX / Plays (Coming Soon)` odometer placeholder in
|
||
`NowPlayingStats.razor`.
|
||
|
||
**Phasing note (Daniel directive, 2026-06-19):** the waves run **bottom-up** — foundation first,
|
||
metrics stacked on the substrate, the user-visible Plays-card flip is the **capstone (built last)**.
|
||
This deliberately **reverses** the earlier "visible win comes early" framing: Daniel does not care
|
||
about the live card until everything underneath it is finished, so the whole telemetry substrate and
|
||
all metrics (including unique listeners) land before the card lights up. See §6.
|
||
|
||
This spec adds a **privacy-light, anonymous play & share telemetry layer** to the public site:
|
||
counting plays (bucketed by completion) and shares, tied to individual tracks and releases, with an
|
||
optional unique-listener "plus" metric. It does **not** add accounts, PII, or any per-user identity
|
||
model — that is a hard constraint, not a deferral.
|
||
|
||
## Phase numbering
|
||
|
||
This is **Phase 16**. Phase 15 (Visualizer Controls Enhancements) is the highest-numbered phase in
|
||
`PLAN.md`. Phases 11 and 10-Reframe are landed; no phase 16 exists yet. If a concurrent worktree has
|
||
claimed 16 by the time this is scoped, bump to the next free number — the content is
|
||
number-independent.
|
||
|
||
## Cross-references (read these before implementing)
|
||
|
||
- `DeepDrftPublic.Client/Services/StreamingAudioPlayerService.cs` — the production player. The
|
||
instrumentation seam lives here: `LoadTrackStreaming` (track-load = play-start candidate),
|
||
the progress callback path, and `ResetToIdle` (stop/unload/switch). `_currentTrackId` holds the
|
||
current `EntryKey`. **No release id is currently held by the player** — see §2.3.
|
||
- `DeepDrftPublic.Client/Services/AudioPlayerService.cs` — base class. `OnProgressCallback(double
|
||
currentTime)` is the per-tick position seam; `OnPlaybackEndCallback` is the organic end-of-stream
|
||
seam (and the only place `TrackEnded` fires). `Duration` is set from the WAV header during load.
|
||
- `DeepDrftPublic.Client/Services/AudioInteropService.cs` — `SetOnProgressCallbackAsync` /
|
||
`SetOnEndCallbackAsync` are the JS→.NET callbacks already wired in `InitializeAsync`. Progress is
|
||
throttled to ~10/sec on the JS side already.
|
||
- `DeepDrftPublic.Client/Services/QueueService.cs` — auto-advance orchestrator. Album playthroughs
|
||
flow `PlayRelease → PlayCurrent → SelectTrackStreaming` per track; `OnTrackEnded` advances. Every
|
||
track in an album play is an independent `SelectTrackStreaming` call, so per-track play events
|
||
arise naturally without queue-specific instrumentation.
|
||
- `DeepDrftPublic.Client/Controls/SharePopover.razor[.cs]` — the **real** share surface. Two share
|
||
actions exist today: **Copy link** (track mode + release mode) and **Copy embed** (track mode
|
||
only, an `<iframe>` snippet to `/FramePlayer?TrackEntryKey=…`). Both are clipboard writes — there
|
||
is no network call, so **no share is recorded today**. This is the share-event origin (§2.4).
|
||
- `DeepDrftPublic.Client/Pages/FramePlayer.razor` — the embeddable single-track player the embed
|
||
snippet points at. Plays inside a third-party `<iframe>` should count (§1, edge cases).
|
||
- `DeepDrftAPI/Controllers/StatsController.cs` + `DeepDrftModels/DTOs/HomeStatsDto.cs` — the landed
|
||
home-stats pattern the Plays card will consume. `GET api/stats/home` returns a bare DTO; aggregation
|
||
lives in `TrackRepository.GetHomeStatsAsync` / `ITrackService`; the controller is a thin boundary.
|
||
**This is the template the play/share read path mirrors** (§4, §5).
|
||
- `DeepDrftPublic.Client/Controls/NowPlayingStats.razor` — the three-card hero stat row. The Plays
|
||
card is the third card; it currently shows the static placeholder. This phase fills it.
|
||
- `DeepDrftData` (`DeepDrftContext`, `TrackRepository`, `TrackManager`, `Migrations`) — where new
|
||
SQL tables/migrations and aggregation queries live. `DeepDrftAPI` owns the HTTP surface.
|
||
- `DeepDrftPublic` proxy controller (`api/track/*`) — the public site's browser→API proxy hop. A new
|
||
`api/event/*` or `api/stats/*` write path needs a matching proxy route (the WASM client cannot
|
||
reach `DeepDrftAPI` directly; SSR can).
|
||
|
||
---
|
||
|
||
## 1. The metrics, defined precisely
|
||
|
||
Three metrics, two core + one stretch:
|
||
|
||
### 1a. Play (core)
|
||
|
||
A **play** is a single listener's session of listening to one track. It is recorded once per
|
||
track-listen, classified by how far the listener got:
|
||
|
||
- **partial** — playback reached **< 30%** of the track's duration before the session ended
|
||
(switched track, stopped, navigated away, closed tab).
|
||
- **complete** — playback reached **> 80%** of duration.
|
||
- **middle band (30%–80%)** — **D1 RESOLVED (Daniel 2026-06-19): three-bucket `sampled`.** The
|
||
middle band is its own bucket, `sampled` (a real listen that wasn't a skip and wasn't a finish), so
|
||
the three buckets are exhaustive and non-overlapping: `partial` [0, 30%), `sampled` [30%, 80%],
|
||
`complete` (80%, 100%]. The headline "Plays" number is the **sum of all three** (every started
|
||
listen counts as a play); the buckets are the texture beneath it.
|
||
|
||
*Road not taken:* folding the middle into "complete" (threshold "≥30% = a real play, else partial")
|
||
— simpler, two buckets, but it discards the "listened to half" vs. "listened to the end"
|
||
distinction, which is the most editorially interesting signal for a music collective ("which mixes
|
||
do people actually finish?"). Three buckets chosen for that texture.
|
||
|
||
**What starts a play candidate:** a track's audio actually begins streaming for playback — i.e.
|
||
`SelectTrackStreaming` reaches the point where `StartStreamingPlayback` succeeds and `IsPlaying`
|
||
becomes true (the `_streamingPlaybackStarted` transition in `StreamAudioWithEarlyPlayback`). A track
|
||
that is *staged* (`StageTrack`) but never played does **not** count. A track that fails to load does
|
||
not count.
|
||
|
||
**What classifies the bucket:** the **maximum playback position reached** as a fraction of duration,
|
||
captured when the listen ends (track switch, stop, organic end, or page unload). Not the position at
|
||
the moment of ending — the *high-water mark* — so that seeking backward near the end doesn't demote a
|
||
complete play to partial. (See seeks under edge cases.)
|
||
|
||
**Why a high-water mark, not elapsed-listen-time:** elapsed time would require accumulating play
|
||
duration across pauses and is more state to carry. The high-water position is already trivially
|
||
derivable from the progress callback (`max(currentTime)/duration`). For v1 the simpler model is the
|
||
right call; if "engaged listening time" becomes interesting later it's an additive metric, not a
|
||
reshape.
|
||
|
||
### 1b. Share (core)
|
||
|
||
A **share** is recorded when a listener performs a share *action* — not when they open the share
|
||
popover. Two actions exist today (`SharePopover`):
|
||
|
||
- **Copy link** (track or release) → records a share against that track or release, with a
|
||
`channel = link` tag.
|
||
- **Copy embed** (track only) → records a share with `channel = embed`.
|
||
|
||
A future "native share" (Web Share API) or per-platform button would add channels without reshaping
|
||
the metric. The share count on the Plays-card payoff is **total shares**; the channel split is
|
||
texture (and probably CMS-only, not public-facing, for v1).
|
||
|
||
**De-dupe:** copying the same link three times in a row is one intent, not three shares. Recommend a
|
||
client-side **debounce** — at most one share event per (target, channel) per short window (e.g. 60s)
|
||
per session. Cheap, prevents the obvious gaming, and matches how "copied!" already feels like one act.
|
||
|
||
### 1c. Unique listeners (stretch / "plus" — lower priority)
|
||
|
||
A **unique listener** is an approximate distinct-listener count, tied to a track or release. **D3
|
||
RESOLVED (Daniel 2026-06-19, by default): all-time window** — not rolling-30-day. All-time fits the
|
||
"N listeners reached" framing and the Option-A mechanism (§3). This is the metric most in tension
|
||
with the no-PII constraint; it is approximate by design (we estimate reach, we do not build
|
||
identity). Mechanism: **§3**. *Low-risk to revisit during its wave (§6) if implementation surfaces a
|
||
reason* — a rolling window is an additive aggregation, not a reshape. Built as part of the substrate
|
||
(no longer an indefinite tail — see §6).
|
||
|
||
### 1d. Edge cases (apply to plays unless noted)
|
||
|
||
- **Replays.** Playing the same track twice in one session = two plays. The play event fires per
|
||
`SelectTrackStreaming` that reaches playback. If a listener loops a track ten times, that's ten
|
||
plays. Acceptable for v1 — looping is genuine listening. (Unique listeners absorbs the "but it's
|
||
the same person" concern at the reach level.)
|
||
- **Seeks (within a play).** Seeking does not start a new play. The high-water position keeps
|
||
climbing; seeking backward never lowers it. So "seek to the end to check the outro, then seek
|
||
back" still classifies as `complete` — correct, they heard the end.
|
||
- **Seek-beyond-buffer re-request.** The `GET api/track/{id}` with a `Range` header during
|
||
`SeekBeyondBuffer` is a *byte* re-fetch of the **same** play — it must **not** start a new play
|
||
event. Because the play event is keyed off the player-service `SelectTrackStreaming` lifecycle (not
|
||
the HTTP fetch), this is free: `SeekBeyondBuffer` reuses `_currentTrackId` and never calls
|
||
`SelectTrackStreaming`. **Instrument at the player-service level, never at the HTTP/media-client
|
||
level** — the media client fires multiple times per play.
|
||
- **Very short tracks.** A 4-second sting: 30%/80% still apply proportionally. No special-casing for
|
||
v1. (If "complete" on a 4s clip feels too cheap, a minimum-absolute-seconds floor is an additive
|
||
tweak — flag as a tuning knob, not a v1 requirement.)
|
||
- **Rapid skips.** Listener clicks through ten tracks in five seconds. Each reaches `< 30%` →
|
||
ten `partial` plays. **D2 RESOLVED (Daniel 2026-06-19): apply a minimum-engagement floor.** A
|
||
listen registers as a play only once playback reaches **≥ 3 seconds OR ≥ 5% of duration, whichever
|
||
is smaller** (so a sub-60s clip floors on the percentage; anything longer floors on the 3-second
|
||
wall). Below the floor it is a *preview/skip*, dropped entirely (no event sent). This keeps the
|
||
headline number honest — a skim through the archive isn't 40 plays — while still capturing genuine
|
||
short partial listens. The floor is a single tunable constant (one place to change if the band
|
||
later wants it looser/tighter). *Road not taken:* floor = 0 ("they hit play, it's a play") —
|
||
simpler and defensible, but inflates the count on a browsing session.
|
||
- **Tab close / navigation mid-play.** The play must still be recorded with its high-water bucket.
|
||
This is the hardest delivery case and drives the beacon recommendation in §2.2.
|
||
- **Embedded (`FramePlayer`) plays.** A play inside a third-party iframe is a real play and should
|
||
count. The embed runs the same player stack, so it instruments for free — but it may run
|
||
cross-origin, which bears on the unique-listener cookie/storage mechanism (§3) and the share
|
||
attribution (an embed play could optionally carry the embedding page as a referrer dimension —
|
||
out of scope for v1, flag as adjacent).
|
||
- **CMS / admin playback.** Plays generated by an admin auditing tracks in the CMS should ideally not
|
||
pollute the public counts. The CMS is a separate app (`DeepDrftManager`) and does not run this
|
||
player stack, so this is mostly free — but a logged-in admin browsing the *public* site would
|
||
count. Acceptable for v1 (low volume); flag as a known caveat.
|
||
|
||
---
|
||
|
||
## 2. Instrumentation
|
||
|
||
### 2.1 Where events originate
|
||
|
||
**Play events: in `StreamingAudioPlayerService`, not the UI and not the HTTP layer.** The player
|
||
service is the single chokepoint every playback path flows through (home StreamNow, gallery, queue
|
||
auto-advance, detail-page play, embed). Instrumenting once there covers all of them and dodges the
|
||
seek-beyond-buffer double-count trap.
|
||
|
||
Concretely, the player service grows a small internal **play-session tracker**:
|
||
|
||
- On the `_streamingPlaybackStarted` transition (playback actually begins): open a play session for
|
||
`_currentTrackId` — record the track `EntryKey`, the resolved release id (§2.3), the duration once
|
||
the WAV header sets it, and start the high-water mark at 0.
|
||
- On each progress tick (`OnProgressCallback`, already firing ≤10/sec): advance the high-water mark
|
||
(`max(highWater, currentTime)`).
|
||
- On session-end — whichever comes first: organic end (`OnPlaybackEndCallback`), a superseding
|
||
`LoadTrackStreaming` / `ResetToIdle` (track switch / stop / unload / dispose), or page unload
|
||
(§2.2) — **close the session**: compute `highWater/duration`, apply the §1d floor, classify the
|
||
bucket, and emit one play event. Then clear the tracker.
|
||
|
||
This keeps all play-counting logic in one place, behind one seam, testable against a fake interop.
|
||
**It does not change the playback path** — it's an observer on transitions the service already makes.
|
||
|
||
**Recommendation:** factor the tracker into a small injectable collaborator (e.g. `IPlayTracker` /
|
||
`PlayTracker`) that the player service calls (`OnPlaybackStarted`, `OnProgress`, `OnPlaybackEnded`),
|
||
rather than inlining HTTP calls into the player. Keeps the player's single responsibility intact and
|
||
matches the repo's "logic in services, not in the playback path" discipline. The tracker owns the
|
||
beacon call and the de-dupe/floor logic. (Final structural call is staff-engineer's; this is the
|
||
steer.)
|
||
|
||
**Share events: in `SharePopover` (or a small `IShareTracker` it calls).** The `CopyLink` /
|
||
`CopyEmbed` handlers already exist and already know the target (track `EntryKey` or release
|
||
`EntryKey` + medium) and the channel. After a successful clipboard write, fire a share event. Apply
|
||
the §1b debounce in the tracker.
|
||
|
||
### 2.2 What gets sent, when, and how (fire-and-forget vs. tracked)
|
||
|
||
**Recommendation: fire-and-forget `sendBeacon` for play and share events.** These are telemetry, not
|
||
transactions — a dropped event is acceptable; blocking the UI or the navigation for one is not.
|
||
`navigator.sendBeacon` is purpose-built for exactly this (it survives page unload, which the
|
||
tab-close edge case requires) and is a tiny TS interop addition alongside the existing audio interop.
|
||
|
||
- **Play event payload (sent once at session close):** `{ trackEntryKey, releaseEntryKey?, medium?,
|
||
bucket: "partial"|"sampled"|"complete", anonId? }`. No duration or position is sent — the bucket is
|
||
computed client-side, so the server stores a classification, not raw listening data (a privacy
|
||
plus: we never transmit *how long* someone listened, only a coarse bucket). `anonId` is the
|
||
unique-listener token and is **omitted entirely** if the unique-listener feature is off or the
|
||
listener has opted out (§3).
|
||
- **Share event payload:** `{ targetType: "track"|"release", targetKey, channel:
|
||
"link"|"embed", anonId? }`.
|
||
- **Transport:** a single new endpoint family `POST api/event/play` and `POST api/event/share`
|
||
(proxied through `DeepDrftPublic` for the WASM client, same hop as `api/track/*`). `sendBeacon`
|
||
issues a `POST` with a small JSON body; the endpoint returns `202 Accepted` and does the write
|
||
async. **Unauthenticated** (same posture as the public reads) — but rate-limited and validated
|
||
(§2.5).
|
||
|
||
**Why not a tracked/awaited call:** the only thing we'd gain is a delivery confirmation we don't need
|
||
and a failure path we don't want (a telemetry 500 must never surface to a listener). The beacon's
|
||
"can't read the response" limitation is irrelevant — we don't act on the response.
|
||
|
||
**Page-unload delivery:** register a `pagehide`/`visibilitychange→hidden` handler that closes any open
|
||
play session via the beacon. This is the canonical pattern (analytics libraries all do this) and is
|
||
why `sendBeacon` over `fetch` matters — `fetch` is cancelled on unload, `sendBeacon` is not.
|
||
|
||
### 2.3 Resolving the release id for a play
|
||
|
||
The player today holds `_currentTrackId` (the track `EntryKey`) but **not** the release. A play event
|
||
wants both (the metric is "tied to individual tracks and releases"). Options:
|
||
|
||
1. **Carry release context on the `TrackDto`/play call.** When a play originates from a release detail
|
||
page or the queue (`PlayRelease`), the caller knows the release. Thread a release `EntryKey` +
|
||
medium into `SelectTrackStreaming` (optional param) or onto the staged context. Clean, explicit.
|
||
2. **Resolve server-side from the track.** The play event sends only `trackEntryKey`; the API joins
|
||
track→release at write or aggregation time. The track→release link already exists in SQL. Zero
|
||
client plumbing; the release dimension is always correct even for plays that started without
|
||
release context (e.g. StreamNow random track).
|
||
|
||
**RESOLVED: option 2 (resolve server-side).** The track→release join is authoritative and already in
|
||
`DeepDrftData`; sending only the track key keeps the client dumb and the payload minimal, and a
|
||
random-track play still gets correctly attributed to its release without the client knowing. The
|
||
client sends what it cheaply knows (track key); the server enriches. **D4 RESOLVED (Daniel
|
||
2026-06-19): release plays are derived** (sum of plays of the release's tracks), not a
|
||
separately-counted "release was played" event — exactly right for multi-track Cuts and trivially
|
||
correct for single-track Session/Mix. The client sends only the track key; release attribution and
|
||
release-total derivation are both server-side.
|
||
|
||
Shares already carry the right target directly (`SharePopover` knows track vs. release), so share
|
||
attribution needs no resolution step.
|
||
|
||
### 2.4 Avoiding double-counting
|
||
|
||
- **Plays:** one session = one event, enforced by the player-service tracker (only `SelectTrack
|
||
Streaming` opens a session; seek-beyond-buffer and progress ticks never do). Covered by §2.1.
|
||
- **Shares:** the §1b per-(target,channel) debounce.
|
||
- **Beacon retries:** `sendBeacon` does not retry, so no transport-level duplication. If a future
|
||
switch to `fetch` adds retries, an idempotency key on the event would be needed — not for v1.
|
||
|
||
### 2.5 Abuse / inflation resistance (privacy-light, so light-touch)
|
||
|
||
Anonymous unauthenticated write endpoints invite inflation. v1 posture: **make casual gaming
|
||
annoying, accept that determined gaming is possible** (this is a band's vanity-and-texture counter,
|
||
not ad-revenue telemetry — the "90s visitor counter vibe" Daniel wants). Measures:
|
||
|
||
- Server-side **rate-limit per IP** on the event endpoints (e.g. N events/minute) — coarse, stateless,
|
||
standard ASP.NET rate-limiting middleware.
|
||
- The play **engagement floor** (§1d, D2) already drops trivial skim-spam.
|
||
- Reject malformed payloads (unknown bucket, missing track key, oversized body) at the controller.
|
||
- **Out of scope for v1:** bot detection, CAPTCHA, signed events, IP reputation. Flag as adjacent if
|
||
the counts ever start mattering enough to defend.
|
||
|
||
---
|
||
|
||
## 3. Privacy-light anonymity mechanism (unique listeners)
|
||
|
||
**D5 RESOLVED (Daniel 2026-06-19): Option A — client-minted random first-party `localStorage` id,
|
||
metric labelled "listeners," fingerprinting (Option C) rejected.** Rationale and the road not taken
|
||
are preserved below — the three mechanism families and their trade-offs are kept as the record of
|
||
why A was chosen over B and C. The unique-listener metric needs *some* notion of "the same anonymous
|
||
listener seen again" without identifying who they are.
|
||
|
||
### Option A — Anonymous client-minted id (random GUID in localStorage)
|
||
|
||
On first visit, the client mints a random GUID, stores it in `localStorage` (or a first-party cookie),
|
||
and sends it as `anonId` on play/share events. The server counts distinct `anonId`s per
|
||
track/release.
|
||
|
||
- **Privacy:** strong. The id is random, contains no PII, is first-party only, never leaves as
|
||
anything but an opaque token, and the listener can clear it (clear site data) at will. It is
|
||
effectively a "this browser, until you clear it" token — not a person.
|
||
- **Accuracy:** good-but-inflating. Same person on phone + laptop = 2 listeners. Cleared storage =
|
||
new listener. Incognito = new listener each session. So it **over-counts** unique listeners
|
||
(counts unique *browser-installs-since-last-clear*, which we relabel honestly as "listeners").
|
||
- **Cross-origin embed caveat:** an embedded `FramePlayer` on a third-party site cannot read the
|
||
first-party `localStorage` of `deepdrft.com` (storage is partitioned). So embed plays would mint a
|
||
*separate* id per embedding site. Acceptable — embed plays are a minority and over-counting is the
|
||
known direction of error.
|
||
- **Consent:** a random first-party id with no cross-site tracking is the lightest-touch case under
|
||
GDPR/ePrivacy — widely treated as not requiring a consent banner when used purely for first-party
|
||
aggregate counts (this is the "privacy-light" sweet spot). **Recommend a short privacy-note line**
|
||
rather than a cookie wall.
|
||
|
||
### Option B — Salted/rotating daily identifier (no stored id at all)
|
||
|
||
The server (or client) derives a per-day token by hashing `IP + User-Agent + a daily-rotating salt`.
|
||
No id is stored on the client. Distinct tokens per day ≈ distinct visitors per day; the salt rotation
|
||
means yesterday's tokens can't be correlated to today's (no long-term tracking).
|
||
|
||
- **Privacy:** very strong on the *no-persistent-identifier* axis (nothing stored client-side, no
|
||
cross-day linkage). This is the pattern Plausible/Fathom-style privacy analytics use.
|
||
- **Accuracy:** coarser. IP+UA collides (everyone behind one NAT/office/carrier-CGNAT shares a token →
|
||
under-counts) and rotates daily (can only ever produce "unique per day," never "unique all-time" —
|
||
by design). Mobile IPs shift constantly → over-counts. Net: noisy, and **only meaningful as a daily
|
||
figure**, which fights the "all-time plays counter" vibe.
|
||
- **Architecture cost:** the IP is only reliably available **server-side** (the client doesn't know
|
||
its own public IP, and `X-Forwarded-For` is already handled server-side per `DeepDrftAPI` config).
|
||
So the token must be computed at the API, not the client — meaning the unique-listener dimension is
|
||
derived entirely server-side from request metadata, and the client sends *no* `anonId` at all. That
|
||
is actually a privacy *plus* (the client transmits nothing identifying) but ties unique-listeners to
|
||
a daily-rolling model.
|
||
|
||
### Option C — Coarse browser fingerprint
|
||
|
||
Derive an id from canvas/font/hardware fingerprinting signals.
|
||
|
||
- **Privacy:** worst. Fingerprinting is precisely what privacy regulation and browser vendors are
|
||
actively killing; it tracks across sites and survives storage-clearing. **Directly contradicts
|
||
"privacy-light."** Rejected outright — listed only for completeness.
|
||
|
||
### Resolution (D5 — Option A)
|
||
|
||
**Option A (client-minted random first-party id) is the chosen mechanism, with the metric honestly
|
||
labelled "listeners."** Reasons:
|
||
|
||
- It fits the product: a band wants an *all-time* "N listeners reached" figure, which A supports and
|
||
B (daily-only) structurally cannot.
|
||
- It is genuinely privacy-light: opaque random token, first-party, clearable, no cross-site
|
||
correlation, no fingerprinting — the lightest mechanism that still answers "all-time unique."
|
||
- It keeps the server simple (count distinct tokens) and the client honest (one tiny localStorage
|
||
read).
|
||
- We label the metric **"listeners," not "people,"** and accept the known over-count. For a vanity
|
||
texture stat this is the right honesty/effort trade.
|
||
|
||
*Road not taken:* Option B (server-derived salted daily token — stores nothing client-side) was the
|
||
fallback for a stronger "stores nothing" posture, at the cost of the metric becoming daily-unique and
|
||
noisier; rejected because the product wants an *all-time* reach figure, which B structurally cannot
|
||
give. Option C (fingerprint) rejected outright — it is exactly what privacy regulation and browser
|
||
vendors are killing, and contradicts "privacy-light." A and B are not mutually exclusive long-term (A
|
||
for all-time reach, B-style for daily actives), but v1 commits to A.
|
||
|
||
**Sequencing note (changed 2026-06-19):** unique-listeners is **no longer an indefinite stretch
|
||
tail.** Under the bottom-up re-sequencing, Daniel wants *everything* finished before the card lights
|
||
up — so the `anonId` layer is folded into the substrate build (§6, wave 16.3) as the lowest-priority
|
||
/ last of the metric layers, not an optional dangler. The play/share counters still need no `anonId`
|
||
and land first; unique-listeners stacks on top before the capstone card.
|
||
|
||
---
|
||
|
||
## 4. Storage & aggregation model
|
||
|
||
### 4.1 Event capture: log vs. counter vs. both
|
||
|
||
Three shapes:
|
||
|
||
1. **Rolled-up counters only.** One row per (track, bucket) with an integer count, incremented on each
|
||
event. Tiny storage, trivial read, no history, no unique-listener support (can't count distinct
|
||
without storing the distinct tokens). Pure "90s hit counter."
|
||
2. **Append-only event log only.** One row per event. Full fidelity, supports any future metric
|
||
(unique listeners, time-of-day, channel splits, retention), but every read is an aggregation query
|
||
over a growing table and write volume = play volume.
|
||
3. **Both — event log + periodic/online rollup.** Events append to a log; counters are maintained
|
||
(either incrementally on write, or by a periodic aggregation pass) for the hot read path.
|
||
|
||
**Recommendation: a lean version of (3) — an append-only event log as the source of truth, plus a
|
||
small rolled-up counter table for the home-card hot read.** Reasoning:
|
||
|
||
- The home Plays card is read on every public-site landing; it must be a single cheap indexed read,
|
||
not a `COUNT(*)` over an event table. That argues for a counter.
|
||
- But unique-listeners *requires* retaining distinct tokens, and "which mixes get finished" texture
|
||
wants the bucket/per-target fidelity — both argue for a log.
|
||
- (3) gets both without forcing a premature choice, and matches the existing `HomeStatsDto` pattern
|
||
(the card reads a pre-aggregated DTO; it never queries raw).
|
||
|
||
**D6 RESOLVED (Daniel 2026-06-19, by default): incremental-on-write rollup.** The event-write
|
||
transaction also bumps the counter row — no background job to stand up. *Road not taken:* periodic
|
||
batch aggregation, the escape hatch if write volume ever makes the incremental bump contended; for a
|
||
collective-scale site incremental is fine and simplest. *Low-risk to revisit during its wave (§6) if
|
||
implementation surfaces a reason* — switching to a periodic pass later doesn't change the schema, only
|
||
how the counter is fed.
|
||
|
||
### 4.2 Conceptual SQL shape (not full schema — that's staff-engineer's)
|
||
|
||
In `DeepDrftData`, new tables roughly:
|
||
|
||
- **`play_event`** (the log): `id`, `track_entry_key` (or `track_id` FK), `release_id` (resolved at
|
||
write, §2.3), `bucket` (enum: partial/sampled/complete), `anon_id` (nullable — present only when
|
||
unique-listeners is on and the listener didn't opt out), `created_at`. Indexed on
|
||
`(track_id)`, `(release_id)`, and `(anon_id)` for the distinct-count query.
|
||
- **`share_event`** (the log): `id`, `target_type` (track/release), `target_id`, `channel`
|
||
(link/embed), `anon_id?`, `created_at`.
|
||
- **`play_counter`** (the rollup, optional in v1 if reads stay cheap): per (track_id) and per
|
||
(release_id), columns for `partial_count`, `sampled_count`, `complete_count`, `share_count`, and a
|
||
`total_plays` derived/stored. This is what the home-stats aggregation reads.
|
||
|
||
`anon_id` is the only field that touches anonymity, it is nullable, and it is never joined to any
|
||
identity table because none exists. Dropping the unique-listener feature later = stop writing the
|
||
column; the play/share counts are unaffected.
|
||
|
||
**Note on the dual-database split:** this is **all SQL** (it's metadata/counters, not binary content).
|
||
The FileDatabase vault is not involved. Aggregation lives in `TrackRepository` alongside
|
||
`GetHomeStatsAsync`; the API surface lives in `DeepDrftAPI`.
|
||
|
||
### 4.3 New API surface (sketch)
|
||
|
||
Writes (unauthenticated, rate-limited, proxied through `DeepDrftPublic`):
|
||
|
||
- `POST api/event/play` — body `{ trackEntryKey, bucket, anonId? }`; returns `202`. Resolves release
|
||
server-side, appends to `play_event`, bumps `play_counter`.
|
||
- `POST api/event/share` — body `{ targetType, targetKey, channel, anonId? }`; returns `202`.
|
||
|
||
Reads (unauthenticated, mirror `GET api/stats/home`):
|
||
|
||
- Extend **`GET api/stats/home`** to include the site-wide play total (and optionally share total)
|
||
for the home card — *the minimal payoff* (§5). The card already does one round-trip to this
|
||
endpoint; adding fields is the smallest possible change.
|
||
- `GET api/stats/track/{entryKey}` and `GET api/stats/release/{entryKey}` — per-target play/share/
|
||
listener figures, for the (future) detail-page display. **Not required for the home-card payoff**;
|
||
build when a detail-page surface wants them.
|
||
|
||
CMS-facing reads (the channel splits, bucket breakdowns, per-track leaderboards) are a separate,
|
||
later, `DeepDrftManager`-side concern — explicitly out of v1 scope, flagged adjacent.
|
||
|
||
---
|
||
|
||
## 5. The Plays card payoff
|
||
|
||
Today `NowPlayingStats.razor`'s third card renders `XXX / Plays (Coming Soon)` — a static odometer
|
||
placeholder. Once this phase lands, the card shows a **real site-wide play total** in the same
|
||
odometer treatment (the "90s visitor counter" vibe is *already the intended aesthetic* — the
|
||
placeholder is literally styled as an odometer). The minimal, correct payoff:
|
||
|
||
- **Primary figure:** total plays site-wide (sum across all tracks), rendered in the odometer.
|
||
- **Secondary line: D7 RESOLVED (Daniel 2026-06-19, by default).** The card gets a secondary line
|
||
(the other two cards both have primary + secondary). **Because the bottom-up re-sequencing now lands
|
||
unique listeners *before* the card (§6), the secondary line can be unique listeners ("N listeners")
|
||
from day one** — the metric the card was always reaching for. Completion-rate ("N% finished") and
|
||
total-shares ("N shared") remain available alternatives if listeners reads oddly in the odometer
|
||
treatment; final pick is a small render-time call during the capstone wave, low-risk to revisit.
|
||
*(Under the prior early-card sequencing this had to be completion-rate or shares because listeners
|
||
shipped later; the re-sequencing removes that constraint.)*
|
||
|
||
Mechanically: add `TotalPlays` (and the chosen secondary) to `HomeStatsDto`, populate it in
|
||
`TrackRepository.GetHomeStatsAsync` from `play_counter`, and the card reads it through the existing
|
||
`IStatsDataService.GetHomeStats()` path — **the same persistent-state-bridged single round-trip the
|
||
other two cards already use.** No new client data path; the card just stops being static.
|
||
|
||
This is deliberately the *first* visible payoff and the *smallest* — the whole counter substrate earns
|
||
its keep the moment the home number goes live, before any per-track surface or unique-listener work.
|
||
|
||
---
|
||
|
||
## 6. Phasing — bottom-up (Daniel directive, 2026-06-19)
|
||
|
||
**Re-sequenced bottom-up: foundation first, metrics stacked on the substrate, the user-visible
|
||
Plays-card flip is the capstone built LAST.** This reverses the earlier "visible win comes early"
|
||
framing. Daniel: *"do the phasing from the bottom up, that seems more stable. I won't care about the
|
||
live card until everything is finished."* So the entire telemetry substrate and **all** metrics
|
||
(including unique listeners, which is no longer an indefinite tail) land before the card lights up.
|
||
|
||
Waves run in strict sequence — each builds on the layer beneath it.
|
||
|
||
- **16.1 — Foundation: capture seam + transport + event log (no card consumption).** The substrate,
|
||
end to end, with nothing reading it yet:
|
||
- Player-service **play-session tracker** (§2.1): opens on playback-start, advances the high-water
|
||
mark on the existing progress callback, closes on track-switch / stop / organic-end / page-unload.
|
||
Applies the §1d/D2 **engagement floor** (≥3s or ≥5%, whichever smaller).
|
||
- Share tracker in `SharePopover` (§1b) with the per-(target,channel) debounce.
|
||
- **`sendBeacon` interop** + `pagehide`/`visibilitychange` unload handler (§2.2).
|
||
- **`POST api/event/{play,share}`** endpoints, **proxied through `DeepDrftPublic`** (§2.2), with
|
||
IP rate-limiting + payload validation (§2.5).
|
||
- **Append-only `play_event` / `share_event` SQL log** + **incremental `play_counter` rollup** (§4,
|
||
D6) in `DeepDrftData`.
|
||
- **Server-side release resolution** + derived release totals (§2.3, D4) — client sends only the
|
||
track key.
|
||
- At this point events flow and counters accumulate, but **no `anonId` is written** and **nothing
|
||
reads the counters**. This is the stable base everything stacks on. **Cold-start wave — nothing
|
||
gates it.**
|
||
- **16.2 — Completion-bucket classification + shares.** The metric texture on top of the raw capture:
|
||
the three-bucket classification (§1a/D1 — `partial`/`sampled`/`complete`, headline = sum) wired
|
||
through the tracker → event payload → log → counter columns, and the share-channel split
|
||
(`link`/`embed`) landing in `share_event`. (Much of the bucket plumbing is natural to build
|
||
alongside 16.1; 16.2 is the boundary where the classification is *correct and exhaustive* end to
|
||
end, including counter columns per bucket.) **Depends on 16.1.**
|
||
- **16.3 — Unique-listener `anonId` layer (D5, lowest-priority metric).** The anonymity mechanism
|
||
(§3, Option A): mint/read the client first-party `localStorage` id, thread `anonId` onto the event
|
||
payloads, store it nullable on the event log, count distinct server-side (all-time, D3). **The last
|
||
of the metric layers** — folded into "everything finished" per the directive, but explicitly the
|
||
lowest-priority and last-built of the substrate. **Depends on 16.1 (event payload + storage); builds
|
||
on 16.2.**
|
||
- **16.4 — Per-target / CMS stats surfaces** `[speculative]`**.** `GET api/stats/{track,release}/{key}`
|
||
per-target reads; CMS analytics views (bucket splits, channel splits, leaderboards). **Speculative,
|
||
not committed scope** — the event log already supports it. Position: **before the capstone** if a
|
||
per-target surface is wanted (e.g. to validate the metrics visually in the CMS before the public
|
||
card goes live), otherwise skippable. Does not gate the card. **Depends on 16.1–16.3** for the data
|
||
it would read.
|
||
- **16.5 — Home Plays-card payoff (CAPSTONE, built LAST).** Extend `HomeStatsDto` +
|
||
`GetHomeStatsAsync` with `TotalPlays` (+ the secondary line — unique listeners now available, D7);
|
||
flip `NowPlayingStats`'s third card from the `XXX / Plays (Coming Soon)` placeholder to live,
|
||
through the existing persistent-state-bridged `GET api/stats/home` round-trip (§5). **The final
|
||
wave — built only once the full substrate and all metrics (16.1–16.3) are in place.** Per Daniel,
|
||
the live card is explicitly the last thing built; there is no early-payoff intermediate.
|
||
|
||
**Hard dependencies (strict bottom-up chain):**
|
||
`16.1 → 16.2 → 16.3 → (16.4 optional) → 16.5`.
|
||
16.1 is the only cold-start wave. 16.5 (the card) sits at the top and depends on the whole stack
|
||
beneath it being finished. 16.4 is speculative and off the critical path to the card.
|
||
|
||
**Verify-before-build (gates 16.1):** confirm the `DeepDrftPublic` proxy can host a `POST api/event/*`
|
||
write route with the same proxy idiom as `api/track/*` (the WASM client cannot reach `DeepDrftAPI`
|
||
directly). This is the one infrastructural assumption the spec rests on; cheap to confirm, blocking if
|
||
wrong.
|
||
|
||
---
|
||
|
||
## 10. Product decisions — RESOLVED (Daniel 2026-06-19)
|
||
|
||
All seven decisions are settled. D1, D2, D4, D5 were resolved on Daniel's explicit pick of the
|
||
recommendation; D3, D6, D7 are **resolved-by-default** (recommendation adopted) and remain low-risk
|
||
to revisit during their wave if implementation surfaces a reason. The "road not taken" for each is
|
||
preserved in the body section so future implementers know what was rejected and why.
|
||
|
||
- **D1 — Middle-band (30–80%) treatment. → RESOLVED: three-bucket `sampled`.** Exhaustive
|
||
partial/sampled/complete; headline plays = sum of all three. *Rejected:* binary partial/complete
|
||
fold. (§1a)
|
||
- **D2 — Engagement floor before a play counts. → RESOLVED: floor on.** A listen counts only at
|
||
**≥3s OR ≥5% of duration, whichever is smaller**; below that it's a dropped preview/skip. *Rejected:*
|
||
floor = 0. Single tunable constant. (§1d)
|
||
- **D3 — Unique-listener window. → RESOLVED (by default): all-time.** Fits "N listeners reached" and
|
||
the Option-A mechanism. *Rejected:* rolling-30-day. Low-risk to revisit in wave 16.3. (§1c)
|
||
- **D4 — Release plays: derived or separately counted. → RESOLVED: derived** (release plays = sum of
|
||
its tracks' plays), server-side. Correct for multi-track Cuts and single-track Session/Mix alike.
|
||
*Rejected:* separate "release was played" event. (§2.3)
|
||
- **D5 — Unique-listener anonymity mechanism. → RESOLVED: Option A** — client-minted random
|
||
first-party `localStorage` id, metric honestly labelled **"listeners,"** all-time. *Rejected:*
|
||
Option B (server-derived salted daily token — stores nothing but only daily-unique); Option C
|
||
(fingerprint — contradicts privacy-light) rejected outright. Headline decision. (§3)
|
||
- **D6 — Rollup strategy. → RESOLVED (by default): incremental-on-write** counter, no background job.
|
||
*Rejected:* periodic batch aggregation (kept as the escape hatch if write volume contends). Low-risk
|
||
to revisit in wave 16.1. (§4.1)
|
||
- **D7 — Home Plays-card secondary line. → RESOLVED (by default): unique listeners ("N listeners").**
|
||
The bottom-up re-sequence lands listeners (16.3) before the card (16.5), so the card can show the
|
||
metric it was always reaching for. *Alternatives kept available:* completion-rate or total-shares,
|
||
a small render-time call during 16.5. Low-risk to revisit in the capstone wave. (§5)
|
||
|
||
## Working with this spec
|
||
|
||
- Mirrors the established product-notes convention (cross-refs up top, numbered sections, decisions
|
||
called out, phasing with explicit dependencies) — see `phase-15-visualizer-controls-enhancements.md`
|
||
/ `phase-11-public-site-enhancements.md`.
|
||
- Decisions D1–D7 are **resolved** (Daniel 2026-06-19) and folded inline; status is design-complete.
|
||
Waves are re-sequenced bottom-up (16.1 → 16.5, card last) per Daniel's directive of the same date.
|
||
- When waves land, doc-keeper moves them to `COMPLETED.md`; the per-wave bodies are written to travel
|
||
cleanly.
|