docs(plan): add Phase 16 spec — anonymous play & share tracking

Design spec for the telemetry layer behind the home-hero Plays card:
completion-bucketed plays, shares, optional anonymous unique listeners
under a no-PII constraint. Seven open decisions flagged for Daniel.
This commit is contained in:
daniel-c-harvey
2026-06-18 14:28:02 -04:00
parent 47919a226e
commit abc832467d
2 changed files with 534 additions and 0 deletions
+21
View File
@@ -239,6 +239,27 @@ Sequenced as **eight waves**; the critical path is `11.A → 11.B → 11.C → 1
---
## Phase 16 — Anonymous Play & Share Tracking
The phase deferred behind the home-hero **Plays** stat card (`NowPlayingStats.razor`'s third card, today a static `XXX / Plays (Coming Soon)` odometer placeholder). Adds a **privacy-light, anonymous** telemetry layer to the public site: counting **plays** (bucketed by completion) and **shares**, tied to individual **tracks and releases**, plus an optional **unique-listener** "plus" metric. Hard constraint: **no accounts, no PII, anonymous identification only** — the unique-listener metric in particular is solved within that constraint, not around it. Full design, metric definitions, instrumentation seam, anonymity-mechanism options, storage model, the card payoff, and wave decomposition: `product-notes/phase-16-play-share-tracking.md`.
**Architectural spine.** Plays are instrumented at **one seam** — the `StreamingAudioPlayerService` playback lifecycle (not the UI, not the HTTP/media-client layer, which fires multiple times per play via seek-beyond-buffer). A small play-session tracker opens on playback-start, advances a high-water position on the existing progress callback, and closes (classifying the §1 completion bucket) on track-switch / stop / organic-end / page-unload. Shares instrument at the **real** share surface (`SharePopover`'s Copy-link / Copy-embed actions — clipboard writes that record nothing today). Events ship **fire-and-forget via `sendBeacon`** to new `POST api/event/{play,share}` endpoints (proxied through `DeepDrftPublic`, same hop as `api/track/*`), land in an **append-only SQL event log + incremental counter rollup** in `DeepDrftData`, and the home card reads the total through the **existing** `GET api/stats/home` / `HomeStatsDto` / `IStatsDataService` path (the same single persistent-state-bridged round-trip the other two cards use). Release attribution is **resolved server-side** from the track→release join (client sends only the track key); release plays are **derived** (sum of their tracks' plays). All SQL — the FileDatabase vault is not involved.
**Completion buckets (agreed thresholds).** `partial` < 30%, `complete` > 80%; the 3080% middle band is proposed as its own `sampled` bucket so the three are exhaustive and non-overlapping (headline plays = sum of all three) — see spec §1a / decision **D1**.
**Sequenced as four waves.** `16.A → 16.B`, `16.A → 16.C`, `16.A → 16.D`. 16.A is the only cold-start wave.
- **16.A — Play & share counters (core).** Player-service play tracker (three-bucket classification + engagement floor), share tracker in `SharePopover`, `sendBeacon` interop + `POST api/event/{play,share}` (rate-limited), `play_event`/`share_event` log + incremental `play_counter` rollup, server-side release resolution. **No `anonId` yet.** Free-floating cold-start wave.
- **16.B — Home Plays-card payoff.** Extend `HomeStatsDto` + `GetHomeStatsAsync` with `TotalPlays` (+ a secondary line, D7); flip the third card from placeholder to live. **Depends on 16.A.** The visible win — sequence immediately after 16.A.
- **16.C — Unique listeners (stretch / "plus").** The anonymity mechanism (recommend a client-minted random first-party `localStorage` id; alt: server-derived salted daily token; fingerprinting rejected — see §3 / **D5**): thread an `anonId` onto event payloads, count distinct server-side, expose per-target and/or as the card's secondary line. **Depends on 16.A. Explicitly lower priority** — defers indefinitely without stranding 16.A/16.B.
- **16.D — Per-target stats surfaces.** `[speculative]` — detail-page play/share/listener display + CMS analytics views (bucket/channel splits, leaderboards). Not in the agreed scope; the event log already supports it. Build when a surface wants it.
**Open product decisions (D1D7, spec §10) — unresolved, awaiting Daniel.** Headline is **D5** (unique-listener anonymity mechanism). Others: D1 (middle-band bucket), D2 (engagement floor before a play counts), D3 (unique-listener window), D4 (release plays derived vs. counted), D6 (rollup strategy), D7 (card secondary line). Resolve before 16.A is decomposed; recommendations are carried in the spec.
**Adjacency to deferred Identity / accounts (the un-phased backlog item above).** This phase is the deliberate **anonymous** answer to "how many plays" — it does **not** need the accounts/identity work and must not be entangled with it. If identity ever lands, per-user listening history is an additive layer above this anonymous substrate, not a replacement.
---
## Working with this file
- **Add items by extending an existing phase first**; only create a new phase when the addition genuinely doesn't fit any of 15. Phase numbers are organisational, not sequencing.
@@ -0,0 +1,513 @@
# Phase 16 — Anonymous Play & Share Tracking (Design Spec)
Status: **design-draft, open for Daniel review** (decision points in §10 are unresolved). Author:
product-designer. Date: 2026-06-18. **No code has been written by this doc.** This is the phase
deferred behind the home-hero "Plays" stat card, which today renders a static
`XXX / Plays (Coming Soon)` odometer placeholder in `NowPlayingStats.razor`.
This spec adds a **privacy-light, anonymous play & share telemetry layer** to the public site:
counting plays (bucketed by completion) and shares, tied to individual tracks and releases, with an
optional unique-listener "plus" metric. It does **not** add accounts, PII, or any per-user identity
model — that is a hard constraint, not a deferral.
## Phase numbering
This is **Phase 16**. Phase 15 (Visualizer Controls Enhancements) is the highest-numbered phase in
`PLAN.md`. Phases 11 and 10-Reframe are landed; no phase 16 exists yet. If a concurrent worktree has
claimed 16 by the time this is scoped, bump to the next free number — the content is
number-independent.
## Cross-references (read these before implementing)
- `DeepDrftPublic.Client/Services/StreamingAudioPlayerService.cs` — the production player. The
instrumentation seam lives here: `LoadTrackStreaming` (track-load = play-start candidate),
the progress callback path, and `ResetToIdle` (stop/unload/switch). `_currentTrackId` holds the
current `EntryKey`. **No release id is currently held by the player** — see §2.3.
- `DeepDrftPublic.Client/Services/AudioPlayerService.cs` — base class. `OnProgressCallback(double
currentTime)` is the per-tick position seam; `OnPlaybackEndCallback` is the organic end-of-stream
seam (and the only place `TrackEnded` fires). `Duration` is set from the WAV header during load.
- `DeepDrftPublic.Client/Services/AudioInteropService.cs` — `SetOnProgressCallbackAsync` /
`SetOnEndCallbackAsync` are the JS→.NET callbacks already wired in `InitializeAsync`. Progress is
throttled to ~10/sec on the JS side already.
- `DeepDrftPublic.Client/Services/QueueService.cs` — auto-advance orchestrator. Album playthroughs
flow `PlayRelease → PlayCurrent → SelectTrackStreaming` per track; `OnTrackEnded` advances. Every
track in an album play is an independent `SelectTrackStreaming` call, so per-track play events
arise naturally without queue-specific instrumentation.
- `DeepDrftPublic.Client/Controls/SharePopover.razor[.cs]` — the **real** share surface. Two share
actions exist today: **Copy link** (track mode + release mode) and **Copy embed** (track mode
only, an `<iframe>` snippet to `/FramePlayer?TrackEntryKey=…`). Both are clipboard writes — there
is no network call, so **no share is recorded today**. This is the share-event origin (§2.4).
- `DeepDrftPublic.Client/Pages/FramePlayer.razor` — the embeddable single-track player the embed
snippet points at. Plays inside a third-party `<iframe>` should count (§1, edge cases).
- `DeepDrftAPI/Controllers/StatsController.cs` + `DeepDrftModels/DTOs/HomeStatsDto.cs` — the landed
home-stats pattern the Plays card will consume. `GET api/stats/home` returns a bare DTO; aggregation
lives in `TrackRepository.GetHomeStatsAsync` / `ITrackService`; the controller is a thin boundary.
**This is the template the play/share read path mirrors** (§4, §5).
- `DeepDrftPublic.Client/Controls/NowPlayingStats.razor` — the three-card hero stat row. The Plays
card is the third card; it currently shows the static placeholder. This phase fills it.
- `DeepDrftData` (`DeepDrftContext`, `TrackRepository`, `TrackManager`, `Migrations`) — where new
SQL tables/migrations and aggregation queries live. `DeepDrftAPI` owns the HTTP surface.
- `DeepDrftPublic` proxy controller (`api/track/*`) — the public site's browser→API proxy hop. A new
`api/event/*` or `api/stats/*` write path needs a matching proxy route (the WASM client cannot
reach `DeepDrftAPI` directly; SSR can).
---
## 1. The metrics, defined precisely
Three metrics, two core + one stretch:
### 1a. Play (core)
A **play** is a single listener's session of listening to one track. It is recorded once per
track-listen, classified by how far the listener got:
- **partial** — playback reached **< 30%** of the track's duration before the session ended
(switched track, stopped, navigated away, closed tab).
- **complete** — playback reached **> 80%** of duration.
- **middle band (30%80%)** — see decision **D1** below. **Recommendation: count the middle band as
its own bucket, `sampled`** (a real listen that wasn't a skip and wasn't a finish), so the three
buckets are exhaustive and non-overlapping: `partial` [0, 30%), `sampled` [30%, 80%], `complete`
(80%, 100%]. The headline "Plays" number is the **sum of all three** (every started listen counts
as a play); the buckets are the texture beneath it.
Alternative considered: fold the middle into "complete" (threshold becomes "≥30% = a real play,
else partial"). Simpler, two buckets — but it throws away the distinction between "listened to half"
and "listened to the end," which is the most editorially interesting signal for a music collective
("which mixes do people actually finish?"). Rejected in favour of three buckets, but it's a genuine
Daniel call (**D1**).
**What starts a play candidate:** a track's audio actually begins streaming for playback — i.e.
`SelectTrackStreaming` reaches the point where `StartStreamingPlayback` succeeds and `IsPlaying`
becomes true (the `_streamingPlaybackStarted` transition in `StreamAudioWithEarlyPlayback`). A track
that is *staged* (`StageTrack`) but never played does **not** count. A track that fails to load does
not count.
**What classifies the bucket:** the **maximum playback position reached** as a fraction of duration,
captured when the listen ends (track switch, stop, organic end, or page unload). Not the position at
the moment of ending — the *high-water mark* — so that seeking backward near the end doesn't demote a
complete play to partial. (See seeks under edge cases.)
**Why a high-water mark, not elapsed-listen-time:** elapsed time would require accumulating play
duration across pauses and is more state to carry. The high-water position is already trivially
derivable from the progress callback (`max(currentTime)/duration`). For v1 the simpler model is the
right call; if "engaged listening time" becomes interesting later it's an additive metric, not a
reshape.
### 1b. Share (core)
A **share** is recorded when a listener performs a share *action* — not when they open the share
popover. Two actions exist today (`SharePopover`):
- **Copy link** (track or release) → records a share against that track or release, with a
`channel = link` tag.
- **Copy embed** (track only) → records a share with `channel = embed`.
A future "native share" (Web Share API) or per-platform button would add channels without reshaping
the metric. The share count on the Plays-card payoff is **total shares**; the channel split is
texture (and probably CMS-only, not public-facing, for v1).
**De-dupe:** copying the same link three times in a row is one intent, not three shares. Recommend a
client-side **debounce** — at most one share event per (target, channel) per short window (e.g. 60s)
per session. Cheap, prevents the obvious gaming, and matches how "copied!" already feels like one act.
### 1c. Unique listeners (stretch / "plus" — lower priority)
A **unique listener** is an approximate distinct-listener count over a window (all-time, or rolling
30 days — **D3**), tied to a track or release. This is the metric most in tension with the no-PII
constraint, and it is explicitly the **last** thing to build (§6). It is approximate by design — we
are not building identity, we are estimating reach. Mechanism options and recommendation: **§3**.
### 1d. Edge cases (apply to plays unless noted)
- **Replays.** Playing the same track twice in one session = two plays. The play event fires per
`SelectTrackStreaming` that reaches playback. If a listener loops a track ten times, that's ten
plays. Acceptable for v1 — looping is genuine listening. (Unique listeners absorbs the "but it's
the same person" concern at the reach level.)
- **Seeks (within a play).** Seeking does not start a new play. The high-water position keeps
climbing; seeking backward never lowers it. So "seek to the end to check the outro, then seek
back" still classifies as `complete` — correct, they heard the end.
- **Seek-beyond-buffer re-request.** The `GET api/track/{id}` with a `Range` header during
`SeekBeyondBuffer` is a *byte* re-fetch of the **same** play — it must **not** start a new play
event. Because the play event is keyed off the player-service `SelectTrackStreaming` lifecycle (not
the HTTP fetch), this is free: `SeekBeyondBuffer` reuses `_currentTrackId` and never calls
`SelectTrackStreaming`. **Instrument at the player-service level, never at the HTTP/media-client
level** — the media client fires multiple times per play.
- **Very short tracks.** A 4-second sting: 30%/80% still apply proportionally. No special-casing for
v1. (If "complete" on a 4s clip feels too cheap, a minimum-absolute-seconds floor is an additive
tweak — flag as a tuning knob, not a v1 requirement.)
- **Rapid skips.** Listener clicks through ten tracks in five seconds. Each reaches `< 30%` →
ten `partial` plays. **Recommendation (D2): apply a minimum-engagement floor before a play counts
at all** — e.g. playback must reach **≥ 3 seconds OR ≥ 5% of duration** (whichever is smaller) for
the listen to register as a play. Below the floor it's a *preview/skip*, not a play, and is dropped
entirely. This keeps the headline number honest (a skim through the archive isn't 40 plays) while
still capturing genuine short partial listens. The floor is a single tunable constant. **D2 is a
Daniel call** — the alternative is "every started playback counts, floor = 0," which is simpler and
defensible ("they hit play, it's a play") but inflates the number on a browsing session.
- **Tab close / navigation mid-play.** The play must still be recorded with its high-water bucket.
This is the hardest delivery case and drives the beacon recommendation in §2.2.
- **Embedded (`FramePlayer`) plays.** A play inside a third-party iframe is a real play and should
count. The embed runs the same player stack, so it instruments for free — but it may run
cross-origin, which bears on the unique-listener cookie/storage mechanism (§3) and the share
attribution (an embed play could optionally carry the embedding page as a referrer dimension —
out of scope for v1, flag as adjacent).
- **CMS / admin playback.** Plays generated by an admin auditing tracks in the CMS should ideally not
pollute the public counts. The CMS is a separate app (`DeepDrftManager`) and does not run this
player stack, so this is mostly free — but a logged-in admin browsing the *public* site would
count. Acceptable for v1 (low volume); flag as a known caveat.
---
## 2. Instrumentation
### 2.1 Where events originate
**Play events: in `StreamingAudioPlayerService`, not the UI and not the HTTP layer.** The player
service is the single chokepoint every playback path flows through (home StreamNow, gallery, queue
auto-advance, detail-page play, embed). Instrumenting once there covers all of them and dodges the
seek-beyond-buffer double-count trap.
Concretely, the player service grows a small internal **play-session tracker**:
- On the `_streamingPlaybackStarted` transition (playback actually begins): open a play session for
`_currentTrackId` — record the track `EntryKey`, the resolved release id (§2.3), the duration once
the WAV header sets it, and start the high-water mark at 0.
- On each progress tick (`OnProgressCallback`, already firing ≤10/sec): advance the high-water mark
(`max(highWater, currentTime)`).
- On session-end — whichever comes first: organic end (`OnPlaybackEndCallback`), a superseding
`LoadTrackStreaming` / `ResetToIdle` (track switch / stop / unload / dispose), or page unload
(§2.2) — **close the session**: compute `highWater/duration`, apply the §1d floor, classify the
bucket, and emit one play event. Then clear the tracker.
This keeps all play-counting logic in one place, behind one seam, testable against a fake interop.
**It does not change the playback path** — it's an observer on transitions the service already makes.
**Recommendation:** factor the tracker into a small injectable collaborator (e.g. `IPlayTracker` /
`PlayTracker`) that the player service calls (`OnPlaybackStarted`, `OnProgress`, `OnPlaybackEnded`),
rather than inlining HTTP calls into the player. Keeps the player's single responsibility intact and
matches the repo's "logic in services, not in the playback path" discipline. The tracker owns the
beacon call and the de-dupe/floor logic. (Final structural call is staff-engineer's; this is the
steer.)
**Share events: in `SharePopover` (or a small `IShareTracker` it calls).** The `CopyLink` /
`CopyEmbed` handlers already exist and already know the target (track `EntryKey` or release
`EntryKey` + medium) and the channel. After a successful clipboard write, fire a share event. Apply
the §1b debounce in the tracker.
### 2.2 What gets sent, when, and how (fire-and-forget vs. tracked)
**Recommendation: fire-and-forget `sendBeacon` for play and share events.** These are telemetry, not
transactions — a dropped event is acceptable; blocking the UI or the navigation for one is not.
`navigator.sendBeacon` is purpose-built for exactly this (it survives page unload, which the
tab-close edge case requires) and is a tiny TS interop addition alongside the existing audio interop.
- **Play event payload (sent once at session close):** `{ trackEntryKey, releaseEntryKey?, medium?,
bucket: "partial"|"sampled"|"complete", anonId? }`. No duration or position is sent — the bucket is
computed client-side, so the server stores a classification, not raw listening data (a privacy
plus: we never transmit *how long* someone listened, only a coarse bucket). `anonId` is the
unique-listener token and is **omitted entirely** if the unique-listener feature is off or the
listener has opted out (§3).
- **Share event payload:** `{ targetType: "track"|"release", targetKey, channel:
"link"|"embed", anonId? }`.
- **Transport:** a single new endpoint family `POST api/event/play` and `POST api/event/share`
(proxied through `DeepDrftPublic` for the WASM client, same hop as `api/track/*`). `sendBeacon`
issues a `POST` with a small JSON body; the endpoint returns `202 Accepted` and does the write
async. **Unauthenticated** (same posture as the public reads) — but rate-limited and validated
(§2.5).
**Why not a tracked/awaited call:** the only thing we'd gain is a delivery confirmation we don't need
and a failure path we don't want (a telemetry 500 must never surface to a listener). The beacon's
"can't read the response" limitation is irrelevant — we don't act on the response.
**Page-unload delivery:** register a `pagehide`/`visibilitychange→hidden` handler that closes any open
play session via the beacon. This is the canonical pattern (analytics libraries all do this) and is
why `sendBeacon` over `fetch` matters — `fetch` is cancelled on unload, `sendBeacon` is not.
### 2.3 Resolving the release id for a play
The player today holds `_currentTrackId` (the track `EntryKey`) but **not** the release. A play event
wants both (the metric is "tied to individual tracks and releases"). Options:
1. **Carry release context on the `TrackDto`/play call.** When a play originates from a release detail
page or the queue (`PlayRelease`), the caller knows the release. Thread a release `EntryKey` +
medium into `SelectTrackStreaming` (optional param) or onto the staged context. Clean, explicit.
2. **Resolve server-side from the track.** The play event sends only `trackEntryKey`; the API joins
track→release at write or aggregation time. The track→release link already exists in SQL. Zero
client plumbing; the release dimension is always correct even for plays that started without
release context (e.g. StreamNow random track).
**Recommendation: option 2 (resolve server-side).** The track→release join is authoritative and
already in `DeepDrftData`; sending only the track key keeps the client dumb and the payload minimal,
and it means a random-track play still gets correctly attributed to its release without the client
knowing. The client sends what it cheaply knows (track key); the server enriches. This also means
**release play counts are derived** (sum of plays of the release's tracks) rather than separately
counted — which is exactly right for multi-track Cuts and trivially correct for single-track
Session/Mix. (**D4**: confirm release plays = sum-of-track-plays, not a separately-counted "release
was played" event. Recommend derived.)
Shares already carry the right target directly (`SharePopover` knows track vs. release), so share
attribution needs no resolution step.
### 2.4 Avoiding double-counting
- **Plays:** one session = one event, enforced by the player-service tracker (only `SelectTrack
Streaming` opens a session; seek-beyond-buffer and progress ticks never do). Covered by §2.1.
- **Shares:** the §1b per-(target,channel) debounce.
- **Beacon retries:** `sendBeacon` does not retry, so no transport-level duplication. If a future
switch to `fetch` adds retries, an idempotency key on the event would be needed — not for v1.
### 2.5 Abuse / inflation resistance (privacy-light, so light-touch)
Anonymous unauthenticated write endpoints invite inflation. v1 posture: **make casual gaming
annoying, accept that determined gaming is possible** (this is a band's vanity-and-texture counter,
not ad-revenue telemetry — the "90s visitor counter vibe" Daniel wants). Measures:
- Server-side **rate-limit per IP** on the event endpoints (e.g. N events/minute) — coarse, stateless,
standard ASP.NET rate-limiting middleware.
- The play **engagement floor** (§1d, D2) already drops trivial skim-spam.
- Reject malformed payloads (unknown bucket, missing track key, oversized body) at the controller.
- **Out of scope for v1:** bot detection, CAPTCHA, signed events, IP reputation. Flag as adjacent if
the counts ever start mattering enough to defend.
---
## 3. Privacy-light anonymity mechanism (unique listeners)
This is the decision most in tension with "no PII / anonymous only," and the one most wanting Daniel's
eyes (**D5**). The unique-listener metric needs *some* notion of "the same anonymous listener seen
again" without identifying who they are. Three mechanism families, with trade-offs:
### Option A — Anonymous client-minted id (random GUID in localStorage)
On first visit, the client mints a random GUID, stores it in `localStorage` (or a first-party cookie),
and sends it as `anonId` on play/share events. The server counts distinct `anonId`s per
track/release.
- **Privacy:** strong. The id is random, contains no PII, is first-party only, never leaves as
anything but an opaque token, and the listener can clear it (clear site data) at will. It is
effectively a "this browser, until you clear it" token — not a person.
- **Accuracy:** good-but-inflating. Same person on phone + laptop = 2 listeners. Cleared storage =
new listener. Incognito = new listener each session. So it **over-counts** unique listeners
(counts unique *browser-installs-since-last-clear*, which we relabel honestly as "listeners").
- **Cross-origin embed caveat:** an embedded `FramePlayer` on a third-party site cannot read the
first-party `localStorage` of `deepdrft.com` (storage is partitioned). So embed plays would mint a
*separate* id per embedding site. Acceptable — embed plays are a minority and over-counting is the
known direction of error.
- **Consent:** a random first-party id with no cross-site tracking is the lightest-touch case under
GDPR/ePrivacy — widely treated as not requiring a consent banner when used purely for first-party
aggregate counts (this is the "privacy-light" sweet spot). **Recommend a short privacy-note line**
rather than a cookie wall.
### Option B — Salted/rotating daily identifier (no stored id at all)
The server (or client) derives a per-day token by hashing `IP + User-Agent + a daily-rotating salt`.
No id is stored on the client. Distinct tokens per day ≈ distinct visitors per day; the salt rotation
means yesterday's tokens can't be correlated to today's (no long-term tracking).
- **Privacy:** very strong on the *no-persistent-identifier* axis (nothing stored client-side, no
cross-day linkage). This is the pattern Plausible/Fathom-style privacy analytics use.
- **Accuracy:** coarser. IP+UA collides (everyone behind one NAT/office/carrier-CGNAT shares a token →
under-counts) and rotates daily (can only ever produce "unique per day," never "unique all-time" —
by design). Mobile IPs shift constantly → over-counts. Net: noisy, and **only meaningful as a daily
figure**, which fights the "all-time plays counter" vibe.
- **Architecture cost:** the IP is only reliably available **server-side** (the client doesn't know
its own public IP, and `X-Forwarded-For` is already handled server-side per `DeepDrftAPI` config).
So the token must be computed at the API, not the client — meaning the unique-listener dimension is
derived entirely server-side from request metadata, and the client sends *no* `anonId` at all. That
is actually a privacy *plus* (the client transmits nothing identifying) but ties unique-listeners to
a daily-rolling model.
### Option C — Coarse browser fingerprint
Derive an id from canvas/font/hardware fingerprinting signals.
- **Privacy:** worst. Fingerprinting is precisely what privacy regulation and browser vendors are
actively killing; it tracks across sites and survives storage-clearing. **Directly contradicts
"privacy-light."** Rejected outright — listed only for completeness.
### Recommendation
**Option A (client-minted random first-party id) as the primary mechanism, with the metric honestly
labelled.** Reasons:
- It fits the product: a band wants an *all-time* "N listeners reached" figure, which A supports and
B (daily-only) structurally cannot.
- It is genuinely privacy-light: opaque random token, first-party, clearable, no cross-site
correlation, no fingerprinting — the lightest mechanism that still answers "all-time unique."
- It keeps the server simple (count distinct tokens) and the client honest (one tiny localStorage
read).
- We label the metric **"listeners," not "people,"** and accept the known over-count. For a vanity
texture stat this is the right honesty/effort trade.
**If Daniel wants the stronger "stores nothing" posture, Option B is the fallback** — at the cost of
the metric becoming daily-unique and noisier. The two are not mutually exclusive long-term (A for
all-time reach, B-style for daily actives) but v1 should pick one. **This is D5.**
**Either way, unique-listeners is the deferred §6 stretch** — the play/share counters (which need no
`anonId` at all) ship first and stand alone.
---
## 4. Storage & aggregation model
### 4.1 Event capture: log vs. counter vs. both
Three shapes:
1. **Rolled-up counters only.** One row per (track, bucket) with an integer count, incremented on each
event. Tiny storage, trivial read, no history, no unique-listener support (can't count distinct
without storing the distinct tokens). Pure "90s hit counter."
2. **Append-only event log only.** One row per event. Full fidelity, supports any future metric
(unique listeners, time-of-day, channel splits, retention), but every read is an aggregation query
over a growing table and write volume = play volume.
3. **Both — event log + periodic/online rollup.** Events append to a log; counters are maintained
(either incrementally on write, or by a periodic aggregation pass) for the hot read path.
**Recommendation: a lean version of (3) — an append-only event log as the source of truth, plus a
small rolled-up counter table for the home-card hot read.** Reasoning:
- The home Plays card is read on every public-site landing; it must be a single cheap indexed read,
not a `COUNT(*)` over an event table. That argues for a counter.
- But unique-listeners *requires* retaining distinct tokens, and "which mixes get finished" texture
wants the bucket/per-target fidelity — both argue for a log.
- (3) gets both without forcing a premature choice, and matches the existing `HomeStatsDto` pattern
(the card reads a pre-aggregated DTO; it never queries raw).
For v1 the rollup can be **incremental-on-write** (the event-write transaction also bumps the counter
row) to avoid standing up a background job. If write volume ever makes that contended, a periodic
aggregation pass is the escape hatch — but for a collective-scale site, incremental is fine and
simplest. (**D6**: incremental-on-write rollup vs. periodic-batch rollup. Recommend incremental for
v1.)
### 4.2 Conceptual SQL shape (not full schema — that's staff-engineer's)
In `DeepDrftData`, new tables roughly:
- **`play_event`** (the log): `id`, `track_entry_key` (or `track_id` FK), `release_id` (resolved at
write, §2.3), `bucket` (enum: partial/sampled/complete), `anon_id` (nullable — present only when
unique-listeners is on and the listener didn't opt out), `created_at`. Indexed on
`(track_id)`, `(release_id)`, and `(anon_id)` for the distinct-count query.
- **`share_event`** (the log): `id`, `target_type` (track/release), `target_id`, `channel`
(link/embed), `anon_id?`, `created_at`.
- **`play_counter`** (the rollup, optional in v1 if reads stay cheap): per (track_id) and per
(release_id), columns for `partial_count`, `sampled_count`, `complete_count`, `share_count`, and a
`total_plays` derived/stored. This is what the home-stats aggregation reads.
`anon_id` is the only field that touches anonymity, it is nullable, and it is never joined to any
identity table because none exists. Dropping the unique-listener feature later = stop writing the
column; the play/share counts are unaffected.
**Note on the dual-database split:** this is **all SQL** (it's metadata/counters, not binary content).
The FileDatabase vault is not involved. Aggregation lives in `TrackRepository` alongside
`GetHomeStatsAsync`; the API surface lives in `DeepDrftAPI`.
### 4.3 New API surface (sketch)
Writes (unauthenticated, rate-limited, proxied through `DeepDrftPublic`):
- `POST api/event/play` — body `{ trackEntryKey, bucket, anonId? }`; returns `202`. Resolves release
server-side, appends to `play_event`, bumps `play_counter`.
- `POST api/event/share` — body `{ targetType, targetKey, channel, anonId? }`; returns `202`.
Reads (unauthenticated, mirror `GET api/stats/home`):
- Extend **`GET api/stats/home`** to include the site-wide play total (and optionally share total)
for the home card — *the minimal payoff* (§5). The card already does one round-trip to this
endpoint; adding fields is the smallest possible change.
- `GET api/stats/track/{entryKey}` and `GET api/stats/release/{entryKey}` — per-target play/share/
listener figures, for the (future) detail-page display. **Not required for the home-card payoff**;
build when a detail-page surface wants them.
CMS-facing reads (the channel splits, bucket breakdowns, per-track leaderboards) are a separate,
later, `DeepDrftManager`-side concern — explicitly out of v1 scope, flagged adjacent.
---
## 5. The Plays card payoff
Today `NowPlayingStats.razor`'s third card renders `XXX / Plays (Coming Soon)` — a static odometer
placeholder. Once this phase lands, the card shows a **real site-wide play total** in the same
odometer treatment (the "90s visitor counter" vibe is *already the intended aesthetic* — the
placeholder is literally styled as an odometer). The minimal, correct payoff:
- **Primary figure:** total plays site-wide (sum across all tracks), rendered in the odometer.
- **Secondary line (optional, recommend yes):** something with texture that fits the card's existing
two-line shape (the other two cards both have a primary + secondary). Candidates: total shares
("N shared"), or completion rate ("N% finished" = complete / total), or unique listeners once the
stretch lands. **Recommend completion-rate or share-count for v1**, swapping to listeners if/when
§6 ships. (**D7** — what's the card's secondary line.)
Mechanically: add `TotalPlays` (and the chosen secondary) to `HomeStatsDto`, populate it in
`TrackRepository.GetHomeStatsAsync` from `play_counter`, and the card reads it through the existing
`IStatsDataService.GetHomeStats()` path — **the same persistent-state-bridged single round-trip the
other two cards already use.** No new client data path; the card just stops being static.
This is deliberately the *first* visible payoff and the *smallest* — the whole counter substrate earns
its keep the moment the home number goes live, before any per-track surface or unique-listener work.
---
## 6. Suggested phasing
Sequenced so the visible payoff lands early and the privacy-sensitive stretch lands last. Each wave is
independently shippable.
- **16.A — Play & share counters (core).** The whole spine: player-service play tracker (§2.1) with
the three-bucket classification (§1a) and engagement floor (§1d/D2); share tracker in `SharePopover`
(§1b); `sendBeacon` interop + `POST api/event/{play,share}` endpoints (§2.2) with rate-limiting
(§2.5); `play_event`/`share_event` log + incremental `play_counter` rollup (§4); server-side release
resolution (§2.3, D4). **No `anonId` written yet** (unique listeners is 16.C). **Free-floating —
the cold-start wave; nothing gates it.**
- **16.B — Home Plays-card payoff (§5).** Extend `HomeStatsDto` + `GetHomeStatsAsync` with `TotalPlays`
(+ chosen secondary, D7); flip `NowPlayingStats`'s third card from placeholder to live. **Depends on
16.A** (needs the counter to read). This is the visible win — sequence it immediately after 16.A.
- **16.C — Unique listeners (stretch / "plus").** The anonymity mechanism (§3, D5 — recommend Option
A): mint/read the anon token, thread it onto event payloads, count distinct server-side, expose via
the per-target stats reads and/or as the home card's secondary line. **Depends on 16.A** (extends the
event payload + storage). **Explicitly lower priority** — 16.A+16.B deliver the agreed core; 16.C is
the agreed stretch. Can be deferred indefinitely without stranding anything.
- **16.D — Per-target stats surfaces (adjacent, not committed).** Detail-page play/share/listener
display via `GET api/stats/{track,release}/{key}`; CMS analytics views (bucket splits, channel
splits, leaderboards). **Speculative** — flagged for when a surface actually wants these. Not part of
the agreed scope; listed so the substrate (the event log) is understood to already support it.
**Hard dependencies:** `16.A → 16.B`; `16.A → 16.C`; `16.A → 16.D`. 16.A is the only cold-start wave.
16.B and 16.C are parallel after 16.A (B is the priority; C is the stretch).
**Verify-before-build:** confirm the `DeepDrftPublic` proxy can host a `POST api/event/*` write route
with the same proxy idiom as `api/track/*` (the WASM client cannot reach `DeepDrftAPI` directly). This
is the one infrastructural assumption the spec rests on; cheap to confirm, blocking if wrong.
---
## 10. Open product decisions (for Daniel)
Resolve these before 16.A is decomposed. Recommendations carried from the body; the call is Daniel's.
- **D1 — Middle-band (3080%) treatment.** Recommend a third `sampled` bucket (exhaustive
partial/sampled/complete); headline plays = sum of all three. Alt: fold middle into a binary
partial/complete. (§1a)
- **D2 — Engagement floor before a play counts.** Recommend a floor (~≥3s or ≥5% of duration) so
archive-skimming doesn't inflate the count. Alt: floor = 0, every started playback counts. (§1d)
- **D3 — Unique-listener window.** All-time vs. rolling-30-day. Recommend all-time (fits the "N
listeners reached" framing and Option-A mechanism). (§1c) — only bites if 16.C is built.
- **D4 — Release plays: derived or separately counted.** Recommend derived (release plays = sum of its
tracks' plays), correct for multi-track Cuts and single-track Session/Mix alike. (§2.3)
- **D5 — Unique-listener anonymity mechanism.** Recommend Option A (client-minted random first-party
localStorage id, metric honestly labelled "listeners," all-time). Fallback Option B (server-derived
salted daily token — stores nothing client-side but only yields daily-unique). Option C
(fingerprint) rejected. **This is the headline decision.** (§3)
- **D6 — Rollup strategy.** Recommend incremental-on-write counter for v1 (no background job). Alt:
periodic batch aggregation. (§4.1)
- **D7 — Home Plays-card secondary line.** Recommend completion-rate or total-shares for v1, swapping
to unique listeners if/when 16.C ships. (§5)
## Working with this spec
- Mirrors the established product-notes convention (cross-refs up top, numbered sections, decisions
called out, phasing with explicit dependencies) — see `phase-15-visualizer-controls-enhancements.md`
/ `phase-11-public-site-enhancements.md`.
- When Daniel resolves the D-decisions, fold the resolutions inline (mark the section "resolved by
Daniel YYYY-MM-DD") and flip the status to design-complete, the same way Phase 15 did.
- When waves land, doc-keeper moves them to `COMPLETED.md`; the per-wave bodies are written to travel
cleanly.