docs(plan): add Phase 16 spec — anonymous play & share tracking
Design spec for the telemetry layer behind the home-hero Plays card: completion-bucketed plays, shares, optional anonymous unique listeners under a no-PII constraint. Seven open decisions flagged for Daniel.
This commit is contained in:
@@ -239,6 +239,27 @@ Sequenced as **eight waves**; the critical path is `11.A → 11.B → 11.C → 1
|
||||
|
||||
---
|
||||
|
||||
## Phase 16 — Anonymous Play & Share Tracking
|
||||
|
||||
The phase deferred behind the home-hero **Plays** stat card (`NowPlayingStats.razor`'s third card, today a static `XXX / Plays (Coming Soon)` odometer placeholder). Adds a **privacy-light, anonymous** telemetry layer to the public site: counting **plays** (bucketed by completion) and **shares**, tied to individual **tracks and releases**, plus an optional **unique-listener** "plus" metric. Hard constraint: **no accounts, no PII, anonymous identification only** — the unique-listener metric in particular is solved within that constraint, not around it. Full design, metric definitions, instrumentation seam, anonymity-mechanism options, storage model, the card payoff, and wave decomposition: `product-notes/phase-16-play-share-tracking.md`.
|
||||
|
||||
**Architectural spine.** Plays are instrumented at **one seam** — the `StreamingAudioPlayerService` playback lifecycle (not the UI, not the HTTP/media-client layer, which fires multiple times per play via seek-beyond-buffer). A small play-session tracker opens on playback-start, advances a high-water position on the existing progress callback, and closes (classifying the §1 completion bucket) on track-switch / stop / organic-end / page-unload. Shares instrument at the **real** share surface (`SharePopover`'s Copy-link / Copy-embed actions — clipboard writes that record nothing today). Events ship **fire-and-forget via `sendBeacon`** to new `POST api/event/{play,share}` endpoints (proxied through `DeepDrftPublic`, same hop as `api/track/*`), land in an **append-only SQL event log + incremental counter rollup** in `DeepDrftData`, and the home card reads the total through the **existing** `GET api/stats/home` / `HomeStatsDto` / `IStatsDataService` path (the same single persistent-state-bridged round-trip the other two cards use). Release attribution is **resolved server-side** from the track→release join (client sends only the track key); release plays are **derived** (sum of their tracks' plays). All SQL — the FileDatabase vault is not involved.
|
||||
|
||||
**Completion buckets (agreed thresholds).** `partial` < 30%, `complete` > 80%; the 30–80% middle band is proposed as its own `sampled` bucket so the three are exhaustive and non-overlapping (headline plays = sum of all three) — see spec §1a / decision **D1**.
|
||||
|
||||
**Sequenced as four waves.** `16.A → 16.B`, `16.A → 16.C`, `16.A → 16.D`. 16.A is the only cold-start wave.
|
||||
|
||||
- **16.A — Play & share counters (core).** Player-service play tracker (three-bucket classification + engagement floor), share tracker in `SharePopover`, `sendBeacon` interop + `POST api/event/{play,share}` (rate-limited), `play_event`/`share_event` log + incremental `play_counter` rollup, server-side release resolution. **No `anonId` yet.** Free-floating cold-start wave.
|
||||
- **16.B — Home Plays-card payoff.** Extend `HomeStatsDto` + `GetHomeStatsAsync` with `TotalPlays` (+ a secondary line, D7); flip the third card from placeholder to live. **Depends on 16.A.** The visible win — sequence immediately after 16.A.
|
||||
- **16.C — Unique listeners (stretch / "plus").** The anonymity mechanism (recommend a client-minted random first-party `localStorage` id; alt: server-derived salted daily token; fingerprinting rejected — see §3 / **D5**): thread an `anonId` onto event payloads, count distinct server-side, expose per-target and/or as the card's secondary line. **Depends on 16.A. Explicitly lower priority** — defers indefinitely without stranding 16.A/16.B.
|
||||
- **16.D — Per-target stats surfaces.** `[speculative]` — detail-page play/share/listener display + CMS analytics views (bucket/channel splits, leaderboards). Not in the agreed scope; the event log already supports it. Build when a surface wants it.
|
||||
|
||||
**Open product decisions (D1–D7, spec §10) — unresolved, awaiting Daniel.** Headline is **D5** (unique-listener anonymity mechanism). Others: D1 (middle-band bucket), D2 (engagement floor before a play counts), D3 (unique-listener window), D4 (release plays derived vs. counted), D6 (rollup strategy), D7 (card secondary line). Resolve before 16.A is decomposed; recommendations are carried in the spec.
|
||||
|
||||
**Adjacency to deferred Identity / accounts (the un-phased backlog item above).** This phase is the deliberate **anonymous** answer to "how many plays" — it does **not** need the accounts/identity work and must not be entangled with it. If identity ever lands, per-user listening history is an additive layer above this anonymous substrate, not a replacement.
|
||||
|
||||
---
|
||||
|
||||
## Working with this file
|
||||
|
||||
- **Add items by extending an existing phase first**; only create a new phase when the addition genuinely doesn't fit any of 1–5. Phase numbers are organisational, not sequencing.
|
||||
|
||||
@@ -0,0 +1,513 @@
|
||||
# Phase 16 — Anonymous Play & Share Tracking (Design Spec)
|
||||
|
||||
Status: **design-draft, open for Daniel review** (decision points in §10 are unresolved). Author:
|
||||
product-designer. Date: 2026-06-18. **No code has been written by this doc.** This is the phase
|
||||
deferred behind the home-hero "Plays" stat card, which today renders a static
|
||||
`XXX / Plays (Coming Soon)` odometer placeholder in `NowPlayingStats.razor`.
|
||||
|
||||
This spec adds a **privacy-light, anonymous play & share telemetry layer** to the public site:
|
||||
counting plays (bucketed by completion) and shares, tied to individual tracks and releases, with an
|
||||
optional unique-listener "plus" metric. It does **not** add accounts, PII, or any per-user identity
|
||||
model — that is a hard constraint, not a deferral.
|
||||
|
||||
## Phase numbering
|
||||
|
||||
This is **Phase 16**. Phase 15 (Visualizer Controls Enhancements) is the highest-numbered phase in
|
||||
`PLAN.md`. Phases 11 and 10-Reframe are landed; no phase 16 exists yet. If a concurrent worktree has
|
||||
claimed 16 by the time this is scoped, bump to the next free number — the content is
|
||||
number-independent.
|
||||
|
||||
## Cross-references (read these before implementing)
|
||||
|
||||
- `DeepDrftPublic.Client/Services/StreamingAudioPlayerService.cs` — the production player. The
|
||||
instrumentation seam lives here: `LoadTrackStreaming` (track-load = play-start candidate),
|
||||
the progress callback path, and `ResetToIdle` (stop/unload/switch). `_currentTrackId` holds the
|
||||
current `EntryKey`. **No release id is currently held by the player** — see §2.3.
|
||||
- `DeepDrftPublic.Client/Services/AudioPlayerService.cs` — base class. `OnProgressCallback(double
|
||||
currentTime)` is the per-tick position seam; `OnPlaybackEndCallback` is the organic end-of-stream
|
||||
seam (and the only place `TrackEnded` fires). `Duration` is set from the WAV header during load.
|
||||
- `DeepDrftPublic.Client/Services/AudioInteropService.cs` — `SetOnProgressCallbackAsync` /
|
||||
`SetOnEndCallbackAsync` are the JS→.NET callbacks already wired in `InitializeAsync`. Progress is
|
||||
throttled to ~10/sec on the JS side already.
|
||||
- `DeepDrftPublic.Client/Services/QueueService.cs` — auto-advance orchestrator. Album playthroughs
|
||||
flow `PlayRelease → PlayCurrent → SelectTrackStreaming` per track; `OnTrackEnded` advances. Every
|
||||
track in an album play is an independent `SelectTrackStreaming` call, so per-track play events
|
||||
arise naturally without queue-specific instrumentation.
|
||||
- `DeepDrftPublic.Client/Controls/SharePopover.razor[.cs]` — the **real** share surface. Two share
|
||||
actions exist today: **Copy link** (track mode + release mode) and **Copy embed** (track mode
|
||||
only, an `<iframe>` snippet to `/FramePlayer?TrackEntryKey=…`). Both are clipboard writes — there
|
||||
is no network call, so **no share is recorded today**. This is the share-event origin (§2.4).
|
||||
- `DeepDrftPublic.Client/Pages/FramePlayer.razor` — the embeddable single-track player the embed
|
||||
snippet points at. Plays inside a third-party `<iframe>` should count (§1, edge cases).
|
||||
- `DeepDrftAPI/Controllers/StatsController.cs` + `DeepDrftModels/DTOs/HomeStatsDto.cs` — the landed
|
||||
home-stats pattern the Plays card will consume. `GET api/stats/home` returns a bare DTO; aggregation
|
||||
lives in `TrackRepository.GetHomeStatsAsync` / `ITrackService`; the controller is a thin boundary.
|
||||
**This is the template the play/share read path mirrors** (§4, §5).
|
||||
- `DeepDrftPublic.Client/Controls/NowPlayingStats.razor` — the three-card hero stat row. The Plays
|
||||
card is the third card; it currently shows the static placeholder. This phase fills it.
|
||||
- `DeepDrftData` (`DeepDrftContext`, `TrackRepository`, `TrackManager`, `Migrations`) — where new
|
||||
SQL tables/migrations and aggregation queries live. `DeepDrftAPI` owns the HTTP surface.
|
||||
- `DeepDrftPublic` proxy controller (`api/track/*`) — the public site's browser→API proxy hop. A new
|
||||
`api/event/*` or `api/stats/*` write path needs a matching proxy route (the WASM client cannot
|
||||
reach `DeepDrftAPI` directly; SSR can).
|
||||
|
||||
---
|
||||
|
||||
## 1. The metrics, defined precisely
|
||||
|
||||
Three metrics, two core + one stretch:
|
||||
|
||||
### 1a. Play (core)
|
||||
|
||||
A **play** is a single listener's session of listening to one track. It is recorded once per
|
||||
track-listen, classified by how far the listener got:
|
||||
|
||||
- **partial** — playback reached **< 30%** of the track's duration before the session ended
|
||||
(switched track, stopped, navigated away, closed tab).
|
||||
- **complete** — playback reached **> 80%** of duration.
|
||||
- **middle band (30%–80%)** — see decision **D1** below. **Recommendation: count the middle band as
|
||||
its own bucket, `sampled`** (a real listen that wasn't a skip and wasn't a finish), so the three
|
||||
buckets are exhaustive and non-overlapping: `partial` [0, 30%), `sampled` [30%, 80%], `complete`
|
||||
(80%, 100%]. The headline "Plays" number is the **sum of all three** (every started listen counts
|
||||
as a play); the buckets are the texture beneath it.
|
||||
|
||||
Alternative considered: fold the middle into "complete" (threshold becomes "≥30% = a real play,
|
||||
else partial"). Simpler, two buckets — but it throws away the distinction between "listened to half"
|
||||
and "listened to the end," which is the most editorially interesting signal for a music collective
|
||||
("which mixes do people actually finish?"). Rejected in favour of three buckets, but it's a genuine
|
||||
Daniel call (**D1**).
|
||||
|
||||
**What starts a play candidate:** a track's audio actually begins streaming for playback — i.e.
|
||||
`SelectTrackStreaming` reaches the point where `StartStreamingPlayback` succeeds and `IsPlaying`
|
||||
becomes true (the `_streamingPlaybackStarted` transition in `StreamAudioWithEarlyPlayback`). A track
|
||||
that is *staged* (`StageTrack`) but never played does **not** count. A track that fails to load does
|
||||
not count.
|
||||
|
||||
**What classifies the bucket:** the **maximum playback position reached** as a fraction of duration,
|
||||
captured when the listen ends (track switch, stop, organic end, or page unload). Not the position at
|
||||
the moment of ending — the *high-water mark* — so that seeking backward near the end doesn't demote a
|
||||
complete play to partial. (See seeks under edge cases.)
|
||||
|
||||
**Why a high-water mark, not elapsed-listen-time:** elapsed time would require accumulating play
|
||||
duration across pauses and is more state to carry. The high-water position is already trivially
|
||||
derivable from the progress callback (`max(currentTime)/duration`). For v1 the simpler model is the
|
||||
right call; if "engaged listening time" becomes interesting later it's an additive metric, not a
|
||||
reshape.
|
||||
|
||||
### 1b. Share (core)
|
||||
|
||||
A **share** is recorded when a listener performs a share *action* — not when they open the share
|
||||
popover. Two actions exist today (`SharePopover`):
|
||||
|
||||
- **Copy link** (track or release) → records a share against that track or release, with a
|
||||
`channel = link` tag.
|
||||
- **Copy embed** (track only) → records a share with `channel = embed`.
|
||||
|
||||
A future "native share" (Web Share API) or per-platform button would add channels without reshaping
|
||||
the metric. The share count on the Plays-card payoff is **total shares**; the channel split is
|
||||
texture (and probably CMS-only, not public-facing, for v1).
|
||||
|
||||
**De-dupe:** copying the same link three times in a row is one intent, not three shares. Recommend a
|
||||
client-side **debounce** — at most one share event per (target, channel) per short window (e.g. 60s)
|
||||
per session. Cheap, prevents the obvious gaming, and matches how "copied!" already feels like one act.
|
||||
|
||||
### 1c. Unique listeners (stretch / "plus" — lower priority)
|
||||
|
||||
A **unique listener** is an approximate distinct-listener count over a window (all-time, or rolling
|
||||
30 days — **D3**), tied to a track or release. This is the metric most in tension with the no-PII
|
||||
constraint, and it is explicitly the **last** thing to build (§6). It is approximate by design — we
|
||||
are not building identity, we are estimating reach. Mechanism options and recommendation: **§3**.
|
||||
|
||||
### 1d. Edge cases (apply to plays unless noted)
|
||||
|
||||
- **Replays.** Playing the same track twice in one session = two plays. The play event fires per
|
||||
`SelectTrackStreaming` that reaches playback. If a listener loops a track ten times, that's ten
|
||||
plays. Acceptable for v1 — looping is genuine listening. (Unique listeners absorbs the "but it's
|
||||
the same person" concern at the reach level.)
|
||||
- **Seeks (within a play).** Seeking does not start a new play. The high-water position keeps
|
||||
climbing; seeking backward never lowers it. So "seek to the end to check the outro, then seek
|
||||
back" still classifies as `complete` — correct, they heard the end.
|
||||
- **Seek-beyond-buffer re-request.** The `GET api/track/{id}` with a `Range` header during
|
||||
`SeekBeyondBuffer` is a *byte* re-fetch of the **same** play — it must **not** start a new play
|
||||
event. Because the play event is keyed off the player-service `SelectTrackStreaming` lifecycle (not
|
||||
the HTTP fetch), this is free: `SeekBeyondBuffer` reuses `_currentTrackId` and never calls
|
||||
`SelectTrackStreaming`. **Instrument at the player-service level, never at the HTTP/media-client
|
||||
level** — the media client fires multiple times per play.
|
||||
- **Very short tracks.** A 4-second sting: 30%/80% still apply proportionally. No special-casing for
|
||||
v1. (If "complete" on a 4s clip feels too cheap, a minimum-absolute-seconds floor is an additive
|
||||
tweak — flag as a tuning knob, not a v1 requirement.)
|
||||
- **Rapid skips.** Listener clicks through ten tracks in five seconds. Each reaches `< 30%` →
|
||||
ten `partial` plays. **Recommendation (D2): apply a minimum-engagement floor before a play counts
|
||||
at all** — e.g. playback must reach **≥ 3 seconds OR ≥ 5% of duration** (whichever is smaller) for
|
||||
the listen to register as a play. Below the floor it's a *preview/skip*, not a play, and is dropped
|
||||
entirely. This keeps the headline number honest (a skim through the archive isn't 40 plays) while
|
||||
still capturing genuine short partial listens. The floor is a single tunable constant. **D2 is a
|
||||
Daniel call** — the alternative is "every started playback counts, floor = 0," which is simpler and
|
||||
defensible ("they hit play, it's a play") but inflates the number on a browsing session.
|
||||
- **Tab close / navigation mid-play.** The play must still be recorded with its high-water bucket.
|
||||
This is the hardest delivery case and drives the beacon recommendation in §2.2.
|
||||
- **Embedded (`FramePlayer`) plays.** A play inside a third-party iframe is a real play and should
|
||||
count. The embed runs the same player stack, so it instruments for free — but it may run
|
||||
cross-origin, which bears on the unique-listener cookie/storage mechanism (§3) and the share
|
||||
attribution (an embed play could optionally carry the embedding page as a referrer dimension —
|
||||
out of scope for v1, flag as adjacent).
|
||||
- **CMS / admin playback.** Plays generated by an admin auditing tracks in the CMS should ideally not
|
||||
pollute the public counts. The CMS is a separate app (`DeepDrftManager`) and does not run this
|
||||
player stack, so this is mostly free — but a logged-in admin browsing the *public* site would
|
||||
count. Acceptable for v1 (low volume); flag as a known caveat.
|
||||
|
||||
---
|
||||
|
||||
## 2. Instrumentation
|
||||
|
||||
### 2.1 Where events originate
|
||||
|
||||
**Play events: in `StreamingAudioPlayerService`, not the UI and not the HTTP layer.** The player
|
||||
service is the single chokepoint every playback path flows through (home StreamNow, gallery, queue
|
||||
auto-advance, detail-page play, embed). Instrumenting once there covers all of them and dodges the
|
||||
seek-beyond-buffer double-count trap.
|
||||
|
||||
Concretely, the player service grows a small internal **play-session tracker**:
|
||||
|
||||
- On the `_streamingPlaybackStarted` transition (playback actually begins): open a play session for
|
||||
`_currentTrackId` — record the track `EntryKey`, the resolved release id (§2.3), the duration once
|
||||
the WAV header sets it, and start the high-water mark at 0.
|
||||
- On each progress tick (`OnProgressCallback`, already firing ≤10/sec): advance the high-water mark
|
||||
(`max(highWater, currentTime)`).
|
||||
- On session-end — whichever comes first: organic end (`OnPlaybackEndCallback`), a superseding
|
||||
`LoadTrackStreaming` / `ResetToIdle` (track switch / stop / unload / dispose), or page unload
|
||||
(§2.2) — **close the session**: compute `highWater/duration`, apply the §1d floor, classify the
|
||||
bucket, and emit one play event. Then clear the tracker.
|
||||
|
||||
This keeps all play-counting logic in one place, behind one seam, testable against a fake interop.
|
||||
**It does not change the playback path** — it's an observer on transitions the service already makes.
|
||||
|
||||
**Recommendation:** factor the tracker into a small injectable collaborator (e.g. `IPlayTracker` /
|
||||
`PlayTracker`) that the player service calls (`OnPlaybackStarted`, `OnProgress`, `OnPlaybackEnded`),
|
||||
rather than inlining HTTP calls into the player. Keeps the player's single responsibility intact and
|
||||
matches the repo's "logic in services, not in the playback path" discipline. The tracker owns the
|
||||
beacon call and the de-dupe/floor logic. (Final structural call is staff-engineer's; this is the
|
||||
steer.)
|
||||
|
||||
**Share events: in `SharePopover` (or a small `IShareTracker` it calls).** The `CopyLink` /
|
||||
`CopyEmbed` handlers already exist and already know the target (track `EntryKey` or release
|
||||
`EntryKey` + medium) and the channel. After a successful clipboard write, fire a share event. Apply
|
||||
the §1b debounce in the tracker.
|
||||
|
||||
### 2.2 What gets sent, when, and how (fire-and-forget vs. tracked)
|
||||
|
||||
**Recommendation: fire-and-forget `sendBeacon` for play and share events.** These are telemetry, not
|
||||
transactions — a dropped event is acceptable; blocking the UI or the navigation for one is not.
|
||||
`navigator.sendBeacon` is purpose-built for exactly this (it survives page unload, which the
|
||||
tab-close edge case requires) and is a tiny TS interop addition alongside the existing audio interop.
|
||||
|
||||
- **Play event payload (sent once at session close):** `{ trackEntryKey, releaseEntryKey?, medium?,
|
||||
bucket: "partial"|"sampled"|"complete", anonId? }`. No duration or position is sent — the bucket is
|
||||
computed client-side, so the server stores a classification, not raw listening data (a privacy
|
||||
plus: we never transmit *how long* someone listened, only a coarse bucket). `anonId` is the
|
||||
unique-listener token and is **omitted entirely** if the unique-listener feature is off or the
|
||||
listener has opted out (§3).
|
||||
- **Share event payload:** `{ targetType: "track"|"release", targetKey, channel:
|
||||
"link"|"embed", anonId? }`.
|
||||
- **Transport:** a single new endpoint family `POST api/event/play` and `POST api/event/share`
|
||||
(proxied through `DeepDrftPublic` for the WASM client, same hop as `api/track/*`). `sendBeacon`
|
||||
issues a `POST` with a small JSON body; the endpoint returns `202 Accepted` and does the write
|
||||
async. **Unauthenticated** (same posture as the public reads) — but rate-limited and validated
|
||||
(§2.5).
|
||||
|
||||
**Why not a tracked/awaited call:** the only thing we'd gain is a delivery confirmation we don't need
|
||||
and a failure path we don't want (a telemetry 500 must never surface to a listener). The beacon's
|
||||
"can't read the response" limitation is irrelevant — we don't act on the response.
|
||||
|
||||
**Page-unload delivery:** register a `pagehide`/`visibilitychange→hidden` handler that closes any open
|
||||
play session via the beacon. This is the canonical pattern (analytics libraries all do this) and is
|
||||
why `sendBeacon` over `fetch` matters — `fetch` is cancelled on unload, `sendBeacon` is not.
|
||||
|
||||
### 2.3 Resolving the release id for a play
|
||||
|
||||
The player today holds `_currentTrackId` (the track `EntryKey`) but **not** the release. A play event
|
||||
wants both (the metric is "tied to individual tracks and releases"). Options:
|
||||
|
||||
1. **Carry release context on the `TrackDto`/play call.** When a play originates from a release detail
|
||||
page or the queue (`PlayRelease`), the caller knows the release. Thread a release `EntryKey` +
|
||||
medium into `SelectTrackStreaming` (optional param) or onto the staged context. Clean, explicit.
|
||||
2. **Resolve server-side from the track.** The play event sends only `trackEntryKey`; the API joins
|
||||
track→release at write or aggregation time. The track→release link already exists in SQL. Zero
|
||||
client plumbing; the release dimension is always correct even for plays that started without
|
||||
release context (e.g. StreamNow random track).
|
||||
|
||||
**Recommendation: option 2 (resolve server-side).** The track→release join is authoritative and
|
||||
already in `DeepDrftData`; sending only the track key keeps the client dumb and the payload minimal,
|
||||
and it means a random-track play still gets correctly attributed to its release without the client
|
||||
knowing. The client sends what it cheaply knows (track key); the server enriches. This also means
|
||||
**release play counts are derived** (sum of plays of the release's tracks) rather than separately
|
||||
counted — which is exactly right for multi-track Cuts and trivially correct for single-track
|
||||
Session/Mix. (**D4**: confirm release plays = sum-of-track-plays, not a separately-counted "release
|
||||
was played" event. Recommend derived.)
|
||||
|
||||
Shares already carry the right target directly (`SharePopover` knows track vs. release), so share
|
||||
attribution needs no resolution step.
|
||||
|
||||
### 2.4 Avoiding double-counting
|
||||
|
||||
- **Plays:** one session = one event, enforced by the player-service tracker (only `SelectTrack
|
||||
Streaming` opens a session; seek-beyond-buffer and progress ticks never do). Covered by §2.1.
|
||||
- **Shares:** the §1b per-(target,channel) debounce.
|
||||
- **Beacon retries:** `sendBeacon` does not retry, so no transport-level duplication. If a future
|
||||
switch to `fetch` adds retries, an idempotency key on the event would be needed — not for v1.
|
||||
|
||||
### 2.5 Abuse / inflation resistance (privacy-light, so light-touch)
|
||||
|
||||
Anonymous unauthenticated write endpoints invite inflation. v1 posture: **make casual gaming
|
||||
annoying, accept that determined gaming is possible** (this is a band's vanity-and-texture counter,
|
||||
not ad-revenue telemetry — the "90s visitor counter vibe" Daniel wants). Measures:
|
||||
|
||||
- Server-side **rate-limit per IP** on the event endpoints (e.g. N events/minute) — coarse, stateless,
|
||||
standard ASP.NET rate-limiting middleware.
|
||||
- The play **engagement floor** (§1d, D2) already drops trivial skim-spam.
|
||||
- Reject malformed payloads (unknown bucket, missing track key, oversized body) at the controller.
|
||||
- **Out of scope for v1:** bot detection, CAPTCHA, signed events, IP reputation. Flag as adjacent if
|
||||
the counts ever start mattering enough to defend.
|
||||
|
||||
---
|
||||
|
||||
## 3. Privacy-light anonymity mechanism (unique listeners)
|
||||
|
||||
This is the decision most in tension with "no PII / anonymous only," and the one most wanting Daniel's
|
||||
eyes (**D5**). The unique-listener metric needs *some* notion of "the same anonymous listener seen
|
||||
again" without identifying who they are. Three mechanism families, with trade-offs:
|
||||
|
||||
### Option A — Anonymous client-minted id (random GUID in localStorage)
|
||||
|
||||
On first visit, the client mints a random GUID, stores it in `localStorage` (or a first-party cookie),
|
||||
and sends it as `anonId` on play/share events. The server counts distinct `anonId`s per
|
||||
track/release.
|
||||
|
||||
- **Privacy:** strong. The id is random, contains no PII, is first-party only, never leaves as
|
||||
anything but an opaque token, and the listener can clear it (clear site data) at will. It is
|
||||
effectively a "this browser, until you clear it" token — not a person.
|
||||
- **Accuracy:** good-but-inflating. Same person on phone + laptop = 2 listeners. Cleared storage =
|
||||
new listener. Incognito = new listener each session. So it **over-counts** unique listeners
|
||||
(counts unique *browser-installs-since-last-clear*, which we relabel honestly as "listeners").
|
||||
- **Cross-origin embed caveat:** an embedded `FramePlayer` on a third-party site cannot read the
|
||||
first-party `localStorage` of `deepdrft.com` (storage is partitioned). So embed plays would mint a
|
||||
*separate* id per embedding site. Acceptable — embed plays are a minority and over-counting is the
|
||||
known direction of error.
|
||||
- **Consent:** a random first-party id with no cross-site tracking is the lightest-touch case under
|
||||
GDPR/ePrivacy — widely treated as not requiring a consent banner when used purely for first-party
|
||||
aggregate counts (this is the "privacy-light" sweet spot). **Recommend a short privacy-note line**
|
||||
rather than a cookie wall.
|
||||
|
||||
### Option B — Salted/rotating daily identifier (no stored id at all)
|
||||
|
||||
The server (or client) derives a per-day token by hashing `IP + User-Agent + a daily-rotating salt`.
|
||||
No id is stored on the client. Distinct tokens per day ≈ distinct visitors per day; the salt rotation
|
||||
means yesterday's tokens can't be correlated to today's (no long-term tracking).
|
||||
|
||||
- **Privacy:** very strong on the *no-persistent-identifier* axis (nothing stored client-side, no
|
||||
cross-day linkage). This is the pattern Plausible/Fathom-style privacy analytics use.
|
||||
- **Accuracy:** coarser. IP+UA collides (everyone behind one NAT/office/carrier-CGNAT shares a token →
|
||||
under-counts) and rotates daily (can only ever produce "unique per day," never "unique all-time" —
|
||||
by design). Mobile IPs shift constantly → over-counts. Net: noisy, and **only meaningful as a daily
|
||||
figure**, which fights the "all-time plays counter" vibe.
|
||||
- **Architecture cost:** the IP is only reliably available **server-side** (the client doesn't know
|
||||
its own public IP, and `X-Forwarded-For` is already handled server-side per `DeepDrftAPI` config).
|
||||
So the token must be computed at the API, not the client — meaning the unique-listener dimension is
|
||||
derived entirely server-side from request metadata, and the client sends *no* `anonId` at all. That
|
||||
is actually a privacy *plus* (the client transmits nothing identifying) but ties unique-listeners to
|
||||
a daily-rolling model.
|
||||
|
||||
### Option C — Coarse browser fingerprint
|
||||
|
||||
Derive an id from canvas/font/hardware fingerprinting signals.
|
||||
|
||||
- **Privacy:** worst. Fingerprinting is precisely what privacy regulation and browser vendors are
|
||||
actively killing; it tracks across sites and survives storage-clearing. **Directly contradicts
|
||||
"privacy-light."** Rejected outright — listed only for completeness.
|
||||
|
||||
### Recommendation
|
||||
|
||||
**Option A (client-minted random first-party id) as the primary mechanism, with the metric honestly
|
||||
labelled.** Reasons:
|
||||
|
||||
- It fits the product: a band wants an *all-time* "N listeners reached" figure, which A supports and
|
||||
B (daily-only) structurally cannot.
|
||||
- It is genuinely privacy-light: opaque random token, first-party, clearable, no cross-site
|
||||
correlation, no fingerprinting — the lightest mechanism that still answers "all-time unique."
|
||||
- It keeps the server simple (count distinct tokens) and the client honest (one tiny localStorage
|
||||
read).
|
||||
- We label the metric **"listeners," not "people,"** and accept the known over-count. For a vanity
|
||||
texture stat this is the right honesty/effort trade.
|
||||
|
||||
**If Daniel wants the stronger "stores nothing" posture, Option B is the fallback** — at the cost of
|
||||
the metric becoming daily-unique and noisier. The two are not mutually exclusive long-term (A for
|
||||
all-time reach, B-style for daily actives) but v1 should pick one. **This is D5.**
|
||||
|
||||
**Either way, unique-listeners is the deferred §6 stretch** — the play/share counters (which need no
|
||||
`anonId` at all) ship first and stand alone.
|
||||
|
||||
---
|
||||
|
||||
## 4. Storage & aggregation model
|
||||
|
||||
### 4.1 Event capture: log vs. counter vs. both
|
||||
|
||||
Three shapes:
|
||||
|
||||
1. **Rolled-up counters only.** One row per (track, bucket) with an integer count, incremented on each
|
||||
event. Tiny storage, trivial read, no history, no unique-listener support (can't count distinct
|
||||
without storing the distinct tokens). Pure "90s hit counter."
|
||||
2. **Append-only event log only.** One row per event. Full fidelity, supports any future metric
|
||||
(unique listeners, time-of-day, channel splits, retention), but every read is an aggregation query
|
||||
over a growing table and write volume = play volume.
|
||||
3. **Both — event log + periodic/online rollup.** Events append to a log; counters are maintained
|
||||
(either incrementally on write, or by a periodic aggregation pass) for the hot read path.
|
||||
|
||||
**Recommendation: a lean version of (3) — an append-only event log as the source of truth, plus a
|
||||
small rolled-up counter table for the home-card hot read.** Reasoning:
|
||||
|
||||
- The home Plays card is read on every public-site landing; it must be a single cheap indexed read,
|
||||
not a `COUNT(*)` over an event table. That argues for a counter.
|
||||
- But unique-listeners *requires* retaining distinct tokens, and "which mixes get finished" texture
|
||||
wants the bucket/per-target fidelity — both argue for a log.
|
||||
- (3) gets both without forcing a premature choice, and matches the existing `HomeStatsDto` pattern
|
||||
(the card reads a pre-aggregated DTO; it never queries raw).
|
||||
|
||||
For v1 the rollup can be **incremental-on-write** (the event-write transaction also bumps the counter
|
||||
row) to avoid standing up a background job. If write volume ever makes that contended, a periodic
|
||||
aggregation pass is the escape hatch — but for a collective-scale site, incremental is fine and
|
||||
simplest. (**D6**: incremental-on-write rollup vs. periodic-batch rollup. Recommend incremental for
|
||||
v1.)
|
||||
|
||||
### 4.2 Conceptual SQL shape (not full schema — that's staff-engineer's)
|
||||
|
||||
In `DeepDrftData`, new tables roughly:
|
||||
|
||||
- **`play_event`** (the log): `id`, `track_entry_key` (or `track_id` FK), `release_id` (resolved at
|
||||
write, §2.3), `bucket` (enum: partial/sampled/complete), `anon_id` (nullable — present only when
|
||||
unique-listeners is on and the listener didn't opt out), `created_at`. Indexed on
|
||||
`(track_id)`, `(release_id)`, and `(anon_id)` for the distinct-count query.
|
||||
- **`share_event`** (the log): `id`, `target_type` (track/release), `target_id`, `channel`
|
||||
(link/embed), `anon_id?`, `created_at`.
|
||||
- **`play_counter`** (the rollup, optional in v1 if reads stay cheap): per (track_id) and per
|
||||
(release_id), columns for `partial_count`, `sampled_count`, `complete_count`, `share_count`, and a
|
||||
`total_plays` derived/stored. This is what the home-stats aggregation reads.
|
||||
|
||||
`anon_id` is the only field that touches anonymity, it is nullable, and it is never joined to any
|
||||
identity table because none exists. Dropping the unique-listener feature later = stop writing the
|
||||
column; the play/share counts are unaffected.
|
||||
|
||||
**Note on the dual-database split:** this is **all SQL** (it's metadata/counters, not binary content).
|
||||
The FileDatabase vault is not involved. Aggregation lives in `TrackRepository` alongside
|
||||
`GetHomeStatsAsync`; the API surface lives in `DeepDrftAPI`.
|
||||
|
||||
### 4.3 New API surface (sketch)
|
||||
|
||||
Writes (unauthenticated, rate-limited, proxied through `DeepDrftPublic`):
|
||||
|
||||
- `POST api/event/play` — body `{ trackEntryKey, bucket, anonId? }`; returns `202`. Resolves release
|
||||
server-side, appends to `play_event`, bumps `play_counter`.
|
||||
- `POST api/event/share` — body `{ targetType, targetKey, channel, anonId? }`; returns `202`.
|
||||
|
||||
Reads (unauthenticated, mirror `GET api/stats/home`):
|
||||
|
||||
- Extend **`GET api/stats/home`** to include the site-wide play total (and optionally share total)
|
||||
for the home card — *the minimal payoff* (§5). The card already does one round-trip to this
|
||||
endpoint; adding fields is the smallest possible change.
|
||||
- `GET api/stats/track/{entryKey}` and `GET api/stats/release/{entryKey}` — per-target play/share/
|
||||
listener figures, for the (future) detail-page display. **Not required for the home-card payoff**;
|
||||
build when a detail-page surface wants them.
|
||||
|
||||
CMS-facing reads (the channel splits, bucket breakdowns, per-track leaderboards) are a separate,
|
||||
later, `DeepDrftManager`-side concern — explicitly out of v1 scope, flagged adjacent.
|
||||
|
||||
---
|
||||
|
||||
## 5. The Plays card payoff
|
||||
|
||||
Today `NowPlayingStats.razor`'s third card renders `XXX / Plays (Coming Soon)` — a static odometer
|
||||
placeholder. Once this phase lands, the card shows a **real site-wide play total** in the same
|
||||
odometer treatment (the "90s visitor counter" vibe is *already the intended aesthetic* — the
|
||||
placeholder is literally styled as an odometer). The minimal, correct payoff:
|
||||
|
||||
- **Primary figure:** total plays site-wide (sum across all tracks), rendered in the odometer.
|
||||
- **Secondary line (optional, recommend yes):** something with texture that fits the card's existing
|
||||
two-line shape (the other two cards both have a primary + secondary). Candidates: total shares
|
||||
("N shared"), or completion rate ("N% finished" = complete / total), or unique listeners once the
|
||||
stretch lands. **Recommend completion-rate or share-count for v1**, swapping to listeners if/when
|
||||
§6 ships. (**D7** — what's the card's secondary line.)
|
||||
|
||||
Mechanically: add `TotalPlays` (and the chosen secondary) to `HomeStatsDto`, populate it in
|
||||
`TrackRepository.GetHomeStatsAsync` from `play_counter`, and the card reads it through the existing
|
||||
`IStatsDataService.GetHomeStats()` path — **the same persistent-state-bridged single round-trip the
|
||||
other two cards already use.** No new client data path; the card just stops being static.
|
||||
|
||||
This is deliberately the *first* visible payoff and the *smallest* — the whole counter substrate earns
|
||||
its keep the moment the home number goes live, before any per-track surface or unique-listener work.
|
||||
|
||||
---
|
||||
|
||||
## 6. Suggested phasing
|
||||
|
||||
Sequenced so the visible payoff lands early and the privacy-sensitive stretch lands last. Each wave is
|
||||
independently shippable.
|
||||
|
||||
- **16.A — Play & share counters (core).** The whole spine: player-service play tracker (§2.1) with
|
||||
the three-bucket classification (§1a) and engagement floor (§1d/D2); share tracker in `SharePopover`
|
||||
(§1b); `sendBeacon` interop + `POST api/event/{play,share}` endpoints (§2.2) with rate-limiting
|
||||
(§2.5); `play_event`/`share_event` log + incremental `play_counter` rollup (§4); server-side release
|
||||
resolution (§2.3, D4). **No `anonId` written yet** (unique listeners is 16.C). **Free-floating —
|
||||
the cold-start wave; nothing gates it.**
|
||||
- **16.B — Home Plays-card payoff (§5).** Extend `HomeStatsDto` + `GetHomeStatsAsync` with `TotalPlays`
|
||||
(+ chosen secondary, D7); flip `NowPlayingStats`'s third card from placeholder to live. **Depends on
|
||||
16.A** (needs the counter to read). This is the visible win — sequence it immediately after 16.A.
|
||||
- **16.C — Unique listeners (stretch / "plus").** The anonymity mechanism (§3, D5 — recommend Option
|
||||
A): mint/read the anon token, thread it onto event payloads, count distinct server-side, expose via
|
||||
the per-target stats reads and/or as the home card's secondary line. **Depends on 16.A** (extends the
|
||||
event payload + storage). **Explicitly lower priority** — 16.A+16.B deliver the agreed core; 16.C is
|
||||
the agreed stretch. Can be deferred indefinitely without stranding anything.
|
||||
- **16.D — Per-target stats surfaces (adjacent, not committed).** Detail-page play/share/listener
|
||||
display via `GET api/stats/{track,release}/{key}`; CMS analytics views (bucket splits, channel
|
||||
splits, leaderboards). **Speculative** — flagged for when a surface actually wants these. Not part of
|
||||
the agreed scope; listed so the substrate (the event log) is understood to already support it.
|
||||
|
||||
**Hard dependencies:** `16.A → 16.B`; `16.A → 16.C`; `16.A → 16.D`. 16.A is the only cold-start wave.
|
||||
16.B and 16.C are parallel after 16.A (B is the priority; C is the stretch).
|
||||
|
||||
**Verify-before-build:** confirm the `DeepDrftPublic` proxy can host a `POST api/event/*` write route
|
||||
with the same proxy idiom as `api/track/*` (the WASM client cannot reach `DeepDrftAPI` directly). This
|
||||
is the one infrastructural assumption the spec rests on; cheap to confirm, blocking if wrong.
|
||||
|
||||
---
|
||||
|
||||
## 10. Open product decisions (for Daniel)
|
||||
|
||||
Resolve these before 16.A is decomposed. Recommendations carried from the body; the call is Daniel's.
|
||||
|
||||
- **D1 — Middle-band (30–80%) treatment.** Recommend a third `sampled` bucket (exhaustive
|
||||
partial/sampled/complete); headline plays = sum of all three. Alt: fold middle into a binary
|
||||
partial/complete. (§1a)
|
||||
- **D2 — Engagement floor before a play counts.** Recommend a floor (~≥3s or ≥5% of duration) so
|
||||
archive-skimming doesn't inflate the count. Alt: floor = 0, every started playback counts. (§1d)
|
||||
- **D3 — Unique-listener window.** All-time vs. rolling-30-day. Recommend all-time (fits the "N
|
||||
listeners reached" framing and Option-A mechanism). (§1c) — only bites if 16.C is built.
|
||||
- **D4 — Release plays: derived or separately counted.** Recommend derived (release plays = sum of its
|
||||
tracks' plays), correct for multi-track Cuts and single-track Session/Mix alike. (§2.3)
|
||||
- **D5 — Unique-listener anonymity mechanism.** Recommend Option A (client-minted random first-party
|
||||
localStorage id, metric honestly labelled "listeners," all-time). Fallback Option B (server-derived
|
||||
salted daily token — stores nothing client-side but only yields daily-unique). Option C
|
||||
(fingerprint) rejected. **This is the headline decision.** (§3)
|
||||
- **D6 — Rollup strategy.** Recommend incremental-on-write counter for v1 (no background job). Alt:
|
||||
periodic batch aggregation. (§4.1)
|
||||
- **D7 — Home Plays-card secondary line.** Recommend completion-rate or total-shares for v1, swapping
|
||||
to unique listeners if/when 16.C ships. (§5)
|
||||
|
||||
## Working with this spec
|
||||
|
||||
- Mirrors the established product-notes convention (cross-refs up top, numbered sections, decisions
|
||||
called out, phasing with explicit dependencies) — see `phase-15-visualizer-controls-enhancements.md`
|
||||
/ `phase-11-public-site-enhancements.md`.
|
||||
- When Daniel resolves the D-decisions, fold the resolutions inline (mark the section "resolved by
|
||||
Daniel YYYY-MM-DD") and flip the status to design-complete, the same way Phase 15 did.
|
||||
- When waves land, doc-keeper moves them to `COMPLETED.md`; the per-wave bodies are written to travel
|
||||
cleanly.
|
||||
Reference in New Issue
Block a user