11 Commits

Author SHA1 Message Date
daniel-c-harvey 1fdbec2533 Merge cors-manager-origin into dev
Deploy DeepDrftAPI / Build, Publish & Bundle (push) Successful in 2m15s
Package install tarball / package (push) Successful in 6s
Deploy DeepDrftAPI / Deploy (push) Successful in 1m35s
2026-06-23 08:21:33 -04:00
daniel-c-harvey 70842cb576 docs: add production install checklist 2026-06-23 08:15:56 -04:00
daniel-c-harvey f2a0d39521 config: add app.deepdrft.com to API CORS allowlist 2026-06-23 08:15:55 -04:00
daniel-c-harvey 1bda2b7bea docs: reflect Phase 23 SEO crawl directives as landed
Deploy DeepDrftManager / Build & Publish (push) Successful in 1m29s
Deploy DeepDrftPublic / Build & Publish (push) Successful in 4m7s
Deploy DeepDrftManager / Deploy (push) Successful in 1m23s
Deploy DeepDrftPublic / Deploy (push) Successful in 1m28s
2026-06-23 07:40:57 -04:00
daniel-c-harvey 8773803712 feature: og default image 2026-06-23 07:40:42 -04:00
daniel-c-harvey 3cc11bcbb5 Merge p23-w1-t2-cms-noindex into dev
Phase 23 Track B: make DeepDrftManager uncrawlable — static robots.txt (Disallow: /) + blanket noindex meta in the CMS head. No env gate; the CMS is always uncrawlable.
2026-06-23 07:36:01 -04:00
daniel-c-harvey 0ba4fc6597 Merge p23-w1-t1-public-crawl-endpoints into dev
Phase 23 Track A: env-gated /robots.txt + /sitemap.xml on DeepDrftPublic. Thin controller + pure builders, reuses api/release + ReleaseRoutes + SeoOptions.BaseUrl. Non-prod uncrawlable; sitemap loc equals page canonical by construction.
2026-06-23 07:35:52 -04:00
daniel-c-harvey 7a0ccdd784 fix: correct WalkPageSize to 100 (actual server PageSize cap) and update comment 2026-06-23 07:33:24 -04:00
daniel-c-harvey ca057dc630 chore: make DeepDrftManager uncrawlable and noindex (Phase 23.3)
Static robots.txt (Disallow: /) in wwwroot + blanket noindex meta in App.razor head. No env gate — the CMS is always uncrawlable. Defense in depth per spec OQ-C1.
2026-06-23 07:23:49 -04:00
daniel-c-harvey 5f4807cc4a feature: Phase 23 Track A — env-gated /robots.txt + /sitemap.xml public crawl endpoints 2026-06-23 07:23:42 -04:00
daniel-c-harvey 9a4b79d377 docs: spec Phase 23 — SEO crawl directives (sitemap.xml, robots.txt, CMS noindex) 2026-06-23 07:10:20 -04:00
16 changed files with 964 additions and 6 deletions
+2 -2
View File
@@ -8,9 +8,9 @@ DeepDrftHome is a **net10.0** solution consisting of ten projects implementing a
### Core Projects ### Core Projects
- **DeepDrftPublic**: ASP.NET Core host. Blazor Web App with Server + WASM render modes. Owns browser-facing proxy controller for `api/track/*` (metadata listing and audio streaming), MudBlazor theme prerender, and TypeScript→JS audio interop. Public-facing site for listeners. - **DeepDrftPublic**: ASP.NET Core host. Blazor Web App with Server + WASM render modes. Owns browser-facing proxy controller for `api/track/*` (metadata listing and audio streaming), crawl-directive endpoints (`GET /robots.txt` and `GET /sitemap.xml`, environment-gated via `IWebHostEnvironment.IsProduction()` directly — server-side only, no PersistentState bridge — served by `CrawlDirectiveController` with pure builders in `Seo/RobotsTxt.cs` and `Seo/SitemapXml.cs`), MudBlazor theme prerender, and TypeScript→JS audio interop. Public-facing site for listeners.
- **DeepDrftPublic.Client**: Blazor WebAssembly assembly. All interactive UI (pages, player stack, dark-mode plumbing, HTTP clients for both backends). Pages include the public `/about` editorial page (`Pages/About.razor` — three-movement **"Liner Notes"** editorial treatment: numbered left-rail (oversized Bodoni numerals + vertical hairline spine + mono marginalia captions), asymmetric content column, pull-quotes breaking into the margin, hand-authored SVG waveform movement dividers (self-contained motif, not the live `WaveformVisualizer`), and stacked editorial definition list for CUTS/SESSIONS/MIXES; active-movement highlight via `about-rail.ts` IntersectionObserver interop; registered in `Layout/Pages.cs`). Home hero stat row (`NowPlayingStats.razor`) is live-data-backed via `IStatsDataService` / `StatsClient` (named `"DeepDrft.API"` client) with a `PersistentComponentState` prerender bridge; `RuntimeFormat` helper converts mix runtime seconds to `hh:mm`. **SEO component** (`Controls/SeoHead.razor` + `Common/SeoModel`, `SeoJsonLd`, `SeoOptions`, `SeoUrls`, `SeoEnvironment`): `SeoHead` is a presentational `<HeadContent>` emitter (one line per page, no fetch); `SeoModel` named factories (`ForRelease`/`ForHome`/`ForAbout`/`ForBrowse`/`ForNotFound`) encode the medium→schema.org mapping in one place; `SeoJsonLd` builds typed JSON-LD (MusicGroup / MusicAlbum+LiveAlbum / MusicRecording / CollectionPage) with inline-safe escaping; `SeoOptions` holds site-wide config (`BaseUrl https://deepdrft.com`, title suffix, default OG image seam, IG `sameAs`) registered via the static `Startup` seam; `SeoEnvironment` is a scoped `[PersistentState]` bridge (mirrors `DarkModeSettings`) seeded in `DeepDrftPublic/Components/App.razor` from `IWebHostEnvironment.IsProduction()` — robots defaults to `index,follow` only in Production, `noindex,nofollow` everywhere else (fail-safe is noindex); per-page `SeoModel.Robots` overrides the default. Tags are present in prerendered HTML (rides the existing `PersistentComponentState` bridge; no new fetch). Canonical/OG origins come from `SeoOptions.BaseUrl` (config), not `window.location` — no `window` at server prerender and the origin cannot be derived behind the nginx proxy. Consumed by the public site. - **DeepDrftPublic.Client**: Blazor WebAssembly assembly. All interactive UI (pages, player stack, dark-mode plumbing, HTTP clients for both backends). Pages include the public `/about` editorial page (`Pages/About.razor` — three-movement **"Liner Notes"** editorial treatment: numbered left-rail (oversized Bodoni numerals + vertical hairline spine + mono marginalia captions), asymmetric content column, pull-quotes breaking into the margin, hand-authored SVG waveform movement dividers (self-contained motif, not the live `WaveformVisualizer`), and stacked editorial definition list for CUTS/SESSIONS/MIXES; active-movement highlight via `about-rail.ts` IntersectionObserver interop; registered in `Layout/Pages.cs`). Home hero stat row (`NowPlayingStats.razor`) is live-data-backed via `IStatsDataService` / `StatsClient` (named `"DeepDrft.API"` client) with a `PersistentComponentState` prerender bridge; `RuntimeFormat` helper converts mix runtime seconds to `hh:mm`. **SEO component** (`Controls/SeoHead.razor` + `Common/SeoModel`, `SeoJsonLd`, `SeoOptions`, `SeoUrls`, `SeoEnvironment`): `SeoHead` is a presentational `<HeadContent>` emitter (one line per page, no fetch); `SeoModel` named factories (`ForRelease`/`ForHome`/`ForAbout`/`ForBrowse`/`ForNotFound`) encode the medium→schema.org mapping in one place; `SeoJsonLd` builds typed JSON-LD (MusicGroup / MusicAlbum+LiveAlbum / MusicRecording / CollectionPage) with inline-safe escaping; `SeoOptions` holds site-wide config (`BaseUrl https://deepdrft.com`, title suffix, default OG image seam, IG `sameAs`) registered via the static `Startup` seam; `SeoEnvironment` is a scoped `[PersistentState]` bridge (mirrors `DarkModeSettings`) seeded in `DeepDrftPublic/Components/App.razor` from `IWebHostEnvironment.IsProduction()` — robots defaults to `index,follow` only in Production, `noindex,nofollow` everywhere else (fail-safe is noindex); per-page `SeoModel.Robots` overrides the default. Tags are present in prerendered HTML (rides the existing `PersistentComponentState` bridge; no new fetch). Canonical/OG origins come from `SeoOptions.BaseUrl` (config), not `window.location` — no `window` at server prerender and the origin cannot be derived behind the nginx proxy. Consumed by the public site.
- **DeepDrftManager**: ASP.NET Core host. Blazor Web App with server-rendered `InteractiveServer` render mode. Hosts all CMS Razor components and pages under `Components/Pages/Cms/`, `Components/Pages/Tracks/`, `Components/Layout/CmsLayout.razor`, and `Components/Shared/` (all inlined from the former `DeepDrftCms` RCL). Public entry point: `Components/Pages/Home.razor` (`@page "/"`, no `[Authorize]`, uses lean `CmsHomeLayout`) — unauthenticated visitors see a DeepDrft-branded splash with a Login CTA; authenticated admins are redirected to `/catalogue` via `RedirectToCatalogue`. `Routes.razor` resolves `DefaultLayout` from the cascaded `Task<AuthenticationState>`: unauthenticated → `CmsHomeLayout`, authenticated → `CmsLayout`; this means the AuthBlocks `Login`/`Register` pages (which declare no `@layout`) render in the lean layout for unauthenticated visitors. `CmsLayout` carries a left `MudDrawer` (app-bar hamburger toggle) holding the CMS destinations (Catalogue `/catalogue`, Releases `/releases`, Upload `/tracks/upload`), the AuthBlocks `UserAdminMenu` fragment (self-gates to `UserAdmin`+, links Users/Registrations/Permissions), and a "Provision User" link to `/useradmin/users/new` wrapped in a `HierarchicalRoleAuthorizeView` (`UserAdmin`-gated) — making the AuthBlocks user-administration surface reachable from the CMS UI. The catalogue dashboard (`Components/Pages/Index.razor`) lives at `@page "/catalogue"` and remains `[Authorize]`-gated with `CmsLayout`; its cards are **CUTS / SESSIONS / MIXES**, each deep-linking to `/releases?medium=<medium>` with the matching tab pre-selected. The consolidated browse surface is `Components/Pages/Tracks/Releases.razor` (`@page "/releases"`): bulk-action buttons (Generate All Profiles / Backfill High-res) → medium tab strip (ALL / CUTS / SESSIONS / MIXES) → the active tab's grid; waveform columns (Profile / High-res) — each showing a status icon when a datum is present and an always-visible generate/regenerate button — and per-track info tooltip live in `CmsAlbumBrowser`'s expanded child-row track table. Old list routes `/tracks`, `/tracks/albums`, `/tracks/archive` are kept as aliases on `Releases.razor` so bookmarks don't 404; operational sub-routes (`/tracks/upload`, edit routes, etc.) remain at `/tracks/*`. Gated by AuthBlocks login and hierarchical `Admin` role authorization. All track operations (upload, metadata read/write, delete, replace audio) are HTTP proxies via `ICmsTrackService` / `CmsTrackService` injected directly into Blazor components; no in-process data layer. The per-track "Replace audio" affordance in `BatchEdit` / `BatchTrackList` / `BatchTrackDetail` swaps the vault bytes, regenerates both waveform datums server-side, and re-derives `DurationSeconds` from the new audio; the track id, `EntryKey`, release membership, position, and all other metadata are preserved. The remove control on a persisted track is hidden when it is the release's sole remaining persisted track — a release can reach zero live tracks only via replace or release-level delete, not per-track removal. Two named HttpClients: `DeepDrft.Content.Cms` (bounded 100 s default, for all non-upload calls) and `DeepDrft.Content.Cms.Upload` (`InfiniteTimeSpan`, for large WAV uploads). Upload progress and idle/heartbeat timeout are driven by a single `ProgressStreamContent` wrapper (`Services/ProgressStreamContent.cs`); `CmsTrackService.UploadTrackAsync` adds a two-phase cancellation (idle window resets per progress tick; separate response-wait budget arms when the body completes). The upload form is create-only: `BatchUpload.razor` calls `GET api/track/release/exists` as a pre-flight before transferring bytes and blocks the submit with a visible message if a (title, artist) match already exists; the server also rejects duplicates with 409. The authenticated user's id (`NameIdentifier` claim) is captured once into `_createdByUserId` at component initialization (`OnInitializedAsync`) — not re-read at submit — so a mid-session token expiry cannot discard a long-composed release; the page is `[Authorize]`-gated and runs `prerender: false`, so the auth state is fully available at init and only one init pass occurs. Within-batch multi-track Cuts still work by passing the release id from row 1 as `releaseId` on rows 2..N (the ATTACH path), while `BatchEdit.razor` uses the same ATTACH path for its legitimate adds-to-existing-release. - **DeepDrftManager**: ASP.NET Core host. Blazor Web App with server-rendered `InteractiveServer` render mode. **Always uncrawlable**: a static `wwwroot/robots.txt` (`Disallow: /`, no env gate) plus a blanket `<meta name="robots" content="noindex,nofollow">` in `Components/App.razor` — defense in depth so the CMS is never indexed regardless of how it is discovered. Hosts all CMS Razor components and pages under `Components/Pages/Cms/`, `Components/Pages/Tracks/`, `Components/Layout/CmsLayout.razor`, and `Components/Shared/` (all inlined from the former `DeepDrftCms` RCL). Public entry point: `Components/Pages/Home.razor` (`@page "/"`, no `[Authorize]`, uses lean `CmsHomeLayout`) — unauthenticated visitors see a DeepDrft-branded splash with a Login CTA; authenticated admins are redirected to `/catalogue` via `RedirectToCatalogue`. `Routes.razor` resolves `DefaultLayout` from the cascaded `Task<AuthenticationState>`: unauthenticated → `CmsHomeLayout`, authenticated → `CmsLayout`; this means the AuthBlocks `Login`/`Register` pages (which declare no `@layout`) render in the lean layout for unauthenticated visitors. `CmsLayout` carries a left `MudDrawer` (app-bar hamburger toggle) holding the CMS destinations (Catalogue `/catalogue`, Releases `/releases`, Upload `/tracks/upload`), the AuthBlocks `UserAdminMenu` fragment (self-gates to `UserAdmin`+, links Users/Registrations/Permissions), and a "Provision User" link to `/useradmin/users/new` wrapped in a `HierarchicalRoleAuthorizeView` (`UserAdmin`-gated) — making the AuthBlocks user-administration surface reachable from the CMS UI. The catalogue dashboard (`Components/Pages/Index.razor`) lives at `@page "/catalogue"` and remains `[Authorize]`-gated with `CmsLayout`; its cards are **CUTS / SESSIONS / MIXES**, each deep-linking to `/releases?medium=<medium>` with the matching tab pre-selected. The consolidated browse surface is `Components/Pages/Tracks/Releases.razor` (`@page "/releases"`): bulk-action buttons (Generate All Profiles / Backfill High-res) → medium tab strip (ALL / CUTS / SESSIONS / MIXES) → the active tab's grid; waveform columns (Profile / High-res) — each showing a status icon when a datum is present and an always-visible generate/regenerate button — and per-track info tooltip live in `CmsAlbumBrowser`'s expanded child-row track table. Old list routes `/tracks`, `/tracks/albums`, `/tracks/archive` are kept as aliases on `Releases.razor` so bookmarks don't 404; operational sub-routes (`/tracks/upload`, edit routes, etc.) remain at `/tracks/*`. Gated by AuthBlocks login and hierarchical `Admin` role authorization. All track operations (upload, metadata read/write, delete, replace audio) are HTTP proxies via `ICmsTrackService` / `CmsTrackService` injected directly into Blazor components; no in-process data layer. The per-track "Replace audio" affordance in `BatchEdit` / `BatchTrackList` / `BatchTrackDetail` swaps the vault bytes, regenerates both waveform datums server-side, and re-derives `DurationSeconds` from the new audio; the track id, `EntryKey`, release membership, position, and all other metadata are preserved. The remove control on a persisted track is hidden when it is the release's sole remaining persisted track — a release can reach zero live tracks only via replace or release-level delete, not per-track removal. Two named HttpClients: `DeepDrft.Content.Cms` (bounded 100 s default, for all non-upload calls) and `DeepDrft.Content.Cms.Upload` (`InfiniteTimeSpan`, for large WAV uploads). Upload progress and idle/heartbeat timeout are driven by a single `ProgressStreamContent` wrapper (`Services/ProgressStreamContent.cs`); `CmsTrackService.UploadTrackAsync` adds a two-phase cancellation (idle window resets per progress tick; separate response-wait budget arms when the body completes). The upload form is create-only: `BatchUpload.razor` calls `GET api/track/release/exists` as a pre-flight before transferring bytes and blocks the submit with a visible message if a (title, artist) match already exists; the server also rejects duplicates with 409. The authenticated user's id (`NameIdentifier` claim) is captured once into `_createdByUserId` at component initialization (`OnInitializedAsync`) — not re-read at submit — so a mid-session token expiry cannot discard a long-composed release; the page is `[Authorize]`-gated and runs `prerender: false`, so the auth state is fully available at init and only one init pass occurs. Within-batch multi-track Cuts still work by passing the release id from row 1 as `releaseId` on rows 2..N (the ATTACH path), while `BatchEdit.razor` uses the same ATTACH path for its legitimate adds-to-existing-release.
- **DeepDrftShared.Client**: Razor Class Library. Shared Blazor components consumed by both `DeepDrftPublic` and `DeepDrftManager` for consistency across public and admin surfaces. - **DeepDrftShared.Client**: Razor Class Library. Shared Blazor components consumed by both `DeepDrftPublic` and `DeepDrftManager` for consistency across public and admin surfaces.
- **DeepDrftData**: Class library. EF Core domain logic: `DeepDrftContext`, `TrackConfiguration`, `Migrations`, `TrackRepository`, `TrackService`, `TrackManager`. Consumed by `DeepDrftAPI` and tests. - **DeepDrftData**: Class library. EF Core domain logic: `DeepDrftContext`, `TrackConfiguration`, `Migrations`, `TrackRepository`, `TrackService`, `TrackManager`. Consumed by `DeepDrftAPI` and tests.
- **DeepDrftAPI**: ASP.NET Core host. Dual-database authority (SQL metadata + FileDatabase binary). AuthBlocks API host (owns registration, migration/seed, JWT endpoints). Track endpoints: streaming, vault write, upload+persist, delete+cleanup, paged list with filters, single metadata (ApiKey-gated operations), metadata update, waveform profiles (512-bucket seeker + per-track high-res visualizer datum in the `track-waveforms` vault), release-track join operations, `POST api/track/duration/backfill` (ApiKey-gated one-time backfill of `DurationSeconds` for existing rows from vault audio). Stats endpoints: `GET api/stats/home` (unauthenticated; returns `HomeStatsDto` with cut track count, per-`ReleaseType` cut release counts, mix release count, and total mix runtime seconds). Release endpoints: paged list with medium filter, single read, session hero-image upload (all unauthenticated reads; authenticated writes via ApiKey). Image endpoints: authenticated upload, unauthenticated streaming. - **DeepDrftAPI**: ASP.NET Core host. Dual-database authority (SQL metadata + FileDatabase binary). AuthBlocks API host (owns registration, migration/seed, JWT endpoints). Track endpoints: streaming, vault write, upload+persist, delete+cleanup, paged list with filters, single metadata (ApiKey-gated operations), metadata update, waveform profiles (512-bucket seeker + per-track high-res visualizer datum in the `track-waveforms` vault), release-track join operations, `POST api/track/duration/backfill` (ApiKey-gated one-time backfill of `DurationSeconds` for existing rows from vault audio). Stats endpoints: `GET api/stats/home` (unauthenticated; returns `HomeStatsDto` with cut track count, per-`ReleaseType` cut release counts, mix release count, and total mix runtime seconds). Release endpoints: paged list with medium filter, single read, session hero-image upload (all unauthenticated reads; authenticated writes via ApiKey). Image endpoints: authenticated upload, unauthenticated streaming.
+19
View File
@@ -6,6 +6,25 @@ Newest entries at the top. Group by phase/wave header (mirroring `PLAN.md` / `CM
--- ---
## Phase 23 — SEO Crawl Directives (landed 2026-06-23)
**Landed:** 2026-06-23 on dev.
- **What:** Server-side crawl-directive endpoints for `DeepDrftPublic` (`GET /robots.txt` and `GET /sitemap.xml`) plus a defense-in-depth noindex layer for `DeepDrftManager`. The endpoint/file-shaped follow-on to Phase 22's per-page `SeoHead` component. Phase 22 is the *content* of discoverability; Phase 23 is the *directives* layer above it — telling crawlers **which** pages exist and **whether** to crawl at all. No new `DeepDrftAPI` endpoint, no schema change.
- **Why:** Without robots.txt a crawler has no machine-readable signal about which routes to include or exclude (e.g. `/FramePlayer`, `/api/*`). Without sitemap.xml Google/Bing must discover release detail pages by link-following alone. Without noindex/robots protection the CMS could be inadvertently crawled if an admin link ever appeared on a public page.
- **Shape:**
- **`DeepDrftPublic/Controllers/CrawlDirectiveController.cs`** (new): thin controller serving both endpoints. Reads `IWebHostEnvironment.IsProduction()` **directly** — no `SeoEnvironment` PersistentState bridge needed because these are server-side only (nothing crosses the server→WASM seam). Env gate is fail-safe closed: non-production robots.txt emits `Disallow: /` and the sitemap returns 404.
- **`DeepDrftPublic/Seo/RobotsTxt.cs`** (new): pure builder for the robots.txt body. Production: `Allow: /` + `Disallow: /FramePlayer` + `Disallow: /api/` + `Sitemap:` pointer. Non-production: `Disallow: /`.
- **`DeepDrftPublic/Seo/SitemapXml.cs`** (new): pure builder for the sitemap XML body. Walks `GET api/release` (server-to-server via the existing `"DeepDrft.API"` named client, paged) and emits a sitemaps.org `urlset`. Six explicit static roots (`/`, `/about`, `/cuts`, `/sessions`, `/mixes`, `/archive`) plus one `<url>` per release — `<loc>` = `SeoOptions.BaseUrl` + `ReleaseRoutes.DetailHref`, equal to the page's `SeoHead` canonical by construction; `<lastmod>` from `ReleaseDate`. Resilient: a partial/failed release read yields a well-formed roots-only document, never a 500.
- **`DeepDrftManager/wwwroot/robots.txt`** (new static file): `Disallow: /` with no environment gate — the CMS is always uncrawlable, including in production.
- **`DeepDrftManager/Components/App.razor`** (updated): blanket `<meta name="robots" content="noindex,nofollow">` in the CMS host `<head>` — defense in depth against de-indexing URLs discovered via external links, complementing the robots.txt directive.
- **Design memo:** `product-notes/phase-23-seo-crawl-directives.md`.
---
## Phase 22 — SEO Metadata Component (landed 2026-06-23) ## Phase 22 — SEO Metadata Component (landed 2026-06-23)
**Landed:** 2026-06-23 on dev. **Landed:** 2026-06-23 on dev.
+2 -1
View File
@@ -16,7 +16,8 @@
"https://localhost:5004", "https://localhost:5004",
"http://localhost:5003", "http://localhost:5003",
"https://deepdrft.com", "https://deepdrft.com",
"https://www.deepdrft.com" "https://www.deepdrft.com",
"https://app.deepdrft.com"
] ]
}, },
"ForwardedHeaders": { "ForwardedHeaders": {
+1
View File
@@ -13,6 +13,7 @@
<link rel="stylesheet" href="@Assets["_content/DeepDrftShared.Client/styles/deepdrft-tokens.css"]" /> <link rel="stylesheet" href="@Assets["_content/DeepDrftShared.Client/styles/deepdrft-tokens.css"]" />
<ImportMap /> <ImportMap />
<link rel="icon" type="image/ico" href="deepdrft-logo.ico" /> <link rel="icon" type="image/ico" href="deepdrft-logo.ico" />
<meta name="robots" content="noindex,nofollow" />
<HeadOutlet @rendermode="ServerMode" /> <HeadOutlet @rendermode="ServerMode" />
</head> </head>
+2
View File
@@ -0,0 +1,2 @@
User-agent: *
Disallow: /
+8 -3
View File
@@ -6,12 +6,15 @@ See the root `CLAUDE.md` for full architecture overview. This file covers what i
## One-line purpose ## One-line purpose
The Blazor Web App host. Owns a browser-facing proxy controller for `api/track/*` (metadata and audio streaming), MudBlazor theme prerender, and TypeScript→JS audio interop. The Blazor Web App host. Owns a browser-facing proxy controller for `api/track/*` (metadata and audio streaming), crawl-directive endpoints (`/robots.txt` + `/sitemap.xml`), MudBlazor theme prerender, and TypeScript→JS audio interop.
## What lives here now (only) ## What lives here now (only)
- `Program.cs`, `Startup.cs`: HTTP host config, DI wiring, port binding. - `Program.cs`, `Startup.cs`: HTTP host config, DI wiring, port binding.
- `Controllers/TrackProxyController.cs`: Thin proxy controller at `[Route("api/track")]`. Two actions: `GET api/track/page` (proxies paged track metadata) and `GET api/track/{trackId}` (proxies audio streaming without buffering, forwards `offset` query param for seek-beyond-buffer). Uses `RegisterForDispose` for clean connection cleanup. - `Controllers/TrackProxyController.cs`: Thin proxy controller at `[Route("api/track")]`. Two actions: `GET api/track/page` (proxies paged track metadata) and `GET api/track/{trackId}` (proxies audio streaming without buffering, forwards `offset` query param for seek-beyond-buffer). Uses `RegisterForDispose` for clean connection cleanup.
- `Controllers/CrawlDirectiveController.cs`: Second controller; serves `GET /robots.txt` and `GET /sitemap.xml`. Reads `IWebHostEnvironment.IsProduction()` **directly** (server-side only — no PersistentState bridge). Production robots.txt: `Allow: /` + `Disallow: /FramePlayer` + `Disallow: /api/` + `Sitemap:` pointer. Non-production robots.txt: `Disallow: /`. Production sitemap.xml: walks `GET api/release` via the `"DeepDrft.API"` named client, emits six static roots + one `<url>` per release (loc = `SeoOptions.BaseUrl` + `ReleaseRoutes.DetailHref`, lastmod from `ReleaseDate`); resilient (partial read → well-formed roots-only doc, never 500). Non-production: sitemap returns 404. Routes automatically via `MapControllers()`.
- `Seo/RobotsTxt.cs`: Pure builder for the robots.txt body (no HTTP, no DI — composition only).
- `Seo/SitemapXml.cs`: Pure builder for the sitemap XML body (no HTTP, no DI — composition only).
- `Services/DarkModeService.cs`: Server-side dark-mode prerender (reads `darkMode` cookie, seeds `DarkModeSettings.IsDarkMode` via `IHttpContextAccessor`, carries to WASM via `PersistentComponentState`). - `Services/DarkModeService.cs`: Server-side dark-mode prerender (reads `darkMode` cookie, seeds `DarkModeSettings.IsDarkMode` via `IHttpContextAccessor`, carries to WASM via `PersistentComponentState`).
- `Components/App.razor`: Root component with `@rendermode="InteractiveAuto"`. Calls `DarkModeService.InitializeAsync()` in `OnInitialized`. - `Components/App.razor`: Root component with `@rendermode="InteractiveAuto"`. Calls `DarkModeService.InitializeAsync()` in `OnInitialized`.
- `Components/Pages/Error.razor`: Error fallback. - `Components/Pages/Error.razor`: Error fallback.
@@ -84,9 +87,11 @@ The middleware pipeline in `Program.cs` is ordered as follows:
8. Development-only `UseStaticFiles()` — serves raw TypeScript from `/Interop/` for source-map debugging. 8. Development-only `UseStaticFiles()` — serves raw TypeScript from `/Interop/` for source-map debugging.
9. `MapControllers()` and `MapRazorComponents()` — route controller and component requests. 9. `MapControllers()` and `MapRazorComponents()` — route controller and component requests.
## The proxy controller ## Controllers
`TrackProxyController` in `Controllers/` is the only HTTP controller. It is a thin proxy only — no domain logic, no data layer. The WASM client points both named HttpClients (`"DeepDrft.API"` and `"DeepDrft.Content"`) at the Blazor host's base address, so all browser requests route through this controller to DeepDrftAPI. Server-side SSR calls DeepDrftAPI directly (server-to-server) via the same named clients — no proxy hop on the server side. `Controllers/` now holds two controllers. Both are thin boundaries — no domain logic, no data layer.
`TrackProxyController` is the audio/metadata proxy. The WASM client points both named HttpClients (`"DeepDrft.API"` and `"DeepDrft.Content"`) at the Blazor host's base address, so all browser requests route through this controller to DeepDrftAPI. Server-side SSR calls DeepDrftAPI directly (server-to-server) via the same named clients — no proxy hop on the server side.
The proxy forwards public, unauthenticated routes: The proxy forwards public, unauthenticated routes:
- `GET api/track/page` — paged metadata listing - `GET api/track/page` — paged metadata listing
@@ -0,0 +1,111 @@
using System.Net.Http.Json;
using System.Text.Json;
using DeepDrftModels.DTOs;
using Models.Common;
using DeepDrftPublic.Client.Common;
using DeepDrftPublic.Seo;
using Microsoft.AspNetCore.Mvc;
namespace DeepDrftPublic.Controllers;
/// <summary>
/// Serves the public crawl-directive surfaces (Phase 23): <c>GET /robots.txt</c> and
/// <c>GET /sitemap.xml</c>. Both are environment-gated server-side via
/// <see cref="IWebHostEnvironment.IsProduction"/> read directly here — not the WASM-only
/// <c>SeoEnvironment</c> bridge — and fail safe closed (non-production is uncrawlable, Invariant E1).
///
/// <para>
/// This is a thin host boundary: it owns the gate and the release walk, and delegates all body composition
/// to the pure <see cref="RobotsTxt"/> / <see cref="SitemapXml"/> builders. The sitemap walk reuses the
/// existing <c>"DeepDrft.API"</c> named client server-to-server (the same client SSR prerender uses) — it
/// <b>enumerates and transforms</b> releases into XML rather than relaying verbatim like the proxy controllers.
/// No new API endpoint, no schema change (Phase 22 C5 holds).
/// </para>
/// </summary>
[ApiController]
public class CrawlDirectiveController : ControllerBase
{
// 100 is the server-side PageSize cap, so this is the largest page the walk can actually get.
private const int WalkPageSize = 100;
// The release walk deserializes a bare PagedResult<ReleaseDto> (no ApiResultDto envelope), matching TrackClient.
private static readonly JsonSerializerOptions JsonOptions = new(JsonSerializerDefaults.Web);
private readonly IWebHostEnvironment _environment;
private readonly SeoOptions _seoOptions;
private readonly HttpClient _upstream;
private readonly ILogger<CrawlDirectiveController> _logger;
public CrawlDirectiveController(
IWebHostEnvironment environment,
SeoOptions seoOptions,
IHttpClientFactory httpClientFactory,
ILogger<CrawlDirectiveController> logger)
{
_environment = environment;
_seoOptions = seoOptions;
_upstream = httpClientFactory.CreateClient("DeepDrft.API");
_logger = logger;
}
/// <summary>
/// <c>GET /robots.txt</c>. Production: allow + FramePlayer/api disallows + sitemap pointer. Any
/// non-production environment: <c>Disallow: /</c> with no sitemap pointer (E1). Always <c>text/plain</c>.
/// </summary>
[HttpGet("/robots.txt")]
public ContentResult GetRobots()
{
var body = RobotsTxt.Build(_environment.IsProduction(), _seoOptions.BaseUrl);
return Content(body, "text/plain");
}
/// <summary>
/// <c>GET /sitemap.xml</c>. Non-production: 404 (the non-prod robots carries no sitemap pointer, so
/// nothing references it). Production: the static roots plus one entry per release. Resilient — a
/// partial/empty/failed release read yields a well-formed (possibly roots-only) document, never a 500.
/// </summary>
[HttpGet("/sitemap.xml")]
public async Task<ActionResult> GetSitemap(CancellationToken ct = default)
{
if (!_environment.IsProduction())
return NotFound();
var releases = await GatherReleasesAsync(ct);
var xml = SitemapXml.Build(_seoOptions.BaseUrl, releases);
return Content(xml, "application/xml");
}
// Walks GET api/release page by page until every release is read. On any upstream failure, returns the
// releases gathered so far (possibly none) so the sitemap degrades to a well-formed roots-only document
// rather than 500ing — a sitemap that errors trains crawlers to stop fetching it (AC-S5).
private async Task<IReadOnlyList<ReleaseDto>> GatherReleasesAsync(CancellationToken ct)
{
var gathered = new List<ReleaseDto>();
var page = 1;
try
{
while (true)
{
var result = await _upstream.GetFromJsonAsync<PagedResult<ReleaseDto>>(
$"api/release?page={page}&pageSize={WalkPageSize}", JsonOptions, ct);
if (result?.Items is null)
break;
gathered.AddRange(result.Items);
if (gathered.Count >= result.TotalCount || !result.Items.Any())
break;
page++;
}
}
catch (Exception ex)
{
_logger.LogError(ex, "Sitemap release walk failed after gathering {Count} release(s); serving a partial sitemap", gathered.Count);
}
return gathered;
}
}
+34
View File
@@ -0,0 +1,34 @@
namespace DeepDrftPublic.Seo;
/// <summary>
/// Pure composition of the <c>robots.txt</c> body (Phase 23 wave 23.1). The environment gate is the
/// caller's: the endpoint reads <see cref="Microsoft.AspNetCore.Hosting.IWebHostEnvironment.IsProduction"/>
/// server-side and passes the boolean here, so the production-vs-beta branch lives in one testable place.
/// Fail-safe is closed — anything that is not Production yields <c>Disallow: /</c> (Invariant E1).
/// </summary>
public static class RobotsTxt
{
/// <summary>
/// Builds the directive body. In Production: allow everything except the embed shell and the proxy API
/// paths, plus a <c>Sitemap:</c> pointer (OQ-R2). In any non-production environment: a closed door
/// (<c>Disallow: /</c>) with no sitemap pointer, so a crawl of beta sees nothing and the sitemap is
/// never advertised.
/// </summary>
/// <param name="isProduction">The server-side <c>IsProduction()</c> result — the single gate.</param>
/// <param name="baseUrl">Canonical origin (no trailing slash) for the <c>Sitemap:</c> line; Production only.</param>
public static string Build(bool isProduction, string baseUrl)
{
if (!isProduction)
{
return "User-agent: *\n" +
"Disallow: /\n";
}
var origin = baseUrl.TrimEnd('/');
return "User-agent: *\n" +
"Allow: /\n" +
"Disallow: /FramePlayer\n" +
"Disallow: /api/\n" +
$"Sitemap: {origin}/sitemap.xml\n";
}
}
+68
View File
@@ -0,0 +1,68 @@
using System.Text;
using System.Xml;
using System.Xml.Linq;
using DeepDrftModels.DTOs;
using DeepDrftPublic.Client.Common;
namespace DeepDrftPublic.Seo;
/// <summary>
/// Pure composition of the sitemaps.org <c>urlset</c> document (Phase 23 wave 23.2). Enumerates the fixed
/// indexable roots plus one entry per release, every <c>&lt;loc&gt;</c> absolutized against
/// <see cref="SeoOptions.BaseUrl"/> and per-release paths resolved through
/// <see cref="ReleaseRoutes.DetailHref(string, DeepDrftModels.Enums.ReleaseMedium)"/> — so each sitemap URL
/// equals the page's <c>SeoHead</c> canonical by construction. No fetch, no env logic: the endpoint owns the
/// gate and the release walk; this turns the gathered DTOs into XML and never throws on partial input.
/// </summary>
public static class SitemapXml
{
private static readonly XNamespace Ns = "http://www.sitemaps.org/schemas/sitemap/0.9";
/// <summary>
/// The indexable static roots (OQ-S3). An explicit list, deliberately NOT derived from the nav index:
/// the indexable set is not the nav set (e.g. <c>/FramePlayer</c> is nav-absent and must stay out, and a
/// new nav entry is not automatically sitemap-worthy). Revisit here if the indexable-roots set grows.
/// </summary>
public static readonly IReadOnlyList<string> StaticRoots = ["/", "/about", "/cuts", "/sessions", "/mixes", "/archive"];
/// <summary>
/// Builds the full <c>urlset</c>: the static roots (no <c>lastmod</c>) followed by one <c>&lt;url&gt;</c>
/// per release. A release carries a <c>&lt;lastmod&gt;</c> sourced from <see cref="ReleaseDto.ReleaseDate"/>
/// in W3C <c>YYYY-MM-DD</c> form when present (OQ-S2 — the release date, accepted as a plausible crawl hint).
/// A null/empty release set yields a well-formed roots-only document.
/// </summary>
/// <param name="baseUrl">Canonical origin (no trailing slash) every <c>&lt;loc&gt;</c> is built from.</param>
/// <param name="releases">The gathered releases; may be empty or partial after an upstream failure.</param>
public static string Build(string baseUrl, IEnumerable<ReleaseDto> releases)
{
var origin = baseUrl.TrimEnd('/');
var roots = StaticRoots.Select(path => UrlElement(origin + path, lastmod: null));
var releaseUrls = releases.Select(release => UrlElement(
origin + ReleaseRoutes.DetailHref(release.EntryKey, release.Medium),
release.ReleaseDate?.ToString("yyyy-MM-dd")));
var urlset = new XElement(Ns + "urlset", roots.Concat(releaseUrls));
var document = new XDocument(new XDeclaration("1.0", "UTF-8", null), urlset);
// Save through a byte-based UTF-8 stream so the XML declaration reads encoding="utf-8". An
// XmlWriter over a StringBuilder/StringWriter is character-based (UTF-16) and would stamp the
// declaration utf-16, which is wrong for a body served as application/xml.
using var stream = new MemoryStream();
var settings = new XmlWriterSettings { Encoding = new UTF8Encoding(encoderShouldEmitUTF8Identifier: false), Indent = true };
using (var xmlWriter = XmlWriter.Create(stream, settings))
{
document.Save(xmlWriter);
}
return Encoding.UTF8.GetString(stream.ToArray());
}
private static XElement UrlElement(string loc, string? lastmod)
{
var element = new XElement(Ns + "url", new XElement(Ns + "loc", loc));
if (lastmod is not null)
element.Add(new XElement(Ns + "lastmod", lastmod));
return element;
}
}
Binary file not shown.

After

Width:  |  Height:  |  Size: 128 KiB

+3
View File
@@ -40,6 +40,9 @@
The queue is pure domain logic, unit-testable against a fake IStreamingPlayerService The queue is pure domain logic, unit-testable against a fake IStreamingPlayerService
with no browser/JS. --> with no browser/JS. -->
<ProjectReference Include="..\DeepDrftPublic.Client\DeepDrftPublic.Client.csproj" /> <ProjectReference Include="..\DeepDrftPublic.Client\DeepDrftPublic.Client.csproj" />
<!-- Referenced for the Phase 23 crawl-directive builders (RobotsTxt / SitemapXml) — pure
string/XML composition over the env flag and release DTOs, unit-testable without HTTP. -->
<ProjectReference Include="..\DeepDrftPublic\DeepDrftPublic.csproj" />
</ItemGroup> </ItemGroup>
</Project> </Project>
+62
View File
@@ -0,0 +1,62 @@
using DeepDrftPublic.Seo;
namespace DeepDrftTests;
/// <summary>
/// Unit tests for <see cref="RobotsTxt"/> — the pure environment-branch composition of the robots.txt body
/// (Phase 23 wave 23.1). The gate (Production vs. anything-else) is the load-bearing branch: Production
/// allows + points at the sitemap and disallows the non-page routes; every non-production environment is a
/// closed door with no sitemap pointer (Invariant E1).
/// </summary>
[TestFixture]
public class RobotsTxtTests
{
private const string BaseUrl = "https://deepdrft.com";
[Test]
public void Build_Production_AllowsAndPointsAtSitemap()
{
var body = RobotsTxt.Build(isProduction: true, BaseUrl);
Assert.Multiple(() =>
{
Assert.That(body, Does.Contain("User-agent: *"));
Assert.That(body, Does.Contain("Allow: /"));
Assert.That(body, Does.Contain("Sitemap: https://deepdrft.com/sitemap.xml"));
});
}
[Test]
public void Build_Production_DisallowsFramePlayerAndApi()
{
var body = RobotsTxt.Build(isProduction: true, BaseUrl);
Assert.Multiple(() =>
{
Assert.That(body, Does.Contain("Disallow: /FramePlayer"));
Assert.That(body, Does.Contain("Disallow: /api/"));
});
}
[Test]
public void Build_NonProduction_DisallowsEverythingWithNoSitemapPointer()
{
var body = RobotsTxt.Build(isProduction: false, BaseUrl);
Assert.Multiple(() =>
{
Assert.That(body, Does.Contain("User-agent: *"));
Assert.That(body, Does.Contain("Disallow: /"));
Assert.That(body, Does.Not.Contain("Allow:"));
Assert.That(body, Does.Not.Contain("Sitemap:"));
});
}
[Test]
public void Build_Production_TrimsTrailingSlashOnBaseUrl()
{
var body = RobotsTxt.Build(isProduction: true, "https://deepdrft.com/");
Assert.That(body, Does.Contain("Sitemap: https://deepdrft.com/sitemap.xml"));
}
}
+154
View File
@@ -0,0 +1,154 @@
using System.Xml.Linq;
using DeepDrftModels.DTOs;
using DeepDrftModels.Enums;
using DeepDrftPublic.Client.Common;
using DeepDrftPublic.Seo;
namespace DeepDrftTests;
/// <summary>
/// Unit tests for <see cref="SitemapXml"/> — the pure sitemaps.org urlset composition (Phase 23 wave 23.2).
/// The document is parsed back to an <see cref="XDocument"/> so each assertion checks real structure, not a
/// substring: that every <c>&lt;loc&gt;</c> is absolute and built through <see cref="ReleaseRoutes"/> (so it
/// equals the page canonical), that <c>&lt;lastmod&gt;</c> tracks the release date, that the static roots are
/// present and FramePlayer is absent, and that empty input still yields a well-formed roots-only document.
/// </summary>
[TestFixture]
public class SitemapXmlTests
{
private const string BaseUrl = "https://deepdrft.com";
private static readonly XNamespace Ns = "http://www.sitemaps.org/schemas/sitemap/0.9";
private static ReleaseDto Release(string entryKey, ReleaseMedium medium, DateOnly? releaseDate = null) => new()
{
EntryKey = entryKey,
Title = "Title",
Artist = "Artist",
Medium = medium,
ReleaseDate = releaseDate,
};
private static List<string> Locs(string xml)
{
var doc = XDocument.Parse(xml);
return doc.Root!.Elements(Ns + "url")
.Select(u => u.Element(Ns + "loc")!.Value)
.ToList();
}
[Test]
public void Build_EmptyReleases_YieldsWellFormedRootsOnlyDocument()
{
var xml = SitemapXml.Build(BaseUrl, []);
var locs = Locs(xml);
Assert.Multiple(() =>
{
Assert.That(locs, Has.Count.EqualTo(SitemapXml.StaticRoots.Count));
Assert.That(locs, Does.Contain("https://deepdrft.com/"));
Assert.That(locs, Does.Contain("https://deepdrft.com/about"));
Assert.That(locs, Does.Contain("https://deepdrft.com/cuts"));
Assert.That(locs, Does.Contain("https://deepdrft.com/sessions"));
Assert.That(locs, Does.Contain("https://deepdrft.com/mixes"));
Assert.That(locs, Does.Contain("https://deepdrft.com/archive"));
});
}
[Test]
public void Build_IsWellFormedUrlsetWithSitemapsOrgNamespace()
{
var xml = SitemapXml.Build(BaseUrl, []);
var doc = XDocument.Parse(xml);
Assert.Multiple(() =>
{
Assert.That(doc.Root!.Name, Is.EqualTo(Ns + "urlset"));
Assert.That(xml, Does.Contain("utf-8").IgnoreCase);
});
}
[Test]
public void Build_FramePlayerIsNeverAStaticRoot()
{
var xml = SitemapXml.Build(BaseUrl, []);
Assert.That(Locs(xml), Has.None.Contains("FramePlayer"));
}
[TestCase(ReleaseMedium.Cut, "https://deepdrft.com/cuts/key-1")]
[TestCase(ReleaseMedium.Session, "https://deepdrft.com/sessions/key-1")]
[TestCase(ReleaseMedium.Mix, "https://deepdrft.com/mixes/key-1")]
public void Build_ReleaseLoc_IsAbsoluteAndResolvedThroughReleaseRoutes(ReleaseMedium medium, string expectedLoc)
{
var xml = SitemapXml.Build(BaseUrl, [Release("key-1", medium)]);
// The loc must equal BaseUrl + ReleaseRoutes.DetailHref — i.e. the page's SeoHead canonical, by construction.
var expected = BaseUrl + ReleaseRoutes.DetailHref("key-1", medium);
Assert.Multiple(() =>
{
Assert.That(expected, Is.EqualTo(expectedLoc));
Assert.That(Locs(xml), Does.Contain(expectedLoc));
});
}
[Test]
public void Build_AllReleasesEnumerated_AppendedAfterStaticRoots()
{
var releases = new[]
{
Release("a", ReleaseMedium.Cut),
Release("b", ReleaseMedium.Mix),
Release("c", ReleaseMedium.Session),
};
var xml = SitemapXml.Build(BaseUrl, releases);
Assert.That(Locs(xml), Has.Count.EqualTo(SitemapXml.StaticRoots.Count + releases.Length));
}
[Test]
public void Build_ReleaseWithDate_EmitsW3CLastmod()
{
var xml = SitemapXml.Build(BaseUrl, [Release("key-1", ReleaseMedium.Cut, new DateOnly(2026, 5, 12))]);
var doc = XDocument.Parse(xml);
var releaseUrl = doc.Root!.Elements(Ns + "url")
.Single(u => u.Element(Ns + "loc")!.Value.EndsWith("/cuts/key-1"));
Assert.That(releaseUrl.Element(Ns + "lastmod")!.Value, Is.EqualTo("2026-05-12"));
}
[Test]
public void Build_ReleaseWithoutDate_OmitsLastmod()
{
var xml = SitemapXml.Build(BaseUrl, [Release("key-1", ReleaseMedium.Cut)]);
var doc = XDocument.Parse(xml);
var releaseUrl = doc.Root!.Elements(Ns + "url")
.Single(u => u.Element(Ns + "loc")!.Value.EndsWith("/cuts/key-1"));
Assert.That(releaseUrl.Element(Ns + "lastmod"), Is.Null);
}
[Test]
public void Build_StaticRoots_NeverCarryLastmod()
{
var xml = SitemapXml.Build(BaseUrl, []);
var doc = XDocument.Parse(xml);
Assert.That(doc.Root!.Elements(Ns + "url").All(u => u.Element(Ns + "lastmod") is null), Is.True);
}
[Test]
public void Build_TrimsTrailingSlashOnBaseUrl()
{
var xml = SitemapXml.Build("https://deepdrft.com/", [Release("key-1", ReleaseMedium.Cut)]);
Assert.Multiple(() =>
{
// No doubled slash on the root or the release URL.
Assert.That(Locs(xml), Does.Contain("https://deepdrft.com/"));
Assert.That(Locs(xml), Does.Contain("https://deepdrft.com/cuts/key-1"));
});
}
}
+1
View File
@@ -654,6 +654,7 @@ convention.** None block 21.1.
--- ---
## Working with this file ## Working with this file
- **Add items by extending an existing phase first**; only create a new phase when the addition genuinely doesn't fit any of 15. Phase numbers are organisational, not sequencing. - **Add items by extending an existing phase first**; only create a new phase when the addition genuinely doesn't fit any of 15. Phase numbers are organisational, not sequencing.
+127
View File
@@ -0,0 +1,127 @@
# DeepDrftHome — Production Installation Checklist
Fresh-box checklist for deploying the DeepDrftHome solution (DeepDrftPublic, DeepDrftManager, DeepDrftAPI) to a new production host. Every package, directory, and service is treated as absent. Gated steps — those requiring a decision, a secret, or a network action from the operator — are marked **[GATE]**.
> This document is a reference you run from. It can drift from the actual `deploy/` scripts (`bootstrap.sh`, `install.sh`, `setup-step10-creds.sh`, the systemd units, and the nginx templates) — when those change, update this checklist to match.
## Phase 0 — Prerequisites (build/admin machine)
- [ ] Confirm DNS A/AAAA records for `deepdrft.com` and `app.deepdrft.com` point at the new host's IP — certbot's HTTP-01 challenge fails if DNS hasn't propagated. **[GATE]**
- [ ] Generate a CI deploy ed25519 key on your local machine (not the host): `ssh-keygen -t ed25519 -C "gitea-ci-deepdrft-prod" -f ~/.ssh/gitea_deepdrft_prod`. Public key → installer prompt; private key → Gitea secret `DEEPDRFT_PROD_SSH_DEPLOY`. **[GATE]**
- [ ] Download the latest `deepdrft-install.tar.gz` release asset (built by `package-install.yml` on a `deploy/` push to `master`). If none exists, push a no-op change to `deploy/` on `master` and wait for the artifact. **[GATE]**
- [ ] `scp deepdrft-install.tar.gz root@<host>:/tmp/`
## Phase 1 — Bootstrap / installer (run as root)
- [ ] Run: `INSTALL_PKG_PATH=/tmp/deepdrft-install.tar.gz bash bootstrap.sh` (installs OS prereqs, hands off to `install.sh`).
- [ ] The installer is interactive — have ready: app user (`deepdrft` / `/deepdrft`), PG role (`deepdrft`), DB names (`deepdrft-meta`, `deepdrft-auth`), public domain, app subdomain (default `app.<public>`), ports (5000/5001/5002), certbot email, PG password (twice), CI deploy public key. **[GATE]**
Automated Steps 010: apt preflight (postgresql, nginx, rsync, openssl, jq, wget) → create user → `enable-linger` → directory layout → deploy scripts to `/opt/deepdrft/bin/` → systemd units enabled (not started) → credentials (Step 6, prompts) → PostgreSQL role + DBs → `authorized_keys` forced-command → nginx vhosts → summary.
## Phase 2 — Credentials (Step 6, interactive) **[GATE]**
Writes 6 files to `/deepdrft/.config/credentials/` (mode 600): `filedatabase.json` (no prompt; vault path hardcoded), `apikey.json` (auto-generate or paste), `connections.json` (PG password), `authblocks.json` (JWT secret, issuer/audience, SMTP host+token+From, admin user/email/password, support email), `api-public.json` + `api-manager.json` (auto-built). JWT issuer/audience must match `appsettings.json` `AuthBlocks:Jwt`.
Verify: `sudo -u deepdrft ls -la /deepdrft/.config/credentials/` → 6 × mode 600, owned `deepdrft:deepdrft`.
## Phase 3 — CorsSettings **[GATE]**
`DeepDrftAPI/appsettings.json` `CorsSettings.AllowedOrigins` must include the Manager origin `https://app.deepdrft.com`. The API throws on startup if origins are empty; a missing Manager origin causes silent 401s on CMS auth. (Confirm this is present before building.)
## Phase 4 — Gitea secrets **[GATE]**
Add `DEEPDRFT_PROD_SSH_DEPLOY` = full private key contents. (The `dev`/beta host uses `DEEPDRFT_DCH7_SSH_DEPLOY`.)
## Phase 5 — TLS (after DNS propagates) **[GATE]**
```
host deepdrft.com
host app.deepdrft.com
snap install --classic certbot
ln -sf /snap/bin/certbot /usr/bin/certbot
certbot --nginx --email <email> --agree-tos --no-eff-email -d deepdrft.com -d app.deepdrft.com
nginx -t && systemctl reload nginx
certbot renew --dry-run
```
The installer's vhosts are HTTP-only (`listen 80`); certbot rewrites them in place to add the 443 blocks.
## Phase 6 — Verify SSH forced-command chain (before first deploy) **[GATE]**
- Layer 1: `ssh -i key deepdrft@host deploy-public` → prints the `[deploy-public]` prefix then a missing-archive error (non-zero exit expected).
- Layer 2: `ssh -i key deepdrft@host id``ssh-wrapper: unknown command: id` (no shell).
- Layer 3: rsync a smoke file → lands in `/deepdrft/staging/`.
- Layer 4: `deploy-public`, `deploy-manager`, `deploy-api` each print their prefix.
Do not proceed until all four pass.
## Phase 7 — First deploy (push `master`)
Three workflows trigger by path filter:
- `deploy-api.yml`: `DeepDrftAPI/`, `DeepDrftData/`, `DeepDrftContent/`, `DeepDrftModels/`
- `deploy-public.yml`: `DeepDrftPublic/`, `DeepDrftPublic.Client/`, `DeepDrftShared.Client/`, `DeepDrftModels/`
- `deploy-manager.yml`: `DeepDrftManager/`, `DeepDrftShared.Client/`, `DeepDrftModels/`
Root-file-only changes trigger none. `deploy-api` builds + publishes self-contained linux-x64, runs `ef migrations bundle`, rsyncs, then `deploy-api.sh` applies the EF bundle to `deepdrft-meta`, swaps `bin/`, restarts the unit. `deploy-public` also installs the `wasm-tools` workload. Watch the three parallel Gitea jobs. **[GATE]**
## Phase 8 — EF migration verification **[GATE]**
The EF bundle runs before the binary swap (metadata DB). The AuthBlocks `deepdrft-auth` schema self-migrates on first boot and seeds the admin. Verify `\dt` on both DBs plus `__EFMigrationsHistory`. On failure: `journalctl --user -u deepdrftapi`.
## Phase 9 — Service health
Check `deepdrftapi` / `deepdrftpublic` / `deepdrftmanager` via `systemctl --user status` and `journalctl`. Common failures: missing/wrong credential key; unreadable creds (must be 600 `deepdrft:deepdrft`); PostgreSQL not on peer auth; wrong vault path; OOM on droplets < 2 GB (add swap).
## Phase 10 — Smoke-test per host
**API (port 5002, internal):**
- `curl localhost:5002/api/track/page` → empty list on fresh install.
- `curl localhost:5002/api/stats/home` → zeros.
- `POST /api/auth/login` with admin creds → a JWT.
**Public site:**
- `curl https://deepdrft.com/` renders.
- `curl https://deepdrft.com/robots.txt``Allow: /` (NOT `Disallow: /` — that means the env isn't Production).
- `curl https://deepdrft.com/sitemap.xml` → XML (static roots present even with 0 releases).
- Confirm the unit carries `Environment=ASPNETCORE_ENVIRONMENT=Production`.
**Manager (CMS):**
- `curl https://app.deepdrft.com/` renders.
- `curl https://app.deepdrft.com/robots.txt``Disallow: /` (always uncrawlable).
- `curl -o /dev/null -w "%{http_code}" https://app.deepdrft.com/` → 200.
**Blazor WebSocket:**
- `curl -H "Upgrade: websocket" -H "Connection: Upgrade" https://deepdrft.com/_blazor` → 101 (not 502/504 from nginx).
## Phase 11 — Hardening
- [ ] Change the admin password immediately — the seed credentials are stored plaintext on disk. **[GATE]**
- [ ] Firewall (UFW): allow 22/80/443, deny the rest. Port 5002 (API) is internal-only.
- [ ] `apt-get install -y unattended-upgrades && dpkg-reconfigure -plow unattended-upgrades`
- [ ] Review `pg_hba.conf` — no `host all all 0.0.0.0/0 md5` line; the `deepdrft` role connects via Unix socket (peer auth).
- [ ] Back up `~/api/deepdrft/vaults` — it is not in any deploy artifact or EF bundle; a server wipe loses all audio permanently. **[GATE]**
## Phase 12 — Iteration notes
- EF migrations auto-apply before the binary swap on every deploy — no manual `dotnet ef database update`.
- AuthBlocks self-migrates on each API start.
- The FileDatabase vault is never touched by deploys; new vault types are created by the app on first access.
- Credential rotation: rerun `setup-step10-creds.sh --force` on the host as `deepdrft`, then restart the affected services.
- Each deploy script moves the current `bin/` to `bin.prev/` — manual rollback: `mv ~/public/bin.prev ~/public/bin && systemctl --user restart deepdrftpublic.service` (substitute `manager` / `api/deepdrft`).
- Two distinct staging dirs: `~/staging/` (CI rsync jail) vs `~/api/deepdrft/vaults/staging/` (large-audio upload staging). Do not conflate them.
## Host path reference
| Path | What |
|---|---|
| `/deepdrft/.config/credentials/` | 6 × JSON credential files (600) |
| `/deepdrft/.config/systemd/user/` | 3 × `.service` unit files |
| `/deepdrft/public/bin/` | DeepDrftPublic publish output |
| `/deepdrft/manager/bin/` | DeepDrftManager publish output |
| `/deepdrft/api/deepdrft/bin/` | DeepDrftAPI publish output |
| `/deepdrft/api/deepdrft/vaults/` | FileDatabase vault — never delete, never in deploy |
| `/deepdrft/staging/` | rsync jail root (CI artifact drop zone) |
| `/opt/deepdrft/bin/ssh-wrapper` | Forced-command dispatcher |
| `/opt/deepdrft/bin/deploy-*.sh` | Per-service deploy scripts |
| `/etc/nginx/sites-available/deepdrft.com.conf` | Public nginx vhost |
| `/etc/nginx/sites-available/app.deepdrft.com.conf` | Manager nginx vhost |
@@ -0,0 +1,370 @@
# Phase 23 — SEO Crawl Directives (sitemap.xml, robots.txt, CMS noindex)
Product spec. Status: **design / framing — implementation-ready pending Daniel's open-question calls.**
Author: product-designer. Date: 2026-06-23. **No code has been written by this doc.**
Phase 23 is the **endpoint/file-shaped follow-on** to Phase 22's per-page `SeoHead` component. Phase 22 flagged
these three as "adjacent but separate concerns" (`product-notes/phase-22-seo-metadata-component.md §7`): they
are a different *unit of work* — server-side endpoints and static files that tell crawlers **which** pages exist
and **whether** to crawl them at all, as opposed to the per-page head surface that tells crawlers **what each
page is**. Phase 22 is the *content* of discoverability; Phase 23 is the *directives* layer above it.
Three items, each independently shippable:
1. **`sitemap.xml`** on the public host — a generated sitemap enumerating every indexable public URL.
2. **`robots.txt`** on the public host — allow + sitemap pointer in Production, `Disallow: /` everywhere else.
3. **CMS `noindex`** on `DeepDrftManager` — the admin app must never be indexed. The **one** item touching the CMS.
---
## 1. The environment gate is the through-line (read this first)
Phase 22 established the rule that **every non-production environment must be uncrawlable** — the beta/staging
host must not appear in search results, and a stray crawl of staging must not dilute or duplicate the production
site. Phase 22 expressed this for *page-level robots meta* via `SeoEnvironment` (a `[PersistentState]` bridge
seeded from `IWebHostEnvironment.IsProduction()`, because `SeoHead` renders in the **WASM** component graph and
WASM has no `IWebHostEnvironment`).
**Phase 23's three items all run server-side only** (endpoints and static files, never the WASM render tree), so
they read the gate the simplest possible way: **`IWebHostEnvironment.IsProduction()` injected directly.** They do
**not** need the `SeoEnvironment` PersistentState bridge — that bridge exists *solely* to ferry the flag across
the server→WASM seam, which these never cross. This is the correct reuse: same source of truth
(`IWebHostEnvironment.IsProduction()`, the exact predicate `App.razor` already seeds `SeoEnvironment` from), no
parallel gate invented, and no PersistentState plumbing where it isn't needed.
| Concern | Renders where | Gate mechanism |
|---|---|---|
| Phase 22 `SeoHead` robots meta | WASM component graph | `SeoEnvironment` `[PersistentState]` bridge (server seed → WASM read) |
| Phase 23 sitemap / robots / CMS | server-side endpoint or static file | `IWebHostEnvironment.IsProduction()` injected directly |
**Invariant E1 (the non-negotiable):** in any non-production environment, `robots.txt` is `Disallow: /` and the
sitemap is either not served or empty. A crawler must see a closed door on beta before it sees a single URL.
The fail-safe default (matching Phase 22's `SeoEnvironment` fail-safe-to-`noindex`) is **closed**: if environment
resolution is ever ambiguous, behave as non-production (disallow).
---
## 2. The architecture seam (where this code lives, and what it must not become)
Per the project convention (root `CLAUDE.md`; `DeepDrftPublic/CLAUDE.md`): **the public host owns thin HTTP
boundaries; domain logic lives in `*.Services` libraries or `DeepDrftAPI`.** Generated XML/text is a *rendering*
of data the host already has access to — it belongs in a **thin endpoint on `DeepDrftPublic`**, and any list
logic it needs must **reuse the existing release read**, not re-implement enumeration.
- **`sitemap.xml`** is *not* a pass-through proxy like `ReleaseProxyController` (which relays JSON verbatim). It
**enumerates** releases and **transforms** them into a different media type (XML). So it is a new endpoint that
*calls* the upstream `GET api/release` paged read (server-to-server via the existing `"DeepDrft.API"` named
`HttpClient`, the same client SSR prerender already uses — no proxy hop, no new data-layer code, no schema
change) and walks the pages to build the URL set. **C5 from Phase 22 holds:** no new API endpoint on
`DeepDrftAPI`, no schema change — the existing `PagedResult<ReleaseDto>` read is sufficient (it carries
`EntryKey`, `Medium`, and `ReleaseDate` — everything a `<url>` entry needs).
- **The URL composition reuses Phase 22's seams, not new ones:** absolute origin from `SeoOptions.BaseUrl`
(`https://deepdrft.com` — config, because the origin can't be derived behind the nginx proxy), and per-release
detail paths from `ReleaseRoutes.DetailHref(entryKey, medium)` (the single source of truth the Cut/Session/Mix
pages, the player bar, and `SharePopover` all already use). The sitemap thereby lists the *exact* canonical
URLs `SeoHead` emits as `<link rel="canonical">` — by construction, not by coincidence.
> **Seam note for staff-engineer.** `SeoOptions` and `ReleaseRoutes` currently live in `DeepDrftPublic.Client`
> (`Common/`). A server-side endpoint on `DeepDrftPublic` (the host) references the client assembly already (it
> loads `DeepDrftPublic.Client._Imports` as an additional WASM assembly and shares the static `Startup`), so the
> host can read these types. Confirm the reference direction at implementation; if `SeoOptions.BaseUrl` is not
> cleanly reachable from a host controller, the minimal move is to source `BaseUrl` from the same config the
> client `SeoOptions` is seeded from (it is a non-secret brand constant — `appsettings.json`, per Phase 22 §4.1),
> **not** to duplicate the constant. This is a wiring detail, not a design fork.
---
## 3. Item 1 — `sitemap.xml`
### 3.1 Mechanism and location
A new thin endpoint on `DeepDrftPublic` serving `GET /sitemap.xml` with content-type `application/xml`. It is an
endpoint (not a static file and not a Razor component) because the URL set is **dynamic** — it must include every
release detail URL, which changes as releases are added. A static file would go stale the moment a release lands.
Recommended placement: a small `SitemapController` (or a minimal-API endpoint in `Program.cs`) alongside the
existing proxy controllers in `DeepDrftPublic/Controllers/`. It is a host concern (HTTP surface + rendering),
exactly the layer the proxy controllers occupy. It injects `IWebHostEnvironment` (the gate) and
`IHttpClientFactory` (to call `"DeepDrft.API"`), mirroring `ReleaseProxyController`'s constructor shape.
### 3.2 What it enumerates
The indexable public URL set, all absolutized against `SeoOptions.BaseUrl`:
- **Static roots:** `/` (home), `/about`, and the four browse surfaces `/cuts`, `/sessions`, `/mixes`,
`/archive`. These are a fixed list (a small in-endpoint constant array, or — cleaner — derived from the same
nav index the site already maintains; see OQ-S3).
- **Every release detail URL:** walk `GET api/release?page=N&pageSize=…` until `PageNumber * PageSize >=
TotalCount`, and for each `ReleaseDto` emit `BaseUrl + ReleaseRoutes.DetailHref(dto.EntryKey, dto.Medium)` —
i.e. `/cuts/{key}`, `/sessions/{key}`, `/mixes/{key}`. No `medium` filter on the query (we want all media in
one pass); a generous `pageSize` (e.g. 100200) keeps the walk to a handful of round-trips even for a large
catalogue.
### 3.3 XML shape
Standard sitemaps.org `urlset`:
```xml
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url><loc>https://deepdrft.com/</loc></url>
<url><loc>https://deepdrft.com/about</loc></url>
<url><loc>https://deepdrft.com/cuts</loc></url>
<!-- … browse roots … -->
<url>
<loc>https://deepdrft.com/mixes/3f2a9c…</loc>
<lastmod>2026-05-12</lastmod> <!-- optional; from ReleaseDate — see OQ-S2 -->
</url>
<!-- … one <url> per release … -->
</urlset>
```
- `<loc>` is required and must be a fully-qualified absolute URL (the reason `BaseUrl` is mandatory).
- `<lastmod>` is **optional** and recommended from `ReleaseDto.ReleaseDate` (W3C date format `YYYY-MM-DD`) **for
release URLs only** — static roots have no natural lastmod and omit it. See **OQ-S2** (ReleaseDate is the
*release* date, not a content-modified date — it is a reasonable proxy but not strictly correct; the safe call
is to include it, as a stale-but-plausible lastmod is better than none and crawlers treat it as a hint).
- **No** `<changefreq>` / `<priority>` — both are widely ignored by Google and add noise. Omit them.
### 3.4 Failure posture
The endpoint must degrade gracefully — a sitemap that 500s trains crawlers to stop fetching it. If the upstream
`api/release` walk fails partway, **emit what was gathered** (static roots are always available; partial release
set is better than none) and log the failure. Never 500 the sitemap. (Mirrors `ReleaseProxyController`'s
philosophy of not collapsing valid-but-partial states, adapted to "always return a well-formed document.")
### 3.5 Acceptance criteria (sitemap)
- **AC-S1 — Valid + complete.** `GET /sitemap.xml` (in Production) returns well-formed `urlset` XML that
validates against the sitemaps.org schema and contains: the 6 static roots **and** exactly one `<url>` per
non-deleted release, addressed by `ReleaseRoutes.DetailHref` (so every `<loc>` equals the page's canonical).
- **AC-S2 — Absolute URLs.** Every `<loc>` is `https://deepdrft.com/…` (config origin, not a relative path, not
a proxy-derived host).
- **AC-S3 — Pagination walk is exhaustive.** A catalogue larger than one page is fully enumerated (no releases
dropped at a page boundary); a catalogue of zero releases yields a valid sitemap of just the static roots.
- **AC-S4 — Environment-gated.** In a non-production environment, `/sitemap.xml` is either not served (404) or
served empty/`Disallow`-consistent — it must never advertise beta release URLs to a crawler (E1). Recommend
**404 in non-production** (simplest; nothing references it because the non-prod `robots.txt` carries no
`Sitemap:` line — see Item 2).
- **AC-S5 — Resilient.** An upstream `api/release` failure yields a well-formed sitemap of the static roots (and
any releases gathered before the failure), logged — never a 500.
---
## 4. Item 2 — `robots.txt`
### 4.1 Mechanism and location — the static-vs-endpoint tradeoff (flagged)
`robots.txt` must express the environment gate (`Disallow: /` on beta, allow + sitemap pointer in Production). A
**static file** in `wwwroot/` **cannot** do this — it serves identical bytes in every environment. So the
content is environment-dependent and wants a **tiny endpoint** (`GET /robots.txt`, content-type `text/plain`),
injecting `IWebHostEnvironment` for the gate.
Three options, with the recommendation:
- **(a) Endpoint `GET /robots.txt` [RECOMMENDED].** A few lines of code in the same place as the sitemap
endpoint; reads `IWebHostEnvironment.IsProduction()`; emits the production or non-production body. Single source
of truth for the gate, co-located with the sitemap, no infra dependency. The body is trivial.
- **(b) Static file + reverse-proxy rule.** Ship a production `robots.txt` in `wwwroot/` and have nginx serve a
`Disallow: /` variant (or block the file) on the beta host. **Cons:** splits the gate across app + nginx config
(two places to reason about, two places to get wrong); the beta protection lives in infra the app can't test;
Daniel would maintain an nginx rule per environment. Rejected unless Daniel specifically wants robots managed at
the proxy layer.
- **(c) Static file only.** Cannot express the gate at all — would either crawl-allow beta (violates E1) or
disallow production. **Rejected outright.**
The endpoint (a) is the natural sibling to the sitemap endpoint and keeps E1 in one testable place. Note the
ordering subtlety from `DeepDrftPublic/CLAUDE.md`: static-file middleware runs before component/controller
mapping, so **if** a literal `wwwroot/robots.txt` ever exists it would shadow the endpoint — the endpoint
approach requires that no static `robots.txt` is shipped (a one-line thing to verify, called out so it isn't
tripped over).
### 4.2 Content
**Production:**
```
User-agent: *
Allow: /
Sitemap: https://deepdrft.com/sitemap.xml
```
**Every non-production environment (beta/staging):**
```
User-agent: *
Disallow: /
```
- The `Sitemap:` line uses the absolute `SeoOptions.BaseUrl` origin (same config source as the sitemap's
`<loc>`s) — it is the one documented way to point crawlers at the sitemap without submitting it manually.
- The non-production body carries **no** `Sitemap:` line (consistent with AC-S4's "don't advertise beta URLs").
- Consider whether to additionally `Disallow: /FramePlayer` and the `api/*` proxy paths in Production (OQ-R2) —
the embed iframe and the JSON/stream proxy endpoints are not pages worth crawling.
### 4.3 Acceptance criteria (robots)
- **AC-R1 — Production allows + points.** `GET /robots.txt` on the production host returns `Allow: /` and a
`Sitemap: https://deepdrft.com/sitemap.xml` line.
- **AC-R2 — Beta disallows everything.** `GET /robots.txt` on any non-production host returns `User-agent: *` +
`Disallow: /` and **no** `Sitemap:` line (E1).
- **AC-R3 — Single gate.** The Production-vs-beta distinction is driven by `IWebHostEnvironment.IsProduction()`
the same predicate as the sitemap and as Phase 22's `SeoEnvironment` seed — not a second config flag.
- **AC-R4 — `text/plain`.** Correct content-type; no BOM/HTML wrapper.
---
## 5. Item 3 — CMS `noindex` (the one CMS-touching item)
**This is the only Phase 23 item that touches `DeepDrftManager`.** Scoped, minimal, admin-chrome-only — **no
functional change** to any CMS page, no service/API/data change. `DeepDrftManager` is an authenticated admin app
that must never appear in any search index, in any environment (it has no "production is fine to index" case —
the CMS is *always* `noindex`, unlike the public site whose gate flips per environment).
### 5.1 Mechanism — defense in depth, cheapest-robust
Two layers; recommend **both** because they fail independently and the cost is trivial:
- **(a) `robots.txt` on the CMS host [primary].** A `Disallow: /` `robots.txt` served at the CMS root. Because the
CMS is *always* uncrawlable (no environment gate), this can be the **simplest possible static file** in the CMS
`wwwroot/` — no endpoint, no environment logic:
```
User-agent: *
Disallow: /
```
This is the cleanest single move and differs from the public `robots.txt` precisely because there is no
per-environment branch to express.
- **(b) Blanket `<meta name="robots" content="noindex,nofollow">` in the CMS layout `<head>` [belt-and-braces].**
A static meta tag in the CMS app's root `App.razor`/host `<head>` (the CMS's analogue of the public
`App.razor`'s static head block). This protects against the case where a crawler reaches a deep CMS URL that
`robots.txt` disallow doesn't *de-index* (robots disallow prevents *crawling*, but a URL linked from elsewhere
can still be *indexed* without crawling; an on-page `noindex` is what actually keeps it out of the index). It is
a single static line in the CMS host head — no per-page wiring, no component, no `SeoHead` port (the CMS does
**not** get Phase 22's component; this is one blanket tag).
Layer (a) is the floor; layer (b) is the robust ceiling. Together they cost a static file plus one `<head>` line.
### 5.2 Why the CMS does *not* reuse Phase 22's `SeoHead` / `SeoEnvironment`
Phase 22 C1/C9 explicitly kept the CMS out of scope ("Zero changes to `DeepDrftManager`"). Phase 23 makes the
**one** deliberate, minimal exception — but it does **not** drag the public component graph into the CMS. The CMS
need is a single constant directive ("never index"), not a parameterized per-page head surface; porting `SeoHead`
(a `DeepDrftPublic.Client` WASM component) into the server-rendered CMS would be wildly disproportionate. The
blanket meta + static robots is the right-sized answer. (And `SeoEnvironment`'s per-environment flip is
irrelevant here — the CMS is `noindex` in *all* environments, including production.)
### 5.3 Acceptance criteria (CMS noindex)
- **AC-C1 — CMS robots disallows.** `GET /robots.txt` on the CMS host returns `User-agent: *` + `Disallow: /`.
- **AC-C2 — Every CMS page carries `noindex`.** Any CMS page's prerendered `<head>` contains
`<meta name="robots" content="noindex,nofollow">` (the blanket layout tag), including the public-facing
`/account/login` and `/account/register` routes (which render in the lean `CmsHomeLayout`) and the home splash.
Confirm the meta lands in whichever head block both layouts inherit (the CMS host `App.razor`), so a
layout-specific head doesn't leave a route uncovered.
- **AC-C3 — No functional change.** No CMS page's behavior, auth gate, layout, or data path changes — the diff is
a static `robots.txt` and a static `<meta>` line. (Aligns with Phase 22 AC9's spirit, now scoped as the
intentional CMS exception.)
- **AC-C4 — Always-on (no env gate).** The CMS `noindex` holds in production too — it is unconditional, unlike the
public site.
---
## 6. Wave decomposition
These are **largely independent** — three separate surfaces with one shared concept (the env gate) and one shared
config value (`BaseUrl`). The dependency graph is shallow.
- **23.1 — Public env-gate primitives + `robots.txt` endpoint (cold-start, shared seam).** Stand up the
server-side `IWebHostEnvironment`-gated endpoint pattern on `DeepDrftPublic` and ship `GET /robots.txt`
(Production allow+sitemap-pointer / non-prod `Disallow: /`). This is the smallest item and it establishes the
**shared gate + BaseUrl wiring** that 23.2 also uses, so doing it first de-risks the seam. Resolves the
static-vs-endpoint call (OQ-R1). **Cold-start; nothing depends on it being done first except that 23.2 reuses
the same gate wiring.**
- **23.2 — `sitemap.xml` endpoint.** The release-enumeration walk over `GET api/release` + XML emission +
`ReleaseRoutes`/`BaseUrl` absolutization + the env gate (404 in non-prod). The largest item. **Shares the gate
+ BaseUrl wiring with 23.1** (do 23.1 first or co-develop; they touch the same controller area). The
`Sitemap:` line in 23.1's production `robots.txt` points at this — so 23.1's production body assumes 23.2 exists
(harmless if 23.2 lands slightly later: a `Sitemap:` pointer to a not-yet-built URL just 404s until it does).
- **23.3 — CMS `noindex` (the CMS-side item).** Static `robots.txt` (`Disallow: /`) in the `DeepDrftManager`
`wwwroot/` + blanket `<meta name="robots" content="noindex,nofollow">` in the CMS host `<head>`. **Fully
independent — touches only `DeepDrftManager`, shares nothing with 23.1/23.2, can run in parallel from day one.**
**Dependency shape:** `23.1 → 23.2` (shared gate/BaseUrl wiring + the `Sitemap:` pointer relationship); **23.3 ∥**
(parallel, independent, different app). The cold-start item is **23.1** (it proves the gate seam the public side
leans on); **23.3** can run start-to-finish alongside either.
**Validation (folded into each wave's ACs, not a separate wave):** the items are small enough that a dedicated
validation wave is overkill — each wave carries its own ACs (S/R/C above). A single end-of-phase check that
exercises the production-vs-beta matrix for all three (Google Search Console / a `curl` against both hosts, plus
the sitemaps.org validator) is worth doing once 23.123.3 land.
---
## 7. Open questions for Daniel (product/infra calls, not implementation detail)
### Sitemap
- **OQ-S1 — Browse variants vs. canonical roots.** The sitemap lists the **canonical** browse roots (`/cuts`,
`/sessions`, `/mixes`, `/archive`). Phase 11 put Archive filters in the URL (`/archive?q=&medium=&genre=`).
**Recommend: do NOT enumerate filtered/paginated variants** — they are filtered *views* of the same release set,
not distinct content, and listing them invites duplicate-content dilution. The per-release detail URLs carry the
indexable content; the browse roots are navigational. `[Daniel decision — recommendation: canonical roots only]`
- **OQ-S2 — `lastmod` source.** Use `ReleaseDto.ReleaseDate` as the release URLs' `<lastmod>`? It is the *release*
date, not a content-last-modified date (a re-edited description or replaced cover would not bump it). **Recommend:
include it** — a plausible-but-imperfect lastmod is a useful crawl hint and strictly better than omitting it; the
alternative (a true content-modified timestamp) would need a schema column that doesn't exist (would violate
C5/no-schema-change). Static roots omit `lastmod`. `[Daniel decision — recommendation: ReleaseDate, accept the
imprecision]`
- **OQ-S3 — Static-root list source.** Hardcode the 6 static roots in the endpoint, or derive from the site's nav
index (`DeepDrftPublic.Client/Layout/Pages.cs` `AllPages`)? **Recommend: hardcode for v1** (the indexable-roots
set is *not* the same as the nav set — e.g. `/FramePlayer` is a nav-absent route that must stay out, and a new
nav entry isn't automatically sitemap-worthy), with a code comment to revisit if the set grows. Deriving couples
the sitemap to nav decisions in a way that can silently leak or drop URLs. `[Daniel decision — recommendation:
explicit list]`
### robots
- **OQ-R1 — Endpoint vs. static + nginx (§4.1).** **Recommend the endpoint** (single testable gate, co-located
with the sitemap). Confirm, or — if Daniel prefers robots managed at the reverse-proxy layer — the static +
nginx-rule variant (b), accepting the split gate. `[Daniel decision — recommendation: endpoint]`
- **OQ-R2 — Disallow non-page routes in Production?** Should the production `robots.txt` additionally
`Disallow: /FramePlayer` (the embed iframe) and/or `Disallow: /api/` (the proxy JSON/stream paths)? **Recommend:
yes for `/FramePlayer`** (an embed shell is not a destination page and would be thin/duplicate content if
crawled), **optional for `/api/`** (proxy paths return JSON/bytes, not HTML — crawlers mostly self-skip, but an
explicit disallow is tidy). `[Daniel decision — low stakes]`
### CMS
- **OQ-C1 — Both layers or just robots? (§5.1)** **Recommend both** (static `Disallow: /` robots **and** the
blanket `noindex` meta) — they fail independently and the combined cost is a file + one line; robots-disallow
alone does not de-index a URL discovered via an external link, which is exactly what the on-page `noindex`
closes. Confirm, or accept robots-only if the meta line is judged not worth the one CMS `<head>` touch. `[Daniel
decision — recommendation: both]`
### Cross-cutting
- **OQ-X1 — Is `https://deepdrft.com` the confirmed canonical origin?** This is Phase 22's OQ1, still load-bearing
here: every `<loc>`, the `Sitemap:` line, all assume `SeoOptions.BaseUrl = https://deepdrft.com`. If that value
was confirmed when Phase 22 landed (COMPLETED.md §22 shows it shipped as `https://deepdrft.com`), this is
closed — flagged only so the dependency is explicit. `[Likely closed — confirm BaseUrl is final]`
---
## 8. Cross-references (read before implementing)
- `product-notes/phase-22-seo-metadata-component.md` — the parent spec; §7 "Adjacent but separate concerns"
flagged all three Phase 23 items; the `SeoOptions.BaseUrl` / `ReleaseRoutes` / `SeoEnvironment` seams Phase 23
reuses are defined here.
- `COMPLETED.md §22` — what Phase 22 actually landed (the `SeoEnvironment` env gate, `SeoOptions.BaseUrl =
https://deepdrft.com`, the `ReleaseRoutes`-based canonical the sitemap must match).
- `DeepDrftPublic/Controllers/ReleaseProxyController.cs` — the thin-proxy shape and the `"DeepDrft.API"` named
client the sitemap endpoint reuses to walk releases (server-to-server, no proxy hop). **Note the distinction:**
the sitemap endpoint *enumerates + transforms*, it does not relay verbatim like this proxy.
- `DeepDrftPublic/CLAUDE.md` — the host's "thin HTTP boundary, no domain logic" contract; the middleware ordering
(static files before controller mapping — relevant to the robots endpoint-vs-static-file shadowing note); the
`IWebHostEnvironment` availability server-side.
- `DeepDrftPublic.Client/Common/ReleaseRoutes.cs``DetailHref(entryKey, medium)`, the single source of truth for
per-release detail URLs; every sitemap `<loc>` for a release goes through it.
- `DeepDrftPublic/Components/App.razor` — where `SeoEnvironment.IsProduction` is seeded from
`IWebHostEnvironment.IsProduction()` (lines 3848); the Phase 23 endpoints read the **same** predicate directly.
- `DeepDrftAPI/Controllers/ReleaseController.cs` `GET api/release` — the paged `PagedResult<ReleaseDto>` read the
sitemap walks (returns `Items`, `TotalCount`, `PageNumber`, `PageSize`; `ReleaseDto` carries `EntryKey`,
`Medium`, `ReleaseDate`). No change to this endpoint (C5).
- `DeepDrftManager` host `App.razor` / `wwwroot/` — where Item 3's CMS robots file and blanket `noindex` meta land
(the one CMS-touching surface).
- sitemaps.org `0.9` schema + Google's "Manage your sitemaps" / robots.txt docs — the validation targets (AC-S1,
AC-R*).