docs: reflect Phase 23 SEO crawl directives as landed
This commit is contained in:
@@ -6,6 +6,25 @@ Newest entries at the top. Group by phase/wave header (mirroring `PLAN.md` / `CM
|
||||
|
||||
---
|
||||
|
||||
## Phase 23 — SEO Crawl Directives (landed 2026-06-23)
|
||||
|
||||
**Landed:** 2026-06-23 on dev.
|
||||
|
||||
- **What:** Server-side crawl-directive endpoints for `DeepDrftPublic` (`GET /robots.txt` and `GET /sitemap.xml`) plus a defense-in-depth noindex layer for `DeepDrftManager`. The endpoint/file-shaped follow-on to Phase 22's per-page `SeoHead` component. Phase 22 is the *content* of discoverability; Phase 23 is the *directives* layer above it — telling crawlers **which** pages exist and **whether** to crawl at all. No new `DeepDrftAPI` endpoint, no schema change.
|
||||
|
||||
- **Why:** Without robots.txt a crawler has no machine-readable signal about which routes to include or exclude (e.g. `/FramePlayer`, `/api/*`). Without sitemap.xml Google/Bing must discover release detail pages by link-following alone. Without noindex/robots protection the CMS could be inadvertently crawled if an admin link ever appeared on a public page.
|
||||
|
||||
- **Shape:**
|
||||
- **`DeepDrftPublic/Controllers/CrawlDirectiveController.cs`** (new): thin controller serving both endpoints. Reads `IWebHostEnvironment.IsProduction()` **directly** — no `SeoEnvironment` PersistentState bridge needed because these are server-side only (nothing crosses the server→WASM seam). Env gate is fail-safe closed: non-production robots.txt emits `Disallow: /` and the sitemap returns 404.
|
||||
- **`DeepDrftPublic/Seo/RobotsTxt.cs`** (new): pure builder for the robots.txt body. Production: `Allow: /` + `Disallow: /FramePlayer` + `Disallow: /api/` + `Sitemap:` pointer. Non-production: `Disallow: /`.
|
||||
- **`DeepDrftPublic/Seo/SitemapXml.cs`** (new): pure builder for the sitemap XML body. Walks `GET api/release` (server-to-server via the existing `"DeepDrft.API"` named client, paged) and emits a sitemaps.org `urlset`. Six explicit static roots (`/`, `/about`, `/cuts`, `/sessions`, `/mixes`, `/archive`) plus one `<url>` per release — `<loc>` = `SeoOptions.BaseUrl` + `ReleaseRoutes.DetailHref`, equal to the page's `SeoHead` canonical by construction; `<lastmod>` from `ReleaseDate`. Resilient: a partial/failed release read yields a well-formed roots-only document, never a 500.
|
||||
- **`DeepDrftManager/wwwroot/robots.txt`** (new static file): `Disallow: /` with no environment gate — the CMS is always uncrawlable, including in production.
|
||||
- **`DeepDrftManager/Components/App.razor`** (updated): blanket `<meta name="robots" content="noindex,nofollow">` in the CMS host `<head>` — defense in depth against de-indexing URLs discovered via external links, complementing the robots.txt directive.
|
||||
|
||||
- **Design memo:** `product-notes/phase-23-seo-crawl-directives.md`.
|
||||
|
||||
---
|
||||
|
||||
## Phase 22 — SEO Metadata Component (landed 2026-06-23)
|
||||
|
||||
**Landed:** 2026-06-23 on dev.
|
||||
|
||||
Reference in New Issue
Block a user