Documentation Audit & OKF Alignment Roadmap¶
A review of the
docs/tree against Google's Open Knowledge Format (OKF v0.1), with a prioritized roadmap and a file-by-file remediation appendix. This document recommends; it changes nothing on its own — and the YAML header above deliberately models the frontmatter convention it proposes in §5.
1. Executive summary¶
LBS Foundry's documentation is already ~70% of the way to OKF philosophically: it is plain markdown, version-controlled in git, vendor-neutral, and even has an agent-facing index (the CLAUDE.md "Where to find things" table). What it lacks is machine-readability and consistency.
The single highest-leverage move is to adopt a lightweight YAML frontmatter convention with a required type field, then generate the index from it instead of hand-maintaining two competing indexes. Everything else (naming, placement, durable-vs-ephemeral separation) follows naturally once each document declares what it is.
Three numbers frame the work:
- 131 durable docs (excluding the auto-generated
superpowers/specs/tree). - 0 of them currently carry YAML frontmatter — the heart of OKF.
- ~35 need relocation, renaming, re-filing, or reclassification (enumerated in the Appendix).
2. What OKF is, and why it is relevant to us¶
OKF is deliberately tiny. Knowledge is represented as markdown files with YAML frontmatter, stored in git, with no SDK. The only required frontmatter field is type; title, description, resource, tags, and timestamp are encouraged. Files are organised hierarchically (domain → concept type), cross-linked with ordinary markdown links, with optional index.md (progressive disclosure) and log.md (history).
The point of OKF is not the file format — it is convergence. Stop scattering knowledge across wikis, metadata catalogs, code comments, and people's heads; put it in one format that both humans and AI agents can read. Because the format is just markdown + a tiny metadata header, Google's reference tooling (a static HTML graph visualiser and an enrichment agent) works over any conformant bundle with no backend.
Why this matters for this repo specifically:
- We already feed docs to an agent.
CLAUDE.mdis a hand-curated agent index. Frontmatter would let that index be generated and let agents filter bytype(e.g. "show me everyrunbook") andstatus(e.g. "ignore anythingdeprecated"). - We ship typed SDKs and an MCP server. Our knowledge is meant to be machine-consumed. OKF is the same instinct applied to prose.
- We have a clear domain hierarchy already (architecture, importers, outcome-context, models, sport). OKF's domain→type layout is a small step from where we are, not a rewrite.
OKF is "minimally opinionated" — extra frontmatter fields are explicitly allowed — so we can adopt its required type and encouraged fields and add our own (e.g. status) without breaking conformance.
3. Current-state inventory¶
What is already good¶
- Markdown-in-git, vendor-neutral, human + agent readable — the OKF baseline, already met.
- Sensible top-level taxonomy with per-folder
README.mdindexes in many sections (architecture/,developer-guide/,adr/,outcome-context/, etc.). - A curated agent index — the
CLAUDE.md"Where to find things" table is, in effect, a partial OKF index already. - Deliberate audience splits — e.g.
event-history-viewer.md(backend) vsuser-guides/event-history-viewer.md(end-user), andgeneric-query-pattern.md(design rationale) vsdeveloper-guide/generic-queries.md(hands-on). These already cross-link each other. This is good information architecture; it just needs consistent placement and atype/audiencetag.
Distribution (durable docs, excluding superpowers/)¶
| Area | Files | Notes |
|---|---|---|
outcome-context/ |
21 | Self-contained, well-organised sub-tree (design + evaluations + sub-designs). |
developer-guide/ |
20 | The daily-development core. |
architecture/ |
12 | Concept + design docs. |
adr/ |
11 | Mixed: 5 true ADRs, the rest are design/guide/ops docs (see gap 6). |
api/, models/, integrations/, importers/ |
8 / 7 / 7 / 6 | Domain references. |
sport/, testing/, samples/, getting-started/ |
5 / 4 / 4 / 4 | |
plans/, user-guides/ |
3 / 2 | |
usecases/, runbooks/, end-user-guides/, data-engineering/, configuration/ |
1 each | Several near-empty folders. |
docs/ root (loose) |
12 | The largest single problem area — see gap 4. |
4. Gap analysis vs OKF¶
Each finding is tagged High / Medium / Low by leverage (impact ÷ effort).
Gap 1 — No frontmatter anywhere · High¶
Evidence: 0 of 131 docs begin with a YAML --- block. Nothing is machine-classifiable: an agent or tool cannot tell a runbook from a plan from a reference, cannot filter out deprecated docs, and cannot graph relationships beyond raw link-following.
OKF principle: frontmatter with a required type is the core of the format.
Fix: define the schema in §5, backfill (Phase 1 for hot docs, Phase 2 for the rest).
Gap 2 — Two indexes that drift · High¶
Evidence: docs/README.md and the CLAUDE.md "Where to find things" table are both hand-maintained, neither is complete, and they list different subsets. Neither references, e.g., runbooks/ or most of outcome-context/.
OKF principle: index.md for progressive disclosure — but an index that must be hand-synced will always lag.
Fix: generate the index from frontmatter (see §5.3). One source of truth, cannot drift.
Gap 3 — Durable docs mixed with ephemera · High¶
Evidence: point-in-time artifacts sit beside durable references with no signal distinguishing them — nfl-season-structure-pr-notes.md, clickhouse-schema-review.md, canonical-entity-mapping-deck-outline.md, integrations/clerk-webhook-implementation-plan.md, api/ballr/LBS-1051-TradeAssist-Implementation.md, the outcome-context/evaluations/storage-experiment/status*.md set.
OKF principle: type (and a status extension) is exactly what disambiguates "this is a permanent reference" from "this was a working note in March."
Fix: assign type: note | plan | evaluation and status: archived where appropriate; consider docs/archive/ for truly dead material.
Gap 4 — 12 homeless files at docs/ root · Medium¶
Evidence: see Appendix A.1. Only README.md (and arguably the pillar docs intro.md / security.md, both referenced by CLAUDE.md) belong at root. The ballr-*.md trio clearly belongs under api/ballr/ or sport/.
Fix: relocate per the appendix; update the two CLAUDE.md/README.md links that point at moved pillar docs.
Gap 5 — Naming drift · Medium¶
Evidence: the tree is mostly kebab-case, but architecture/IDENTITY_LINKING_PROBLEM_SUMMARY.md (SHOUTY_SNAKE), api/ballr/LBS-1051-TradeAssist-Implementation.md (ticket-prefixed PascalCase), sport/SuperCoachParticipantStats/CricketStatsRules.md (PascalCase dir + file), and models/.../nfl_sim_flowchart.md (snake_case) break it. Ticket numbers belong in frontmatter, not filenames.
Fix: rename to kebab-case (Appendix A.2). Exception: nfl_sim_flowchart.md is a generated artifact referenced by CLAUDE.md and the regen-nfl-flowchart skill — leave it (or change the generator + references together, not the file alone). Conventional uppercase (README.md, CONTEXT.md, UPDATING.md) is fine and should stay.
Gap 6 — adr/ mixes types · Medium¶
Evidence: adr/ contains 5 genuine ADRs (adr-001/007/008/009, real-time-notification-system.md) alongside design docs (consumer-api-design.md, content-workflow-design.md, event-sourcing-content-implementation.md), an ops doc (deployment-operations.md), and a guide (strapi-integration-guide.md). An ADR is an immutable decision record; a design doc evolves. Conflating them weakens both.
Fix: keep true ADRs in adr/; move the rest to design/, runbooks/, integrations/ per Appendix A.3.
Gap 7 — No freshness signal · Low¶
Evidence: nothing in-doc indicates currency. Git knows, but a reader (or agent) scanning the file does not. Stale docs read as authoritative.
OKF principle: timestamp (and optional log.md).
Fix: add updated + status to frontmatter; optionally a CI check that flags docs untouched for N months.
5. Proposed conventions¶
5.1 The type taxonomy (for this repo)¶
A small, closed set. Every doc gets exactly one.
type |
Meaning | Lifecycle | Examples |
|---|---|---|---|
index |
Navigation / table of contents | living | every README.md, docs/README.md |
concept |
Explains how/why something works | living | architecture/event-sourcing.md, intro.md |
guide |
Task-oriented how-to | living | developer-guide/common-tasks.md, getting-started/* |
reference |
Authoritative factual lookup | living | security.md, architecture/database-schema-diagram.md, api/ballr/* |
design |
Technical design / rationale | evolves, then settles | architecture/*-design.md, outcome-context/design.md |
adr |
Architecture Decision Record | immutable once accepted | adr/adr-00X-*.md |
runbook |
Operational procedure | living | runbooks/*, developer-guide/deployment-pipeline-setup.md |
model |
Simulation/model documentation | living | models/americanfootball/* |
plan |
Roadmap / implementation plan | ephemeral | plans/*, outcome-context/roadmap.md |
evaluation |
R&D findings, experiments | ephemeral | outcome-context/evaluations/* |
note |
Working note, review, draft | ephemeral | clickhouse-schema-review.md, *-pr-notes.md |
sample |
Sample data / diagnostic output | ephemeral | samples/* |
5.2 Frontmatter schema¶
OKF-conformant field names (so Google's reference visualiser/tooling works out of the box), plus a status extension (OKF permits extra fields):
---
type: guide # REQUIRED — one value from the taxonomy in §5.1
title: Common Tasks # REQUIRED — human title (often matches H1)
description: How to add a query or command in Foundry. # one line; powers search + index
status: current # current | draft | deprecated | archived
tags: [cqrs, marten, commands] # for discovery / filtering
updated: 2026-06-19 # OKF calls this `timestamp`; `updated` is clearer for prose
resource: src/Domain/... # OPTIONAL — link to the code/system this documents
audience: developer # OPTIONAL extension — developer | end-user | ops
---
Minimum to be conformant and useful: type, title, description, status, updated. The rest are encouraged where they add value.
Decision needed: keep OKF's exact
timestampfield, or use the friendlierupdated? (See §8.) Whichever we pick, apply it uniformly.
5.3 The generated index (replaces hand-maintained indexes)¶
A small script — natural fit for the existing TS toolchain (src/Tools/ or an pnpm docs:index task) — walks docs/, parses frontmatter, and emits docs/README.md grouped by type/area, skipping anything status: archived. Run it in CI; fail the build (or warn) if any doc lacks required frontmatter. This kills Gap 2 permanently and gives us a free OKF-style index.
The CLAUDE.md table can either be generated by the same pass or kept as a curated subset that links to the generated index — but it should stop trying to be a second full catalogue.
5.4 Naming & placement rules¶
- kebab-case for all filenames and directories. Conventional uppercase entry files (
README.md,CONTEXT.md,UPDATING.md) excepted. - No ticket numbers in filenames — put
LBS-XXXXin frontmatter (tagsor aticketfield). - No durable docs at
docs/root exceptREADME.mdand the two pillar docs (intro.md,security.md) thatCLAUDE.mdlinks by stable path. - One folder per type-or-domain — don't mix ADRs with designs (Gap 6).
6. Target structure¶
The shape after alignment (only deltas from today shown):
docs/
├── README.md # GENERATED index (type: index)
├── intro.md # pillar (type: concept) — kept at root
├── security.md # pillar (type: reference) — kept at root
├── meta/ # NEW — docs about docs
│ ├── documentation-audit.md # this file
│ └── documentation-guide.md # the conventions in §5, as living guidance
├── design/ # NEW — technical designs split out of adr/
├── adr/ # true ADRs only
├── archive/ # OPTIONAL — status:archived material, or use the status field in place
├── architecture/ developer-guide/ api/ importers/ integrations/
├── models/ outcome-context/ sport/ testing/ samples/ runbooks/
├── getting-started/ user-guides/ end-user-guides/ usecases/
└── ... # (loose root files relocated into the above)
outcome-context/ is already well-structured and self-contained — leave its layout, just add frontmatter.
7. Phased roadmap¶
Phase 0 — Decide (this document)¶
Approve the type taxonomy (§5.1), the frontmatter schema (§5.2), and the open questions in §8. Output: a short docs/meta/documentation-guide.md capturing the agreed conventions. Effort: ~1 hr of review.
Phase 1 — Quick wins · ~half a day¶
Mechanical, high-visibility, low-risk:
- Add frontmatter to the ~30 hot docs (everything linked from CLAUDE.md + docs/README.md).
- Relocate the 12 loose root files (Appendix A.1) and fix the handful of cross-links.
- Rename the naming outliers (Appendix A.2).
- Tag the obvious ephemera with status: archived / type: note (Appendix A.4).
Unlocks: a clean root, consistent naming, and a critical mass of frontmatter to prototype the index generator.
Phase 2 — Structural · ~1–2 days¶
- Backfill frontmatter across all 131 docs.
- Split
adr/→adr/+design/(Appendix A.3). - Build the index generator + CI check (§5.3); switch
docs/README.mdto generated. - Decide and apply the archive strategy (folder vs
status). Unlocks: drift-proof index, machine-queryable corpus, enforced consistency.
Phase 3 — Optional / aspirational¶
- Point Google's OKF static HTML visualiser at
docs/for a browsable knowledge graph (zero backend). - Add an enrichment step — an agent (we already run Claude + MCP) that proposes frontmatter and cross-links on new/changed docs in PRs.
- Extend the OKF idea beyond prose to the data catalog (Marten projections / BigQuery-style table + metric docs), which is OKF's original use case. Unlocks: the full "knowledge graph + agent reasoning" payoff OKF is designed for.
8. Open questions / decisions for you¶
- Field name: OKF-exact
timestamp, or human-friendlyupdated? (Recommendupdated.) - Archive strategy: a
docs/archive/folder, or astatus: archivedfield with docs left in place? (Recommend thestatusfield — less churn, preserves links.) - Index ownership: generate
docs/README.mdand theCLAUDE.mdtable, or generate the README and keepCLAUDE.mdas a curated subset? (Recommend the latter.) - Ephemeral planning docs (
plans/, parts ofoutcome-context/): keep them indocs/taggedtype: plan, or move working-process docs out of the published tree entirely? - Enforcement: should missing frontmatter fail CI, or only warn? (Recommend warn first, fail after Phase 2 backfill is complete.)
Appendix: file-by-file remediation¶
Only files needing action are listed. Every other doc simply receives frontmatter in Phase 1/2 with no move or rename. "Re-link" means update the one or two references in CLAUDE.md / docs/README.md / sibling docs.
A.1 Relocate loose root files¶
| Current path | Action | Proposed path | type |
|---|---|---|---|
ballr-get-supercoach-team-rankings.md |
move | api/ballr/get-supercoach-team-rankings.md |
reference |
ballr-link-unlink-supercoach-team.md |
move | api/ballr/link-unlink-supercoach-team.md |
reference |
ballr-search-supercoach-teams.md |
move | api/ballr/search-supercoach-teams.md |
reference |
event-history-viewer.md |
move + re-link | developer-guide/event-history-viewer-backend.md |
guide |
generic-query-pattern.md |
move + re-link | architecture/generic-query-pattern.md |
design |
development-workflow.md |
move | developer-guide/development-workflow.md |
guide |
clickhouse-schema-review.md |
reclassify | meta/reviews/clickhouse-schema-review.md (or in place) |
note (status: archived) |
canonical-entity-mapping-deck-outline.md |
reclassify | meta/notes/canonical-entity-mapping-deck-outline.md |
note (status: archived) |
nfl-season-structure-pr-notes.md |
move + reclassify | models/americanfootball/nfl-season-structure-pr-notes.md |
note (status: archived) |
intro.md |
keep at root | — | concept |
security.md |
keep at root | — | reference |
README.md |
keep at root | — | index (generated) |
A.2 Rename for kebab-case¶
| Current path | Proposed path | Note |
|---|---|---|
architecture/IDENTITY_LINKING_PROBLEM_SUMMARY.md |
architecture/identity-linking-problem-summary.md |
type: design (or note if stale) |
api/ballr/LBS-1051-TradeAssist-Implementation.md |
api/ballr/tradeassist-implementation.md |
ticket LBS-1051 → frontmatter tags/ticket |
sport/SuperCoachParticipantStats/CricketStatsRules.md |
sport/supercoach-participant-stats/cricket-stats-rules.md |
rename dir + file |
models/americanfootball/nfl-flow/rendered/nfl_sim_flowchart.md |
(leave as-is) | generated artifact; change generator + CLAUDE.md ref together if ever renamed |
A.3 Re-file out of adr/ (type confusion)¶
| Current path | Proposed path | type |
|---|---|---|
adr/consumer-api-design.md |
design/consumer-api-design.md |
design |
adr/content-workflow-design.md |
design/content-workflow-design.md |
design |
adr/event-sourcing-content-implementation.md |
design/event-sourcing-content-implementation.md |
design |
adr/deployment-operations.md |
runbooks/deployment-operations.md |
runbook |
adr/strapi-integration-guide.md |
integrations/strapi-integration-guide.md |
guide |
adr/adr-001/007/008/009-*.md, adr/real-time-notification-system.md, adr/README.md |
(keep) | adr / index |
A.4 Classify ephemera (assign type + status; no move required)¶
| Path(s) | type |
Suggested status |
|---|---|---|
plans/* (3 files) |
plan | current / archived per state |
outcome-context/roadmap.md, phase-breakdown.md, sequencing-rationale.md |
plan | current |
outcome-context/query-layer/build-plan.md, gap-analysis.md, rd-findings.md |
plan / evaluation | current |
outcome-context/evaluations/** (incl. storage-experiment/status*.md) |
evaluation | archived where the experiment is closed |
integrations/clerk-webhook-implementation-plan.md |
plan | archived if shipped |
integrations/discord-verification-frontend-implementation.md |
design / note | current |
samples/* (4 files) |
sample | current |
runbooks/archive-non-luckbox-aggregates.md |
runbook | current (already correctly placed) |
A.5 Audience-split pairs (keep both; co-locate + tag)¶
These are not duplicates — keep both, ensure cross-links survive any move, and add audience frontmatter:
| Pair | Action |
|---|---|
event-history-viewer.md (backend) ↔ user-guides/event-history-viewer.md (end-user) |
move backend half into developer-guide/ (A.1); add audience: developer / audience: end-user; keep the existing cross-links |
generic-query-pattern.md (design) ↔ developer-guide/generic-queries.md (hands-on) |
move design half into architecture/ (A.1); tag type: design / type: guide; keep cross-links |
Execution status (2026-06-20)¶
The structural recommendations and the frontmatter backfill were applied on branch docs/okf-alignment. What shipped, and where it deviated from the plan above:
Applied
- All relocations (A.1), renames (A.2), and the audience-split moves (A.5), with every inbound/outbound relative link updated. A link checker confirms the move introduced no broken links.
- OKF frontmatter added to all 131 durable docs (this file and the conventions guide already had it). updated reflects each file's last git commit date; 8 ephemeral docs are marked status: archived.
- docs/README.md regenerated as a complete index from frontmatter (closes Gap 2).
- CLAUDE.md gains a row pointing to the conventions guide; UTF-8 BOMs were stripped from 4 files.
Deviations from the plan (discovered while executing)
- A.3 dropped. The "design docs" in adr/ (consumer-api-design, content-workflow-design, event-sourcing-content-implementation, deployment-operations, strapi-integration-guide) are catalogued as ADR-002 through ADR-006 in adr/README.md — they are genuine ADRs, not mis-filed designs. They were kept in adr/ and tagged type: adr. Optional follow-up: rename them to the adr-00X- filename convention (touches the dense ADR cross-link web; deferred for deliberate review).
- clickhouse-schema-review.md kept at root. It is referenced by ~8 links from ADR-009 and the outcome-context evaluations as a permanent record; relocating it was high-churn for little gain. Tagged type: note, status: archived, left in place.
Deferred (Phase 2/3)
- The index generator + CI enforcement: docs/README.md was regenerated once by a hand-run script, but no generator is committed to the build (see §5.3).
- Hard CI gating on missing frontmatter; the OKF HTML visualiser / enrichment agent (Phase 3).
Pre-existing issues surfaced (not caused by this change) — for your triage
- clickhouse-schema-review.md links a deleted file (...StorageExperiment/.../ClickHouseSchemas.cs); that project no longer exists.
- Four directory-style links don't resolve: discord-integration.md → ./api/, ./deployment/; storage-experiment/status.md → samples/, samples/experiment-runs/.
This document began as the plan; the section above records the executed result on branch docs/okf-alignment.